Saturday, February 19, 2011

Howto Configure Heartbeat Cluster

Server clustering gains more and more popularity these days.
One of the leading open-source products for HA is called "Heartbeat" (http://www.linux-ha.org/wiki/Heartbeat).

Heartbeat offers great (and free!) way to achieve HA on a Linux machine.
While configuring Heartbeat may not be the easiest thing , it's surely worth the effort, especially when we talking about some fancy production machine which requires HA.

In this article I will demonstrate how to configure Heartbeat and achieve full HA, read on.

For my test I have used 2 virtual stations (both installed CentOS v5.5 x64) - node1 & node2.

Pay attention that every step needs to be done on both hosts.
Also pay attention that your firewall is configured properly (port 694 udp needs to be allowed).

Step 1:
The first thing you want to do is to edit your /etc/hosts, in my case it looks like this:


192.168.0.133 node1
192.168.0.134 node1-e1
192.168.0.139 node2
192.168.0.140 node2-e1

When node1-e1 & node2-e2 are virtual interfaces on both of the machines (more explanation ahead).

Step 2:
Configure virtual interfaces on both of the nodes:

root@node1# ifconfig eth0:1 192.168.0.134 netmask 255.255.255.0
root@node2# ifconfig eth0:1 192.168.0.140 netmask 255.255.255.0

Step 3: 
Install heartbeat on both of the nodes:
# yum install heartbeat

Step 4:
Edit "authkeys" file (/etc/ha.d/authkeys) to include:

auth 1
1 crc

Step 5:
Edit "haresources" file (/etc/ha.d/haresources) to include:

node1 IPaddr2::192.168.0.133/24/eth0:2
node2 IPaddr2::192.168.0.139/24/eth0:1

Step 6:
Edit "ha.cf" file (/etc/ha.d/ha.cf)to include:

logfile /var/log/ha-log
debugfile /var/log/ha-debug
logfacility local0
udpport 694
ucast eth0  192.168.0.133
ucast eth0  192.168.0.134
ucast eth0  192.168.0.139
ucast eth0  192.168.0.140
ping node1-e1 node2-e1
node node1 node2
respawn hacluster /usr/lib64/heartbeat/ipfail
apiauth ipfail gid=haclient uid=hacluster
auto_failback on
keepalive 2
deadtime 10
warntime 3
deadping 10
realtime on
initdead 50
debug 1


Step 7:
Restart Heartbeat:
# /etc/init.d/heartbeat restart

That's should be it, now let's test the HA.
I'm pinging node1 from 3rd (Windows) machine, then rebooting it, pay attention that after short time node2 takes over and I'm getting response:


 

Tuesday, February 15, 2011

Arithmetic expressions with BASH

Lately I needed to write a BASH script that iterates over couple of hosts and counts the total amount of CPU cores.
Since I had many hosts and wanted to see the overall progress of the script (how much percent completed), I had to implement arithmetic expression in my script, let's see how it's done:

I have created a file that stores the array of all my environment hosts , the file is called "servers.txt".
The scripts iterates over that file, and checks the cores on each server via -
"cat /proc/cpuinfo|grep -i proc|wc -l".

If the server is reachable via ping it reads that CPU core data and then simply adds the found cores to total amount of cores and in the end displays the final result.

The problem was to display the percentage of the progress and after scratching my head a bit I came up with this:

echo  "$COUNT*100/$SERVERS_COUNT"|bc -l ;echo -n "Percent done...

We are multiplying our current count ($COUNT) by 100 and dividing the result with the total numbers of server ($SERVER_COUNT), then we pipe the result to command called bc, which does the magic of the calculation.

This is the script, enjoy:

#!/bin/bash
core_count() {
clear
echo "Starting core count, please wait this might take a while..."

SERVERS=`cat servers.txt`
SERVERS_COUNT=`cat servers.txt|wc -l`
TOTAL_CORES=0
COUNT=0

echo "Going to check:$SERVERS_COUNT servers" ;sleep 1
for i in $SERVERS;do
ping -c1 $i &>/dev/null
if [ $? == "0" ];then

CURRENT_CORES=`ssh $i "cat /proc/cpuinfo|grep -i proc|wc -l"`
let TOTAL_CORES+=$CURRENT_CORES
let COUNT+=1
clear
echo  "Working, please be patient..."
echo  "$COUNT*100/$SERVERS_COUNT"|bc -l ;echo -n "Percent done..."

else 
        echo "Server: $i is DOWN, can't determine core number"
fi
done

#############################################
echo -e "\n"
echo "Total cores: $TOTAL_CORES"
}



#Main Program
core_count 2>/dev/null

Thursday, February 3, 2011

Howto enable passwordless SSH on NetApp filer

When working in an environment with lots of filer servers, you might consider enable password less SSH on the filer for easier administration (from some administration host).
The concept stays the same, public keys exchange, but as you know the NetApp OnTapp OS uses slightly different syntax, so I decided to write this small guide that will help you out:

OK, let's get busy , I'll assume that the filer has already has networking configured correctly.


1) The filer does not have SSH enabled by default, so login via telnet:
admin_host> telnet filer01


2) Set up root password:
filer01> passwd


3) Next, you need to enable SSH (preferably version 2 - as it's more secure):
filer01> secureadmin enable ssh2
filer01> secureadmin setup ssh


4) Make sure you exports file is edited correctly and vol0 is exported to admin_host
admin_host> showmount -e filer01

You can edit exports file from the filer with "wrfile" command, if you have modified the file remmember to re-export the new exports with:

filer01 >exportfs -av

5) Next, mount  vol0 from the NetApp filer on the amdministration host:

admin_host> mkdir -p /nfs/filer01/vol0
admin_host> mount -t nfs filer01:/vol/vol0 /nfs/filer01/vol0

Check that you see the mounted volume:

admin_host> ls /nfs/filer01/vol0

If not you're probably having some issue with your firewall, or exports on the filer side.

6) This is the most critical part, here you will create the ssh directory and append your root public key to authorized_keys of the filer:

admin_host> mkdir -p /nfs/filer01/vol0/etc/sshd/root/.ssh/

admin_host> cat /root/.ssh/id_rsa.pub >> /nfs/filer01/vol0/etc/sshd/root/.ssh/authorized_keys


7) Last, you may want to turn down rsh and telnet services (for obvious security reasons):

filer01> options rsh.enable off
filer01> options telnet.enable off


Your are done.

Wednesday, February 2, 2011

Change date on Linux machine

Not once we need to adjust time on a Linux server (when NTP server is not reachable for any reason).

The date command can be used to set the time and date. 

To set the time manually, execute as root:
# date -s "15:13:00"
  Wed Feb 2 15:13:00 CST 2011

If you need to adjust the full date, and not just the time:

# date -s "15:13:30 Feb 2, 2011"
Wed Feb 2 15:13:30 PDT 2011

There is also another way to set the date and time (less elegant):

# date 033121422011.55
Mon Mar 31 21:42:55 PST 2011

The above command does not use the -s option, and the fields are arranged like this:  MMDDhhmmCCYY.ss
where MM = month, DD = day, hh = hour, mm = minute, CCYY = 4 digit year, and ss = seconds.