Friday, November 21, 2014

First Steps with Packer

In this short tutorial I'll share some of my first attempts to create images with Packer, I will try to "Pack" an EC2 instance with my own modifications into AMI.

Let the fun start...

We will create a JSON, that describes the Packer creation flow:

The configuration is pretty self-explanatory, the interesting part here is the builder (which is just shell a script called '' in this case). 
There are other builders for other Cloud providers and even for Docker and VMware.

For the example we will make something simple such as:

And make sure both files in my CWD. What's left is to run Packer:

 packer build test.json 
...and voila! a new AMI is ready ;)

Saturday, September 20, 2014

Enable/Disable HAproxy Backend Servers via Python

Sometimes we may want to automatically enable/disable machines behind our HAproxy automatically (during a deploy or maintenance), this is how it's done via Python code. The idea is to use HAproxy Unix Socket based API.

Monday, August 11, 2014

Expertimenting with Ruby, Sinatra & AWS

Lately I have started to explore Ruby and as part of my ramp-up I am getting familiar with all kind of interesting libraries and frameworks this language has to offer. One of these libraries is called Sinatra - which basically allows you to create web based REST api server side applications with minimum effort. The installation is as simple as -

sudo gem install sinatra

I immediately wanted to test it and the first thing that came to my mind was - Wouldn't it be cool to receive list of all my AWS EC2 instances by making a REST api call? Sounded easy enough. To accomplish this task I have also used the AWS SDK gem, which similarly can be installed with:

sudo gem install aws-sdk 

Make sure to have ruby development package installed for that to work (on Ubuntu 14.04 the package is called ruby1.9.1-dev).

OK, so now when we have all the pre-requirements  installed it's time to get our hands dirty...
So first of all we will need to import these libraries into our Ruby code:

require 'aws-sdk'
require 'sinatra'

Next I'll create a connector to AWS EC2 and pass it our AWS credentials and the relevant region:

ec2 = 'eu-west-1',
    access_key_id: 'xxxx',
    secret_access_key: 'xxxx')

We will use the describe_instances method on the 'ec2' object to query our instances.
An object called 'resp' will hold the response from EC2 and will include all the information about our environment and present it as a nested hash:

resp = ec2.describe_instances

 .... we will iterate over the relevant has keys to filter out the desired output from the nested hash, collecting everything into array called 'machine_array':

machine_array = [ ]
resp[:reservation_set].each do |reservation_set|
set].each do |instance_set|
                       instance_set[:tag_set].sort.each do |name|

                       machine_array.push(name[:value]+" ")   

I will embed it into a function called 'aws_list' that will return the machine_array (see final piece of code below):

Finally we will invoke the Sinatra web server and configure it in such way that our function will be executed each time a HTTP GET method to /aws_list is made:

get '/aws_list' do

That's all!
This is the code I have ended up with:
require 'aws-sdk'
require 'sinatra'

def aws_list
machine_array = [ ]
ec2 = 'eu-west-1',
    access_key_id: 'xxxx',
        secret_access_key: 'xxxx')
resp = ec2.describe_instances
resp[:reservation_set].each do |reservation_set|
set].each do |instance_set|
                instance_set[:tag_set].sort.each do |name|
machine_array.push(name[:value]+" ")

#invoke aws_list when HTTP request to /aws_list is received

get '/aws_list' do

Sunday, August 10, 2014

Optimizing Hadoop - Part1 (Hardware, Linux Tunings)

In these series of posts I'll share some of my experience with configuring Hadoop clusters for optimized performance and provide you with general guidance for efficiently optimizing your existing Hadoop cluster. 
I will start from the low level configurations/optimizations & tunings then we will cover OS level tunings, possible JVM tunings and finally the Hadoop platform level tunings.

Hardware Level configurations, tunings and checks:

Before we begin, it is extremely important to make sure our cluster nodes are aligned with their HW specs. Do all DataNode/TaskTracker nodes have the same amount of memory? Do all the DIMM's operate on same speeds? What about number of disks and their speed? What about NIC's speed? Are there any dropped packets? It is important that the actual number of installed DIMM's correspond to the number of channels per CPU, otherwise performance will be sub-optimal.

A good idea is to run some custom scripts combing commands such as 'dmidecode' , 'lspci', 'ifconfig', 'ethtool', 'netstat -s', 'fdisk -l', 'cat /proc/cpuinfo' with tool such as clustershell and make sure our nodes are indeed aligned and healthy. Mitigating low level (HW) issues is mandatory before we begin benchmarking our HW. 

Couple of things I would suggest checking -
  • RAID StripeSize - Hadoop benefits most by running in JBOD mode, however certain controllers out there require each disk to be configured as separate RAID0 array , in that case you should tune the stripe size from 64K to 256K, this may have significant impact on the disk IO (I have observed ~25% performance boost while going up from 64K to 256K). Another thing is to enable write back mode if your controller has battery.
  • Memory - disable power saving mode to increase memory frequency (usually from 1333 to 1600 Mhz) and throughput.
  •  Limiting NIC interrupt rate significantly reduce context switching during the shuffle/sorting phase (where network load is highest) - a good idea will be to consult your vendor how to achieve this.
After we are sure our cluster nodes are aligned for our planning and do not suffer from any HW issue/anomaly, we can continue and conduct the appropriate HW performance tests:

Memory tests -   Stream is a great tool that will help you measure memory bandwith per node, nowdays  Xeon CPU's with 8 Channels per CPU and 1600MHz DIMMs can deliver 70-80GB/sec "Triad" results.

Network tests - Should be conducted from each node, to each node sequentially and as well as concurrently with tool such as 'iperf', you should expected about 90% of NIC BW, meaning 115MBps for 1 GBit or 1150MBps for 10Gbit network.

Disk tests - Tools such as IOzone will help to benchmarks our disks. Current 10K RPM SAS disks achieve optimally about ~170 MB/sec and 7.2K RPM SATA can reach ~140MB/sec for sequential reads/writes, random reads/writes will be roughly as half.

Have you found any sub-optimal performance on any of components above? Perhaps there is still a HW level issue that needs to be solved before diving into higher hierarchy optimizations.

Linux Tunings

Kernel parameters -

At minimum, we do not want our cluster to ever swap, we also want to decrease number of TCP re-transmit retries (we do not want to keep re-transmitting to faulty nodes) ,this setting is not recommended for multi-tenant (cloud environments) with higher latency + higher possible error rate.
It's also a good idea to enable memory over-committing ,since Hadoop processes tend to reserve more memory than they actually use, another important tuning is increasing somaxcon, which is a socket backlog - to be able to deal with connections bursts.

echo 'vm.swappiness  =  0'  >>  /etc/sysctl.conf
echo 'net.ipv4.tcp_retries2 = 2' >> /etc/sysctl.conf
echo 'vm.overcommit_memory = 1' >> /etc/sysctl.conf

echo 'net.core.somaxconn = 4096' >> /etc/sysctl.conf
sysctl -p

OS limits -

Linux defaults limits are too tight for Hadoop, make sure to tune limits for user running Hadoop services:

hadoop - memlock unlimited
hadoop - core unlimited
hadoop - nofile 65536
hadoop - nproc unlimited
hadoop - nice -10
hadoop - renice -10

File-system tunings -

Make sure your /etc/fstab mount options for Hadoop disks are with 'noatime' parameter, the gain is that no metadata has to be updated per filesystem reads/writes improving IO performance.

/dev/sdc  /data01  ext4  defaults,noatime  0 0
/dev/sdd  /data02  ext4  defaults,noatime  0 0
/dev/sde  /data03  ext4  defaults,noatime  0 0
/dev/sdf  /data04  ext4  defaults,noatime  0 0

Also, make sure to reclaim filesystem blocks that are by default set to be reserved to be used by privileged processes. By default 5% of total filesystem capacity is reserved. This is important especially on big disks (+2TB), since a lot of storage space can be reclaimed -

tune2fs -m 0 /dev/sdc

Disable Transparent Huge Pages (RHEL6+) -

RHEL 6.x includes a feature called "transparent hugepage compaction" which interacts poorly with Hadoop workloads. This can cause a serious performance regression compared to other operating system versions on the same hardware, the symptom is very high kernel space (sys) CPU usage.

echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/enabled

echo 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local
echo 'echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local

Enable NSCD -

In environments synced to NIS/LDAP for central authentication, it's possible to enable NSCD daemon so user/group information will be retrieved from local cache and not from server.

Tuesday, April 22, 2014

Python - read configuration file

Consider the following configuration file which consists of section name and key values:

$ cat /opt/myapp/myapp.ini

The following code ( will parse the configuration file extracting values by section + corresponding key:

import ConfigParser
configParser = ConfigParser.RawConfigParser()
configFilePath = r'/opt/myapp/myapp.ini'
myhost = configParser.get('master', 'host')
myport = configParser.get('master', 'port')
print "Checking host:"+myhost+",port:"+myport


$ ./
checking host:host01,port:2181

Monday, March 3, 2014

Random Useful Stuff

In this short post I'll try to concentrate some useful Linux CLI wisdow I find myself using again and again:

Prepend timestamp before each command and append to log:

ps -U hadoop -opid=,vsize=,rss=,pcpu=,comm=,args| gawk '{ print strftime("%Y-%m-%dT%H:%M:%S"), $0; fflush(); }' >> $logfile

AWK - Check ascending incremental (by 1) order:

cat /tmp/servers| awk 'BEGIN{count=1}{if ($1 != count+1) print "MISS"}; {count=$1; print $0}'

AWK - Compare two columns strings:

cat /tmp/test | awk '{if ($1 != $2) print $1 " <= not equal to => " $2 ;else print $0 }'

Run 'top' in batch mode (useful for debug):

top -p 7459 -b -d 30 | awk -v "date=$(date)" '/7459/ {print date,$0}'

Find processes blocked on IO:

while [ 1 ] ;do ps aux|awk '{if ($8 ~ "D") print }';done

Quick GNUplot:

gnuplot -persist <(echo 'plot "testdata.txt" using 1:xticlabels(2) with boxes')

Clean Memory Cache:

sync && echo 3 > /proc/sys/vm/drop_caches

Share current directory quickly via web server (port 8000):

python -m SimpleHTTPServer

Check URL latency:

curl -L -o /dev/null -s -w {time_connect}:%{time_starttransfer}:%{time_total}\\n ""

Determine external IP:

dig +short

Diff local and remote file via 'Meld':

meld local_file.txt <(ssh host cat remote_file.txt)

Share a terminal:

script| tee -a /dev/pts/1 

Parallel ping (multiple hosts in a file):

echo $(cat hosts-file)|xargs -P0 -n1 ping -c1 -w1 

DIY Parallel SSH:

cat hosts-file |xargs -P0 -I host ssh host uname -r

Create an archive of log files:

find /path/to/dir -name '*.log' | tar -c --files-from=- | bzip2 > logs-`date +%F`.tar.bz2

Compressed MySQL dump over SSH:

mysqldump -u root -h host --all-databases |pigz | ssh user@host "cat > /tmp/dump.sql.pgz" 

Start a script with 'nohup' over SSH:

ssh myhost "nohup /path/to/script/ > /path/to/script/nohup.out 2> /path/to/script/nohup.err < /dev/null &" 

Remove old kernels (Ubuntu):
echo $(dpkg --list | grep linux-image | awk '{ print $2 }' | sort -V \| sed -n '/'`uname -r`'/q;p') $(dpkg --list | grep linux-headers | awk '{ print $2 }' | sort -V | sed -n '/'"$(uname -r | sed "s/\([0-9.-]*\)-\([^0-9]\+\)/\1/")"'/q;p') | xargs sudo apt-get -y purge

Copy all cron's from one machine to another over SSH:

 crontab -l |ssh myhost001 "crontab -"

Unattended SSH key generation:

ssh-keygen -q -t rsa -f ~/.ssh/id_rsa -N ""

Base conversion (Hex to Dec):

echo 'ibase=16;obase=A;FF' | bc

Check XML syntax:

xmllint --noout core-site.xml; echo $?

Pretty XML output:

xmllint --format yourxmlfile.xml

Check Puppet manifest syntax:

puppet parser validate /path/to/mainfest/init.pp

Check ERB template syntax:

erb -x -T '-' mytemplate.erb | ruby -c

Check YAML syntax:

ruby -e "require 'yaml'; YAML.parse('/path/to/file.yaml'))";

Check Python script syntax:

python -m py_compile /path/to/ 

Check Json syntax:

cat file.json | python -m json.tool

Java core diagnosis (stack trace):

jstack -J-d64 $JAVA_HOME/bin/java /opt/cores/java.core.7788 

Java core diagnosis (gdb):

gdb $JAVA_HOME/bin/java /opt/cores/java.core.7788 
(gdb) where

Close specific file descriptor (for example 123):

gdb -p PID

(gdb) call close(123)
$1 = 0