Sunday, April 5, 2015

Set Up ElasticSearch cluster on CoreOS (pt1)

For those who are not familiar with CoreOS,  it's an extremely thin version of Gentoo, designed to run and orchestrate Docker containers at scale. In the following tutorial I'll show how to deploy a test ES cluster on top of CoreOS.

1) On your admin node (your laptop?) generate etcd unique discovery string:

$ curl -L https://discovery.etcd.io/new

Copy/paste the output as this will be used by our cluster nodes for discovery.

2) Next, from the AWS console/AWS CLI API launch 3 instances with the following user data:

#cloud-config

coreos:
  etcd:
    discovery: https://discovery.etcd.io/78c03094374cc2140d261d116c6d31f3
    addr: $public_ipv4:4001
    peer-addr: $public_ipv4:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start

I have used the following AMI ID: ami-0e300d13 (which is CoreOS stable 607).


3) On the admin node add your AWS private key to the ssh-agent:

 $ eval `ssh-agent -s`
 $ ssh-add ~/.ssh/test-private-key.pem

4) Get Go Lang + Install fleetctl, we will use it orchestrate our CoreOS cluster:

$ apt-get update; apt-get install golang -y
$ git clone https://github.com/coreos/fleet.git /opt/fleet
$ cd /opt/fleet ; ./build
$ echo "export PATH=${PATH}:/opt/fleet/bin" >> ~/.bashrc 
$ source ~/.bashrc

5) Check fleetctl functionality:

$ fleetctl --tunnel coreos1 list-machines
MACHINE         IP              METADATA
06673ee6...     172.31.13.15    -
3c5c65e8...     172.31.13.16    -
fd02bd21...     172.31.13.17    -

Where 'coreos1' is one of our cluster nodes external IP.

Voila! our CoreOS 3 node cluster is up and running, brilliant ;)

6) Let's create a sample unit file (similar to systemd) which will pull Docker ES container and bind it's port to 9200 of the host:

$ cat << EOF > ES.service
[Unit]
Description=ElasticSearch
After=docker.service
Requires=docker.service

[Service]
ExecStartPre=/usr/bin/docker pull elasticsearch:latest
ExecStart=/usr/bin/docker run --name elasticsearch -p 9200:9200 elasticsearch
ExecStopPre=/usr/bin/docker kill elasticsearch
ExecStop=/usr/bin/docker rm elasticsearch
TimeoutStartSec=0
Restart=always
RestartSec=10s

[X-Fleet]
X-Conflicts=ES@*.service

EOF

7) Launch our newly created unit file:

$ fleetctl --tunnel coreos1 start ES.service
Unit ES.service launched on 06673ee6.../172.31.13.15

$ fleetctl --tunnel coreos1 list-units
UNIT            MACHINE                         ACTIVE  SUB
ES.service      06673ee6.../172.31.13.15        active  running

Boom! Our ES node is up and running. We can verify it's functionality by executing a simple HTTP GET such as:

$ curl -q -L http://coreos1/9200/_status/

We are still missing some important parts such as persistent data for ES Docker containers (to survive reboots), nodes discovery, monitoring and much more so stay tuned for the next part.


Tuesday, March 3, 2015

Installing CoreOS on BareMetal Server

Installing CoreOS is fairly a simple task.

On the host from which you will administer the CoreOS nodes from (aka the "admin machine"), make sure to copy (or generate new) SSH public key which will be used for authentication to the CoreOS machine(s), so in case there is no public key exist:

ssh-keygen -t rsa -N '' -f ~/.ssh/id_rsa

cat ~/.ssh/id_rsa.pub

Boot your machine with any Linux LiveCD (with internet connection ;) ).
Edit a cloud init file - which is basically a YAML that describes how the CoreOS machine is going to be installed & configured, at the very minimum it should contain the public key we have previously generated/copied:

vim cloud-init.yaml


#cloud-config

ssh_authorized_keys:
  - ssh-rsa AAAblablabla

Save and fetch CoreOS install script:

wget -O core-os-install.sh https://raw.githubusercontent.com/coreos/init/master/bin/coreos-install

Run the install script (this will wipe out /dev/sda of course):

bash ./core-os-install.sh -d /dev/sda -C stable -c cloud-init.yaml

When the installation is done, boot to the new CoreOS kernel and try to login as user 'core' with the public key provided in the YAML above.

Wednesday, February 18, 2015

MySQL on Docker

Numerous articles were written on how Docker is going to change the IT as we know it. Removing the need for full/para virtualization and configuration management frameworks.

While these statements are a bit exaggerated in my opinion, it seems that the technology is here to stay and is being rapidly adopted especially by the SaaS/web companies for obvious reasons such as portability and lower footprint than traditional Hypervisors.

In this short post I'd like to present a small "Howto" on running a MySQL (now MariaDB) DB on a Docker container, and present you some potential pitfalls ,so let's get started...

We will create a Docker file, which is our bootstrapping manifest for our DB image:



OK, so what we go here? We are pulling an Ubuntu image out of the Docker repository, installing the server and making sure the it is not bound to 'localhost', with some 'sed' magic, all in all pretty standard.

If more modifications were required for my.conf (and in real life scenario this would probably be mandatory), obviously 'sed' will be an ugly way to modify it so we could create a local copy of my.conf, make all the modification , add it to our Docker file and run the build process:

At this point we will be able to connect both from host and from other containers through a TCP socket:

But what about data persistence?  Remember that all the local data that is currently running in the container is ephemeral... While we could do something as:
This would delete our system data (tables with metadata), so what's the solution?

We need to add a wrapper script that will re-initialize the db in case there is no metadata available. The script can be added to the Docker file via 'ADD' statement:



That's better, now our DB runs on a persistent storage (/data/mysql - on the host machine which can be external SAN or NAS storage).


Friday, November 21, 2014

First Steps with Packer

In this short tutorial I'll share some of my first attempts to create images with Packer, I will try to "Pack" an EC2 instance with my own modifications into AMI.

Let the fun start...

We will create a JSON, that describes the Packer creation flow:


The configuration is pretty self-explanatory, the interesting part here is the builder (which is just shell a script called 'bootstrap.sh' in this case). 
There are other builders for other Cloud providers and even for Docker and VMware.

For the example we will make something simple such as:

And make sure both files in my CWD. What's left is to run Packer:

 packer build test.json 
 
...and voila! a new AMI is ready ;)

Saturday, September 20, 2014

Enable/Disable HAproxy Backend Servers via Python

Sometimes we may want to automatically enable/disable machines behind our HAproxy automatically (during a deploy or maintenance), this is how it's done via Python code. The idea is to use HAproxy Unix Socket based API.

Sunday, August 10, 2014

Optimizing Hadoop - Part1 (Hardware, Linux Tunings)

In these series of posts I'll share some of my experience with configuring Hadoop clusters for optimized performance and provide you with general guidance for efficiently optimizing your existing Hadoop cluster. 
I will start from the low level configurations/optimizations & tunings then we will cover OS level tunings, possible JVM tunings and finally the Hadoop platform level tunings.

Hardware Level configurations, tunings and checks:

Before we begin, it is extremely important to make sure our cluster nodes are aligned with their HW specs. Do all DataNode/TaskTracker nodes have the same amount of memory? Do all the DIMM's operate on same speeds? What about number of disks and their speed? What about NIC's speed? Are there any dropped packets? It is important that the actual number of installed DIMM's correspond to the number of channels per CPU, otherwise performance will be sub-optimal.

A good idea is to run some custom scripts combing commands such as 'dmidecode' , 'lspci', 'ifconfig', 'ethtool', 'netstat -s', 'fdisk -l', 'cat /proc/cpuinfo' with tool such as clustershell and make sure our nodes are indeed aligned and healthy. Mitigating low level (HW) issues is mandatory before we begin benchmarking our HW. 

Couple of things I would suggest checking -
  • RAID StripeSize - Hadoop benefits most by running in JBOD mode, however certain controllers out there require each disk to be configured as separate RAID0 array , in that case you should tune the stripe size from 64K to 256K, this may have significant impact on the disk IO (I have observed ~25% performance boost while going up from 64K to 256K). Another thing is to enable write back mode if your controller has battery.
  • Memory - disable power saving mode to increase memory frequency (usually from 1333 to 1600 Mhz) and throughput.
  •  Limiting NIC interrupt rate significantly reduce context switching during the shuffle/sorting phase (where network load is highest) - a good idea will be to consult your vendor how to achieve this.
After we are sure our cluster nodes are aligned for our planning and do not suffer from any HW issue/anomaly, we can continue and conduct the appropriate HW performance tests:

Memory tests -   Stream is a great tool that will help you measure memory bandwith per node, nowdays  Xeon CPU's with 8 Channels per CPU and 1600MHz DIMMs can deliver 70-80GB/sec "Triad" results.

Network tests - Should be conducted from each node, to each node sequentially and as well as concurrently with tool such as 'iperf', you should expected about 90% of NIC BW, meaning 115MBps for 1 GBit or 1150MBps for 10Gbit network.

Disk tests - Tools such as IOzone will help to benchmarks our disks. Current 10K RPM SAS disks achieve optimally about ~170 MB/sec and 7.2K RPM SATA can reach ~140MB/sec for sequential reads/writes, random reads/writes will be roughly as half.

Have you found any sub-optimal performance on any of components above? Perhaps there is still a HW level issue that needs to be solved before diving into higher hierarchy optimizations.


Linux Tunings

Kernel parameters -

At minimum, we do not want our cluster to ever swap, we also want to decrease number of TCP re-transmit retries (we do not want to keep re-transmitting to faulty nodes) ,this setting is not recommended for multi-tenant (cloud environments) with higher latency + higher possible error rate.
It's also a good idea to enable memory over-committing ,since Hadoop processes tend to reserve more memory than they actually use, another important tuning is increasing somaxcon, which is a socket backlog - to be able to deal with connections bursts.

echo 'vm.swappiness  =  0'  >>  /etc/sysctl.conf
echo 'net.ipv4.tcp_retries2 = 2' >> /etc/sysctl.conf
echo 'vm.overcommit_memory = 1' >> /etc/sysctl.conf

echo 'net.core.somaxconn = 4096' >> /etc/sysctl.conf
sysctl -p


OS limits -

Linux defaults limits are too tight for Hadoop, make sure to tune limits for user running Hadoop services:

hadoop - memlock unlimited
hadoop - core unlimited
hadoop - nofile 65536
hadoop - nproc unlimited
hadoop - nice -10
hadoop - renice -10


File-system tunings -

Make sure your /etc/fstab mount options for Hadoop disks are with 'noatime' parameter, the gain is that no metadata has to be updated per filesystem reads/writes improving IO performance.

/dev/sdc  /data01  ext4  defaults,noatime  0 0
/dev/sdd  /data02  ext4  defaults,noatime  0 0
/dev/sde  /data03  ext4  defaults,noatime  0 0
/dev/sdf  /data04  ext4  defaults,noatime  0 0

Also, make sure to reclaim filesystem blocks that are by default set to be reserved to be used by privileged processes. By default 5% of total filesystem capacity is reserved. This is important especially on big disks (+2TB), since a lot of storage space can be reclaimed -

tune2fs -m 0 /dev/sdc

Disable Transparent Huge Pages (RHEL6+) -

RHEL 6.x includes a feature called "transparent hugepage compaction" which interacts poorly with Hadoop workloads. This can cause a serious performance regression compared to other operating system versions on the same hardware, the symptom is very high kernel space (sys) CPU usage.

echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/enabled

echo 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local
echo 'echo  never  >  /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local

Enable NSCD -

In environments synced to NIS/LDAP for central authentication, it's possible to enable NSCD daemon so user/group information will be retrieved from local cache and not from server.

Tuesday, April 22, 2014

Python - read configuration file


Consider the following configuration file which consists of section name and key values:

$ cat /opt/myapp/myapp.ini
[master]
host=host01
port=2181


The following code (myapp.py) will parse the configuration file extracting values by section + corresponding key:

#!/usr/bin/python
import ConfigParser
configParser = ConfigParser.RawConfigParser()
configFilePath = r'/opt/myapp/myapp.ini'
configParser.read(configFilePath)
myhost = configParser.get('master', 'host')
myport = configParser.get('master', 'port')
print "Checking host:"+myhost+",port:"+myport



Run:

$ ./myapp.py
checking host:host01,port:2181

Sunday, June 30, 2013

Unattended backups for Cisco appliances using scp


It's a good practice to keep your Cisco running configuration backed up to a remote backup repository on a regular basis, most convenient way I have found is using 'archive' function in the IOS and transferring the configuration over 'scp':
 
router01#conf t
router01(config)#archive
router01(config-archive)#path scp://bkpadmin:passw0rd@10.10.1.10//backup/bkp-$h-$trouter01(config-archive)#time-period 720
router01(config-archive)#do wr

Where:
  • 10.10.1.10 - is my backup server
  • bkpadmin/passw0rd - my remote user credentials.
  • $h - is the hostname of the appliance
  • $t - is the backup time stamp
  • Backup time interval is specified in minutes so in my case the backup occurs twice a day (1440 minutes=24h).

Your running-config will be saved in file such as:

#ls /backup
bkp-router01-Jun-30-11-27-15.585-0

Saturday, March 9, 2013

AWS VPC port forwarding techniques

Port forwarding using 'iptables' is extremely useful for ad-hoc interactions with your instances located on the private subnet on the VPC in situations when you do not wish to re-design your network architecture. 
As you must already know the instances on private subnet are not able to interact with the external world unless configured to use a NAT instance (located on the public subnet) as their GW.

So, for the example, let's say I want to forward any requests coming from the outside world to port 8080 via my NAT instance Elastic IP (which is an external, routable IP address) to an instance located on my private subnet - Puppet Master server, so:
  • My NAT instance external IP address (Elastic IP) is:123.123.123.123
  • My NAT instance internal IP address is:10.0.0.254
  • My Puppet  Master internal IP address is:10.0.1.239

First, on the NAT instance make sure IP forwarding is enabled:
[root@ip-10-0-0-254 ~]#cat /proc/sys/net/ipv4/ip_forward
1
[root@ip-10-0-0-254 ~]#
We are good to go....
Next, we will instruct to redirect any requests coming to port 8080 to IP 10.0.1.239 port 8080: 
 
[root@ip-10-0-0-254 ~]# iptables -t nat -i eth0 -I PREROUTING -p tcp --dport 8080 -j DNAT --to 10.0.1.239:8080

Note, that in some cases you will want to limit this function only for incoming traffic, since the above example will forward any requests (even from inside the VPC) destined for port 8080, the best solution is to specify the destination IP address of the NAT instance -

[root@ip-10-0-0-254 ~]# iptables -t nat -d 10.0.0.254 -I PREROUTING -p tcp --dport 8080-j DNAT --to 10.0.1.239:8080

Pay attention that I've specified the NAT internal IP address. The reason for that is because the destination IP of the packet is in fact NAT instance internal IP - that's because Amazon EC2 already use NAT when correlating between elastic IP's and instance internal IP addresses.

Verify the command worked with:

[root@ip-10-0-0-254 ~]#iptables -L -t nat -v


Save your iptables configuration:
[root@ip-10-0-0-254 ~]#iptables-save > fw_conf_`date +%F`
[root@ip-10-0-0-254 ~]#/etc/init.d/iptables save

Make sure the security group your NAT instance is currently using allows relevant incoming traffic.

Finally, test the connection from outside of the VPC (make sure traffic is not blocked by any security group):

>telnet 123.123.123.123 8080

Your request now should be be redirected to the back-end node on private subnet on the VPC.

Cheers.

Tuesday, June 19, 2012

OpenLDAP with phpLDAPadmin (CentOS6)

In the following tutorial I will demonstrate how to install and configure OpenLDAP with phpLDAPadmin extension for convenient directory administration on CentOS 6.2 x86_64 machine.

Install OpenLDAP:

1)Install the relevant packages: 
#yum install openldap-servers openldap-clients -y

#chkconfig slapd on


Configure OpenLDAP:

This is where things start to get nasty :)

Edit the server configuration file (create it if it does not exist):
#vi /etc/openldap/slapd.conf

And add the following lines (they specify LDAP pid file and arguments file):
 pidfile     /var/run/openldap/slapd.pid
argsfile    /var/run/openldap/slapd.args

You can remove the config files under /etc/openldap/slapd.d:
# \rm -rf /etc/openldap/slapd.d/*

Next we will need to add couple of configurations:
 #vi /etc/openldap/slapd.d/cn=config/olcDatabase\={0}config.ldif

Comment out:
#olcAccess: {0}to *  by * none
...and Insert :
olcAccess:  {0}to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth manage by * break

Another configuration (create a new file if it does not exist):
#vi /etc/openldap/slapd.d/cn=config/olcDatabase\={1}monitor.ldif

Insert the following content:
 dn: olcDatabase={1}monitor
objectClass: olcDatabaseConfig
olcDatabase: {1}monitor
olcAccess: {1}to * by dn.exact=gidNumber=0+uidNumber=0,cn=peercred,cn=external,cn=auth manage by * break
olcAddContentAcl: FALSE
olcLastMod: TRUE
olcMaxDerefDepth: 15
olcReadOnly: FALSE
olcMonitoring: FALSE
structuralObjectClass: olcDatabaseConfig
creatorsName: cn=config
modifiersName: cn=config

Make sure configuration files owned by 'ldap' user (if the installation has not added it you may add it manually with useradd).
#chown ldap.ldap -R /etc/openldap/slapd.d/
#chmod -R 700 /etc/openldap/slapd.d/

Start the LDAP server and check it is listening on port 389:
#/etc/init.d/slapd start
#netstat -ntulp|grep 389

Import all the needed schema's:
 ldapadd -Y EXTERNAL -H ldapi:/// -f /etc/openldap/schema/core.ldif
  149  ldapadd -Y EXTERNAL -H ldapi:/// -f /etc/openldap/schema/cosine.ldif
  150  ldapadd -Y EXTERNAL -H ldapi:/// -f /etc/openldap/schema/nis.ldif
  151  ldapadd -Y EXTERNAL -H ldapi:/// -f /etc/openldap/schema/inetorgperson.ldif

Change the LDAP admin password:
#slappasswd

Save the SSHA hash, we will need it in the next stage
It's time to create our LDAP fronted/backend LDIF files:

Backend LDIF file (server_backend.ldif) will look like this (make sure you paste you SSHA hash at the 'oldRootPW' line and change the dc=*,dc=* with your domain credentials:

dn: cn=module,cn=config
objectClass: olcModuleList
cn: module
olcModulepath: /usr/lib64/openldap
olcModuleload: back_hdb

dn: olcDatabase=hdb,cn=config
objectClass: olcDatabaseConfig
objectClass: olcHdbConfig
olcDatabase: {2}hdb
olcSuffix: dc=yourdomain,dc=com
olcDbDirectory: /var/lib/ldap
olcRootDN: cn=admin,dc=yourdomain,dc=com
olcRootPW: {SSHA}xxxxxx
olcDbConfig: set_cachesize 0 2097152 0
olcDbConfig: set_lk_max_objects 1500
olcDbConfig: set_lk_max_locks 1500
olcDbConfig: set_lk_max_lockers 1500
olcDbIndex: objectClass eq
olcLastMod: TRUE
olcMonitoring: TRUE
olcDbCheckpoint: 512 30
olcAccess: to attrs=userPassword by dn="cn=admin,dc=yourdomain,dc=com" write by anonymous auth by self write by * none
olcAccess: to attrs=shadowLastChange by self write by * read
olcAccess: to dn.base="" by * read
olcAccess: to * by dn="cn=admin,dc=yourdomain,dc=com" write by * read

Import the backend LDIF file:
#ldapadd -Y EXTERNAL -H ldapi:/// -f server_backend.ldif


The frontend file will look like this:

dn: dc=yourdomain,dc=com
objectClass: top
objectClass: dcObject
objectclass: organization
o: Test Domain
dc: yourdomain

dn: cn=admin,dc=yourdomain,dc=com
objectClass: simpleSecurityObject
objectClass: organizationalRole
cn: admin
userPassword: {SSHA}wPkUaeo450ckN5rT8ZRE7HEpP7W7V3vJ

dn: ou=users,dc=yourdomain,dc=com
objectClass: organizationalUnit
ou: users

dn: ou=groups,dc=yourdomain,dc=com
objectClass: organizationalUnit
ou: groups

Import the frontend LDIF (server_frondtend.ldif) file:
#ldapadd -x -D cn=admin,dc=yourdomain,dc=com -W -f server_frontend.ldif

Basic configuration is done.


Add users/groups:

We will create 2 files: users.ldif, groups.ldif.

users.ldif:
dn: uid=paul,ou=users,dc=yourdomain,dc=com
objectClass: inetOrgPerson
objectClass: posixAccount
objectClass: shadowAccount
uid: paul
sn: paul
givenName: paul
cn: paul
displayName: paul
uidNumber: 500
gidNumber: 500
userPassword: {crypt}!!$1$ErpqdrvZ$MtK5dCLSh2EHuqxMVjsKJ/
gecos: paul
loginShell: /bin/bash
homeDirectory: /home/paul
shadowExpire: -1
shadowFlag: 0
shadowWarning: 7
shadowMin: 0
shadowMax: 99999
shadowLastChange: 15114

Let's add the user:
#ldapadd -x -D cn=admin,dc=yourdomain,dc=org -W -f users.ldif
 ...

groups.ldif will look like this:

dn: cn=engineering,ou=groups,dc=yourdomain,dc=com
objectClass: posixGroup
cn: engineering
gidNumber: 500

dn: cn=support,ou=groups,dc=yourdomain,dc=com
objectClass: posixGroup
cn: support
gidNumber: 501
We will add the groups via:
#ldapadd -x -D cn=admin,dc=yourdomain,dc=org -W -f groups.ldif




Install phpLDAPadmin:


Get the EPEL repository:
#rpm -Uvh http://ftp-stud.hs-esslingen.de/pub/epel/6/i386/epel-release-6-7.noarch.rpm

Install phpLDAPadmin:
#yum install phpldapadmin -y

Edit phpLDAPadmin configuration file:
#vi /etc/phpldapadmin/config.php

Comment the line:
//$servers->setValue('login','attr','uid');

Un-comment the line:
$servers->setValue('login','attr','dn');

Make sure the apache ACL settings are correct for phpLDAPadmin:
#grep -i -E 'deny|allow' /etc/httpd/conf.d/phpldapadmin.conf 

  Order Deny,Allow
  Deny from all
  Allow from 10.100.50.0/24
  Allow from ::1

In my case only 10.100.50.0/24 subnet can access phpLDAPadmin.

Restart Apache:
#/etc/init.d/apache restart

You can access phpLDAPadmin via:
http://your-server/ldapadmin