I will start from the low level configurations/optimizations & tunings then we will cover OS level tunings, possible JVM tunings and finally the Hadoop platform level tunings.
Hardware Level configurations, tunings and checks:
Before we begin, it is extremely important to make sure our cluster nodes are aligned with their HW specs. Do all DataNode/TaskTracker nodes have the same amount of memory? Do all the DIMM's operate on same speeds? What about number of disks and their speed? What about NIC's speed? Are there any dropped packets? It is important that the actual number of installed DIMM's correspond to the number of channels per CPU, otherwise performance will be sub-optimal.
A good idea is to run some custom scripts combing commands such as 'dmidecode' , 'lspci', 'ifconfig', 'ethtool', 'netstat -s', 'fdisk -l', 'cat /proc/cpuinfo' with tool such as clustershell and make sure our nodes are indeed aligned and healthy. Mitigating low level (HW) issues is mandatory before we begin benchmarking our HW.
Couple of things I would suggest checking -
- RAID StripeSize - Hadoop benefits most by running in JBOD mode, however certain controllers out there require each disk to be configured as separate RAID0 array , in that case you should tune the stripe size from 64K to 256K, this may have significant impact on the disk IO (I have observed ~25% performance boost while going up from 64K to 256K). Another thing is to enable write back mode if your controller has battery.
- Memory - disable power saving mode to increase memory frequency (usually from 1333 to 1600 Mhz) and throughput.
- Limiting NIC interrupt rate significantly reduce context switching during the shuffle/sorting phase (where network load is highest) - a good idea will be to consult your vendor how to achieve this.
Memory tests - Stream is a great tool that will help you measure memory bandwith per node, nowdays Xeon CPU's with 8 Channels per CPU and 1600MHz DIMMs can deliver 70-80GB/sec "Triad" results.
Network tests - Should be conducted from each node, to each node sequentially and as well as concurrently with tool such as 'iperf', you should expected about 90% of NIC BW, meaning 115MBps for 1 GBit or 1150MBps for 10Gbit network.
Disk tests - Tools such as IOzone will help to benchmarks our disks. Current 10K RPM SAS disks achieve optimally about ~170 MB/sec and 7.2K RPM SATA can reach ~140MB/sec for sequential reads/writes, random reads/writes will be roughly as half.
Have you found any sub-optimal performance on any of components above? Perhaps there is still a HW level issue that needs to be solved before diving into higher hierarchy optimizations.
Linux Tunings
Kernel parameters -
At minimum, we do not want our cluster to ever swap, we also want to decrease number of TCP re-transmit retries (we do not want to keep re-transmitting to faulty nodes) ,this setting is not recommended for multi-tenant (cloud environments) with higher latency + higher possible error rate.
It's also a good idea to enable memory over-committing ,since Hadoop processes tend to reserve more memory than they actually use, another important tuning is increasing somaxcon, which is a socket backlog - to be able to deal with connections bursts.
echo 'vm.swappiness = 0' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_retries2 = 2' >> /etc/sysctl.conf
echo 'vm.overcommit_memory = 1' >> /etc/sysctl.conf
echo 'net.core.somaxconn = 4096' >> /etc/sysctl.conf
sysctl -p
OS limits -
Linux defaults limits are too tight for Hadoop, make sure to tune limits for user running Hadoop services:
hadoop - memlock unlimited
hadoop - core unlimited
hadoop - nofile 65536
hadoop - nproc unlimited
hadoop - nice -10
hadoop - renice -10
File-system tunings -
Make sure your /etc/fstab mount options for Hadoop disks are with 'noatime' parameter, the gain is that no metadata has to be updated per filesystem reads/writes improving IO performance.
/dev/sdc /data01 ext4 defaults,noatime 0 0
/dev/sdd /data02 ext4 defaults,noatime 0 0
/dev/sde /data03 ext4 defaults,noatime 0 0
/dev/sdf /data04 ext4 defaults,noatime 0 0
Also, make sure to reclaim filesystem blocks that are by default set to be reserved to be used by privileged processes. By default 5% of total filesystem capacity is reserved. This is important especially on big disks (+2TB), since a lot of storage space can be reclaimed -
tune2fs -m 0 /dev/sdc
Disable Transparent Huge Pages (RHEL6+) -
RHEL 6.x includes a feature called "transparent hugepage compaction" which interacts poorly with Hadoop workloads. This can cause a serious performance regression compared to other operating system versions on the same hardware, the symptom is very high kernel space (sys) CPU usage.
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local
echo 'echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled' >> /etc/rc.local
Enable NSCD -
In environments synced to NIS/LDAP for central authentication, it's possible to enable NSCD daemon so user/group information will be retrieved from local cache and not from server.
9 comments:
While training for hadoop online training I came to know about this website. Great insight about the subject through videos, presentations along with nice content on this site completely related to hadoop and cloud. Thank you...
My rather long internet look up has at the end of the day been compensated with pleasant insight to talk about with my family and friends.
Best PHP Training Institute in Chennai|PHP Course in chennai
Best .Net Training Institute in Chennai
MCSE Training in Chennai
AI Training in Chennai
SEO Training in Chennai
Thanks for all the Wonderful Articles,Information's about java is very useful to enhance my Technical skill...Keep posting like this
Java training in chennai | Java training in annanagar | Java training in omr | Java training in porur | Java training in tambaram | Java training in velachery
This information is impressive; I am inspired with your post writing style & how continuously you describe this topic.
Big Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery
It’s hard to come by experienced people about this subject, but you seem like you know what you’re talking about. I have found something which helped me. Thank you
Java Training in Chennai
Java Training in Velachery
Java Training in Tambaram
Java Training in Porur
Java Training in Omr
Java Training in Annanagar
i just go through your article it’s very interesting time just pass away by reading your article looking for more updates. Thank you for sharing.
Digital Marketing Training in Chennai
Digital Marketing Training in Velachery
Digital Marketing Training in Tambaram
Digital Marketing Training in Porur
Digital Marketing Training in Omr
Digital Marketing Training in Annanagar
Very Nice Blog…Thanks for sharing this information with us. Here am sharing some information about training institute.
devops training in hyderabad
Post a Comment