Three identical Apache Hadoop clusters were provisioned on Joyent infrastructure using different operating systems: SmartOS, Ubuntu, and KVM virtual machines. Monitoring showed the Ubuntu and KVM clusters spent more time in the OS kernel during I/O operations compared to the SmartOS cluster. The SmartOS cluster was able to utilize CPU resources more efficiently and scale to more mappers and reducers. Basic cluster configuration and tuning the number of map and reduce tasks are important to optimize Hadoop performance.
Related topics: