HW09 Hadoop Vaidya

Hadoop Vaidya Viraj Bhat ( [email_address] ) Suhas Gogate ( [email_address] ) Milind Bhandarkar ( [email_address] ) Cloud Computing & Data Infrastructure Group, Yahoo! Inc. Hadoop World October 2, 2009

Hadoop & Job Optimization: Why ? Hadoop is a highly configurable commodity cluster computing framework Performance tuning of Hadoop jobs is a significant challenge! 165+ tunable parameters Tuning one parameter adversely affects others Hadoop Job Optimization Job Performance – User perspective Reduce end-to-end execution time Yield quicker analysis of data Cluster Utilization – Provider perspective Efficient sharing of cluster resources across multiple users Increase overall throughput in terms of number of jobs/unit time

Hadoop Vaidya -- Rule based performance diagnostics Tool Rule based performance diagnosis of M/R jobs M/R performance analysis expertise is captured and provided as an input through a set of pre-defined diagnostic rules Detects performance problems by postmortem analysis of a job by executing the diagnostic rules against the job execution counters Provides targeted advice against individual performance problems Extensible framework You can add your own rules, based on a rule template and published job counters data structures Write complex rules using existing simpler rules Vaidya : An expert (versed in his own profession , esp. in medical science) , skilled in the art of healing , a physician

Hadoop Vaidya : Status Input Data used for evaluating the rules Job History, Job Configuration (xml) A Contrib project under Apache Hadoop Available in Hadoop version 0.20.0 http://issues.apache.org/jira/browse/HADOOP-4179 Automated deployment for analysis of thousands of daily jobs on the Yahoo! Grids Helps quickly identify inefficient user jobs utilizing more resources and advice them appropriately Helps certify user jobs before moving to production clusters (compliance)

Diagnostic Test Rule <DiagnosticTest> <Title> Balanced Reduce Partitioning </Title> <ClassName> org.apache.hadoop.vaidya.postexdiagnosis.tests.BalancedReducePartitioning </ClassName> <Description> This rule tests as to how well the input to reduce tasks is balanced </Description> <Importance> High </Importance> <SuccessThreshold> 0.20 </SuccessThreshold> <Prescription> advice </Prescription> <InputElement> <PercentReduceRecords> 0.85 </PercentReduceRecords> </InputElement> </DiagnosticTest >

Diagnostic Report Element <TestReportElement> <TestTitle> Balanced Reduce Partitioning </TestTitle> <TestDescription> This rule tests as to how well the input to reduce tasks is balanced </TestDescription> <TestImportance> HIGH </TestImportance> <TestResult> POSITIVE(FAILED) </TestResult> <TestSeverity> 0.98 </TestSeverity> <ReferenceDetails> * TotalReduceTasks: 1000 * BusyReduceTasks processing 85% of total records: 2 * Impact: 0.98 </ReferenceDetails> <TestPrescription> * Use the appropriate partitioning function * For streaming job consider following partitioner and hadoop config parameters * org.apache.hadoop.mapred.lib.KeyFieldBasedPartitioner * -jobconf stream.map.output.field.separator, -jobconf stream.num.map.output.key.fields </TestPrescription> </TestReportElement>

Hadoop Vaidya Rules - Examples Balanced Reduce Partitioning Checks if intermediate data is well partitioned among reducers. Map/Reduce tasks reading HDFS files as side effect Checks if HDFS files are being read as side effect and in effect causing the access bottleneck across map/reduce tasks Percent Re-execution of Map/Reduce tasks Map tasks data locality Checks the % data locality for Map tasks Use of Combiner & Combiner efficiency Checks if there is a potential in using combiner after map stage Intermediate data compression Checks if intermediate data is compressed to lower the shuffle time Currently there are 15 rules

Performance Analysis for sample set of Jobs Vaidya Rules Total jobs analyzed = 794

Future Enhancements Online progress analysis of the Map/Reduce jobs to improve utilization Correlation of various prescriptions suggested by Hadoop Vaidya to detect larger performance bottlenecks Proactive SLA monitoring Detect inefficiently executing jobs early enough or those that would eventually fail due to any resource constraints Integration with the Job History viewer Production Job Certification

Results of Hadoop Vaidya Total jobs analyzed = 22602 Rules which yielded POSITIVE (TEST FAILED) Balanced Reduce Partitioning (4247 jobs / 18.79%) Impact of Map tasks re-execution (1 job) Impact of Reduce tasks re-execution (8 jobs) #Maps/Reduces tasks reading HDFS data as side effect (20570 jobs / 91%) Map side disk spill (864 jobs / 3.8%)

HW09 Hadoop Vaidya

More Related Content

What's hot (20)

Viewers also liked (17)

Similar to HW09 Hadoop Vaidya (20)

More from Cloudera, Inc. (20)

Recently uploaded (20)

HW09 Hadoop Vaidya