SlideShare a Scribd company logo
Page1 © Hortonworks Inc. 2014
Tez: UI & Debugging
Fall 2014
Version 1.0
gopalv@apache.org
Page2 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TEZ (nomenclature)
• DAG
• Vertex
• Task
• Attempt
• Container
• Edge
Page3 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Directed Acyclic Graphs
Page4 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
How to view raw DAGs from logs
• Tez Application logs contain .dot files in Graphviz format
• To generate images: dot –Tpng –o dag.png dag.dot
• OR javascript version: http://people.apache.org/~gopalv/dagviz/
Page5 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TEZ-8 JIRA & branch
• TEZ UI for progress tracking and history
• https://issues.apache.org/jira/browse/TEZ-8
• https://github.com/apache/tez/tree/TEZ-8
• UI-centric branch
Page6 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez-UI: Landing page
Page7 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: DAG view
Page8 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Vertex view
Page9 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Vertex -> Tasks view
Page10 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Task logs
Task logs
Page11 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Task counters
Task counters
Page12 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Task counters
Search for
counters
Page13 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Per-edge shuffle counters
Map 3 to Map 1 only
Page14 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Payload view
Page15 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed DAGs (diagnostic)
Page16 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed tasks indication
Failed tasks
Page17 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed tasks
Page18 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Tez UI: Failed attempts
Page20 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Post-hoc/Ad-hoc analysis helpers
• tez/tez-tools ships with two helper tools
• swimlanes
• tez-tfile-parser
Page21 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Swimlanes
• ./yarn-swimlanes.sh application_1415860665053_0098
Page22 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TFile parser
• Tez logs can be parsed via PIG
• Allows us to treat our logs exactly like we treat our big-data
• Processing using “pig –x tez” + UDFs [1]
rawLogs = load ‘/app-logs/root/logs/application_1409012059361_0539/*' using
org.apache.tez.tools.TFileLoader() as (machine:chararray, key:chararray, line:chararray);
[1] - https://github.com/rajeshbalamohan/tez_log_parser/blob/master/src/main/resources/pig/udf.groovy
Page23 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TFile parser (contd)
• Parsing INFO logs for shuffle for instance (for time taken + machine)
Problematic machine
Page24 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
TFile parser (node/rack traffic at 350 nodes)
Problematic machine
Fetcher in node-100 is always slow
(irrespective of where its pulling data from)
Other faulty nodes
Mapout served from node-100 to node-120
To any node is always slow
Page25 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49
Questions?
• Thanks all tez contributors for their efforts!
• FYI, Hadoop Summit 2015 (Europe) Call for papers is out

More Related Content

PPTX
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...
PDF
Tez: Accelerating Data Pipelines - fifthel
PDF
Quick Introduction to Apache Tez
PPTX
Apache Tez – Present and Future
PPTX
Tune up Yarn and Hive
 
PPTX
Apache Tez - A unifying Framework for Hadoop Data Processing
PPTX
YARN - Presented At Dallas Hadoop User Group
PPTX
Pig on Tez: Low Latency Data Processing with Big Data
November 2014 HUG: Apache Tez - A Performance View into Large Scale Data-proc...
Tez: Accelerating Data Pipelines - fifthel
Quick Introduction to Apache Tez
Apache Tez – Present and Future
Tune up Yarn and Hive
 
Apache Tez - A unifying Framework for Hadoop Data Processing
YARN - Presented At Dallas Hadoop User Group
Pig on Tez: Low Latency Data Processing with Big Data

What's hot (20)

PDF
Apache Hadoop YARN - The Future of Data Processing with Hadoop
PPTX
Apache Tez - Accelerating Hadoop Data Processing
PPTX
Analyzing Hadoop Using Hadoop
PPTX
February 2014 HUG : Hive On Tez
PPTX
Apache Tez : Accelerating Hadoop Query Processing
PPTX
Pig on Tez - Low Latency ETL with Big Data
PPTX
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
PDF
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
PDF
Apache Hadoop YARN - Enabling Next Generation Data Applications
PPTX
Flexible and Real-Time Stream Processing with Apache Flink
PPTX
Apache Tez – Present and Future
PDF
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
PPTX
LLAP: long-lived execution in Hive
PPTX
October 2014 HUG : Hive On Spark
PPTX
Spark crash course workshop at Hadoop Summit
PDF
Hortonworks Technical Workshop: HBase and Apache Phoenix
PPTX
Apache Hadoop YARN 2015: Present and Future
PPTX
NextGen Apache Hadoop MapReduce
PDF
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
PPTX
Data organization: hive meetup
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Tez - Accelerating Hadoop Data Processing
Analyzing Hadoop Using Hadoop
February 2014 HUG : Hive On Tez
Apache Tez : Accelerating Hadoop Query Processing
Pig on Tez - Low Latency ETL with Big Data
Big Data Day LA 2015 - What's new and next in Apache Tez by Bikas Saha of Hor...
Query Engines for Hive: MR, Spark, Tez with LLAP – Considerations!
Apache Hadoop YARN - Enabling Next Generation Data Applications
Flexible and Real-Time Stream Processing with Apache Flink
Apache Tez – Present and Future
Architecting a Scalable Hadoop Platform: Top 10 considerations for success
LLAP: long-lived execution in Hive
October 2014 HUG : Hive On Spark
Spark crash course workshop at Hadoop Summit
Hortonworks Technical Workshop: HBase and Apache Phoenix
Apache Hadoop YARN 2015: Present and Future
NextGen Apache Hadoop MapReduce
Speed Up Your Queries with Hive LLAP Engine on Hadoop or in the Cloud
Data organization: hive meetup
Ad

Similar to TEZ-8 UI Walkthrough (20)

PPTX
Apache Tez: Accelerating Hadoop Query Processing
PPTX
YARN Ready - Integrating to YARN using Slider Webinar
PPTX
Bring your Service to YARN
PPTX
Munich HUG 21.11.2013
PPTX
Apache Slider
PDF
Enabling R on Hadoop
PPTX
Introduction to pig
PDF
Hortonworks Technical Workshop - build a yarn ready application with apache ...
PPTX
Ted Dunning-Faster and Furiouser- Flink Drift
PPTX
Architecture & Operations
PPTX
Internet of Things Crash Course Workshop at Hadoop Summit
PPTX
Internet of things Crash Course Workshop
PDF
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
PDF
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
PPTX
Process and Visualize Your Data with Revolution R, Hadoop and GoogleVis
PPTX
Hdp r-google charttools-webinar-3-5-2013 (2)
PPTX
Apache Hadoop YARN: Present and Future
PPTX
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
PPTX
Docker based Hadoop provisioning - anywhere
PPTX
Hadoop Summit Europe 2015 - YARN Present and Future
Apache Tez: Accelerating Hadoop Query Processing
YARN Ready - Integrating to YARN using Slider Webinar
Bring your Service to YARN
Munich HUG 21.11.2013
Apache Slider
Enabling R on Hadoop
Introduction to pig
Hortonworks Technical Workshop - build a yarn ready application with apache ...
Ted Dunning-Faster and Furiouser- Flink Drift
Architecture & Operations
Internet of Things Crash Course Workshop at Hadoop Summit
Internet of things Crash Course Workshop
Pivotal CenturyLink Cloud Platform Seminar Presentations: Architecture & Oper...
April 2013 HUG: The Stinger Initiative - Making Apache Hive 100 Times Faster
Process and Visualize Your Data with Revolution R, Hadoop and GoogleVis
Hdp r-google charttools-webinar-3-5-2013 (2)
Apache Hadoop YARN: Present and Future
Enterprise-Grade Rolling Upgrade for a Live Hadoop Cluster
Docker based Hadoop provisioning - anywhere
Hadoop Summit Europe 2015 - YARN Present and Future
Ad

Recently uploaded (20)

PDF
Modernizing your data center with Dell and AMD
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
KodekX | Application Modernization Development
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Cloud computing and distributed systems.
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Approach and Philosophy of On baking technology
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Advanced Soft Computing BINUS July 2025.pdf
PPT
Teaching material agriculture food technology
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Modernizing your data center with Dell and AMD
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
KodekX | Application Modernization Development
Chapter 3 Spatial Domain Image Processing.pdf
Unlocking AI with Model Context Protocol (MCP)
Cloud computing and distributed systems.
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
cuic standard and advanced reporting.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Approach and Philosophy of On baking technology
Per capita expenditure prediction using model stacking based on satellite ima...
Advanced Soft Computing BINUS July 2025.pdf
Teaching material agriculture food technology
GamePlan Trading System Review: Professional Trader's Honest Take
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...

TEZ-8 UI Walkthrough

  • 1. Page1 © Hortonworks Inc. 2014 Tez: UI & Debugging Fall 2014 Version 1.0 gopalv@apache.org
  • 2. Page2 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TEZ (nomenclature) • DAG • Vertex • Task • Attempt • Container • Edge
  • 3. Page3 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Directed Acyclic Graphs
  • 4. Page4 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 How to view raw DAGs from logs • Tez Application logs contain .dot files in Graphviz format • To generate images: dot –Tpng –o dag.png dag.dot • OR javascript version: http://people.apache.org/~gopalv/dagviz/
  • 5. Page5 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TEZ-8 JIRA & branch • TEZ UI for progress tracking and history • https://issues.apache.org/jira/browse/TEZ-8 • https://github.com/apache/tez/tree/TEZ-8 • UI-centric branch
  • 6. Page6 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez-UI: Landing page
  • 7. Page7 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: DAG view
  • 8. Page8 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Vertex view
  • 9. Page9 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Vertex -> Tasks view
  • 10. Page10 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Task logs Task logs
  • 11. Page11 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Task counters Task counters
  • 12. Page12 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Task counters Search for counters
  • 13. Page13 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Per-edge shuffle counters Map 3 to Map 1 only
  • 14. Page14 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Payload view
  • 15. Page15 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed DAGs (diagnostic)
  • 16. Page16 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed tasks indication Failed tasks
  • 17. Page17 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed tasks
  • 18. Page18 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Tez UI: Failed attempts
  • 19. Page20 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Post-hoc/Ad-hoc analysis helpers • tez/tez-tools ships with two helper tools • swimlanes • tez-tfile-parser
  • 20. Page21 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Swimlanes • ./yarn-swimlanes.sh application_1415860665053_0098
  • 21. Page22 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TFile parser • Tez logs can be parsed via PIG • Allows us to treat our logs exactly like we treat our big-data • Processing using “pig –x tez” + UDFs [1] rawLogs = load ‘/app-logs/root/logs/application_1409012059361_0539/*' using org.apache.tez.tools.TFileLoader() as (machine:chararray, key:chararray, line:chararray); [1] - https://github.com/rajeshbalamohan/tez_log_parser/blob/master/src/main/resources/pig/udf.groovy
  • 22. Page23 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TFile parser (contd) • Parsing INFO logs for shuffle for instance (for time taken + machine) Problematic machine
  • 23. Page24 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 TFile parser (node/rack traffic at 350 nodes) Problematic machine Fetcher in node-100 is always slow (irrespective of where its pulling data from) Other faulty nodes Mapout served from node-100 to node-120 To any node is always slow
  • 24. Page25 © Hortonworks Inc. 2014 FOR: BAY AREA HADOOP USER GROUP MEETUP #49 Questions? • Thanks all tez contributors for their efforts! • FYI, Hadoop Summit 2015 (Europe) Call for papers is out