SlideShare a Scribd company logo
Wei’s Notes on Resource AwarenessMarch  2011
Example workloadsIO-boundIndexingSearchingGroupingDecoding/decompressingData importing and exportingCPU-boundMachine learningComplex text miningNatural language processingFeature extraction
IO/CPU intensive?How to judge if a job is IO/CPU intensive?Simplify: let user specifyOtherwise:Does it make more sense to find the pattern at the job level or task level?Could a job be CPU intensive but with reduce tasks being IO intensive?
GoalMake task/Job placement resource awareProposal: provide a profiling mechanism to quantify demand and supply per job per task type and per machine periodically, like a 3D score sheet. Any scheduler could generically adopt the score sheet, and sign slot/task based on the weighted task/slot.   Job_TaskTypetimemachine
Proposed schemeQuantify resource capacities at cluster startQuantify machine/network variables periodicallyProfile tasks/jobs resource demand whenever: a job is submitted, first mapper task finishes, mapper done, or first mapper task finishes.Assign score per job per task_type per possible machine placement (all slots on a given machine are homogeneous) based on profiles obtained in 1, 2 and 3 periodically.
Variables*traffic on the link which a given node have to transfer data from
Idle Cluster: 1 Task – M SlotsPolicy (without Network IO && Picking only, not scoring. ONLY for brainstorming):List<Node> nodes, s. t. availability_io > demand_io && availability_cpu > demand _cpuIf nodes.size() = 1DONE!else if nodes.size() > 1for each                   //try to balance io usage and cpu usage on a machine io_cpu_dist = dist (availability_io - demand_io,availability_cpu - demand _cpu)Pick node with min(io_cpu_dist)DONE!else if nodes.size() = 0for each shortage = dist (availability_io, demand_io) + dist(availability_cpu, demand _cpu)Pick node with min(shortage )DONE
Busy Cluster: 1 Slot – M TasksCloser to the production clusters usage patternSimilar algo as idle. And the same algo can be extended to assign scores.
LimitationsScore sheet only has scores of running tasks (extending to tasks from the same job of the same task type). Doesn’t benefit the very first mapper task or the very first reducer task.
Measurement & QuantificationProfile a task type of a job by samplingHow to measure IO and CPU of a given machine at a given time?Availability = Capacity – (sum of resource consumption of running task). Capacity?Or better: Availability = (sum of resource consumption of running task) * (1/usage percentage – 1)	*this availability is based on average current running task demand. And step 1 in the proposed scheme could potentially be skipped! Well… but that could come handy when placing the very first task.How to normalize IO and CPU against each other?Use percentage? Then demands has to be normalized with the same multipliers, IO and CPU respectively.

More Related Content

PPTX
Operator overloading
PPT
Python advanced 3.the python std lib by example –data structures
PPT
Python advanced 3.the python std lib by example – algorithm
PPTX
Task and Data Parallelism: Real-World Examples
PPTX
Clone cloud
PDF
Introduction to TensorFlow
PPTX
Python your new best friend
PPTX
Tensor flow
Operator overloading
Python advanced 3.the python std lib by example –data structures
Python advanced 3.the python std lib by example – algorithm
Task and Data Parallelism: Real-World Examples
Clone cloud
Introduction to TensorFlow
Python your new best friend
Tensor flow

What's hot (15)

PPT
Python advanced 3.the python std lib by example – application building blocks
PDF
Introduction to Tensor Flow for Optical Character Recognition (OCR)
PDF
Python Generators
PPTX
Dynamic memory allocation
PPTX
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
PDF
Communicating State Machines
PPTX
An Introduction to TensorFlow architecture
PPTX
Iterarators and generators in python
PPT
Analysis Is Painless
PDF
Machine Intelligence at Google Scale: TensorFlow
PPTX
Introduction to Machine Learning with TensorFlow
PDF
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
PDF
Introduction to Functional Programming
PPTX
Neural networks and google tensor flow
Python advanced 3.the python std lib by example – application building blocks
Introduction to Tensor Flow for Optical Character Recognition (OCR)
Python Generators
Dynamic memory allocation
Avi Pfeffer, Principal Scientist, Charles River Analytics at MLconf SEA - 5/2...
Communicating State Machines
An Introduction to TensorFlow architecture
Iterarators and generators in python
Analysis Is Painless
Machine Intelligence at Google Scale: TensorFlow
Introduction to Machine Learning with TensorFlow
Rajat Monga, Engineering Director, TensorFlow, Google at MLconf 2016
Introduction to Functional Programming
Neural networks and google tensor flow
Ad

Viewers also liked (6)

PPT
Church management
PPTX
10 Proven Military Strategies for Breakthrough Resource Management
PPTX
Organizational structure and Church Governance slides
PPT
Resource allocation
PPTX
Administration Vs. Management
PPTX
Categorization describing resource classes and types
Church management
10 Proven Military Strategies for Breakthrough Resource Management
Organizational structure and Church Governance slides
Resource allocation
Administration Vs. Management
Categorization describing resource classes and types
Ad

Similar to Wei's notes on hadoop resource awareness (20)

PDF
Parallel Computing - Lec 5
PPTX
Introduction to Hadoop part 2
PDF
RTOS - Real Time Operating Systems
PPTX
Programming Fundamentals With OOPs Concepts (Java Examples Based)
PPT
Migration To Multi Core - Parallel Programming Models
PPT
Intermachine Parallelism
PPTX
Optimizing Performance - Clojure Remote - Nikola Peric
PPTX
Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and...
PDF
Accelerating the Development of Efficient CP Optimizer Models
PDF
MapReduce: teoria e prática
PDF
Weighted Flowtime on Capacitated Machines
PDF
Introduction to OpenMP (Performance)
PDF
Hadoop interview questions - Softwarequery.com
PDF
Introduction to Parallel Computing
PDF
AVOIDING DUPLICATED COMPUTATION TO IMPROVE THE PERFORMANCE OF PFSP ON CUDA GPUS
PDF
AVOIDING DUPLICATED COMPUTATION TO IMPROVE THE PERFORMANCE OF PFSP ON CUDA GPUS
PPTX
NYAI - Scaling Machine Learning Applications by Braxton McKee
PPTX
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
PDF
Parallel Computing - Lec 6
PDF
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems
Parallel Computing - Lec 5
Introduction to Hadoop part 2
RTOS - Real Time Operating Systems
Programming Fundamentals With OOPs Concepts (Java Examples Based)
Migration To Multi Core - Parallel Programming Models
Intermachine Parallelism
Optimizing Performance - Clojure Remote - Nikola Peric
Apache Hadoop India Summit 2011 talk "An Extension of Fairshare-Scheduler and...
Accelerating the Development of Efficient CP Optimizer Models
MapReduce: teoria e prática
Weighted Flowtime on Capacitated Machines
Introduction to OpenMP (Performance)
Hadoop interview questions - Softwarequery.com
Introduction to Parallel Computing
AVOIDING DUPLICATED COMPUTATION TO IMPROVE THE PERFORMANCE OF PFSP ON CUDA GPUS
AVOIDING DUPLICATED COMPUTATION TO IMPROVE THE PERFORMANCE OF PFSP ON CUDA GPUS
NYAI - Scaling Machine Learning Applications by Braxton McKee
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Parallel Computing - Lec 6
Efficient Dynamic Scheduling Algorithm for Real-Time MultiCore Systems

Recently uploaded (20)

PDF
Deliverable file - Regulatory guideline analysis.pdf
PPTX
Business Ethics - An introduction and its overview.pptx
PDF
DOC-20250806-WA0002._20250806_112011_0000.pdf
PDF
Training And Development of Employee .pdf
PDF
A Brief Introduction About Julia Allison
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PPTX
Lecture (1)-Introduction.pptx business communication
PDF
Business model innovation report 2022.pdf
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PDF
Nidhal Samdaie CV - International Business Consultant
PDF
How to Get Funding for Your Trucking Business
PDF
Unit 1 Cost Accounting - Cost sheet
PDF
Ôn tập tiếng anh trong kinh doanh nâng cao
PDF
IFRS Notes in your pocket for study all the time
PPTX
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
PDF
MSPs in 10 Words - Created by US MSP Network
PPTX
Probability Distribution, binomial distribution, poisson distribution
PPT
Chapter four Project-Preparation material
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PDF
WRN_Investor_Presentation_August 2025.pdf
Deliverable file - Regulatory guideline analysis.pdf
Business Ethics - An introduction and its overview.pptx
DOC-20250806-WA0002._20250806_112011_0000.pdf
Training And Development of Employee .pdf
A Brief Introduction About Julia Allison
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Lecture (1)-Introduction.pptx business communication
Business model innovation report 2022.pdf
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
Nidhal Samdaie CV - International Business Consultant
How to Get Funding for Your Trucking Business
Unit 1 Cost Accounting - Cost sheet
Ôn tập tiếng anh trong kinh doanh nâng cao
IFRS Notes in your pocket for study all the time
AI-assistance in Knowledge Collection and Curation supporting Safe and Sustai...
MSPs in 10 Words - Created by US MSP Network
Probability Distribution, binomial distribution, poisson distribution
Chapter four Project-Preparation material
Reconciliation AND MEMORANDUM RECONCILATION
WRN_Investor_Presentation_August 2025.pdf

Wei's notes on hadoop resource awareness

  • 1. Wei’s Notes on Resource AwarenessMarch 2011
  • 2. Example workloadsIO-boundIndexingSearchingGroupingDecoding/decompressingData importing and exportingCPU-boundMachine learningComplex text miningNatural language processingFeature extraction
  • 3. IO/CPU intensive?How to judge if a job is IO/CPU intensive?Simplify: let user specifyOtherwise:Does it make more sense to find the pattern at the job level or task level?Could a job be CPU intensive but with reduce tasks being IO intensive?
  • 4. GoalMake task/Job placement resource awareProposal: provide a profiling mechanism to quantify demand and supply per job per task type and per machine periodically, like a 3D score sheet. Any scheduler could generically adopt the score sheet, and sign slot/task based on the weighted task/slot. Job_TaskTypetimemachine
  • 5. Proposed schemeQuantify resource capacities at cluster startQuantify machine/network variables periodicallyProfile tasks/jobs resource demand whenever: a job is submitted, first mapper task finishes, mapper done, or first mapper task finishes.Assign score per job per task_type per possible machine placement (all slots on a given machine are homogeneous) based on profiles obtained in 1, 2 and 3 periodically.
  • 6. Variables*traffic on the link which a given node have to transfer data from
  • 7. Idle Cluster: 1 Task – M SlotsPolicy (without Network IO && Picking only, not scoring. ONLY for brainstorming):List<Node> nodes, s. t. availability_io > demand_io && availability_cpu > demand _cpuIf nodes.size() = 1DONE!else if nodes.size() > 1for each //try to balance io usage and cpu usage on a machine io_cpu_dist = dist (availability_io - demand_io,availability_cpu - demand _cpu)Pick node with min(io_cpu_dist)DONE!else if nodes.size() = 0for each shortage = dist (availability_io, demand_io) + dist(availability_cpu, demand _cpu)Pick node with min(shortage )DONE
  • 8. Busy Cluster: 1 Slot – M TasksCloser to the production clusters usage patternSimilar algo as idle. And the same algo can be extended to assign scores.
  • 9. LimitationsScore sheet only has scores of running tasks (extending to tasks from the same job of the same task type). Doesn’t benefit the very first mapper task or the very first reducer task.
  • 10. Measurement & QuantificationProfile a task type of a job by samplingHow to measure IO and CPU of a given machine at a given time?Availability = Capacity – (sum of resource consumption of running task). Capacity?Or better: Availability = (sum of resource consumption of running task) * (1/usage percentage – 1) *this availability is based on average current running task demand. And step 1 in the proposed scheme could potentially be skipped! Well… but that could come handy when placing the very first task.How to normalize IO and CPU against each other?Use percentage? Then demands has to be normalized with the same multipliers, IO and CPU respectively.

Editor's Notes

  • #3: Source:http://www.cloudera.com/blog/2010/03/clouderas-support-team-shares-some-basic-hardware-recommendations/