SlideShare a Scribd company logo
Hadoop on Dockers
What are Dockers?
• Docker is a tool designed to make it easier to create, deploy, and run
applications by using containers.
• At a high level, Docker is a Linux utility that can efficiently create, ship, and
run containers.
• Docker containers wrap a piece of software in a complete file system that
contains everything needed to run: code, runtime, system tools, system
libraries
• Docker enables you to quickly, reliably, and consistently deploy
applications regardless of environment.
What are containers?
• Linux containers are self-contained execution environments -- with
their own, isolated CPU, memory, block I/O, and network resources
• Feels like a virtual machine, but sheds all the weight and startup
overhead of a guest operating system
• Containers allow a developer to package up an application with all of
the parts it needs, such as libraries and other dependencies, and ship
it all out as one package.
Container vs Virtual Machines
• Containers and virtual machines have similar resource isolation and
allocation benefits -- but a different architectural approach allows
containers to be more portable and efficient.
VIRTUAL MACHINES
Virtual machines include the application, the
necessary binaries and libraries, and an entire guest
operating system -- all of which can amount to tens
of GBs.
CONTAINERS
Containers include the application and all of its
dependencies --but share the kernel with other containers,
running as isolated processes in user space on the host
operating system. Docker containers are not tied to any
specific infrastructure: they run on any computer, on any
infrastructure, and in any cloud.
Containers:
Virtual Machines
Dockers
Advantages
• Containers running on a single machine share the same operating system
kernel; they start instantly and use less RAM. Images are constructed
from layered file systems and share common files, making disk usage and
image downloads much more efficient.
• Docker containers are based on open standards, enabling containers to
run on all major Linux distributions and on Microsoft Windows -- and on
top of any infrastructure.
• Containers isolate applications from one another and the underlying
infrastructure, while providing an added layer of protection for the
application.
• Eliminate Environment inconsistencies
• Distribute and share Content
• Simply Share your application with other without worrying about
environment
Advantages..
• Quickly Scale
• Docker makes it easy to identify issues, isolate the problem
container, quickly roll back to make the necessary changes, and then
push the updated container into production
• Dockers allows you to bundle Build Once, Run anywhere
Dockerfile and Image
Example:
HADOOP ON DOCKERS
• Docker is the New Quick Start Option for Apache Hadoop and
Cloudera
http://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for-apache-hadoop-and-cloudera/
HADOOP ON DOCKERS
Some Challenges
• Which container manager to choose? Swarn, kubernetes, AWS ECS,
MESOS ?
• How to handle Storage Configuration? Overlay files, flocker, canvoy?
• Which network configurations?
• Software compatibly? What OS(linus/ubunutu), build of Hadoop,
application layer, how to make sure all these work together.
• Maintenance : availability, multi-container, upgrades, patches, back up?
Hadoop on Dockers
References:
https://www.youtube.com/watch?v=pGYAg7TMmp0
https://www.youtube.com/watch?v=biJTvobZm1A
https://www.youtube.com/watch?v=YFl2mCHdv24

More Related Content

PPTX
Docker - Portable Deployment
PPTX
Building microservices with docker
PDF
Manta: a new internet-facing object storage facility that features compute by...
PPTX
Containerization & Docker - Under the Hood
PPTX
Unikernels and Cloud Computing
PPTX
Cloud computing
PDF
Docker's Killer Feature: The Remote API
PPT
Linux virtualization
Docker - Portable Deployment
Building microservices with docker
Manta: a new internet-facing object storage facility that features compute by...
Containerization & Docker - Under the Hood
Unikernels and Cloud Computing
Cloud computing
Docker's Killer Feature: The Remote API
Linux virtualization

What's hot (20)

PDF
Run containers on bare metal already!
PDF
Whales, Clouds, and Bubbles...?
PPTX
Understanding the container landscape and it associated projects
PPTX
SECURITY, VIRTUALISATION AND INTEGRITY IN CLOUD COMPUTING
PDF
node.js and Containers: Dispatches from the Frontier
PDF
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
PDF
Containerization Principles Overview for app development and deployment
PDF
Unikernels Introduction
PPTX
PPTX
PPTX
Virtualization
PPTX
Virtualization Explained | What Is Virtualization Technology? | Virtualizatio...
PDF
Présentation d'Unikernel
PDF
node.js in production: Reflections on three years of riding the unicorn
PDF
The Container Revolution: Reflections after the first decade
PPTX
Cloud computing 2
PPTX
PPTX
DockerDaylight (OpenDaylight + Docker networking)
PDF
Triton + Docker, July 2015
PPTX
Virtualization
Run containers on bare metal already!
Whales, Clouds, and Bubbles...?
Understanding the container landscape and it associated projects
SECURITY, VIRTUALISATION AND INTEGRITY IN CLOUD COMPUTING
node.js and Containers: Dispatches from the Frontier
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
Containerization Principles Overview for app development and deployment
Unikernels Introduction
Virtualization
Virtualization Explained | What Is Virtualization Technology? | Virtualizatio...
Présentation d'Unikernel
node.js in production: Reflections on three years of riding the unicorn
The Container Revolution: Reflections after the first decade
Cloud computing 2
DockerDaylight (OpenDaylight + Docker networking)
Triton + Docker, July 2015
Virtualization
Ad

Similar to Hadoop on Dockers (20)

PPTX
Lectre # 11 (VS&S). virtualization .pptx
PPTX
Docker
PPTX
Docker Overview
PPTX
Docker - A curtain raiser to the Container world
PPTX
CONTAINERIZATION WITH DOCKER .pptx
PDF
Week 8 lecture material
PDF
week8_watermark.pdfhowcanitbe minimum 40 i
PPTX
What is Docker?
PDF
Docker - Frank Maounis
PDF
Dockers and kubernetes
PDF
The ABC of Docker: The Absolute Best Compendium of Docker
PPTX
Docker - the what why and hows
PPTX
Introduction to Dockers.pptx
PDF
PDF
An Introduction To Docker
PPTX
Getting started with Docker
PPTX
Docker - HieuHoang
PPTX
UNITde II - Docker-Containerization.pptx,
PPTX
Docker Workshop
PDF
Docker handons-workshop-for-charity
Lectre # 11 (VS&S). virtualization .pptx
Docker
Docker Overview
Docker - A curtain raiser to the Container world
CONTAINERIZATION WITH DOCKER .pptx
Week 8 lecture material
week8_watermark.pdfhowcanitbe minimum 40 i
What is Docker?
Docker - Frank Maounis
Dockers and kubernetes
The ABC of Docker: The Absolute Best Compendium of Docker
Docker - the what why and hows
Introduction to Dockers.pptx
An Introduction To Docker
Getting started with Docker
Docker - HieuHoang
UNITde II - Docker-Containerization.pptx,
Docker Workshop
Docker handons-workshop-for-charity
Ad

Recently uploaded (20)

PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPT
Quality review (1)_presentation of this 21
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Lecture1 pattern recognition............
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PDF
Introduction to Business Data Analytics.
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Foundation of Data Science unit number two notes
PDF
Mega Projects Data Mega Projects Data
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Reliability_Chapter_ presentation 1221.5784
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Quality review (1)_presentation of this 21
Miokarditis (Inflamasi pada Otot Jantung)
Lecture1 pattern recognition............
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Data_Analytics_and_PowerBI_Presentation.pptx
Launch Your Data Science Career in Kochi – 2025
1_Introduction to advance data techniques.pptx
Moving the Public Sector (Government) to a Digital Adoption
Introduction to Business Data Analytics.
Business Acumen Training GuidePresentation.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
oil_refinery_comprehensive_20250804084928 (1).pptx
Foundation of Data Science unit number two notes
Mega Projects Data Mega Projects Data

Hadoop on Dockers

  • 2. What are Dockers? • Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. • At a high level, Docker is a Linux utility that can efficiently create, ship, and run containers. • Docker containers wrap a piece of software in a complete file system that contains everything needed to run: code, runtime, system tools, system libraries • Docker enables you to quickly, reliably, and consistently deploy applications regardless of environment.
  • 3. What are containers? • Linux containers are self-contained execution environments -- with their own, isolated CPU, memory, block I/O, and network resources • Feels like a virtual machine, but sheds all the weight and startup overhead of a guest operating system • Containers allow a developer to package up an application with all of the parts it needs, such as libraries and other dependencies, and ship it all out as one package.
  • 4. Container vs Virtual Machines • Containers and virtual machines have similar resource isolation and allocation benefits -- but a different architectural approach allows containers to be more portable and efficient. VIRTUAL MACHINES Virtual machines include the application, the necessary binaries and libraries, and an entire guest operating system -- all of which can amount to tens of GBs. CONTAINERS Containers include the application and all of its dependencies --but share the kernel with other containers, running as isolated processes in user space on the host operating system. Docker containers are not tied to any specific infrastructure: they run on any computer, on any infrastructure, and in any cloud.
  • 8. Advantages • Containers running on a single machine share the same operating system kernel; they start instantly and use less RAM. Images are constructed from layered file systems and share common files, making disk usage and image downloads much more efficient. • Docker containers are based on open standards, enabling containers to run on all major Linux distributions and on Microsoft Windows -- and on top of any infrastructure. • Containers isolate applications from one another and the underlying infrastructure, while providing an added layer of protection for the application. • Eliminate Environment inconsistencies • Distribute and share Content • Simply Share your application with other without worrying about environment
  • 9. Advantages.. • Quickly Scale • Docker makes it easy to identify issues, isolate the problem container, quickly roll back to make the necessary changes, and then push the updated container into production • Dockers allows you to bundle Build Once, Run anywhere
  • 12. HADOOP ON DOCKERS • Docker is the New Quick Start Option for Apache Hadoop and Cloudera http://blog.cloudera.com/blog/2015/12/docker-is-the-new-quickstart-option-for-apache-hadoop-and-cloudera/
  • 14. Some Challenges • Which container manager to choose? Swarn, kubernetes, AWS ECS, MESOS ? • How to handle Storage Configuration? Overlay files, flocker, canvoy? • Which network configurations? • Software compatibly? What OS(linus/ubunutu), build of Hadoop, application layer, how to make sure all these work together. • Maintenance : availability, multi-container, upgrades, patches, back up?

Editor's Notes

  • #9: Light weight Open source Security Eliminate Environment inconsistencies