SlideShare a Scribd company logo
Installing Hadoop on
Ubuntu 16
INSTALL OPEN JDK
1
Install Java
 Do I have Java? Type on terminal: java -version
 If I see the output below, then I don’t have java installed, follow instructions next
slide
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
2
Install Java
 Type:
 sudo apt-get install openjdk-8-jdk
 Type Y to continue the installation process (it will take a while to complete the
installation)
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
3
Do I have java?
 To confirm java ins installed on my Ubuntu system type:
 java –version
 You will see output below
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
4
Install Openssh
 Is mandatory to install openssh server:
sudo apt-get install openssh-server
 If ssh server is installed then
generate keys, run command below:
ssh-keygen -t rsa
 Enter file, press enter
 Enter passphrase, press enter
 Enter same passphrase again press
 enter
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
5
SSH Keys
 Now we will copy the key to the user and host, in my case my user is hadoop and
host is hadoopdev
 ssh-copy-id hadoop@hadoopdev
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
6
Download and Install
Hadoop
DOWNLOAD HADOOP FROM APACHE WEB PAGE
7
Download Apache Hadoop
 Type in the terminal the following command to create new folder within my home
linux folder, in this case/home/Hadoop/:
 mkdir hadoop_install
 Then go into this new folder:
 cd hadoop_install
 And copy the command below:
 wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-
2.7.3.tar.gz
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
8
Download Apache Hadoop
 You will see windows reflecting the progress of the download
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
9
Unzip Hadoop folder
 Once download is complete
 Type the following command:
 tar -xvf hadoop-2.7.3.tar.gz
 Now you will see 2 folders, the new directory is called hadoop-2.7.3:
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
10
Setup bashrc
 This is the java location (very important for next steps):
 Edit bashrc
 Type:
 Sudo gedit ~/.bashrc
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
11
Setup ~/.bashrc
 Add this lines to the .bashrc
 Pls note on previous slide the java path is displayed, need to point bashrc to the
actual java path
 #HADOOP VARIABLES START
 export JAVA_HOME=/usr/lib/jvm/ java-1.8.0-openjdk-amd64
 export HADOOP_INSTALL=/home/hadoop/hadoop_install
 export PATH=$PATH:$HADOOP_INSTALL/bin
 export PATH=$PATH:$HADOOP_INSTALL/sbin
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
12
Testing hadoop installation
 Type the following command to refresh ~/.bashrc changes (no need to restart)
 source ~/.basrch
 Type the command below (if at this point you see an output like this you’re
doing well)
hadoop version
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
13
Setup single node
INSTALL OPEN JDK
14
Point your java to hadoop conf file
 Go to the path:
 /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
 Edit the file:
 sudo gedit Hadoop-env.sh
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
15
Modifying hadoop-env.sh
 Modify the value for Java Home in the file: hadoop-env.sh
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
16
Modify core-site.xml
 Create a folder called tmp in /home/hadoop/hadoop_install
 Add the following text to the core-site.xml , file is on the path:
/home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop_install/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system.</description>
</property>
</configuration>
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
17
Modify mapred-site.xml
 By default there is a file called: mapred-site.xml.template, needs to be renamed to
mapred-site.xml and then add the code below:
 File is on path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs at. </description>
</property>
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
18
Modify hdfs-site.xml
 We need to créate 2 new folders which will contain name node and data node:
 I placed these 2 folders on: /home/hadoop/hadoop_install/
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
19
Modify hdfs-site.xml
Add the code below in the file hdfs-site.xml, the paths for namnode and datanode are the 2 new folders
you just created on previous slide.
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/hadoop_install/namenode</value>
</property>
<property>
<name>dfs.data.node.name.dir</name>
<value>file:///home/hadoop/hadoop_install/datanode</value>
</property>
</configuration>
#hdfs-site.xml is located on the path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
20
Format the namenode
 Run the following command:
 hadoop namenode –format
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
21
Format the namenode part 2
 If everything is ok you will see message below:
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
22
Running Hadoop Single node
 Run the command:
 startall.sh
 Then execute the command:
 jps, you will see the following output
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
23
Stop Cluster
 We run stop-all.sh
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
24
Web Interface: localhost:50070
 In the browser go to: localhost:50070
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
25
Applies for:
 This installation runs under:
 Ubuntu 16
 Hadoop 2.7.3
 Virtual Machine:
 2 Processors
 2 Gb Ram
 2 Network Interface, 1 as Bridge, 2nd as Nat
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
26
You need help?
 Contact name:
 Enrique Davila Gutierrez
 Enrique.davila@Gmail.com
10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com
27

More Related Content

PPT
Power point on linux commands,appache,php,mysql,html,css,web 2.0
PPT
Linux presentation
DOCX
Document Management: Opendocman and LAMP installation on Cent OS
PDF
Pluggable database 3
DOCX
Network Manual
PDF
Lamp Server With Drupal Installation
PPT
PPT
Linux apache installation
Power point on linux commands,appache,php,mysql,html,css,web 2.0
Linux presentation
Document Management: Opendocman and LAMP installation on Cent OS
Pluggable database 3
Network Manual
Lamp Server With Drupal Installation
Linux apache installation

What's hot (16)

PDF
Pluggable database tutorial 2
PDF
DSpace Manual for BALID Trainee
PDF
PDF
Hadoop completereference
PDF
Instalar PENTAHO 5 en CentOS 6
DOCX
Asm disk group migration from
PDF
Step by Step Restore rman to different host
PDF
Apache
PPT
Linux Webserver Installation Command and GUI.ppt
PPT
PPT
PDF
Pluggable database tutorial
TXT
Easy install
DOCX
Content server installation guide
PDF
linux-commandline-magic-Joomla-World-Conference-2014
Pluggable database tutorial 2
DSpace Manual for BALID Trainee
Hadoop completereference
Instalar PENTAHO 5 en CentOS 6
Asm disk group migration from
Step by Step Restore rman to different host
Apache
Linux Webserver Installation Command and GUI.ppt
Pluggable database tutorial
Easy install
Content server installation guide
linux-commandline-magic-Joomla-World-Conference-2014
Ad

Viewers also liked (20)

PDF
Hadoopを40分で理解する #cwt2013
PPT
はやわかりHadoop
PDF
Hadoopの概念と基本的知識
PDF
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
PPTX
Trabajo mariale
PPTX
Falta de ejecución, recaudo insuficiente y aplazamientos indebidos: mis preoc...
PDF
The Most Common Soccer Injuries
PPTX
Slideshare22
PDF
+Software development methodologies
PDF
Herramientas
PPTX
3 deciduous biome
DOCX
Aprendizje vivencial
PDF
commentario delle stanze di dzyan
PDF
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-Telemedicine
PDF
Gestión del tiempo: cómo sobrevivir con 24 al día
PPT
Alondra basquet 1
PPT
Prueba 1
PDF
Matriz de invitacion 1
PPTX
Túbulo contorneado proximal
PDF
Manual html-actualizaasddo
Hadoopを40分で理解する #cwt2013
はやわかりHadoop
Hadoopの概念と基本的知識
40分でわかるHadoop徹底入門 (Cloudera World Tokyo 2014 講演資料)
Trabajo mariale
Falta de ejecución, recaudo insuficiente y aplazamientos indebidos: mis preoc...
The Most Common Soccer Injuries
Slideshare22
+Software development methodologies
Herramientas
3 deciduous biome
Aprendizje vivencial
commentario delle stanze di dzyan
Legal-landscape-struggles-to-keep-pace-with-the-rise-of-Telemedicine
Gestión del tiempo: cómo sobrevivir con 24 al día
Alondra basquet 1
Prueba 1
Matriz de invitacion 1
Túbulo contorneado proximal
Manual html-actualizaasddo
Ad

Similar to 簡単にApache Hadoopのインストール (20)

PPTX
Session 03 - Hadoop Installation and Basic Commands
PPTX
Hadoop installation on windows
PPTX
Hadoop 2.4 installing on ubuntu 14.04
PDF
Big data using Hadoop, Hive, Sqoop with Installation
DOCX
Hadoop installation
DOCX
Run wordcount job (hadoop)
PPT
Hadoop Installation
PDF
Setting up a HADOOP 2.2 cluster on CentOS 6
PDF
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
PDF
Hadoop installation, Configuration, and Mapreduce program
PDF
Hadoop installation steps
PPTX
Big Data Course - BigData HUB
PDF
Hadoop single node installation on ubuntu 14
PPTX
Exp-3.pptx
PDF
Single node hadoop cluster installation
PDF
02 Hadoop deployment and configuration
PDF
DA Lab Manual Data Analysis Data AnalysisData AnalysisData AnalysisData Analysis
PPTX
Hadoop single node setup
DOCX
Setup and run hadoop distrubution file system example 2.2
Session 03 - Hadoop Installation and Basic Commands
Hadoop installation on windows
Hadoop 2.4 installing on ubuntu 14.04
Big data using Hadoop, Hive, Sqoop with Installation
Hadoop installation
Run wordcount job (hadoop)
Hadoop Installation
Setting up a HADOOP 2.2 cluster on CentOS 6
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop installation, Configuration, and Mapreduce program
Hadoop installation steps
Big Data Course - BigData HUB
Hadoop single node installation on ubuntu 14
Exp-3.pptx
Single node hadoop cluster installation
02 Hadoop deployment and configuration
DA Lab Manual Data Analysis Data AnalysisData AnalysisData AnalysisData Analysis
Hadoop single node setup
Setup and run hadoop distrubution file system example 2.2

More from Enrique Davila (6)

PPTX
Installing apache sqoop
PPTX
Load data into hive and csv
PPTX
Installing hive on ubuntu 16
PPTX
安装Apache Hadoop的轻松
PPTX
Installing hadoop on ubuntu 16
PPTX
Installing hadoop on ubuntu 16
Installing apache sqoop
Load data into hive and csv
Installing hive on ubuntu 16
安装Apache Hadoop的轻松
Installing hadoop on ubuntu 16
Installing hadoop on ubuntu 16

Recently uploaded (20)

PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction to machine learning and Linear Models
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PDF
Foundation of Data Science unit number two notes
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Database Infoormation System (DBIS).pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PDF
Lecture1 pattern recognition............
PDF
annual-report-2024-2025 original latest.
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to machine learning and Linear Models
Miokarditis (Inflamasi pada Otot Jantung)
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Foundation of Data Science unit number two notes
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Database Infoormation System (DBIS).pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Galatica Smart Energy Infrastructure Startup Pitch Deck
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Lecture1 pattern recognition............
annual-report-2024-2025 original latest.
Data_Analytics_and_PowerBI_Presentation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Supervised vs unsupervised machine learning algorithms

簡単にApache Hadoopのインストール

  • 1. Installing Hadoop on Ubuntu 16 INSTALL OPEN JDK 1
  • 2. Install Java  Do I have Java? Type on terminal: java -version  If I see the output below, then I don’t have java installed, follow instructions next slide 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 2
  • 3. Install Java  Type:  sudo apt-get install openjdk-8-jdk  Type Y to continue the installation process (it will take a while to complete the installation) 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 3
  • 4. Do I have java?  To confirm java ins installed on my Ubuntu system type:  java –version  You will see output below 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 4
  • 5. Install Openssh  Is mandatory to install openssh server: sudo apt-get install openssh-server  If ssh server is installed then generate keys, run command below: ssh-keygen -t rsa  Enter file, press enter  Enter passphrase, press enter  Enter same passphrase again press  enter 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 5
  • 6. SSH Keys  Now we will copy the key to the user and host, in my case my user is hadoop and host is hadoopdev  ssh-copy-id hadoop@hadoopdev 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 6
  • 7. Download and Install Hadoop DOWNLOAD HADOOP FROM APACHE WEB PAGE 7
  • 8. Download Apache Hadoop  Type in the terminal the following command to create new folder within my home linux folder, in this case/home/Hadoop/:  mkdir hadoop_install  Then go into this new folder:  cd hadoop_install  And copy the command below:  wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop- 2.7.3.tar.gz 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 8
  • 9. Download Apache Hadoop  You will see windows reflecting the progress of the download 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 9
  • 10. Unzip Hadoop folder  Once download is complete  Type the following command:  tar -xvf hadoop-2.7.3.tar.gz  Now you will see 2 folders, the new directory is called hadoop-2.7.3: 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 10
  • 11. Setup bashrc  This is the java location (very important for next steps):  Edit bashrc  Type:  Sudo gedit ~/.bashrc 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 11
  • 12. Setup ~/.bashrc  Add this lines to the .bashrc  Pls note on previous slide the java path is displayed, need to point bashrc to the actual java path  #HADOOP VARIABLES START  export JAVA_HOME=/usr/lib/jvm/ java-1.8.0-openjdk-amd64  export HADOOP_INSTALL=/home/hadoop/hadoop_install  export PATH=$PATH:$HADOOP_INSTALL/bin  export PATH=$PATH:$HADOOP_INSTALL/sbin 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 12
  • 13. Testing hadoop installation  Type the following command to refresh ~/.bashrc changes (no need to restart)  source ~/.basrch  Type the command below (if at this point you see an output like this you’re doing well) hadoop version 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 13
  • 15. Point your java to hadoop conf file  Go to the path:  /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop  Edit the file:  sudo gedit Hadoop-env.sh 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 15
  • 16. Modifying hadoop-env.sh  Modify the value for Java Home in the file: hadoop-env.sh 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 16
  • 17. Modify core-site.xml  Create a folder called tmp in /home/hadoop/hadoop_install  Add the following text to the core-site.xml , file is on the path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop <configuration> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/hadoop_install/tmp</value> <description>A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://localhost:54310</value> <description>The name of the default file system.</description> </property> </configuration> 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 17
  • 18. Modify mapred-site.xml  By default there is a file called: mapred-site.xml.template, needs to be renamed to mapred-site.xml and then add the code below:  File is on path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop <configuration> <property> <name>mapred.job.tracker</name> <value>localhost:54311</value> <description>The host and port that the MapReduce job tracker runs at. </description> </property> 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 18
  • 19. Modify hdfs-site.xml  We need to créate 2 new folders which will contain name node and data node:  I placed these 2 folders on: /home/hadoop/hadoop_install/ 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 19
  • 20. Modify hdfs-site.xml Add the code below in the file hdfs-site.xml, the paths for namnode and datanode are the 2 new folders you just created on previous slide. <configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///home/hadoop/hadoop_install/namenode</value> </property> <property> <name>dfs.data.node.name.dir</name> <value>file:///home/hadoop/hadoop_install/datanode</value> </property> </configuration> #hdfs-site.xml is located on the path: /home/hadoop/hadoop_install/hadoop-2.7.3/etc/hadoop 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 20
  • 21. Format the namenode  Run the following command:  hadoop namenode –format 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 21
  • 22. Format the namenode part 2  If everything is ok you will see message below: 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 22
  • 23. Running Hadoop Single node  Run the command:  startall.sh  Then execute the command:  jps, you will see the following output 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 23
  • 24. Stop Cluster  We run stop-all.sh 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 24
  • 25. Web Interface: localhost:50070  In the browser go to: localhost:50070 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 25
  • 26. Applies for:  This installation runs under:  Ubuntu 16  Hadoop 2.7.3  Virtual Machine:  2 Processors  2 Gb Ram  2 Network Interface, 1 as Bridge, 2nd as Nat 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 26
  • 27. You need help?  Contact name:  Enrique Davila Gutierrez  Enrique.davila@Gmail.com 10/24/2016Enrique Davila Big Data Instructor enrique.davila@gmail.com 27