SlideShare a Scribd company logo
Apache Hadoop 2 Installation in Pseudo Mode
Download URL
1. Hadoop: https://archive.apache.org/dist/hadoop/core/stable/
2. Hive: http://people.apache.org/~hashutosh/hive-0.10.0-rc0/
3. Pig: http://ftp.udc.es/apache/pig/pig-0.12.0/
4. Hbase: http://archive.apache.org/dist/hbase/hbase-0.94.10/
Step 1: Generate ssh key
$ssh-keygen -t rsa -P “”
Step 2: Copy id_rsa.pub to authorized_keys
$cd .ssh
$cp id_rsa.pub authorized_keys
$chmod 644 authorized_keys
Step 3: Passwordless ssh to localhost
$cd ~
$ssh localhost
Step 4: Untar tarballs
$tar -xvzf hadoop-2.2.0.tar.gz
Step 5: Configuration files
$cd hadoop-2.2.0/etc/hadoop/
$vim core-site.xml
Add following properties in core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://172.17.196.14</value>
</property>
<property>
<name>io.native.lib.available</name>
<value>true</value>
</property>
$vim hdfs-site.xml
Add following property in hdfs-site.xml
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/hadoop-2.2.0/pseudo/dfs/data</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/hadoop-2.2.0/pseudo/dfs/name</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
$vim mapred-site.xml
Add following property in mapred-site.xml
<property>
<name>mapreduce.cluster.temp.dir</name>
<value>/home/hadoop/hadoop-2.2.0/temp</value>
<final>true</final>
</property>
<property>
<name>mapreduce.cluster.local.dir</name>
<value>/home/hadoop/hadoop-2.2.0/local</value>
<final>true</final>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
$vim yarn-site.xml
Add following property in yarn-site.xml
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>localhost:6000</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value> localhost:6001</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler<
/value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value> localhost:6002</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/home/hadoop/hadoop-2.2.0/yarn_nodemanager</value>
</property>
<property>
<name>yarn.nodemanager.address</name>
<value>0.0.0.0:6003</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>10240</value>
</property>
<property>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/home/hadoop/hadoop-2.2.0/app-logs</value>
</property>
<property>
<name>yarn.nodemanager.log-dirs</name>
<value>/home/hadoop/hadoop-2.2.0/logs</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
$vim slaves
Add localhost in masters file
Step 6: set .bashrc
$cd ~
$vim .bashrc
export JAVA_HOME=/usr/
export HADOOP_HOME=/home/ahadoop2/hadoop-2.2.0
export HADOOP_CONF_DIR=/home/ahadoop2/hadoop-2.2.0/etc/hadoop
export PIG_HOME=/home/ahadoop2/pig-0.12.0
export HBASE_HOME=/home/ahadoop2/hbase-0.96.0-hadoop2
export HIVE_HOME=/home/ahadoop2/hive-0.11.0
export PIG_CLASSPATH=/home/ahadoop2/hadoop-2.2.0/etc/hadoop
export CLASSPATH=$PIG_HOME/pig-withouthadoop.jar:
$HADOOP_HOME/share/hadoop/common/hadoop-common-2.2.0.jar:
$HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-2.2.0.jar:$HBASE_HOME/lib/hbase-client-
0.96.0-hadoop2.jar:$HBASE_HOME/lib/hbase-common-0.96.0-hadoop2.jar:
$HBASE_HOME/lib/hbase-server-0.96.0-hadoop2.jar:$HBASE_HOME/lib/commons-httpclient-
3.1.jar:$HBASE_HOME/lib/commons-collections-3.2.1.jar:$HBASE_HOME/lib/commons-lang-
2.6.jar:$HBASE_HOME/lib/jackson-mapper-asl-1.8.8.jar:$HBASE_HOME/lib/jackson-core-asl-
1.8.8.jar:$HBASE_HOME/lib/guava-12.0.1.jar:$HBASE_HOME/lib/protobuf-java-2.5.0.jar:
$HBASE_HOME/lib/commons-codec-1.7.jar:$HBASE_HOME/lib/zookeeper-3.4.5.jar:
$HIVE_HOME/lib/hive-jdbc-0.11.0.jar:$HIVE_HOME/lib/hive-metastore-0.11.0.jar:
$HIVE_HOME/lib/hive-serde-0.11.0.jar:$HIVE_HOME/lib/hive-common-0.11.0.jar:
$HIVE_HOME/lib/hive-service-0.11.0.jar:$HIVE_HOME/lib/libfb303-0.9.0.jar:
$HIVE_HOME/lib/postgresql-9.2-1003.jdbc3.jar:$HIVE_HOME/lib/libthrift-0.9.0.jar:
$HIVE_HOME/lib/slf4j-api-1.6.1.jar:$HIVE_HOME/lib/commons-logging-
1.0.4.jar:/home/ahadoop2/Hadoop2Training.jar
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$PIG_HOME/bin:$HBASE_HOME/bin:
$HIVE_HOME/bin:/bin:/usr/lib64/qt-
3.3/bin:/usr/kerberos/sbin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:
Step 7: Load .bashrc
$cd ~
$. .bashrc
Step 8: Formatting the name node
$cd ~
$hadoop namenode -format
Step 9: Starting Cluster
$cd ~/hadoop-2.2.0/sbin
$ ./start-all.sh
To view the started daemons
$ jps
This should show the started daemons.
NameNode
DataNode
SecondaryNamenode
Nodemanager
ResourceManager
Apache Hbase Installation in Pseudo Mode
Step 1: Untar the tarballs
$tar -xvzf hbase-0.96.0-hadoop2.tar.gz
Step 2: Configuration files
$cd hbase-0.96.0-hadoop2/conf
$vim hbase-site.xml
Copy following properties in hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:8020/hbase</value>
<description>The directory shared by RegionServers</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
$vim regionservers
Add localhost in regionservers file
Step 3: Add hadoop jars from hadoop directory to hbase lib directory
$cd /home/hadoop/hadoop-2.2.0/share/hadoop/common/
$cp hadoop-common-2.2.0.jar /home/hadoop/hbase-0.96.0-hadoop2/lib/
Step 4: start hbase
$cd ~
$start-hbase.sh
Step 5: To view the started daemons
$ jps
Hmaster
Hregionserver
Hquorumpeer
Step 6: To view hbase shell
$hbase shell
Step 7: Before connecting to hbase using java
Start hbase rest service by executing following command
$hbase-daemon.sh start rest -p 8090
Apache Hive Installation
Step 1: Untar the tarballs
$tar -xvzf hive-0.11.0.tar.gz
Step 2: Configuring a remote PostgreSQL database for the Hive Metastore
Before you can run the Hive metastore with a remote PostgreSQL database, you must configure a
connector to the remote PostgreSQL database, set up the initial database schema, and configure the
PostgreSQL user account for the Hive user.
Install and start PostgreSQL if you have not already done so you need to edit the postgresql.conf
file. Set the listen property to * to make sure that the Configure authentication for your network in
pg_hba.conf. Add a new line into pg_hba.con that has the following information:
Start PostgreSQL Server
$ su postgres
$cd $postgres_home/bin
$./pg_ctl start -D path_to_data_dir
Install the Postgres JDBC Driver
Copy postgresql-jdbc driver in $HIVE_HOME/lib/
Create the metastore database and user account
Proceed as in the following example:
bash# sudo –u postgres psql
bash$ psql
postgres=# CREATE USER hiveuser WITH PASSWORD 'mypassword';
postgres=# CREATE DATABASE metastore;
postgres=# exit;
bash# sudo –u hiveuser metastore
You are now connected to database 'metastore' with hiveuser.
metastore=# i /home/hadoop/hive-0.11.0/scripts/metastore/upgrade/postgres/hive-schema-
0.10.0.postgres.sql
Step 3: Configuration files
$cd hive-0.11.0/conf
$vim hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://<postgresql instance ip>:5432/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mypassword</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://<namenode ip>:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore
host</description>
</property>
<property>
<name>datanucleus.autoStartMechanism</name>
<value>SchemaTable</value>
</property>
</configuration>
Step 4: Strat hive metastore
$hive --service metastore
Step 5: To view hive console
$hive
hive>show tables;
OK
Step 6: Before connecting to hive using java
Start hiveserver by executing following command
$hive --service hiveserver
Apache pig installation
Step 1: Untar the tarballs
$tar -xvzf pig-0.12.0.tar.gz
Step 2: Delete two jars (pig and pig-without hadoop jar) from pig home directory and add pig-
withouthadoop.jar in pig installation directory (Uploaded in knowmax same path)
Step 3: To open pig grunt
$pig
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://<namenode ip>:9083</value>
<description>IP address (or fully-qualified domain name) and port of the metastore
host</description>
</property>
<property>
<name>datanucleus.autoStartMechanism</name>
<value>SchemaTable</value>
</property>
</configuration>
Step 4: Strat hive metastore
$hive --service metastore
Step 5: To view hive console
$hive
hive>show tables;
OK
Step 6: Before connecting to hive using java
Start hiveserver by executing following command
$hive --service hiveserver
Apache pig installation
Step 1: Untar the tarballs
$tar -xvzf pig-0.12.0.tar.gz
Step 2: Delete two jars (pig and pig-without hadoop jar) from pig home directory and add pig-
withouthadoop.jar in pig installation directory (Uploaded in knowmax same path)
Step 3: To open pig grunt
$pig

More Related Content

PDF
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
PDF
Puppet: Eclipsecon ALM 2013
PDF
Getting Started with Ansible
PDF
Automated Java Deployments With Rpm
PDF
mapserver_install_linux
PDF
Alfredo-PUMEX
PDF
Using docker for data science - part 2
PDF
Using python and docker for data science
Hadoop 2.0 cluster setup on ubuntu 14.04 (64 bit)
Puppet: Eclipsecon ALM 2013
Getting Started with Ansible
Automated Java Deployments With Rpm
mapserver_install_linux
Alfredo-PUMEX
Using docker for data science - part 2
Using python and docker for data science

What's hot (20)

PDF
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2
PDF
Ansible : what's ansible & use case by REX
PDF
Build Automation 101
PDF
Czym jest webpack i dlaczego chcesz go używać?
PPTX
Dc kyiv2010 jun_08
KEY
PyCon AU 2010 - Getting Started With Apache/mod_wsgi.
PPT
Puppet
PDF
Fixing Growing Pains With Puppet Data Patterns
PDF
Puppet Camp Boston 2014: Greenfield Puppet: Getting it right from the start (...
PPT
Drupal Performance - SerBenfiquista.com Case Study
PDF
Python And My Sq Ldb Module
PDF
Ansible leveraging 2.0
PDF
Facebook的缓存系统
PDF
AnsibleFest 2014 - Role Tips and Tricks
PPTX
IT Infrastructure Through The Public Network Challenges And Solutions
PDF
Burn down the silos! Helping dev and ops gel on high availability websites
PDF
Automated infrastructure is on the menu
PDF
More tips n tricks
PPT
Powerful and flexible templates with Twig
DOCX
SCasia 2018 MSFT hands on session for Azure Batch AI
Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2
Ansible : what's ansible & use case by REX
Build Automation 101
Czym jest webpack i dlaczego chcesz go używać?
Dc kyiv2010 jun_08
PyCon AU 2010 - Getting Started With Apache/mod_wsgi.
Puppet
Fixing Growing Pains With Puppet Data Patterns
Puppet Camp Boston 2014: Greenfield Puppet: Getting it right from the start (...
Drupal Performance - SerBenfiquista.com Case Study
Python And My Sq Ldb Module
Ansible leveraging 2.0
Facebook的缓存系统
AnsibleFest 2014 - Role Tips and Tricks
IT Infrastructure Through The Public Network Challenges And Solutions
Burn down the silos! Helping dev and ops gel on high availability websites
Automated infrastructure is on the menu
More tips n tricks
Powerful and flexible templates with Twig
SCasia 2018 MSFT hands on session for Azure Batch AI
Ad

Viewers also liked (17)

PPT
Steps To Success
PPS
Tiny Frogs
DOCX
How to analyze_table_through_informatica
PDF
Two single node cluster to one multinode cluster
PDF
WCAN 2013 Winter LT オープンソースのPHP製 汎用メールフォームシステムTransmitMail 2のご紹介
PPT
The_factories_act__1948
PDF
Hris_notes
DOCX
Characteristics_of_the_database_system
PDF
Business cases made_simple
PPTX
Case2 _layoff
DOCX
A_human_resource_information_system
PDF
オープンデータデイ2015 記者説明会 発表資料
PDF
Dbms_class _14
PPT
Employees_state_insurance_act
PPT
Labour_law
PPT
(C) Etiquette @ Work
PPT
Industrial and labour laws (comprehensive)
Steps To Success
Tiny Frogs
How to analyze_table_through_informatica
Two single node cluster to one multinode cluster
WCAN 2013 Winter LT オープンソースのPHP製 汎用メールフォームシステムTransmitMail 2のご紹介
The_factories_act__1948
Hris_notes
Characteristics_of_the_database_system
Business cases made_simple
Case2 _layoff
A_human_resource_information_system
オープンデータデイ2015 記者説明会 発表資料
Dbms_class _14
Employees_state_insurance_act
Labour_law
(C) Etiquette @ Work
Industrial and labour laws (comprehensive)
Ad

Similar to Apache hadoop 2_installation (20)

PPTX
Installing hive on ubuntu 16
PDF
Hive
DOC
Hadoop cluster configuration
PDF
Micro Datacenter & Data Warehouse
PPTX
Installing hadoophivederby on_centos
PPT
Big data with hadoop Setup on Ubuntu 12.04
PDF
R hive tutorial supplement 2 - Installing Hive
PDF
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
PPTX
Hadoop 2.4 installing on ubuntu 14.04
PDF
Hadoop Admin role & Hive Data Warehouse support
PPT
Hadoop ecosystem
DOC
Configure h base hadoop and hbase client
PDF
Hadoop completereference
PDF
03 h base-2-installation_andshell
PPTX
Securing the Hadoop Ecosystem
PDF
Hadoop single node installation on ubuntu 14
PPTX
Hadoop presentation
PPTX
Hadoop installation on windows
DOCX
Setup and run hadoop distrubution file system example 2.2
PDF
Store and Process Big Data with Hadoop and Cassandra
Installing hive on ubuntu 16
Hive
Hadoop cluster configuration
Micro Datacenter & Data Warehouse
Installing hadoophivederby on_centos
Big data with hadoop Setup on Ubuntu 12.04
R hive tutorial supplement 2 - Installing Hive
Hadoop installation and Running KMeans Clustering with MapReduce Program on H...
Hadoop 2.4 installing on ubuntu 14.04
Hadoop Admin role & Hive Data Warehouse support
Hadoop ecosystem
Configure h base hadoop and hbase client
Hadoop completereference
03 h base-2-installation_andshell
Securing the Hadoop Ecosystem
Hadoop single node installation on ubuntu 14
Hadoop presentation
Hadoop installation on windows
Setup and run hadoop distrubution file system example 2.2
Store and Process Big Data with Hadoop and Cassandra

Apache hadoop 2_installation