SlideShare a Scribd company logo
Apache Hive
Presentation by,
Gyathri,
Dibakaran,
Dhanush,
Deveraj
HIVE
Hive
• Data warehousing package built on top of
hadoop.
• Used for data analysis on structured data.
• Targeted towards users comfortable with SQL.
• It is similar to SQL and called HiveQL.
• Abstracts complexity of hadoop.
• No Java is required.
• Developed by Facebook.
Features of Hive
How is it Different from SQL
•The major difference is that a Hive query
executes on a Hadoop infrastructure rather than
a traditional database.
•This allows Hive to handle huge data sets - data
sets so large that high-end, expensive, traditional
databases would fail.
•The internal execution of a Hive query is via a
series of automatically generated Map Reduce
jobs
Hive Modes
Tostart the hive shell, type hive and Enter.
• Hive in Local mode
No HDFS is required, All files run on local file
system.
hive> SET mapred.job.tracker=local
• Hive in MapReduce(hadoop) mode
hive> SET mapred.job.tracker=master:9001;
Hive Architecture
Components
• Thrift Client
It is possible to interact with hive by using any
programming language that usages Thrift server. For e.g.
Python
Ruby
• JDBC Driver
Hive provides a pure java JDBC driver for java application
to connect to hive , defined in the class
org.hadoop.hive.jdbc.HiveDriver
• ODBC Driver
An ODBC driver allows application that supports ODBC
protocol
Hive Program Structure
• The Hive Shell
 The shell is the primary way that we will interact with Hive, by issuing
commands in HiveQL.
 HiveQL is heavily influenced by MySQL, so if you are familiar with
MySQL, you should feel at home using Hive.
 The command must be terminated with a semicolon to tell Hive to
execute it.
 HiveQL is generally case insensitive.
 The Tab key will autocomplete Hive keywords and functions.
• Hive can run in non-interactive mode.
 Use -f option to run the commands in the specified file,
 hive -f script.hql
 For short scripts, you can use the -e option to specify the commands
inline, in which case the final semicolon is not required.
 hive -e 'SELECT * FROM dummy'
Hive Tables
A Hive table is logically made up of the data being stored in HDFS and the
associated metadata describing the layout of the data in the MySQL table.
• Managed Table
 When you create a table in Hive and load data into a managed table, it is moved into
Hive’s warehouse directory.
 CREATE TABLE managed_table (dummy STRING);
 LOAD DATAINPATH '/user/tom/data.txt' INTO table managed_table;
• External Table
 Alternatively, you may create an external table, which tells Hive to refer to the data that
is at an existing location outside the warehouse directory.
 The location of the external data is specified at table creation time:
 CREATE EXTERNAL TABLE external_table (dummy STRING)
 LOCATION '/user/tom/external_table';
 LOAD DATAINPATH '/user/tom/data.txt' INTO TABLE external_table;
• When you drop an external table, Hive will leave the data untouched and
only delete the metadata.
• Hive does not do any transformation while loading data into tables. Load
operations are currently pure copy/move operations that move data files
into locations corresponding to Hive tables.
Thank You
• Question?
• Feedback?

More Related Content

PPTX
PPTX
Apache hive introduction
PPTX
03 hive query language (hql)
PPTX
PPTX
Introduction to Apache Hive(Big Data, Final Seminar)
PDF
Working with Hive Analytics
PPTX
Apache Hive
PPTX
Hive - A theoretical overview in Detail.pptx
Apache hive introduction
03 hive query language (hql)
Introduction to Apache Hive(Big Data, Final Seminar)
Working with Hive Analytics
Apache Hive
Hive - A theoretical overview in Detail.pptx

Similar to Apache Hive and commands PPT Presentation (20)

PPTX
Apache Hive
PDF
Apache Hive micro guide - ConfusedCoders
PPTX
Unit II Hadoop Ecosystem_Updated.pptx
PPTX
hive_slides_Webinar_Session_1.pptx
PPTX
Hive ppt on the basis of importance of big data
PPTX
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
PPTX
BDA: Introduction to HIVE, PIG and HBASE
ODT
ACADGILD:: HADOOP LESSON
PPTX
SQL Server 2012 and Big Data
PPTX
Hive hcatalog
PPTX
Apache hive
PPTX
2 bda module-2 apache hive
PDF
Apache Hive, data segmentation and bucketing
ODP
Apache hive1
PPTX
PPTX
PPTX
Hadoop intro
PPTX
PDF
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
PPTX
Hive_Pig.pptx
Apache Hive
Apache Hive micro guide - ConfusedCoders
Unit II Hadoop Ecosystem_Updated.pptx
hive_slides_Webinar_Session_1.pptx
Hive ppt on the basis of importance of big data
Unveiling Hive: A Comprehensive Exploration of Hive in Hadoop Ecosystem
BDA: Introduction to HIVE, PIG and HBASE
ACADGILD:: HADOOP LESSON
SQL Server 2012 and Big Data
Hive hcatalog
Apache hive
2 bda module-2 apache hive
Apache Hive, data segmentation and bucketing
Apache hive1
Hadoop intro
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Hive_Pig.pptx
Ad

Recently uploaded (20)

PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PDF
RMMM.pdf make it easy to upload and study
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Lesson notes of climatology university.
PDF
Insiders guide to clinical Medicine.pdf
PDF
01-Introduction-to-Information-Management.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Abdominal Access Techniques with Prof. Dr. R K Mishra
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
O5-L3 Freight Transport Ops (International) V1.pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Microbial diseases, their pathogenesis and prophylaxis
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPH.pptx obstetrics and gynecology in nursing
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
GDM (1) (1).pptx small presentation for students
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Sports Quiz easy sports quiz sports quiz
RMMM.pdf make it easy to upload and study
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Lesson notes of climatology university.
Insiders guide to clinical Medicine.pdf
01-Introduction-to-Information-Management.pdf
Ad

Apache Hive and commands PPT Presentation

  • 3. Hive • Data warehousing package built on top of hadoop. • Used for data analysis on structured data. • Targeted towards users comfortable with SQL. • It is similar to SQL and called HiveQL. • Abstracts complexity of hadoop. • No Java is required. • Developed by Facebook.
  • 4. Features of Hive How is it Different from SQL •The major difference is that a Hive query executes on a Hadoop infrastructure rather than a traditional database. •This allows Hive to handle huge data sets - data sets so large that high-end, expensive, traditional databases would fail. •The internal execution of a Hive query is via a series of automatically generated Map Reduce jobs
  • 5. Hive Modes Tostart the hive shell, type hive and Enter. • Hive in Local mode No HDFS is required, All files run on local file system. hive> SET mapred.job.tracker=local • Hive in MapReduce(hadoop) mode hive> SET mapred.job.tracker=master:9001;
  • 7. Components • Thrift Client It is possible to interact with hive by using any programming language that usages Thrift server. For e.g. Python Ruby • JDBC Driver Hive provides a pure java JDBC driver for java application to connect to hive , defined in the class org.hadoop.hive.jdbc.HiveDriver • ODBC Driver An ODBC driver allows application that supports ODBC protocol
  • 8. Hive Program Structure • The Hive Shell  The shell is the primary way that we will interact with Hive, by issuing commands in HiveQL.  HiveQL is heavily influenced by MySQL, so if you are familiar with MySQL, you should feel at home using Hive.  The command must be terminated with a semicolon to tell Hive to execute it.  HiveQL is generally case insensitive.  The Tab key will autocomplete Hive keywords and functions. • Hive can run in non-interactive mode.  Use -f option to run the commands in the specified file,  hive -f script.hql  For short scripts, you can use the -e option to specify the commands inline, in which case the final semicolon is not required.  hive -e 'SELECT * FROM dummy'
  • 9. Hive Tables A Hive table is logically made up of the data being stored in HDFS and the associated metadata describing the layout of the data in the MySQL table. • Managed Table  When you create a table in Hive and load data into a managed table, it is moved into Hive’s warehouse directory.  CREATE TABLE managed_table (dummy STRING);  LOAD DATAINPATH '/user/tom/data.txt' INTO table managed_table; • External Table  Alternatively, you may create an external table, which tells Hive to refer to the data that is at an existing location outside the warehouse directory.  The location of the external data is specified at table creation time:  CREATE EXTERNAL TABLE external_table (dummy STRING)  LOCATION '/user/tom/external_table';  LOAD DATAINPATH '/user/tom/data.txt' INTO TABLE external_table; • When you drop an external table, Hive will leave the data untouched and only delete the metadata. • Hive does not do any transformation while loading data into tables. Load operations are currently pure copy/move operations that move data files into locations corresponding to Hive tables.