SlideShare a Scribd company logo
wwww.edureka.co/big-data-and-hadoop
Big Data Analytics for Non-Programmers
wwww.edureka.co/big-data-and-hadoop
Agenda for the day
 Can Hadoop be learnt without knowing Java?
 How Pig can be used in place of MapReduce ?
 Querying data with HiveQL
wwww.edureka.co/big-data-and-hadoop
Can Hadoop be learnt without
knowing Java?
wwww.edureka.co/big-data-and-hadoop
YES !!
Hadoop can be learnt without knowing Java
wwww.edureka.co/big-data-and-hadoop
Pig & Hive
Tools like Pig and Hive that are built on top of Hadoop, offer high-level languages for working with data
If you want to write MapReduce program, then you can use Pig and Pig Latin for which knowledge of
Java is not required.
If you want to view data in HDFS in a readable form you can use Hive which again does not require any
knowledge of Java.
wwww.edureka.co/big-data-and-hadoop
Why Pig?
wwww.edureka.co/big-data-and-hadoop
But why Pig?
Pig simplifies complex MapReduce programs by using Pig Latin
Additionally, If you want to write your own MapReduce code, you can do so in any language (e.g. Perl, Python,
Ruby, C, etc.)
But the most attractive features of Pig are:
 10 lines of PIG = 200 lines of Java
Built in operations like:
 Join
 Group
 Filter
 Sort
 and more…
wwww.edureka.co/big-data-and-hadoop
Why Pig?
Provides common data operations
filters, joins, ordering, etc. and nested
data types tuples, bags, and maps
missing from MapReduce.
It is Open source and is actively
supported by a community of
developers.
Structured data
Semi-Structured data
Unstructured data
Similar to SQL
Reads like a series
of steps
Java
Python
JavaScript
Ruby
An ad-hoc way of creating and
executing map-reduce jobs on very
large data sets
Can take any data
Easy to learn, Easy
to read and write
Extensible by UDF
(User Defined Functions)
Java not required
wwww.edureka.co/big-data-and-hadoop
Why Hive?
wwww.edureka.co/big-data-and-hadoop
Why Hive?
Defines
SQL-Like
Query
Language
called
HiveQL
Data
Warehouse
Infrastructure
Allows
programmers to
plug-in custom
mappers and
reducers
Provides tools to
enable easy ETL
wwww.edureka.co/big-data-and-hadoop
Features of Hive
You can use HIVE to read and write files on Hadoop and run your reports from a BI tool
Predictive Modeling & Hypothesis
Testing
Document Indexing
Customer-facing Business Intelligence
Log Processing
Data Mining
HIVE
Applications
wwww.edureka.co/big-data-and-hadoop
Demo
wwww.edureka.co/big-data-and-hadoop
Thank You
Questions/Queries/Feedback
Recording and presentation will be made available to you within 24 hours

More Related Content

PDF
Webinar : Talend : The Non-Programmer's Swiss Knife for Big Data
PPTX
Is Hadoop a necessity for Data Science
PPTX
Hadoop for Data Warehousing professionals
PPTX
Hadoop for Java Professionals
PDF
Webinar: Big Data & Hadoop - When not to use Hadoop
PDF
Distributed Cache With MapReduce
PDF
Introduction to Big Data and Hadoop
PDF
Why Talend for Big Data?
Webinar : Talend : The Non-Programmer's Swiss Knife for Big Data
Is Hadoop a necessity for Data Science
Hadoop for Data Warehousing professionals
Hadoop for Java Professionals
Webinar: Big Data & Hadoop - When not to use Hadoop
Distributed Cache With MapReduce
Introduction to Big Data and Hadoop
Why Talend for Big Data?

What's hot (19)

PDF
Introduction to Big data & Hadoop -I
PPTX
Learn Hadoop
PDF
Introduction to Big Data & Hadoop
PPTX
Simplifying Big Data ETL with Talend
PDF
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
PPTX
Learn Big Data & Hadoop
PDF
ETL using Big Data Talend
PDF
Hadoop : The Pile of Big Data
PPTX
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
PDF
Bulk Loading Into HBase With MapReduce
PPTX
Whatisbigdataandwhylearnhadoop
PDF
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
PPTX
Introduction to Hadoop Administration
PDF
Intro to HDFS and MapReduce
PDF
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
PDF
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
DOCX
Hadoop Seminar Report
PPTX
HDFS & MapReduce
PPTX
Big Data & Hadoop Tutorial
Introduction to Big data & Hadoop -I
Learn Hadoop
Introduction to Big Data & Hadoop
Simplifying Big Data ETL with Talend
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Learn Big Data & Hadoop
ETL using Big Data Talend
Hadoop : The Pile of Big Data
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Bulk Loading Into HBase With MapReduce
Whatisbigdataandwhylearnhadoop
Big Data Analytics Tutorial | Big Data Analytics for Beginners | Hadoop Tutor...
Introduction to Hadoop Administration
Intro to HDFS and MapReduce
Hadoop Interview Questions and Answers | Big Data Interview Questions | Hadoo...
What is Hadoop | Introduction to Hadoop | Hadoop Tutorial | Hadoop Training |...
Hadoop Seminar Report
HDFS & MapReduce
Big Data & Hadoop Tutorial
Ad

Viewers also liked (12)

PPTX
Spark for big data analytics
PPTX
Mastering in data warehousing & BusinessIintelligence
PDF
Big Data Processing with Spark and Scala
PPTX
Top 5 algorithms used in Data Science
PDF
Clare Corthell: Learning Data Science Online
PPTX
Health care and big data with hadoop – Beacuse prevention is better than cure
PDF
Is Data Scientist still the sexiest job of 21st century? Find Out!
PDF
Power of Python with Big Data
PPTX
R and Visualization: A match made in Heaven
PPTX
Python for Big Data Analytics
PDF
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
PDF
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Spark for big data analytics
Mastering in data warehousing & BusinessIintelligence
Big Data Processing with Spark and Scala
Top 5 algorithms used in Data Science
Clare Corthell: Learning Data Science Online
Health care and big data with hadoop – Beacuse prevention is better than cure
Is Data Scientist still the sexiest job of 21st century? Find Out!
Power of Python with Big Data
R and Visualization: A match made in Heaven
Python for Big Data Analytics
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
Big Data Tutorial For Beginners | What Is Big Data | Big Data Tutorial | Hado...
Ad

Similar to Big Data Analytics for Non-Programmers (20)

PDF
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
PDF
Unit V.pdf
PDF
An Overview Of Apache Pig And Apache Hive
PPTX
PPTX
Apache pig
PPTX
Introduction to Data Analyst Training
PPTX
An Introduction to Apache Pig
PDF
43_Sameer_Kumar_Das2
PPTX
Big data
PPTX
BDA R20 21NM - Summary Big Data Analytics
PDF
Big Data Hadoop Training
PDF
Apache pig
PPTX
Hive and Pig for .NET User Group
PPTX
Big data ppt
PPTX
Enhancing Big Data Analytics with Pig and Hadoop: Harnessing the Power of Dis...
PDF
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
PDF
IRJET- Analysis of Boston’s Crime Data using Apache Pig
PDF
Big Data: hype or necessity?
PDF
Introduction to pig & pig latin
PPTX
ch 01B Introduction to Hadoop components
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Unit V.pdf
An Overview Of Apache Pig And Apache Hive
Apache pig
Introduction to Data Analyst Training
An Introduction to Apache Pig
43_Sameer_Kumar_Das2
Big data
BDA R20 21NM - Summary Big Data Analytics
Big Data Hadoop Training
Apache pig
Hive and Pig for .NET User Group
Big data ppt
Enhancing Big Data Analytics with Pig and Hadoop: Harnessing the Power of Dis...
Hadoop Ecosystem | Big Data Analytics Tools | Hadoop Tutorial | Edureka
IRJET- Analysis of Boston’s Crime Data using Apache Pig
Big Data: hype or necessity?
Introduction to pig & pig latin
ch 01B Introduction to Hadoop components

More from Edureka! (20)

PDF
What to learn during the 21 days Lockdown | Edureka
PDF
Top 10 Dying Programming Languages in 2020 | Edureka
PDF
Top 5 Trending Business Intelligence Tools | Edureka
PDF
Tableau Tutorial for Data Science | Edureka
PDF
Python Programming Tutorial | Edureka
PDF
Top 5 PMP Certifications | Edureka
PDF
Top Maven Interview Questions in 2020 | Edureka
PDF
Linux Mint Tutorial | Edureka
PDF
How to Deploy Java Web App in AWS| Edureka
PDF
Importance of Digital Marketing | Edureka
PDF
RPA in 2020 | Edureka
PDF
Email Notifications in Jenkins | Edureka
PDF
EA Algorithm in Machine Learning | Edureka
PDF
Cognitive AI Tutorial | Edureka
PDF
AWS Cloud Practitioner Tutorial | Edureka
PDF
Blue Prism Top Interview Questions | Edureka
PDF
Big Data on AWS Tutorial | Edureka
PDF
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
PDF
Kubernetes Installation on Ubuntu | Edureka
PDF
Introduction to DevOps | Edureka
What to learn during the 21 days Lockdown | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
Tableau Tutorial for Data Science | Edureka
Python Programming Tutorial | Edureka
Top 5 PMP Certifications | Edureka
Top Maven Interview Questions in 2020 | Edureka
Linux Mint Tutorial | Edureka
How to Deploy Java Web App in AWS| Edureka
Importance of Digital Marketing | Edureka
RPA in 2020 | Edureka
Email Notifications in Jenkins | Edureka
EA Algorithm in Machine Learning | Edureka
Cognitive AI Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
Blue Prism Top Interview Questions | Edureka
Big Data on AWS Tutorial | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
Kubernetes Installation on Ubuntu | Edureka
Introduction to DevOps | Edureka

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
cuic standard and advanced reporting.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Cloud computing and distributed systems.
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
Approach and Philosophy of On baking technology
cuic standard and advanced reporting.pdf
Unlocking AI with Model Context Protocol (MCP)
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Mobile App Security Testing_ A Comprehensive Guide.pdf
Encapsulation_ Review paper, used for researhc scholars
Building Integrated photovoltaic BIPV_UPV.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Programs and apps: productivity, graphics, security and other tools
MIND Revenue Release Quarter 2 2025 Press Release
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Review of recent advances in non-invasive hemoglobin estimation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Cloud computing and distributed systems.
Diabetes mellitus diagnosis method based random forest with bat algorithm

Big Data Analytics for Non-Programmers