SlideShare a Scribd company logo
Hadoop Developer
Training
Session 04 - PIG
Page 2Classification: Restricted
Agenda
PIG
• Loads in Pig Continued
• Verification
• Filters
• Macros in Pig
Page 3Classification: Restricted
Load in Pig
Inner Join is used quite frequently; it is also referred to as equijoin. An inner
join returns rows when there is a match in both tables.
It creates a new relation by combining column values of two relations (say A
and B) based upon the join-predicate. The query compares each row of A
with each row of B to find all pairs of rows which satisfy the join-predicate.
When the join-predicate is satisfied, the column values for each matched pair
of rows of A and B are combined into a result row.
Syntax
Here is the syntax of performing inner join operation using the JOIN operator.
grunt> result = JOIN relation1 BY columnname, relation2 BY columnname;
Example
Let us perform inner join operation on the two relations customers and
orders as shown below.
grunt> coustomer_orders = JOIN customers BY id, orders BY customer_id
Page 4Classification: Restricted
Verification
Verify the relation coustomer_orders using the DUMP operator as shown
below.
grunt> Dump coustomer_orders; Output
You will get the following output that will the contents of the relation
named coustomer_orders.
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060)
Outer Join: Unlike inner join, outer join returns all the rows from at least one
of the relations. An outer join operation is carried out in three ways −
Left outer join
Right outer join
Full outer join
Left Outer Join
Page 5Classification: Restricted
Verification
The left outer Join operation returns all rows from the left table, even if
there are no matches in the right relation.
Syntax
Given below is the syntax of performing left outer join operation using the
JOIN operator.
grunt> Relation3_name = JOIN Relation1_name BY id LEFT OUTER,
Relation2_name BY customer_id;
Example
Let us perform left outer join operation on the two relations customers and
orders as shown below.
grunt> outer_left = JOIN customers BY id LEFT OUTER, orders BY
customer_id;
Page 6Classification: Restricted
Verification
Verify the relation outer_left using the DUMP operator as shown below.
grunt> Dump outer_left; Output
It will produce the following output, displaying the contents of the relation
outer_left.
(1,Peter,32,Salt Lake City,2000,,,,)
(2,Aaron,25,Salt Lake City,1500,101,2009-11-20 00:00:00,2,1560)
(3,Danny,23,Salt Lake City,2000,100,2009-10-08 00:00:00,3,1500)
(3,Danny,23,Salt Lake City,2000,102,2009-10-08 00:00:00,3,3000)
(4,Angela,25,Salt Lake City,6500,103,2008-05-20 00:00:00,4,2060)
(5,Peggy,27,Bhopal,8500,,,,)
(6,King,22,MP,4500,,,,)
(7,Carolyn,24,Indore,10000,,,,)
Page 7Classification: Restricted
Verification
Right Outer Join
The right outer join operation returns all rows from the right table, even if
there are no matches in the left table.
Syntax
Given below is the syntax of performing right outer join operation using the
JOIN operator.
grunt> outer_right = JOIN customers BY id RIGHT, orders BY customer_id;
Example
Let us perform right outer join operation on the two relations customers and
orders as shown below.
grunt> outer_right = JOIN customers BY id RIGHT, orders BY customer_id;
Verification
Verify the relation outer_right using the DUMP operator as shown below.
grunt> Dump outer_right Output
Page 8Classification: Restricted
Verification
It will produce the following output, displaying the contents of the relation
outer_right.
(2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560)
(3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500)
(3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000)
(4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060
The SPLIT operator is used to split a relation into two or more relations.
Syntax
Given below is the syntax of the SPLIT operator.
SPLIT student_details into student_details1 if age<23, student_details2 if
(22<age and age>25);
Dump student_details1;
grunt> Dump student_details2;
Page 9Classification: Restricted
Verification
Output
It will produce the following output, displaying the contents of the relations
student_details1 and student_details2 respectively.
grunt> Dump student_details1;
(1, Peter, Burke, 4353521729, Salt Lake City)
(2, Aaron, Kimberlake, 8013528191, Salt Lake City)
(3, Danny, Jacob, 2958295582, Salt Lake City)
(4, Angela, Kouth, 2938811911, Salt Lake City)
grunt> Dump student_details2;
(5, Peggy, Karter, 3202289119, Salt Lake City)
(6, King, Salmon, 2398329282, Salt Lake City)
(7, Carolyn, Fisher, 2293322829, Salt Lake City)
(8, John, Hopkins, 2102392020, Salt Lake City)
Page 10Classification: Restricted
Verification
The FILTER operator is used to select the required tuples from a relation
based on a condition.
Syntax
Given below is the syntax of the FILTER operator.
grunt> Relation2_name = FILTER Relation1_name BY (condition);
Example
Assume that we have a file named student_details.txt in the HDFS directory
/pig_data/ as shown below.
student_details.txt
1, Peter, Burke, 4353521729, Salt Lake City
2, Aaron, Kimberlake, 8013528191, Salt Lake City
3, Danny, Jacob, 2958295582, Salt Lake City
4, Angela, Kouth, 2938811911, Salt Lake City
Page 11Classification: Restricted
Verification
5, Peggy, Karter, 3202289119, Salt Lake City
6, King, Salmon, 2398329282, Salt Lake City
7, Carolyn, Fisher, 2293322829, Salt Lake City
8, John, Hopkins, 2102392020, Salt Lake City
And we have loaded this file into Pig with the relation name student_details
as shown below.
grunt> student_details = LOAD '/pig_data/student_details.txt' USING
PigStorage(',') as (id:int, firstname:chararray, lastname:chararray, age:int,
phone:chararray, city:chararray);
cx = FILTER student_details BY city == 'Chennai’;
Verification
Verify the relation filter_data using the DUMP operator as shown below.
grunt> Dump filter_data;
Page 12Classification: Restricted
Verification
Output
It will produce the following output, displaying the contents of the relation
filter_data as follows.
(6, King, Salmon, 2398329282, Salt Lake City)
(8, John, Hopkins, 2102392020, Salt Lake City)
The DISTINCT operator is used to remove redundant (duplicate) tuples from
a relation.
distinct_data = DISTINCT student_details;
grunt> Dump distinct_data;
The FOREACH operator is used to generate specified data transformations
based on the column data
grunt> foreach_data = FOREACH student_details GENERATE id,age,city;
grunt> Dump foreach_data; Output
It will produce the following output, displaying the contents of the relation
foreach_data.
Page 13Classification: Restricted
Verification
(1,21,Salt Lake City)
(2,22, Salt Lake City)
(3,22, Salt Lake City)
(4,21, Salt Lake City)
(5,23, Salt Lake City)
(6,23, Salt Lake City)
(7,24, Salt Lake City)
(8,24, Salt Lake City)
Assert operator is used for data validation. The script will fail if it doesn't
meets the specified condition in assert
paste the data on desktop as
12,23
23,34
-21,22
Page 14Classification: Restricted
Verification
a = load '/home/mishra/Desktop/exp' USING PigStorage(',') AS (id:int,roll:int);
now apply the assert operator
grunt> assert a by id >0,'a cant be neg'
dump a;
an error is generated as one of the values in id is negative
check the details at the generated log file of the pig..at the end of the file you
will find
Assertion violated: a cant be neg
now assume another example witth same data
grunt> b = load '/home/mishra/Desktop/exp' USING PigStorage(',') AS
(id:int,roll:int);
grunt> assert b by id > 13,'value is below 13';
Dump b;
an error is generated as few values in id is less than 13
check the details at the generated log file of the pig..at the end of the file you
will find
Assertion violated: value is below 13
Page 15Classification: Restricted
Macros in Pig
We can develop more reusable scripts in Pig Latin Using Macros also.Macro is
a kind of function written in Pig Latin.
DEFINE keyword is used to make macros
You can define a function by writing a macro and then reuse that macro
paste the undergiven data of student with fields as id,name,fees,rollno
respectively:::::::::::::
10,Peter,10000,1
15,Aaron,20000,25
30,Danny,30000,1
40,Angela,40000,35
move the data to hdfs by:::::
hadoop fs -put /home/ands/Desktop/xyz.txt /pigy
make a file on Desktop with .pig extention(say macro.pig) and paste the lines
below:::::::::::::::
DEFINE myfilter(relvar,colvar) returns x{
$x = filter $relvar by $colvar==1;
};
Page 16Classification: Restricted
Macros in Pig
stu = load '/pigy'using PigStorage(',') as (id,name,fees,rollno);
studrollno1 =myfilter( stu,rollno);
dump studrollno1;
Above macro takes two values as input,one is relation variable (relvar) and
second is column variable (colvar)
macro checks if colvar equals to 1 or not
now change directory to desktop and run:::::
pig -f macro.pig
Page 17Classification: Restricted
Topics to be covered in next session
• Sqoop
• Sqoop Installation
• Exporting the data
• Exporting from Hadoop to SQL
Page 18Classification: Restricted
Thank you!

More Related Content

PPTX
Session 04 pig - slides
PPT
Session 19 - MapReduce
PPTX
Pig latin
PPTX
Unit 4 lecture-3
PDF
R data-import, data-export
 
PDF
Import web resources using R Studio
PDF
Introduction to Data Mining with R and Data Import/Export in R
PDF
Apache Hive Table Partition and HQL
Session 04 pig - slides
Session 19 - MapReduce
Pig latin
Unit 4 lecture-3
R data-import, data-export
 
Import web resources using R Studio
Introduction to Data Mining with R and Data Import/Export in R
Apache Hive Table Partition and HQL

What's hot (20)

PDF
Import and Export Big Data using R Studio
DOCX
Big Data Analytics Lab File
PPTX
Apache pig
PDF
Manipulating Data using DPLYR in R Studio
PPTX
PPTX
Merge Multiple CSV in single data frame using R
PDF
Lecture 2 part 3
PPTX
Apache PIG
PDF
Data preparation, depth function
 
PDF
7. Data Import – Data Export
 
PPTX
Unit 3 writable collections
PPTX
PDF
R Programming: Importing Data In R
PDF
Grouping & Summarizing Data in R
PPTX
Unit 2 part-2
PDF
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
PPTX
Map reduce prashant
PPT
r,rstats,r language,r packages
PPTX
R Programming Language
PPT
R tutorial for a windows environment
Import and Export Big Data using R Studio
Big Data Analytics Lab File
Apache pig
Manipulating Data using DPLYR in R Studio
Merge Multiple CSV in single data frame using R
Lecture 2 part 3
Apache PIG
Data preparation, depth function
 
7. Data Import – Data Export
 
Unit 3 writable collections
R Programming: Importing Data In R
Grouping & Summarizing Data in R
Unit 2 part-2
Writing MapReduce Programs using Java | Big Data Hadoop Spark Tutorial | Clou...
Map reduce prashant
r,rstats,r language,r packages
R Programming Language
R tutorial for a windows environment
Ad

Similar to Session 04 -Pig Continued (20)

PPT
IBM Informix dynamic server 11 10 Cheetah Sql Features
PPTX
data frames.pptx
PPT
ASP.NET 09 - ADO.NET
PDF
Assignment#04
PPTX
Database Management System Review
PDF
Relational Database Design
PPT
Introduction to javascript.ppt
PPTX
Database Connectivity with JDBC
PDF
Optimizing the Catalyst Optimizer for Complex Plans
PDF
Building an Autonomous Data Layer
PDF
Iowa_Report_2
PDF
Class 12 computer sample paper with answers
PPTX
mysql 高级优化之 理解索引使用
PDF
Inspec one tool to rule them all
PDF
Apache Pig Relational Operators - II
PDF
Data mining final report
PPTX
Tutorial - Learn SQL with Live Online Database
PDF
Postgres Performance for Humans
PDF
22 scheme OOPs with C++ BCS306B_module2.pdfmodule2.pdf
IBM Informix dynamic server 11 10 Cheetah Sql Features
data frames.pptx
ASP.NET 09 - ADO.NET
Assignment#04
Database Management System Review
Relational Database Design
Introduction to javascript.ppt
Database Connectivity with JDBC
Optimizing the Catalyst Optimizer for Complex Plans
Building an Autonomous Data Layer
Iowa_Report_2
Class 12 computer sample paper with answers
mysql 高级优化之 理解索引使用
Inspec one tool to rule them all
Apache Pig Relational Operators - II
Data mining final report
Tutorial - Learn SQL with Live Online Database
Postgres Performance for Humans
22 scheme OOPs with C++ BCS306B_module2.pdfmodule2.pdf
Ad

More from AnandMHadoop (7)

PPTX
Overview of Java
PPTX
Session 14 - Hive
PPTX
Session 09 - Flume
PPTX
Session 23 - Kafka and Zookeeper
PPTX
Session 03 - Hadoop Installation and Basic Commands
PPTX
Session 02 - Yarn Concepts
PPTX
Session 01 - Into to Hadoop
Overview of Java
Session 14 - Hive
Session 09 - Flume
Session 23 - Kafka and Zookeeper
Session 03 - Hadoop Installation and Basic Commands
Session 02 - Yarn Concepts
Session 01 - Into to Hadoop

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Machine Learning_overview_presentation.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Tartificialntelligence_presentation.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Big Data Technologies - Introduction.pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
Mobile App Security Testing_ A Comprehensive Guide.pdf
A Presentation on Artificial Intelligence
Network Security Unit 5.pdf for BCA BBA.
Machine Learning_overview_presentation.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
cuic standard and advanced reporting.pdf
1. Introduction to Computer Programming.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Unlocking AI with Model Context Protocol (MCP)
MIND Revenue Release Quarter 2 2025 Press Release
Tartificialntelligence_presentation.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Spectroscopy.pptx food analysis technology
Assigned Numbers - 2025 - Bluetooth® Document
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Big Data Technologies - Introduction.pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
“AI and Expert System Decision Support & Business Intelligence Systems”

Session 04 -Pig Continued

  • 2. Page 2Classification: Restricted Agenda PIG • Loads in Pig Continued • Verification • Filters • Macros in Pig
  • 3. Page 3Classification: Restricted Load in Pig Inner Join is used quite frequently; it is also referred to as equijoin. An inner join returns rows when there is a match in both tables. It creates a new relation by combining column values of two relations (say A and B) based upon the join-predicate. The query compares each row of A with each row of B to find all pairs of rows which satisfy the join-predicate. When the join-predicate is satisfied, the column values for each matched pair of rows of A and B are combined into a result row. Syntax Here is the syntax of performing inner join operation using the JOIN operator. grunt> result = JOIN relation1 BY columnname, relation2 BY columnname; Example Let us perform inner join operation on the two relations customers and orders as shown below. grunt> coustomer_orders = JOIN customers BY id, orders BY customer_id
  • 4. Page 4Classification: Restricted Verification Verify the relation coustomer_orders using the DUMP operator as shown below. grunt> Dump coustomer_orders; Output You will get the following output that will the contents of the relation named coustomer_orders. (2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560) (3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500) (3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000) (4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060) Outer Join: Unlike inner join, outer join returns all the rows from at least one of the relations. An outer join operation is carried out in three ways − Left outer join Right outer join Full outer join Left Outer Join
  • 5. Page 5Classification: Restricted Verification The left outer Join operation returns all rows from the left table, even if there are no matches in the right relation. Syntax Given below is the syntax of performing left outer join operation using the JOIN operator. grunt> Relation3_name = JOIN Relation1_name BY id LEFT OUTER, Relation2_name BY customer_id; Example Let us perform left outer join operation on the two relations customers and orders as shown below. grunt> outer_left = JOIN customers BY id LEFT OUTER, orders BY customer_id;
  • 6. Page 6Classification: Restricted Verification Verify the relation outer_left using the DUMP operator as shown below. grunt> Dump outer_left; Output It will produce the following output, displaying the contents of the relation outer_left. (1,Peter,32,Salt Lake City,2000,,,,) (2,Aaron,25,Salt Lake City,1500,101,2009-11-20 00:00:00,2,1560) (3,Danny,23,Salt Lake City,2000,100,2009-10-08 00:00:00,3,1500) (3,Danny,23,Salt Lake City,2000,102,2009-10-08 00:00:00,3,3000) (4,Angela,25,Salt Lake City,6500,103,2008-05-20 00:00:00,4,2060) (5,Peggy,27,Bhopal,8500,,,,) (6,King,22,MP,4500,,,,) (7,Carolyn,24,Indore,10000,,,,)
  • 7. Page 7Classification: Restricted Verification Right Outer Join The right outer join operation returns all rows from the right table, even if there are no matches in the left table. Syntax Given below is the syntax of performing right outer join operation using the JOIN operator. grunt> outer_right = JOIN customers BY id RIGHT, orders BY customer_id; Example Let us perform right outer join operation on the two relations customers and orders as shown below. grunt> outer_right = JOIN customers BY id RIGHT, orders BY customer_id; Verification Verify the relation outer_right using the DUMP operator as shown below. grunt> Dump outer_right Output
  • 8. Page 8Classification: Restricted Verification It will produce the following output, displaying the contents of the relation outer_right. (2,Khilan,25,Delhi,1500,101,2009-11-20 00:00:00,2,1560) (3,kaushik,23,Kota,2000,100,2009-10-08 00:00:00,3,1500) (3,kaushik,23,Kota,2000,102,2009-10-08 00:00:00,3,3000) (4,Chaitali,25,Mumbai,6500,103,2008-05-20 00:00:00,4,2060 The SPLIT operator is used to split a relation into two or more relations. Syntax Given below is the syntax of the SPLIT operator. SPLIT student_details into student_details1 if age<23, student_details2 if (22<age and age>25); Dump student_details1; grunt> Dump student_details2;
  • 9. Page 9Classification: Restricted Verification Output It will produce the following output, displaying the contents of the relations student_details1 and student_details2 respectively. grunt> Dump student_details1; (1, Peter, Burke, 4353521729, Salt Lake City) (2, Aaron, Kimberlake, 8013528191, Salt Lake City) (3, Danny, Jacob, 2958295582, Salt Lake City) (4, Angela, Kouth, 2938811911, Salt Lake City) grunt> Dump student_details2; (5, Peggy, Karter, 3202289119, Salt Lake City) (6, King, Salmon, 2398329282, Salt Lake City) (7, Carolyn, Fisher, 2293322829, Salt Lake City) (8, John, Hopkins, 2102392020, Salt Lake City)
  • 10. Page 10Classification: Restricted Verification The FILTER operator is used to select the required tuples from a relation based on a condition. Syntax Given below is the syntax of the FILTER operator. grunt> Relation2_name = FILTER Relation1_name BY (condition); Example Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt 1, Peter, Burke, 4353521729, Salt Lake City 2, Aaron, Kimberlake, 8013528191, Salt Lake City 3, Danny, Jacob, 2958295582, Salt Lake City 4, Angela, Kouth, 2938811911, Salt Lake City
  • 11. Page 11Classification: Restricted Verification 5, Peggy, Karter, 3202289119, Salt Lake City 6, King, Salmon, 2398329282, Salt Lake City 7, Carolyn, Fisher, 2293322829, Salt Lake City 8, John, Hopkins, 2102392020, Salt Lake City And we have loaded this file into Pig with the relation name student_details as shown below. grunt> student_details = LOAD '/pig_data/student_details.txt' USING PigStorage(',') as (id:int, firstname:chararray, lastname:chararray, age:int, phone:chararray, city:chararray); cx = FILTER student_details BY city == 'Chennai’; Verification Verify the relation filter_data using the DUMP operator as shown below. grunt> Dump filter_data;
  • 12. Page 12Classification: Restricted Verification Output It will produce the following output, displaying the contents of the relation filter_data as follows. (6, King, Salmon, 2398329282, Salt Lake City) (8, John, Hopkins, 2102392020, Salt Lake City) The DISTINCT operator is used to remove redundant (duplicate) tuples from a relation. distinct_data = DISTINCT student_details; grunt> Dump distinct_data; The FOREACH operator is used to generate specified data transformations based on the column data grunt> foreach_data = FOREACH student_details GENERATE id,age,city; grunt> Dump foreach_data; Output It will produce the following output, displaying the contents of the relation foreach_data.
  • 13. Page 13Classification: Restricted Verification (1,21,Salt Lake City) (2,22, Salt Lake City) (3,22, Salt Lake City) (4,21, Salt Lake City) (5,23, Salt Lake City) (6,23, Salt Lake City) (7,24, Salt Lake City) (8,24, Salt Lake City) Assert operator is used for data validation. The script will fail if it doesn't meets the specified condition in assert paste the data on desktop as 12,23 23,34 -21,22
  • 14. Page 14Classification: Restricted Verification a = load '/home/mishra/Desktop/exp' USING PigStorage(',') AS (id:int,roll:int); now apply the assert operator grunt> assert a by id >0,'a cant be neg' dump a; an error is generated as one of the values in id is negative check the details at the generated log file of the pig..at the end of the file you will find Assertion violated: a cant be neg now assume another example witth same data grunt> b = load '/home/mishra/Desktop/exp' USING PigStorage(',') AS (id:int,roll:int); grunt> assert b by id > 13,'value is below 13'; Dump b; an error is generated as few values in id is less than 13 check the details at the generated log file of the pig..at the end of the file you will find Assertion violated: value is below 13
  • 15. Page 15Classification: Restricted Macros in Pig We can develop more reusable scripts in Pig Latin Using Macros also.Macro is a kind of function written in Pig Latin. DEFINE keyword is used to make macros You can define a function by writing a macro and then reuse that macro paste the undergiven data of student with fields as id,name,fees,rollno respectively::::::::::::: 10,Peter,10000,1 15,Aaron,20000,25 30,Danny,30000,1 40,Angela,40000,35 move the data to hdfs by::::: hadoop fs -put /home/ands/Desktop/xyz.txt /pigy make a file on Desktop with .pig extention(say macro.pig) and paste the lines below::::::::::::::: DEFINE myfilter(relvar,colvar) returns x{ $x = filter $relvar by $colvar==1; };
  • 16. Page 16Classification: Restricted Macros in Pig stu = load '/pigy'using PigStorage(',') as (id,name,fees,rollno); studrollno1 =myfilter( stu,rollno); dump studrollno1; Above macro takes two values as input,one is relation variable (relvar) and second is column variable (colvar) macro checks if colvar equals to 1 or not now change directory to desktop and run::::: pig -f macro.pig
  • 17. Page 17Classification: Restricted Topics to be covered in next session • Sqoop • Sqoop Installation • Exporting the data • Exporting from Hadoop to SQL