SlideShare a Scribd company logo
Analytics Functions in Mysql,
Oracle and Hive
Rank, Hierarchical, Pivot queries.
Mysql Employee, Manager Relationship.
 mysql> select * from employee;
 +------+---------+------+
 | id | name | p_id |
 +------+---------+------+
 | 1 | Manager | NULL |
 | 2 | Emp1 | 1 |
 | 3 | Emp2 | 1 |
 | 4 | Emp1_1 | 2 |
 | 5 | Emp1_2 | 2 |
 | 6 | Emp2_1 | 3 |
 | 7 | Emp2_2 | 3 |
 +------+---------+------+
 mysql> select * from employee a,employee b where a.id=b.p_id;
 +------+---------+------+------+--------+------+
 | id | name | p_id | id | name | p_id |
 +------+---------+------+------+--------+------+
 | 1 | Manager | NULL | 2 | Emp1 | 1 |
 | 1 | Manager | NULL | 3 | Emp2 | 1 |
 | 2 | Emp1 | 1 | 4 | Emp1_1 | 2 |
 | 2 | Emp1 | 1 | 5 | Emp1_2 | 2 |
 | 3 | Emp2 | 1 | 6 | Emp2_1 | 3 |
 | 3 | Emp2 | 1 | 7 | Emp2_2 | 3 |
 +------+---------+------+------+--------+------+
Mysql Nth Salary
 mysql> select * from employee;
 +------+---------+------+--------+
 | id | name | p_id | salary |
 +------+---------+------+--------+
 | 1 | Manager | NULL | 300000 |
 | 2 | Emp1 | 1 | 200000 |
 | 3 | Emp2 | 1 | 200000 |
 | 4 | Emp1_1 | 2 | 100000 |
 | 5 | Emp1_2 | 2 | 100000 |
 | 6 | Emp2_1 | 3 | 100000 |
 | 7 | Emp2_2 | 3 | 100000 |
 +------+---------+------+--------+
 mysql> select max(salary) from employee;
 +-------------+
 | max(salary) |
 +-------------+
 | 300000 |
 +-------------+
 Second (N=2) Highest Salary:-
 Select col1,col2,salary from tab1 where N-1 = (select count(distinct salary) from
tab2 where tab2.salary>tab1.salary);
 In our case:-
 mysql> select id,name,salary from employee a where 1=(select count(distinct
salary) from employee b where b.salary>=a.salary);
 +------+------+--------+
 | id | name | salary |
 +------+------+--------+
 | 2 | Emp1 | 200000 |
 | 3 | Emp2 | 200000 |
 +------+------+--------+
 Explanation:- Above query called corelated query i.e. inner query is dependent on outer
query result.
 So in laymen words our query says:- Get columns with salary from table (select
id,name,salary from employee a ) where check in the inner query that how many
rows/count is less than our outer query salary (select count(distinct salary) from
employee b where b.salary>= 300000)
 Now if you want have separate department table and you want department wise
employee second (nth) highest salary then you have to modify our query with
joins:-
 SELECT DISTINCT d1.deptno,e1.sal FROM emp e1,dept d1 WHERE 2 =(SELECT
count(DISTINCT e2.sal) FROM emp e2,dept d2 WHERE e2.sal >= e1.sal AND
d2.deptno = e2.deptno AND d1.deptno = d2.deptno) AND d1.deptno =
e1.deptno;
Mysql Pivot Query
 mysql> select * from sales;
 +-------+----------+--------+
 | name | item | amount |
 +-------+----------+--------+
 | Mark | Computer | 20000 |
 | Mark | Laptop | 100000 |
 | Bill | Laptop | 100000 |
 | Steve | Mobile | 150000 |
 | Steve | Headset | 15000 |
 +-------+----------+--------+
 mysql> select (case when item='Computer' then sum(amount) else 0 end) as
Computer_sale,(case when item='Laptop' then sum(amount) else 0 end) as Laptop_sale,(case
when item='Mobile' then sum(amount) else 0 end) as Mobile_sale,(case when item='Headset'
then sum(amount) else 0 end) as Headset_sale from sales group by item;
 +---------------+-------------+-------------+--------------+
 | Computer_sale | Laptop_sale | Mobile_sale | Headset_sale |
 +---------------+-------------+-------------+--------------+
 | 20000 | 0 | 0 | 0 |
 | 0 | 0 | 0 | 15000 |
 | 0 | 200000 | 0 | 0 |
 | 0 | 0 | 150000 | 0 |
 +---------------+-------------+-------------+--------------+
ORACLE Employee, Manager Relationship.
 Now convert all queries in ORACLE and its bit simple because ORACLE give us
functions for all this.
 For Employee Manager or Hierarchal queries we have connect by prior in ORACLE
also it give us LEVEL, START WITH kind of functions.
 select id,name,LEVEL from employee connect by prior id=p_id;
ID NAME LEVEL
2 Emp1 1
4 Emp1_1 2
5 Emp1_2 2
 select id,name,LEVEL from employee start with id=1 connect by prior id=p_id
ORDER SIBLINGS BY name;
 https://docs.oracle.com/cd/B19306_01/server.102/b14200/queries003.htm
ID NAME LEVEL
1 Manager 1
2 Emp1 2
4 Emp1_1 3
5 Emp1_2 3
ORACLE Nth Salary
 For nth salary or similar kind of queries we have some analytical function like
ROW_NUMBER(), RANK(), DENSE_RANK() etc.
 select name ,salary,ROW_NUMBER() over (order by salary desc) from
employee;
NAME SALARY RN
Manager 300000 1
Emp2 200000 2
Emp1 200000 3
Emp2_1 100000 4
 Now comparing all three in one query:-
 select name ,salary,ROW_NUMBER() over (order by salary desc) as rn,RANK()
over (order by salary desc) as ranks, DENSE_RANK() over (order by salary
as dense_ranks from employee;
NAME SALARY RN RANKS DENSE_RANKS
Manager 300000 1 1 1
Emp2 200000 2 2 2
Emp1 200000 3 2 2
Emp2_1 100000 4 4 3
Emp1_2 100000 5 4 3
Emp2_2 100000 6 4 3
Emp1_1 100000 7 4 3
 In above we have nth salary and we put over clause on it because we want salary
rankings. But what if we have department or year and we want each department
wise salary. In this case in oracle have PARTITION BY clause so when we put over
you want to run analytics function on that particular column but if we put
PARTITION BY then it will create partition over that column and run the analytic
functions.
 select name ,salary,ROW_NUMBER() over (partition by deprtment order by
desc) as rn from employee
ORACLE Pivot Query
 Pivot direct function available in ORACLE.
 select * from (select name,item,amount from sales) pivot (sum(amount) as
sum_amount for item in ('Computer','Laptop','Mobile','Headset') );
 All Mysql queries are valid for ORACLE.
Hive Queries
 All mysql queries can be applied as it is but slight change like for string have to use
double quotes etc.
 Also from hive 0.11 we can use analytics function like ORACLE.
 https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAn
dAnalytics#LanguageManualWindowingAndAnalytics-
WindowingandAnalyticsFunctions

More Related Content

PDF
Mysqlfunctions
PPTX
Creating a Table from a Function
PDF
Fulltext engine for non fulltext searches
PDF
Pandas pythonfordatascience
PDF
Python matplotlib cheat_sheet
PDF
Advanced fulltext search with Sphinx
PDF
Python bokeh cheat_sheet
PDF
Cheat Sheet for Machine Learning in Python: Scikit-learn
Mysqlfunctions
Creating a Table from a Function
Fulltext engine for non fulltext searches
Pandas pythonfordatascience
Python matplotlib cheat_sheet
Advanced fulltext search with Sphinx
Python bokeh cheat_sheet
Cheat Sheet for Machine Learning in Python: Scikit-learn

What's hot (20)

PDF
[1062BPY12001] Data analysis with R / week 2
PDF
Python seaborn cheat_sheet
PDF
ゼロから始めるScala文法
PPTX
LIST IN PYTHON-PART 3[BUILT IN PYTHON]
PDF
MariaDB: ANALYZE for statements (lightning talk)
DOCX
CLUSTERGRAM
PDF
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
PDF
ECMAScript 6 major changes
PDF
Extending Operators in Perl with Operator::Util
ODP
Promise
PPT
ملخص البرمجة المرئية - الوحدة الخامسة
PDF
Ch 03
PDF
optim function
PPTX
Date and time functions in mysql
PPTX
Introduction to R
PPTX
Pytorch and Machine Learning for the Math Impaired
PDF
The Ring programming language version 1.5.2 book - Part 22 of 181
PDF
Basic operations by novi reandy sasmita
PDF
Drinking the free kool-aid
[1062BPY12001] Data analysis with R / week 2
Python seaborn cheat_sheet
ゼロから始めるScala文法
LIST IN PYTHON-PART 3[BUILT IN PYTHON]
MariaDB: ANALYZE for statements (lightning talk)
CLUSTERGRAM
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
ECMAScript 6 major changes
Extending Operators in Perl with Operator::Util
Promise
ملخص البرمجة المرئية - الوحدة الخامسة
Ch 03
optim function
Date and time functions in mysql
Introduction to R
Pytorch and Machine Learning for the Math Impaired
The Ring programming language version 1.5.2 book - Part 22 of 181
Basic operations by novi reandy sasmita
Drinking the free kool-aid
Ad

Similar to Analytics functions in mysql, oracle and hive (20)

PPTX
Perth APAC Groundbreakers tour - SQL Techniques
ODP
Oracle SQL Advanced
PPT
DOCX
It6312 dbms lab-ex2
PPTX
Sangam 18 - Great Applications with Great SQL
PDF
KScope19 - SQL Features
PPTX
MySqL_n.pptx edshdshfbhjbdhcbjdchdchjcdbbjd
PPTX
SQL techniques for faster applications
PPSX
Oracle Advanced SQL
PPTX
PPTX
OracleSQLraining.pptx
PPTX
Module 3.1.pptx
PPT
Sql query [select, sub] 4
PPT
Oracle tips and tricks
PDF
UKOUG 2019 - SQL features
PDF
ILOUG 2019 - SQL features for Developers
PPT
PDF
sql language
Perth APAC Groundbreakers tour - SQL Techniques
Oracle SQL Advanced
It6312 dbms lab-ex2
Sangam 18 - Great Applications with Great SQL
KScope19 - SQL Features
MySqL_n.pptx edshdshfbhjbdhcbjdchdchjcdbbjd
SQL techniques for faster applications
Oracle Advanced SQL
OracleSQLraining.pptx
Module 3.1.pptx
Sql query [select, sub] 4
Oracle tips and tricks
UKOUG 2019 - SQL features
ILOUG 2019 - SQL features for Developers
sql language
Ad

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Cloud computing and distributed systems.
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Approach and Philosophy of On baking technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation theory and applications.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
KodekX | Application Modernization Development
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPT
Teaching material agriculture food technology
PPTX
Programs and apps: productivity, graphics, security and other tools
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Diabetes mellitus diagnosis method based random forest with bat algorithm
sap open course for s4hana steps from ECC to s4
Cloud computing and distributed systems.
The AUB Centre for AI in Media Proposal.docx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Review of recent advances in non-invasive hemoglobin estimation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Approach and Philosophy of On baking technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation theory and applications.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KodekX | Application Modernization Development
20250228 LYD VKU AI Blended-Learning.pptx
Teaching material agriculture food technology
Programs and apps: productivity, graphics, security and other tools

Analytics functions in mysql, oracle and hive

  • 1. Analytics Functions in Mysql, Oracle and Hive Rank, Hierarchical, Pivot queries.
  • 2. Mysql Employee, Manager Relationship.  mysql> select * from employee;  +------+---------+------+  | id | name | p_id |  +------+---------+------+  | 1 | Manager | NULL |  | 2 | Emp1 | 1 |  | 3 | Emp2 | 1 |  | 4 | Emp1_1 | 2 |  | 5 | Emp1_2 | 2 |  | 6 | Emp2_1 | 3 |  | 7 | Emp2_2 | 3 |  +------+---------+------+
  • 3.  mysql> select * from employee a,employee b where a.id=b.p_id;  +------+---------+------+------+--------+------+  | id | name | p_id | id | name | p_id |  +------+---------+------+------+--------+------+  | 1 | Manager | NULL | 2 | Emp1 | 1 |  | 1 | Manager | NULL | 3 | Emp2 | 1 |  | 2 | Emp1 | 1 | 4 | Emp1_1 | 2 |  | 2 | Emp1 | 1 | 5 | Emp1_2 | 2 |  | 3 | Emp2 | 1 | 6 | Emp2_1 | 3 |  | 3 | Emp2 | 1 | 7 | Emp2_2 | 3 |  +------+---------+------+------+--------+------+
  • 4. Mysql Nth Salary  mysql> select * from employee;  +------+---------+------+--------+  | id | name | p_id | salary |  +------+---------+------+--------+  | 1 | Manager | NULL | 300000 |  | 2 | Emp1 | 1 | 200000 |  | 3 | Emp2 | 1 | 200000 |  | 4 | Emp1_1 | 2 | 100000 |  | 5 | Emp1_2 | 2 | 100000 |  | 6 | Emp2_1 | 3 | 100000 |  | 7 | Emp2_2 | 3 | 100000 |  +------+---------+------+--------+
  • 5.  mysql> select max(salary) from employee;  +-------------+  | max(salary) |  +-------------+  | 300000 |  +-------------+  Second (N=2) Highest Salary:-  Select col1,col2,salary from tab1 where N-1 = (select count(distinct salary) from tab2 where tab2.salary>tab1.salary);
  • 6.  In our case:-  mysql> select id,name,salary from employee a where 1=(select count(distinct salary) from employee b where b.salary>=a.salary);  +------+------+--------+  | id | name | salary |  +------+------+--------+  | 2 | Emp1 | 200000 |  | 3 | Emp2 | 200000 |  +------+------+--------+  Explanation:- Above query called corelated query i.e. inner query is dependent on outer query result.
  • 7.  So in laymen words our query says:- Get columns with salary from table (select id,name,salary from employee a ) where check in the inner query that how many rows/count is less than our outer query salary (select count(distinct salary) from employee b where b.salary>= 300000)  Now if you want have separate department table and you want department wise employee second (nth) highest salary then you have to modify our query with joins:-  SELECT DISTINCT d1.deptno,e1.sal FROM emp e1,dept d1 WHERE 2 =(SELECT count(DISTINCT e2.sal) FROM emp e2,dept d2 WHERE e2.sal >= e1.sal AND d2.deptno = e2.deptno AND d1.deptno = d2.deptno) AND d1.deptno = e1.deptno;
  • 8. Mysql Pivot Query  mysql> select * from sales;  +-------+----------+--------+  | name | item | amount |  +-------+----------+--------+  | Mark | Computer | 20000 |  | Mark | Laptop | 100000 |  | Bill | Laptop | 100000 |  | Steve | Mobile | 150000 |  | Steve | Headset | 15000 |  +-------+----------+--------+
  • 9.  mysql> select (case when item='Computer' then sum(amount) else 0 end) as Computer_sale,(case when item='Laptop' then sum(amount) else 0 end) as Laptop_sale,(case when item='Mobile' then sum(amount) else 0 end) as Mobile_sale,(case when item='Headset' then sum(amount) else 0 end) as Headset_sale from sales group by item;  +---------------+-------------+-------------+--------------+  | Computer_sale | Laptop_sale | Mobile_sale | Headset_sale |  +---------------+-------------+-------------+--------------+  | 20000 | 0 | 0 | 0 |  | 0 | 0 | 0 | 15000 |  | 0 | 200000 | 0 | 0 |  | 0 | 0 | 150000 | 0 |  +---------------+-------------+-------------+--------------+
  • 10. ORACLE Employee, Manager Relationship.  Now convert all queries in ORACLE and its bit simple because ORACLE give us functions for all this.  For Employee Manager or Hierarchal queries we have connect by prior in ORACLE also it give us LEVEL, START WITH kind of functions.  select id,name,LEVEL from employee connect by prior id=p_id; ID NAME LEVEL 2 Emp1 1 4 Emp1_1 2 5 Emp1_2 2
  • 11.  select id,name,LEVEL from employee start with id=1 connect by prior id=p_id ORDER SIBLINGS BY name;  https://docs.oracle.com/cd/B19306_01/server.102/b14200/queries003.htm ID NAME LEVEL 1 Manager 1 2 Emp1 2 4 Emp1_1 3 5 Emp1_2 3
  • 12. ORACLE Nth Salary  For nth salary or similar kind of queries we have some analytical function like ROW_NUMBER(), RANK(), DENSE_RANK() etc.  select name ,salary,ROW_NUMBER() over (order by salary desc) from employee; NAME SALARY RN Manager 300000 1 Emp2 200000 2 Emp1 200000 3 Emp2_1 100000 4
  • 13.  Now comparing all three in one query:-  select name ,salary,ROW_NUMBER() over (order by salary desc) as rn,RANK() over (order by salary desc) as ranks, DENSE_RANK() over (order by salary as dense_ranks from employee; NAME SALARY RN RANKS DENSE_RANKS Manager 300000 1 1 1 Emp2 200000 2 2 2 Emp1 200000 3 2 2 Emp2_1 100000 4 4 3 Emp1_2 100000 5 4 3 Emp2_2 100000 6 4 3 Emp1_1 100000 7 4 3
  • 14.  In above we have nth salary and we put over clause on it because we want salary rankings. But what if we have department or year and we want each department wise salary. In this case in oracle have PARTITION BY clause so when we put over you want to run analytics function on that particular column but if we put PARTITION BY then it will create partition over that column and run the analytic functions.  select name ,salary,ROW_NUMBER() over (partition by deprtment order by desc) as rn from employee
  • 15. ORACLE Pivot Query  Pivot direct function available in ORACLE.  select * from (select name,item,amount from sales) pivot (sum(amount) as sum_amount for item in ('Computer','Laptop','Mobile','Headset') );  All Mysql queries are valid for ORACLE.
  • 16. Hive Queries  All mysql queries can be applied as it is but slight change like for string have to use double quotes etc.  Also from hive 0.11 we can use analytics function like ORACLE.  https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAn dAnalytics#LanguageManualWindowingAndAnalytics- WindowingandAnalyticsFunctions