SlideShare a Scribd company logo
2
Most read
4
Most read
11
Most read
Advance Database Management Systems : 39
Algorithms for PROJECT and SET Operations
Prof Neeraj Bhargava
Vaibhav Khanna
Department of Computer Science
School of Engineering and Systems Sciences
Maharshi Dayanand Saraswati University Ajmer
Slide 15- 2
Algorithms for PROJECT and SET Operations (1)
• Algorithm for PROJECT operations (Figure 15.3b)
 <attribute list>(R)
1. If <attribute list> has a key of relation R, extract all tuples from
R with only the values for the attributes in <attribute list>.
2. If <attribute list> does NOT include a key of relation R,
duplicated tuples must be removed from the results.
• Methods to remove duplicate tuples
1. Sorting
2. Hashing
Slide 15- 3
Algorithms for PROJECT and SET Operations (2)
• Algorithm for SET operations
• Set operations:
– UNION, INTERSECTION, SET DIFFERENCE and CARTESIAN
PRODUCT
• CARTESIAN PRODUCT of relations R and S include all possible
combinations of records from R and S. The attribute of the
result include all attributes of R and S.
• Cost analysis of CARTESIAN PRODUCT
– If R has n records and j attributes and S has m records and k
attributes, the result relation will have n*m records and j+k
attributes.
• CARTESIAN PRODUCT operation is very expensive and should
be avoided if possible.
Slide 15- 4
Algorithms for PROJECT and SET Operations (3)
• Algorithm for SET operations (contd.)
• UNION (See Figure 15.3c)
– Sort the two relations on the same attributes.
– Scan and merge both sorted files concurrently, whenever the
same tuple exists in both relations, only one is kept in the
merged results.
• INTERSECTION (See Figure 15.3d)
– Sort the two relations on the same attributes.
– Scan and merge both sorted files concurrently, keep in the
merged results only those tuples that appear in both relations.
• SET DIFFERENCE R-S (See Figure 15.3e)
– Keep in the merged results only those tuples that appear in
relation R but not in relation S.
Slide 15- 5
Implementing Aggregate Operations and Outer
Joins (1)
• Implementing Aggregate Operations:
• Aggregate operators:
– MIN, MAX, SUM, COUNT and AVG
• Options to implement aggregate operators:
– Table Scan
– Index
• Example
– SELECT MAX (SALARY)
– FROM EMPLOYEE;
• If an (ascending) index on SALARY exists for the employee relation, then
the optimizer could decide on traversing the index for the largest value,
which would entail following the right most pointer in each index node
from the root to a leaf.
Slide 15- 6
Implementing Aggregate Operations and Outer
Joins (2)
• Implementing Aggregate Operations (contd.):
• SUM, COUNT and AVG
• For a dense index (each record has one index entry):
– Apply the associated computation to the values in the index.
• For a non-dense index:
– Actual number of records associated with each index entry must be
accounted for
• With GROUP BY: the aggregate operator must be applied separately to
each group of tuples.
– Use sorting or hashing on the group attributes to partition the file into
the appropriate groups;
– Computes the aggregate function for the tuples in each group.
• What if we have Clustering index on the grouping attributes?
Slide 15- 7
Implementing Aggregate Operations and Outer
Joins (3)
• Implementing Outer Join:
• Outer Join Operators:
– LEFT OUTER JOIN
– RIGHT OUTER JOIN
– FULL OUTER JOIN.
• The full outer join produces a result which is equivalent to the union of the results
of the left and right outer joins.
• Example:
SELECT FNAME, DNAME
FROM (EMPLOYEE LEFT OUTER JOIN DEPARTMENT
ON DNO = DNUMBER);
• Note: The result of this query is a table of employee names and their associated
departments. It is similar to a regular join result, with the exception that if an
employee does not have an associated department, the employee's name will still
appear in the resulting table, although the department name would be indicated
as null.
Slide 15- 8
Implementing Aggregate Operations and Outer
Joins (4)
• Implementing Outer Join (contd.):
• Modifying Join Algorithms:
– Nested Loop or Sort-Merge joins can be modified to implement
outer join. E.g.,
• For left outer join, use the left relation as outer relation and
construct result from every tuple in the left relation.
• If there is a match, the concatenated tuple is saved in the result.
• However, if an outer tuple does not match, then the tuple is still
included in the result but is padded with a null value(s).
Slide 15- 9
Implementing Aggregate Operations and Outer
Joins (5)
• Implementing Outer Join (contd.):
• Executing a combination of relational algebra operators.
• Implement the previous left outer join example
– {Compute the JOIN of the EMPLOYEE and DEPARTMENT tables}
• TEMP1FNAME,DNAME(EMPLOYEE DNO=DNUMBER DEPARTMENT)
– {Find the EMPLOYEEs that do not appear in the JOIN}
• TEMP2   FNAME (EMPLOYEE) - FNAME (Temp1)
– {Pad each tuple in TEMP2 with a null DNAME field}
• TEMP2  TEMP2 x 'null'
– {UNION the temporary tables to produce the LEFT OUTER JOIN}
• RESULT  TEMP1 υ TEMP2
• The cost of the outer join, as computed above, would include the cost of
the associated steps (i.e., join, projections and union).
Slide 15- 10
Combining Operations using Pipelining (1)
• Motivation
– A query is mapped into a sequence of operations.
– Each execution of an operation produces a temporary result.
– Generating and saving temporary files on disk is time consuming
and expensive.
• Alternative:
– Avoid constructing temporary results as much as possible.
– Pipeline the data through multiple operations - pass the result
of a previous operator to the next without waiting to complete
the previous operation.
Slide 15- 11
Combining Operations using Pipelining (2)
• Example:
– For a 2-way join, combine the 2 selections on the
input and one projection on the output with the
Join.
• Dynamic generation of code to allow for
multiple operations to be pipelined.
• Results of a select operation are fed in a
"Pipeline" to the join algorithm.
• Also known as stream-based processing.
Assignment
• Explain the Algorithms for PROJECT and SET Operations

More Related Content

PPT
predicate logic example
PPTX
Temporal databases
PPSX
Parallel Database
PPTX
Algorithm and pseudocode conventions
PPT
Divide and conquer
PPT
recursive transition_networks
PPTX
Lec 7 query processing
PPTX
Introdution and designing a learning system
predicate logic example
Temporal databases
Parallel Database
Algorithm and pseudocode conventions
Divide and conquer
recursive transition_networks
Lec 7 query processing
Introdution and designing a learning system

What's hot (20)

PPT
Unit 5 testing -software quality assurance
PPTX
15 puzzle problem using branch and bound
PPTX
Load runner & win runner
PPT
Use case Diagram
PPTX
Software Configuration Management (SCM)
PPTX
Cost estimation for Query Optimization
PDF
Chapter 6 software metrics
PPTX
4.3 techniques for turing machines construction
PPTX
Real time and distributed design
PPTX
Data decomposition techniques
PPTX
Alpha-beta pruning (Artificial Intelligence)
PPTX
Knowledge Representation, Inference and Reasoning
PPTX
Distributed system lamport's and vector algorithm
PPTX
Inductive bias
PPT
Ddbms1
PPTX
Software metrics
ODP
Production system in ai
ODP
Distributed operating system(os)
PPTX
Three address code In Compiler Design
Unit 5 testing -software quality assurance
15 puzzle problem using branch and bound
Load runner & win runner
Use case Diagram
Software Configuration Management (SCM)
Cost estimation for Query Optimization
Chapter 6 software metrics
4.3 techniques for turing machines construction
Real time and distributed design
Data decomposition techniques
Alpha-beta pruning (Artificial Intelligence)
Knowledge Representation, Inference and Reasoning
Distributed system lamport's and vector algorithm
Inductive bias
Ddbms1
Software metrics
Production system in ai
Distributed operating system(os)
Three address code In Compiler Design
Ad

Similar to Adbms 39 algorithms for project and set operations (20)

PPT
Ch7
PPT
Relational-algebra in Data base management ppts
PPT
Module 2-2.ppt
PPTX
Adbms 40 heuristics in query optimization
PPT
PPT
Chapter15
PPT
E212d9a797dbms chapter3 b.sc2
PPT
E212d9a797dbms chapter3 b.sc2 (1)
PPT
E212d9a797dbms chapter3 b.sc2 (2)
PDF
Oracle Join Methods and 12c Adaptive Plans
PDF
DBMS Module 2.2.pdf......................
PPTX
Working on MS-Excel 'FORMULA TAB'
PPTX
Class 2 variables, classes methods...
PPTX
Basics of SQL understanding the database.pptx
PPTX
Chapter 3.3
PPT
Operators and Expressions in C++
PPTX
Using Excel Functions
PPTX
Problem-solving and design 1.pptx
PDF
Handout2.pdf
PPTX
Parallel Machine Learning- DSGD and SystemML
Ch7
Relational-algebra in Data base management ppts
Module 2-2.ppt
Adbms 40 heuristics in query optimization
Chapter15
E212d9a797dbms chapter3 b.sc2
E212d9a797dbms chapter3 b.sc2 (1)
E212d9a797dbms chapter3 b.sc2 (2)
Oracle Join Methods and 12c Adaptive Plans
DBMS Module 2.2.pdf......................
Working on MS-Excel 'FORMULA TAB'
Class 2 variables, classes methods...
Basics of SQL understanding the database.pptx
Chapter 3.3
Operators and Expressions in C++
Using Excel Functions
Problem-solving and design 1.pptx
Handout2.pdf
Parallel Machine Learning- DSGD and SystemML
Ad

More from Vaibhav Khanna (20)

PPTX
Information and network security 47 authentication applications
PPTX
Information and network security 46 digital signature algorithm
PPTX
Information and network security 45 digital signature standard
PPTX
Information and network security 44 direct digital signatures
PPTX
Information and network security 43 digital signatures
PPTX
Information and network security 42 security of message authentication code
PPTX
Information and network security 41 message authentication code
PPTX
Information and network security 40 sha3 secure hash algorithm
PPTX
Information and network security 39 secure hash algorithm
PPTX
Information and network security 38 birthday attacks and security of hash fun...
PPTX
Information and network security 37 hash functions and message authentication
PPTX
Information and network security 35 the chinese remainder theorem
PPTX
Information and network security 34 primality
PPTX
Information and network security 33 rsa algorithm
PPTX
Information and network security 32 principles of public key cryptosystems
PPTX
Information and network security 31 public key cryptography
PPTX
Information and network security 30 random numbers
PPTX
Information and network security 29 international data encryption algorithm
PPTX
Information and network security 28 blowfish
PPTX
Information and network security 27 triple des
Information and network security 47 authentication applications
Information and network security 46 digital signature algorithm
Information and network security 45 digital signature standard
Information and network security 44 direct digital signatures
Information and network security 43 digital signatures
Information and network security 42 security of message authentication code
Information and network security 41 message authentication code
Information and network security 40 sha3 secure hash algorithm
Information and network security 39 secure hash algorithm
Information and network security 38 birthday attacks and security of hash fun...
Information and network security 37 hash functions and message authentication
Information and network security 35 the chinese remainder theorem
Information and network security 34 primality
Information and network security 33 rsa algorithm
Information and network security 32 principles of public key cryptosystems
Information and network security 31 public key cryptography
Information and network security 30 random numbers
Information and network security 29 international data encryption algorithm
Information and network security 28 blowfish
Information and network security 27 triple des

Recently uploaded (20)

PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
System and Network Administration Chapter 2
PPTX
Introduction to Artificial Intelligence
PDF
Digital Strategies for Manufacturing Companies
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
top salesforce developer skills in 2025.pdf
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Essential Infomation Tech presentation.pptx
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
System and Network Administraation Chapter 3
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
How to Choose the Right IT Partner for Your Business in Malaysia
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Operating system designcfffgfgggggggvggggggggg
System and Network Administration Chapter 2
Introduction to Artificial Intelligence
Digital Strategies for Manufacturing Companies
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Reimagine Home Health with the Power of Agentic AI​
top salesforce developer skills in 2025.pdf
How Creative Agencies Leverage Project Management Software.pdf
Essential Infomation Tech presentation.pptx
Navsoft: AI-Powered Business Solutions & Custom Software Development
How to Migrate SBCGlobal Email to Yahoo Easily
System and Network Administraation Chapter 3
Design an Analysis of Algorithms II-SECS-1021-03
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Softaken Excel to vCard Converter Software.pdf
Design an Analysis of Algorithms I-SECS-1021-03
2025 Textile ERP Trends: SAP, Odoo & Oracle

Adbms 39 algorithms for project and set operations

  • 1. Advance Database Management Systems : 39 Algorithms for PROJECT and SET Operations Prof Neeraj Bhargava Vaibhav Khanna Department of Computer Science School of Engineering and Systems Sciences Maharshi Dayanand Saraswati University Ajmer
  • 2. Slide 15- 2 Algorithms for PROJECT and SET Operations (1) • Algorithm for PROJECT operations (Figure 15.3b)  <attribute list>(R) 1. If <attribute list> has a key of relation R, extract all tuples from R with only the values for the attributes in <attribute list>. 2. If <attribute list> does NOT include a key of relation R, duplicated tuples must be removed from the results. • Methods to remove duplicate tuples 1. Sorting 2. Hashing
  • 3. Slide 15- 3 Algorithms for PROJECT and SET Operations (2) • Algorithm for SET operations • Set operations: – UNION, INTERSECTION, SET DIFFERENCE and CARTESIAN PRODUCT • CARTESIAN PRODUCT of relations R and S include all possible combinations of records from R and S. The attribute of the result include all attributes of R and S. • Cost analysis of CARTESIAN PRODUCT – If R has n records and j attributes and S has m records and k attributes, the result relation will have n*m records and j+k attributes. • CARTESIAN PRODUCT operation is very expensive and should be avoided if possible.
  • 4. Slide 15- 4 Algorithms for PROJECT and SET Operations (3) • Algorithm for SET operations (contd.) • UNION (See Figure 15.3c) – Sort the two relations on the same attributes. – Scan and merge both sorted files concurrently, whenever the same tuple exists in both relations, only one is kept in the merged results. • INTERSECTION (See Figure 15.3d) – Sort the two relations on the same attributes. – Scan and merge both sorted files concurrently, keep in the merged results only those tuples that appear in both relations. • SET DIFFERENCE R-S (See Figure 15.3e) – Keep in the merged results only those tuples that appear in relation R but not in relation S.
  • 5. Slide 15- 5 Implementing Aggregate Operations and Outer Joins (1) • Implementing Aggregate Operations: • Aggregate operators: – MIN, MAX, SUM, COUNT and AVG • Options to implement aggregate operators: – Table Scan – Index • Example – SELECT MAX (SALARY) – FROM EMPLOYEE; • If an (ascending) index on SALARY exists for the employee relation, then the optimizer could decide on traversing the index for the largest value, which would entail following the right most pointer in each index node from the root to a leaf.
  • 6. Slide 15- 6 Implementing Aggregate Operations and Outer Joins (2) • Implementing Aggregate Operations (contd.): • SUM, COUNT and AVG • For a dense index (each record has one index entry): – Apply the associated computation to the values in the index. • For a non-dense index: – Actual number of records associated with each index entry must be accounted for • With GROUP BY: the aggregate operator must be applied separately to each group of tuples. – Use sorting or hashing on the group attributes to partition the file into the appropriate groups; – Computes the aggregate function for the tuples in each group. • What if we have Clustering index on the grouping attributes?
  • 7. Slide 15- 7 Implementing Aggregate Operations and Outer Joins (3) • Implementing Outer Join: • Outer Join Operators: – LEFT OUTER JOIN – RIGHT OUTER JOIN – FULL OUTER JOIN. • The full outer join produces a result which is equivalent to the union of the results of the left and right outer joins. • Example: SELECT FNAME, DNAME FROM (EMPLOYEE LEFT OUTER JOIN DEPARTMENT ON DNO = DNUMBER); • Note: The result of this query is a table of employee names and their associated departments. It is similar to a regular join result, with the exception that if an employee does not have an associated department, the employee's name will still appear in the resulting table, although the department name would be indicated as null.
  • 8. Slide 15- 8 Implementing Aggregate Operations and Outer Joins (4) • Implementing Outer Join (contd.): • Modifying Join Algorithms: – Nested Loop or Sort-Merge joins can be modified to implement outer join. E.g., • For left outer join, use the left relation as outer relation and construct result from every tuple in the left relation. • If there is a match, the concatenated tuple is saved in the result. • However, if an outer tuple does not match, then the tuple is still included in the result but is padded with a null value(s).
  • 9. Slide 15- 9 Implementing Aggregate Operations and Outer Joins (5) • Implementing Outer Join (contd.): • Executing a combination of relational algebra operators. • Implement the previous left outer join example – {Compute the JOIN of the EMPLOYEE and DEPARTMENT tables} • TEMP1FNAME,DNAME(EMPLOYEE DNO=DNUMBER DEPARTMENT) – {Find the EMPLOYEEs that do not appear in the JOIN} • TEMP2   FNAME (EMPLOYEE) - FNAME (Temp1) – {Pad each tuple in TEMP2 with a null DNAME field} • TEMP2  TEMP2 x 'null' – {UNION the temporary tables to produce the LEFT OUTER JOIN} • RESULT  TEMP1 υ TEMP2 • The cost of the outer join, as computed above, would include the cost of the associated steps (i.e., join, projections and union).
  • 10. Slide 15- 10 Combining Operations using Pipelining (1) • Motivation – A query is mapped into a sequence of operations. – Each execution of an operation produces a temporary result. – Generating and saving temporary files on disk is time consuming and expensive. • Alternative: – Avoid constructing temporary results as much as possible. – Pipeline the data through multiple operations - pass the result of a previous operator to the next without waiting to complete the previous operation.
  • 11. Slide 15- 11 Combining Operations using Pipelining (2) • Example: – For a 2-way join, combine the 2 selections on the input and one projection on the output with the Join. • Dynamic generation of code to allow for multiple operations to be pipelined. • Results of a select operation are fed in a "Pipeline" to the join algorithm. • Also known as stream-based processing.
  • 12. Assignment • Explain the Algorithms for PROJECT and SET Operations