SlideShare a Scribd company logo
1
Chapter Two
Query Processing and Optimization
2
Introduction
Query Processing
 Activities involved in retrieving data from the database.
 This includes translation of high –level queries into low
level expressions that can be used at physical level of
the file system, query optimization and actual execution
of the query to get the result.
3
Query Processing…
Aims of query processing (QP):
 Transform query written in high-level language (e.g.,
SQL), into correct and efficient execution strategy
expressed in low-level language that implements
relational algebra (RA);
 Execute strategy to retrieve required data.
Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
3-4
Parsing and translation
 Scanner: The scanner specifies and recognizes the language tokens
such as SQL Keywords, attribute names, and relation names in the
text of the query.
 Parser: The parser checks the query syntax to determine whether it
is formulated according to the syntax rules of the query language.
 Validation: The query must be validated by checking that all
attributes and relation names are valid and semantically meaningful
names in the schema of the particular database being queried.
3-5
Parsing and translation
 Query is converted to relational algebra by SQL
interpreter.
 Relational Algebra converted to annotated tree,
joins as branches
 Each operator has implementation choices.
6
7
Translating SQL Queries into Relational Algebra
Query block:
The basic unit that can be translated into the algebraic operators
and optimized.
A query block contains a single SELECT-FROM-WHERE expression,
as well as GROUP BY and HAVING clause if these are part of the
block.
Nested queries:
Within a query are identified as separate query blocks.
Aggregate operators in SQL must be included in the extended
algebra.
Translation Example
Possible SQL Query:
 SELECT balance FROM account WHERE balance<2500
Possible Relational Algebra Query:
 balance(balance<2500(account))
3-8
9
Translating SQL Queries into Relational Algebra
Consider: to find names of employees making more than
everyone in department 5.
SELECT lname, fname FROM employee WHERE salary > (
SELECT MAX(salary) FROM employee WHERE dno=5)
10
Translating SQL Queries into Relational Algebra
2 query blocks:
SELECT lname, fname
FROM employee
WHERE salary > constant
SELECT MAX(salary)
FROM employee
WHERE dno=5
Relational Algebra:
π lname, fname (σsalary>cons (employee))
where cons is the result from:
π MAX Salary (σdno=5(employee))
11
Translating SQL Queries into Relational Algebra
consider: to find names of employees making more
than everyone in department 5.
SELECT lname,fname, dname FROM employee e,
department d WHERE e.dno=d.dno
Relational Algebra:
π lname, fname (employee ⋈e.dno=d.dno department)
Optimization
The query optimizer selects an execution plan that has
lowest and fastest but functionally equivalent form.
 A relational algebra expression may have many equivalent
expressions, each of which gives rise to a different evaluation plan.
 Bala( bala>100(Account))
  bala>100(Bala (Account)) both are equivalent query i.e. they display
the same results.
Amongst all equivalent evaluation plans choose the one with
lowest cost.
3-12
13
Execution plan
An internal representation of the query is then created, usually as a
tree data structure called a query tree.
The DBMS must then devise an execution strategy or plan for
retrieving the results of the query from the database files.
A query typically has many possible execution strategies, and the
process of choosing a suitable one for processing a query is known as
query optimization.
Evaluation
When the query came how the database answer it?
The query-execution engine takes a query-evaluation plan,
executes that plan, and returns the answers to the query.
3-14
15
Relational Algebra: overview
Project (unary)
 <attr list> (R)
 <attr list> is a list of attributes (columns) from R only
 Ex: title, year, length (Movie) “horizontal restriction”
A1 A2 A3 … An
...
i
A1 A2… Ak
...
j

n K, n≥k
16
Project
PROJECT can produce many tuples with same value
 Relational algebra semantics says remove duplicates
 SQL does not -- one difference between formal and
actual query languages
17
Relational Algebra: Select
Select or Restrict
 <predicate> (R)
 <predicate> is a conditional expression of the type that we are
familiar with from conventional programming languages
 <attribute> <op> <attribute>
 <attribute> <op> <constant>
 attribute in R
 op  {=,,<,>,, …, AND, OR}
 Ex: length100 (Movie) vertical restriction
18
Pictorially
A1 A2 A3 … An
...
i
A1 A2 A3 … An
...
j, i  j

title year length filmType
Star Wars
Mighty
Ducks
Wayne’s
World
1977
1991
1992
124
104
95
color
color
color
Movie
result set
# of selected tuples is referred to as the selectivity of the condition
19
Cartesian Product
 R x S
 Sets of all pairs that can be formed by choosing the first
element of the pair to be any element of R, the second any
element of S.
 Resulting schema may be ambiguous
 Use R.A or S.A to disambiguate an attribute that occurs in
both schemas
20
Example
A B
1 2
3 4
B C
2 5
4 7
D
6
8
9 10 11
x
A R.BS.B C D
R S
1 2 2 5 6
1 2 4 7 8
1 2 9 10 11
3 4
3 4
3 4
2 5 6
4 7 8
9 10 11
21
Join Operations
Natural Join (binary)
 R join S
 Match only those tuples from R and S that agree in whatever
attributes are common to the schemas of R and S
 If r and s from r(R) and s(S) are successfully paired, result is
called a joined tuple
 This join operation is the same we used in earlier section to
recombine relations that had been projected onto two subsets of
their attributes (e.g., as a result of a BCNF decomposition)
22
Example
A B
1 2
3 4
B C
2 5
4 7
D
6
8
9 10 11
join
A B C D
R S
1 2 5 6
3 4 7 8
Optimization
A relational algebra expression may have many equivalent expressions
E.g.,salary75000(salary(instructor)) is equivalent to
salary(salary75000(instructor))
Each relational algebra operation can be evaluated using one of several
different algorithms
 Correspondingly, a relational-algebra expression can be evaluated
in many ways.
 E.g., can use an index on salary to find instructors with salary <
75000,
 or can perform complete relation scan and discard instructors
with salary  75000
3-23
Optimization….
Annotated expression specifying detailed evaluation strategy is called an
evaluation-plan.
Query Optimization: Amongst all equivalent evaluation plans choose the
one with lowest cost.
Cost is estimated using statistical information from the database catalog
e.g. number of tuples in each relation, size of tuples, etc.
 Total cost= CPU cost + I/O cost + communication cost
3-24
Three Key Concepts in QPO
1. Building blocks
 Similarly, most DBMS have few building blocks:
• select (point query, range query), join, sorting, ...
 SQL query is decomposed in building blocks
2. Query processing strategies for building blocks
 DBMS keeps a few processing strategies for each building
block
• e.g. a point query can be answer via an index or via scanning
data-file
3. Query optimization
 For each building block of a given query, DBMS QPO tries
to choose
• “most efficient” strategy given database parameters
• parameter examples: table size, available indices, …
• ex. index search is chosen for a point query if the index is
available
3-25
Query tree
Query tree: a tree data structure that corresponds to a
relational algebra expression. It represents the input
relations of the query as leaf nodes of the tree, and
represents the relational algebra operations as internal
nodes.
An execution of the query tree consists of executing an
internal node operation whenever its operands are available
and then replacing that internal node by the relation that
results from executing the operation.
3-26
Tree Representation of Relational Algebra
balancebalance<2500(account))
balance
balance<2500
account
3-27
Making An Evaluation Plan
Annotate Query Tree with evaluation instructions:
The query can now be executed by the query execution engine.
balance
balance<2500
account
use index 1
3-28
Tree Representation of Relational Algebra
A1,,,,Anp( R1 x,….Rk))
A1,,,An
P
x
x
x
R3
R2
Rk
R1
3-29
Why Learn about QPO?
Why learn about QPO in a DBMS?
 Identify performance bottleneck for a query
• is it the physical data model or QPO ?
 How to help QPO speed up processing of a query ?
• providing hints, rewriting query, etc.
 How to enhance physical data model to speed up
queries?
• add indices, change file- structures, …
3-30
Measures of Query Cost
Cost is generally measured as total elapsed time for answering
query
 Many factors contribute to time cost
• disk accesses, CPU, or even network communication
Typically disk access is the predominant cost, and is also
relatively easy to estimate. Measured by taking into account
 Number of seeks * average-seek-cost
 Number of blocks read * average-block-read-cost
 Number of blocks written * average-block-write-cost
• Cost to write a block is greater than cost to read a block
• data is read back after being written to ensure that the write
was successful
3-31
32
Algorithms for select operations
Implementing the SELECT Operations
There are many algorithms for executing a select operation , which is
basically a search operation to locate the records in a disk file that
satisfy a certain condition.
Let as discuss on the ff relational operations.
 OP1: SSN=“123” (Employee)
 OP2: Dnumber>5 (department)
 OP3: Dno>5 (employee)
33
Search Methods for Simple Selection
S1.Linear search (brute force algorithm) : Retrieve every
record in the file, and test whether its attribute values
satisfy the selection condition.
S2. Binary search: If the selection condition involves an
equality comparison on a key attribute on which the file is
ordered, binary search—which is more efficient than linear
search—can be used.
34
Search Methods for Simple Selection
S3. Using a primary index : If the selection condition involves
an equality comparison on a key attribute with a primary index.
for example, Eid = ‘123’ use the primary index to retrieve the
record. Note that this condition retrieves a single record (at
most).
S4.Using a primary index to retrieve multiple records: If
the comparison condition is >,>=,<, or <= on a key field with a
primary index—for example, deptname > 5 use the index to find
the record satisfying the corresponding equality condition
(deptname= 5), then retrieve all subsequent records in the
(ordered) file. For the condition deptname < 5, retrieve all the
Search Methods for Simple Selection
S5. Using a clustering index to retrieve multiple records: If
the selection condition involves an equality comparison on a non-
key attribute with a clustering index—for example, DNO = 5 is
use the index to retrieve all the records satisfying the condition.
S6. Using a secondary ( B+ tree) index on an equality
comparison: This search method can be used to retrieve a single
record if the indexing field is a key (has unique values) or to
retrieve multiple records if the indexing field is not a key. This
can also be used for comparisons involving >, >=, <, or <=.
3-35
36
Sorting
Efficient evaluation for many operations
Sorting uses keyboard order by:
 SELECT cid,name FROM student ORDER BY name
Implementations
 Internal sorting (if records fit in main memory)
 External sorting
Why Sort?
 A classic problem in computing
 Data requested in sorted order
 e.g., find students in increasing gpa order
 Sorting is useful for eliminating duplicate copies in a collection of
records
 Problem: If a list is too large to fit in main memory, the time required to
access a data value on a disk or tape dominates any efficiency analysis.
E.g sort 10GB of data with 1GB of RAM.
 Solution: Develop external sorting algorithms that minimize disk
accesses.
3-37
38
External Sorting
Refers to sorting algorithms that are suitable for large
files of records stored on disk that do not fit entirely in
main memory.
External sorting handles a massive amount of data. This data
may be too big to fit in RAM of the computer device for
sorting. So data reside on slower external memory.
The typical external sorting algorithm uses a sort-merge
strategy, which starts by sorting small sub files called runs.
39
Basic External Sorting Algorithm
 Assume unsorted data is on disk at start
 Let M = maximum number of records that can be stored & sorted in internal
memory at one time
Algorithm
Sort phase:
First divide the file into runs such that the size of runs small enough to
fit into main memory.
1. Read M records into main memory & sort internally.
2. Write this sorted sub-list onto disk. (This is one “run”).
Until all data is processed into runs
Merge phase:
1. Merge two runs into one sorted run by reading the first block of runs.
2. Pass first recodes to buffer blocks till buffer block is full
3. Write this output back to disk
4. When a block of a run is exhausted next block of the run is read
2-Way Sort: Requires 3 Buffers
Phase 1: PREPARE.
 Read a page, sort it, write it.
 only one buffer page is used
Phase 2, 3, …, etc.: MERGE:
 Three buffer pages used.
Main memory buffers
INPUT 1
INPUT 2
OUTPUT
Disk
Disk
Disk
input
Main memory
Disk
1 buffer
1 buffer
1 buffer
3-40
41
Basic External Sorting
11 96 12 35 17 99 28 58 41 75 15
94
81
Unsorted Data on Disk
Assume M = 3 (M would actually be much larger, of course.)
First step is to read 3 data items at a time into main memory,
sort them and write them back to disk as runs of length 3.
11 94
81
96
12 35
17 99
28
58
41 75
15
42
Basic External Sorting
Next step is to merge the runs of length 3 into runs of length 6.
11 94
81 96
12 35
17 99
28 58
41 75
15
11 94
81
96
12 35
17 99
28
58
41 75
15
43
Basic External Sorting
Next step is to merge the runs of length 6 into runs of length 12.
11 94
81 96
12 35
17 99
28 58
41 75
15
15
11 94
81 96
12 35
17 99
28 58
41 75
44
Basic External Sorting
Next step is to merge the runs of length 12 into runs of length 24. Here we
have less than 24, so we’re finished.
11 94
81 96
12 35
17 99
28 58
41 75
15
11 94
81 96
12 35
17 99
28 58
41 75
15
45
Example 2
18 20
19 14
11 12
16 13 21
17 15
46
Sort-Merge Example
d 95
a 12
x 44
s 95
f 12
o 73
t 45
n 67
e 87
z 11
v 22
b 38
file memory
t 45
n 67
e 87
z 11
v 22
b 38
d 95
a 12
x 44
a 12
d 95
x 44
R1
f 12
o 73
s 95
R2
e 87
n 67
t 45
R3
b 38
v 22
z 11
R4
a 12
d 95
x 44
s 95
f 12
o 73
run
pass
pass
v 22
t 45
s 95
z 11
x 44
o 73
a 12
b 38
n 67
f 12
d 95
e 87
Implementing the join operation
The JOIN operation is one of the most time-consuming
operations in query processing.
Many of the join operations encountered in queries are of the
EQUIJOIN and NATURAL JOIN varieties.
The algorithms we consider are for join operations of the
form :R joinA=B S
Where A and B are domain-compatible attributes of R and S,
respectively
3-47
Methods for Implementing Joins
J1. Nested-loop join (brute force): For each record t in R
(outer loop), retrieve every record s from S (inner loop) and
test whether the two records satisfy the join condition t[A]
= s[B].
J2. Single-loop join (using an access structure to retrieve
the matching records):
If an index (or hash key) exists for one of the two join
attributes—say, B of S—retrieve each record t in R, one at a
time (single loop), and then use the access structure to
retrieve directly all matching records s from S that satisfy
s[B] = t[A].
3-48
Methods for Implementing Joins
J3. Sort–merge join: If the records of R and S are physically sorted by
value of the join attributes A and B, respectively, we can implement the
join in the most efficient way possible, in order of the join attributes A
and B.
If the files are not sorted, they may be sorted first by using external
sorting. In this method, pairs of file blocks are copied into memory
buffers in order and the records of each file are scanned only once each
for matching with the other file.
unless both A and B are non-key attributes, in which case the method needs
indexing. R(i) to refer to the record in R.A variation of the sort-merge join.
3-49
Methods for implementing joins
J4. Hash-join: The records of files R and S are both hashed to the
same hash file, using the same hashing function on the join attributes A
of R and B of S as hash keys.
First, a single pass through the file with fewer records (say, R) hashes
its records to the hash file buckets; this is called the partitioning
phase, since the records of R are partitioned into the hash buckets.
In the second phase, called the probing phase, a single pass through the
other file (S) then hashes each of its records to probe the appropriate
bucket, and that record is combined with all matching records from R in
that bucket.
This simplified description of hash-join assumes that the smaller of the
two files fits entirely into memory buckets after the first phase.
3-50
methods of query optimization
There are two methods of query optimization.
1. Cost based Optimization (Physical)
This is based on the cost of the query. The query can use
different paths based on indexes, constraints, sorting
methods etc.
This method mainly uses the statistics like record size,
number of records, number of records per block, number of
blocks, table size, whether whole table fits in a block,
organization of tables, uniqueness of column values, size of
columns etc
3-51
methods of query optimization cont…
2. Rule based optimization:
Use heuristics, called query rewrite rules
 eliminate many of the really bad plans
 Rules that will improve performance with very high
probability
 Getting queries into a form that we know how to handle best
 This method creates relational tree for the given query
based on the equivalence rules.
When these equivalence rules provide an alternative way of
writing and evaluating the query, gives the better path to
evaluate the query.
3-52
Query Rewrite Rules
 Transform one logical plan into another
 Do not use statistics
 Equivalences in relational algebra
 Push-down predicates
 Write projects early
 Avoid cross-products if possible
 Use left-deep trees
 Use of constraints, e.g., uniqueness
3-53
Query Rewrite Rules
 First, move SELECT operations down the query tree
 Second, perform the more restrictive SELECT operations
first
 Third, replace CARTESIAN PRODUCT and SELECT
combinations with JOIN operations
 Finally, move PROJECT operations down the query tree
 This is called heuristic optimization
3-54
Example Query
Select B,D
From R,S
Where R.A = “c”  R.C=S.C
3-55
Initial Logical Plan
Relational Algebra: B,D [ R.A=“c” R.C = S.C (RXS)]
Select B,D
From R,S
Where R.A = “c” 
R.C=S.C
B,D
R.A = “c” Λ R.C = S.C
X
R S
3-56
Apply Rewrite Rule (1)
B,D [ R.C=S.C [R.A=“c”(R X S)]]
Split the conjunction into two select predicates, the order
doesn’t matter
B,D
R.A = “c” Λ R.C = S.C
X
R S
B,D
R.A = “c”
X
R S
R.C = S.C
3-57
Apply Rewrite Rule (2)
B,D [ R.C=S.C [R.A=“c”(R)] X S]
B,D
R.A = “c”
X
R
S
R.C = S.C
B,D
R.A = “c”
X
R S
R.C = S.C
3-58
Apply Rewrite Rule (2)
B,D [ R.C=S.C [R.A=“c”(R)] X S]
B,D
R.A = “c”
R
S
R.C = S.C
B,D
R.A = “c”
X
R S
R.C = S.C
3-59
• How do we execute this query?
- Do Cartesian product
- Select tuples
- Do projection
One idea
Select B,D
From R,S
Where R.A = “c”  S.E = 2 
R.C=S.C
3-60
R A B C S C D E
a 1 10 10 x 2
b 1 20 20 y 2
c 2 10 30 z 2
d 2 35 40 x 1
e 3 45 50 y 3
Answer B D
2 x
Select B,D
From R,S
Where R.A = “c” 
S.E = 2  R.C=S.C
3-61
62
An Example (cont.)
Plan 1
 Cross product of R & S
 Select tuples using WHERE conditions
 Project on B & D
Algebra expression
B,D
R.A=‘c’ S.E=2 R.C=S.C

R S
B,D(R.A=‘c’ S.E=2 R.C=S.C (R S))
R X S R.A R.B R.C S.C S.D S.E
a 1 10 10 x 2
a 1 10 20 y 2
.
.
c 2 10 10 x 2
.
.
Found!
Got one...
Select B,D
From R,S
Where R.A = “c”
 S.E = 2 
R.C=S.C
3-63
64
An Example (cont.)
Plan 2
 Select R tuples with R.A=“c”
 Select S tuples with S.E=2
 Natural join
 Project B & D
Algebra expression B,D
S.E=2
R S
R.A=‘c’
B,D( R.A=“c” (R) S.E=2 (S))
Relational Algebra Primer
Select: R.A=“c” R.C=10
Project: B,D
Cartesian Product: R X S
Natural Join: R S
3-65
Another idea:
B,D
R.A = “c” S.E = 2
R(A,B,C) S(C,D,E)
Plan II
natural join
Select B,D
From R,S
Where R.A = “c” 
S.E = 2  R.C=S.C
3-66
67
Query Evaluation
How to evaluate individual relational operation?
 Selection: find a subset of rows in a table
 Join: connecting tuples from two tables
 Other operations: union, projection, …
How to estimate cost of individual operation?
How does available buffer affect the cost?
How to evaluate a relational algebraic expression?
Algebraic Laws
Commutative and Associative Laws
 R U S = S U R, R U (S U T) = (R U S) U T
 R ∩ S = S ∩ R, R ∩ (S ∩ T) = (R ∩ S) ∩ T
3-68
Algebraic Laws
Laws involving selection:
  C AND C’(R) =  C( C’(R)) =  C(R) ∩  C’(R)
  C OR C’(R) =  C(R) U  C’(R)
  C (R U S) =  C (R) U  C (S)
• When C involves only attributes of R
  C (R S) =  C (R) S
  C (R – S) =  C (R) – S
  C (R ∩ S) =  C (R) ∩ S

 
3-69
Transformation Rules for RA Operations
Conjunctive Selection operations can cascade into
individual Selection operations (and vice versa).
pqr(R) = p(q(r(R)))
Sometimes referred to as cascade of Selection.
branchNo='B003'  salary>15000(Staff) =
branchNo='B003'(salary>15000(Staff))
70
3-70
Transformation Rules for RA Operations
Commutativity of Selection.
p(q(R)) = q(p(R))
For example:
branchNo='B003'(salary>15000(Staff)) =
salary>15000(branchNo='B003'(Staff))
71
3-71
Disk Structure
Storing Data: Disks and Files
72
3-72
The Storage Hierarchy
–Main memory (RAM) for
currently used data.
–Disk for the main
database (secondary
storage).
–Tapes for archiving older
versions of the data
(tertiary storage).
Smaller, Faster
Bigger, Slower
3-73
Disks and Files
DBMS stores information on disks.
This has major implications for DBMS design!
 READ: transfer data from disk to main memory (RAM).
 WRITE: transfer data from RAM to disk.
 Both are high-cost operations, relative to in-memory
operations, so must be planned carefully!
3-74
Components of a Disk
Platters
Spindle
The arm assembly is
moved in or out to
position a head on a
desired track.
Tracks under heads
make a cylinder
(imaginary!).
Disk head
Arm movement
Arm assembly
Tracks
Sector
Block size is a multiple
of sector size (which is fixed).
75
Disks
 Secondary storage device of choice.
 Main advantage over tapes: random access vs. sequential.
 Data is stored and retrieved in units called disk blocks or
pages.
 Unlike RAM, time to retrieve a disk block varies depending
upon location on disk.
 Therefore, relative placement of blocks on disk has
major impact on DBMS performance!
3-76
Accessing a Disk Page
Time to access (read/write) a disk block:
 seek time (moving arms to position disk head on track)
 rotational delay (waiting for block to rotate under head)
 transfer time (actually moving data to/from disk surface)
Seek time and rotational delay dominate.
 Seek time varies between about 0.3 and 10msec
 Rotational delay varies from 0 to 4msec
 Transfer rate around 0.08msec
Key to lower I/O cost: reduce seek/rotation delays!
Hardware vs. software solutions?
3-77
Index and index structure
Index is
 Mechanism for efficiently locating rows without having to
scone entire table. Ex author catalog in library
 Based on a search key: rows having a particular value for
the search key attributes can be quickly located.
 Candidate key-set of attributes, quarantines uniqueness.
 Search key:- sequence of attributes, does not guarantee
uniqueness.
 This minimize the no of disk access required or it’s the way
of optimizing the performance of database
3-78
Structure of index
 Search Key - attribute to set of attributes used to look
up records in a file.
 An index file consists of records (called index entries) of
the form
 Index files are typically much smaller than the original
file
 Pointer- holds address of particular disk block where the
key value can be found.
 Two basic kinds of indices:
 Ordered indices: search keys are stored
 Hash indices: search keys are distributed uniformly
across “buckets” using a “hash function”.
search-key pointer
3-79
Index Evaluation Metrics
Access types supported efficiently.
 Equality searches – records with a specified value in an attribute.
 Range searches – records with an attribute value falling within a
specified range
Access time-time to find and use a files
Insertion time- time to push new record
Deletion time-time to delete from record
Space overhead- how much extra byte need for the index
itself.
3-80
Classification of Indexing
 In an ordered index, index entries are stored sorted on the search
key value
 Eg. Author catalog in library
 Primary index: in a sequentially ordered file, the index whose search
key specifies the sequential order of the actual file. Also called
clustering index.
 Index entry is created for first record of each block
 No of index entries= no of blocks
 Secondary index: an index whose search key specifies an order
different from the sequential order of the file. Also called non-
clustering index. Number of entry in index file = number of entry in
main file
3-81
Primary Dense Index Files
 Dense index — Index record appears for every search-key value in the
file. Or
 every entry for possible search key values. Faster but it requires more
space to store index itself.
 E.g. index on ID
3-82
Dense Index Files (Cont.)
Dense index on dept_name, with instructor file sorted on dept_name
Don’t have a pointer to every records but one which has for search key
3-83
Primary Sparse Index Files
 Sparse Index: contains index records for only some search-
key values
 To locate a record with search-key value K :
 Find index record with largest search-key value < K
 Search file sequentially starting at the record to which the
index record points
 You reach to the nearest record the follow pointer.
3-84
Sparse Index Files (Cont.)
Compared to dense indices:
 Less space and less maintenance overhead for
insertions and deletions.
 Generally slower than dense index for locating
records.
Good tradeoff: sparse index with an index entry for every
block in file, corresponding to least search-key value in the
block.
3-85
Problems with simple indexes
 Ex 100,000 entries
 If we create desen index it will have very large index
 If create sparse index we may have 50,000 sparse
index.
 Solution: create multiple sparse index
3-86
Multilevel Index
If primary index does not fit in memory, access becomes
expensive.
Solution: treat primary index kept on disk as a sequential
file and construct a sparse index on it.
 outer index – a sparse index of primary index
 inner index – the primary index file
 If even outer index is too large to fit in main memory, yet
another level of index can be created, and so on.
 Indices at all levels must be updated on insertion or
deletion from the file.
3-87
Multilevel Index (Cont.)
3-88
Multilevel Index B+-Tree Index Files
 All the data is stored in leaf node.
 Every leaf is at the same level
 Internal nodes stores just keys
 All the leafs have pointer/links with each other for faster
accesses like link list
 Keys are used for directing a search to the proper key
 There is threshold level(M)= max no of elements at a node.
3-89
B+-Tree Node Structure
Typical node
 Ki are the search-key values
 Pi are pointers to children or pointers to records (for
leaf nodes).
The search-keys in a node are ordered
K1 < K2 < K3 < . . . < Kn–1
3-90
B+-Tree Node Structure
Advantage of B+-tree index files:
 Automatically reorganizes itself with small, local
changes, in the time of insertions and deletions.
 Reorganization of entire file is not required to maintain
performance.
Disadvantage of B+-trees:
 Extra insertion and deletion overhead, space overhead.
 B+-trees are used extensively
3-91
B+-Tree Node Structure
Internal nodes:
 Internal (non-leaf) nodes contain at least ⌈n/2⌉
pointers, except the root node.
 At most, an internal node can contain n pointers
Leaf nodes:
 Leaf nodes contain at least ⌈n/2⌉ record pointers and
⌈n/2⌉ key values.
 At most, a leaf node can contain n record pointers
and n key values.
 Every leaf node contains one block pointer P to point to
next leaf node and forms a linked list
3-92
B+ Tree Insertion
B+ tree are filled from bottom and each entry is done at
the leaf node.
If a leaf node overflows −
• Split node into two parts.
• Partition at i = ⌊(m + 1)/2⌋.
• First i entries are stored in one node.
• Rest of the entries i + 1 on wards are
moved to a new node.
• ith key is duplicated at the parent of the
leaf.
3-93
B+ Tree Insertion
If a non-leaf node overflows −
Split node into two parts.
Partition the node at i = ⌈m + 1/2⌉.
Entries up to i are kept in one node.
Rest of the entries are moved to a new
3-94
B+ Tree node structure
A B
Values <A Values>=A &&<B
3-95
B+ Tree Deletion
 B+ tree entries are deleted at the leaf nodes.
 The target entry is searched and deleted.
 If it is an internal node, delete and replace with the
entry from the left position.
 After deletion, underflow is tested,
 If underflow occurs, distribute the entries from the
nodes left to it.
 If distribution is not possible from left, then
 Distribute from the nodes right to it.
 If distribution is not possible from left or from right,
then
 Merge the node with left and right to it.
3-96
Rules for Constructing a B+ Tree
 Key values are max node of elements in the node
 The number of key values contained in a non-leaf node is 1 less
than the number of pointers
 So if n is the order key is M=n-1.
 If the tree has order n, the number of occupied key values in a
leaf node must be between (n-1)/2 and n-1.
 If (n-1)/2 is not an integer, round up to determine the minimum
number of occupied key values.
 The tree must be balanced, that is, every path from the root
node must have the same length. 3-97
Rules for Constructing a B+ Tree
Example n=5
Node Min Node Max Node Min Node Max Node
Root 1 1
Internal
node
N/2 N 3 5
Leaf node N/2 N-1 3 4
3-98
Searching
Since no structure change in a B+ tree during a searching
process, so just compare the key value with the data in the
tree, then give the result back.
For example: find the value 45 and 15 in below tree.
3-99
Searching
Result:
1. For the value of 45, not found.
2. For the value of 15, return the position where
the pointer located.
3-100
Insertion
 Since insert a value into a B+ tree may cause the tree
unbalance, so rearrange the tree if needed.
 Example #1: insert 28 into the below tree.
25 28 30
3-101
Insertion
Result:
3-102
Insertion
 Example #2: insert 70 into below tree
3-103
Insertion
 Process: split the tree
50 55 60 65 70
50 55 60 65 70
104
Insertion
Result: chose the middle key 60, and place it
in the index page between 50 and 75.
3-105
Insertion
Exercise: add a key value 95 to the below
tree.
75 80 85 90 95
25 50 60 75 85
75 80 85 90 95
3-106
Insertion
Result: again put the middle key 60 to the
index page and rearrange the tree.
3-107
Deletion
 Same as insertion, the tree has to be rebuild if the
deletion result violate the rule of B+ tree.
 Example #1: delete 70 from the tree
60 65
This is OK.
3-108
Deletion
 Result:
3-109
Deletion
Example #2: delete 25 from below tree, but 25
appears in the index page.
28 30
But…
This is
OK. 3-110
Deletion
 Result: replace 28 in the index page.
Add 28
3-111
Deletion
 Example #3: delete 60 from the below tree
65
50 55 65 3-112
Deletion
 Result: delete 60 from the index page and
combine the rest of index pages.
3-113
Example 1
 Create B +Tree for the following data.
 Insert: 50,75,100,120
M=2
3-114
Insert 50,75
50* 75* a
3-115
Insert 100
* 75 *
* 50 * * 75 *100*
a
b c
Node a splits creating 2 children: b and c
3-116
Insert 8
* 50 *
100* 120 *
* 100 *
* 75 *
* 75 *
3-117
Exercise 1
 B+Tree of order n=4 then key=3
 Insert: 2,4,7,10,17,21,28
3-118
Exercise 1
 B+Tree of order n=6
 Insert: 10,20,30,40,50,60,70,80,90
M=5
3-119
Exercise 2
 B+Tree of order n=4
 Insert: 20,15,5,1,3,9,2
M=3
3-120
Exercise 3
 B+ Tree of order 3
 Insert: 5,8,4,16,23,10,15
M=2
3-121
Conclusion
 For a B+ Tree:
 It is easy to maintain it’s balance.
 The searching time is short than most of other types of
trees.
3-122
Query Execution Plans
An execution plan for a relational algebra query consists of
a combination of the relational algebra query tree and
information about the access methods to be used for each
relation as well as the methods to be used in computing the
relational operators stored in the tree.
• Materialized evaluation: the result of an operation is
stored as a temporary relation.
• Pipelined evaluation: as the result of an operator is
produced, it is forwarded to the next operator in
sequence.
3-123
124
Evaluation
 evaluate multiple operations in a plan
 materialization
 pipelining
σcoursename=Advanced DBs
student takes
cid; hash join
courseid; index-
nested loop
course
πname
125
Materialization
 create and read temporary relations
 create implies writing to disk
 more page writes
σcoursename=Advanced DBs
student takes
cid; hash join
courseid; index-
nested loop
course
πname
126
Pipelining (1/2)
 creating a pipeline of operations
 reduces number of read-write operations
 implementations
 demand-driven - data pull
 producer-driven - data push σcoursename=Advanced DBs
student takes
cid; hash join
ccourseid; index-
nested loop
course
πname
127
Pipelining (2/2)
 can pipelining always be used?
 any algorithm?
 cost of R S
 materialization and hash join: BR + 3(BR+BS)
 pipelining and indexed nested loop join: NR *
HTi
σcoursename=Advanced DBs
student takes
cid
courseid
course
pipelined materialized
R S

More Related Content

PPT
Query processing-and-optimization
PDF
Measures of query cost
PPTX
Query decomposition in data base
PPTX
Query optimization
PPTX
Query processing and Query Optimization
PPTX
Query processing
PDF
Plsql lab mannual
PPT
Relational algebra.pptx
Query processing-and-optimization
Measures of query cost
Query decomposition in data base
Query optimization
Query processing and Query Optimization
Query processing
Plsql lab mannual
Relational algebra.pptx

What's hot (20)

PPTX
Segmentation in operating systems
PDF
Relational algebra in dbms
PPTX
System call
PPT
13. Query Processing in DBMS
PDF
Additional Relational Algebra Operations
PDF
Network layer logical addressing
PPTX
Distributed Query Processing
PDF
Normalization in DBMS
PPTX
Server system architecture
PPTX
Concurrency Control in Distributed Database.
PPTX
Relational algebra ppt
PPTX
Routing algorithm
PPTX
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
PPTX
Wireless transmission
PDF
Interconnection Network
PPT
15. Transactions in DBMS
PPTX
Synchronization in distributed computing
PPTX
RECURSIVE DESCENT PARSING
PPTX
Error detection and correction
Segmentation in operating systems
Relational algebra in dbms
System call
13. Query Processing in DBMS
Additional Relational Algebra Operations
Network layer logical addressing
Distributed Query Processing
Normalization in DBMS
Server system architecture
Concurrency Control in Distributed Database.
Relational algebra ppt
Routing algorithm
DISTRIBUTED DATABASE WITH RECOVERY TECHNIQUES
Wireless transmission
Interconnection Network
15. Transactions in DBMS
Synchronization in distributed computing
RECURSIVE DESCENT PARSING
Error detection and correction
Ad

Similar to Query optimization and processing for advanced database systems (20)

PPT
ch02-240507064009-ac337bf1 .ppt
PPT
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
PPTX
Ch-2-Query-Process.pptx advanced database
PPTX
700442110-advanced database Ch-2-Query-Process.pptx
PDF
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
PDF
Chapter 2.pdf WND FWKJFW KSD;KFLWHFB ASNK
PPTX
Query Processing in Database mgmt system
PDF
CH5_Query Processing and Optimization.pdf
PPTX
DB LECTURE 5 QUERY PROCESSING.pptx
PPTX
Query processing and optimization (updated)
PPTX
Query processing
PPTX
Query processing and optimization on dbms
PPTX
Chapter 4 - Query Processing and Optimization.pptx
PPTX
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
PPTX
Advanced Database System Chapter Two Query processing and Optimization.pptx
PPTX
Query-porcessing-& Query optimization
PPTX
Concepts of Query Processing in ADBMS.pptx
PDF
8 query processing and optimization
PPTX
Query processing
PPTX
Lecture 5.pptx
ch02-240507064009-ac337bf1 .ppt
QPOfutyfurfugfuyttruft7rfu65rfuyt PPT - Copy.ppt
Ch-2-Query-Process.pptx advanced database
700442110-advanced database Ch-2-Query-Process.pptx
itm661-lecture0VBBBBBBBBBBBBBBM3-part2-2015.pdf
Chapter 2.pdf WND FWKJFW KSD;KFLWHFB ASNK
Query Processing in Database mgmt system
CH5_Query Processing and Optimization.pdf
DB LECTURE 5 QUERY PROCESSING.pptx
Query processing and optimization (updated)
Query processing
Query processing and optimization on dbms
Chapter 4 - Query Processing and Optimization.pptx
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
Advanced Database System Chapter Two Query processing and Optimization.pptx
Query-porcessing-& Query optimization
Concepts of Query Processing in ADBMS.pptx
8 query processing and optimization
Query processing
Lecture 5.pptx
Ad

More from meharikiros2 (17)

PPTX
CHapter four database managementm04.pptx
PPT
CHAPTERGAHAhsghdfsfdsfsdfsfdsfsdfAGSHagsh-7.ppt
PPTX
chapter three 3-part II-1lecture slide.pptx
PPTX
ExitExam Tutorial gfgfgfdfdfdfdf (1).pptx
PPT
CHAPTER-7 C++ PROGRAMMING ( STRUCTURE IN C++)
PPTX
Emerging chap asasasasasawwqwqwwqwewewr4.pptx
PPTX
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
PPT
chapter1lecturenotes sdsdasdddadad(2).ppt
PPTX
Lab Session for sql programming language 1.pptx
PPTX
RIFLI-Computer-Basics-Part-1-1 lecture not
PPT
Lecture 01 - CS193Jxcxcxcx Summer 2003.ppt
PPT
JavaAdvanced programming for expertes dsd
PPTX
Computer_Programming_Fundamentals in cpp
PPT
SystemsProgrammingCourse FSDFFSFDSDSDSFSFS
PPTX
Introduction-to-C-Part-1 JSAHSHAHSJAHSJAHSJHASJ
PPT
This is introduction to distributed systems for the revised curiculum
PPTX
ITET-4.pptx
CHapter four database managementm04.pptx
CHAPTERGAHAhsghdfsfdsfsdfsfdsfsdfAGSHagsh-7.ppt
chapter three 3-part II-1lecture slide.pptx
ExitExam Tutorial gfgfgfdfdfdfdf (1).pptx
CHAPTER-7 C++ PROGRAMMING ( STRUCTURE IN C++)
Emerging chap asasasasasawwqwqwwqwewewr4.pptx
Chapter-1-IntroDistributeddffsfdfsdf-1.pptx
chapter1lecturenotes sdsdasdddadad(2).ppt
Lab Session for sql programming language 1.pptx
RIFLI-Computer-Basics-Part-1-1 lecture not
Lecture 01 - CS193Jxcxcxcx Summer 2003.ppt
JavaAdvanced programming for expertes dsd
Computer_Programming_Fundamentals in cpp
SystemsProgrammingCourse FSDFFSFDSDSDSFSFS
Introduction-to-C-Part-1 JSAHSHAHSJAHSJAHSJHASJ
This is introduction to distributed systems for the revised curiculum
ITET-4.pptx

Recently uploaded (20)

PPTX
UNIT 4 Total Quality Management .pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Welding lecture in detail for understanding
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Well-logging-methods_new................
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
Construction Project Organization Group 2.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Digital Logic Computer Design lecture notes
PPT
Project quality management in manufacturing
UNIT 4 Total Quality Management .pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Welding lecture in detail for understanding
Internet of Things (IOT) - A guide to understanding
Well-logging-methods_new................
UNIT-1 - COAL BASED THERMAL POWER PLANTS
CYBER-CRIMES AND SECURITY A guide to understanding
Foundation to blockchain - A guide to Blockchain Tech
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Construction Project Organization Group 2.pptx
CH1 Production IntroductoryConcepts.pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
Model Code of Practice - Construction Work - 21102022 .pdf
Digital Logic Computer Design lecture notes
Project quality management in manufacturing

Query optimization and processing for advanced database systems

  • 2. 2 Introduction Query Processing  Activities involved in retrieving data from the database.  This includes translation of high –level queries into low level expressions that can be used at physical level of the file system, query optimization and actual execution of the query to get the result.
  • 3. 3 Query Processing… Aims of query processing (QP):  Transform query written in high-level language (e.g., SQL), into correct and efficient execution strategy expressed in low-level language that implements relational algebra (RA);  Execute strategy to retrieve required data.
  • 4. Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation 3-4
  • 5. Parsing and translation  Scanner: The scanner specifies and recognizes the language tokens such as SQL Keywords, attribute names, and relation names in the text of the query.  Parser: The parser checks the query syntax to determine whether it is formulated according to the syntax rules of the query language.  Validation: The query must be validated by checking that all attributes and relation names are valid and semantically meaningful names in the schema of the particular database being queried. 3-5
  • 6. Parsing and translation  Query is converted to relational algebra by SQL interpreter.  Relational Algebra converted to annotated tree, joins as branches  Each operator has implementation choices. 6
  • 7. 7 Translating SQL Queries into Relational Algebra Query block: The basic unit that can be translated into the algebraic operators and optimized. A query block contains a single SELECT-FROM-WHERE expression, as well as GROUP BY and HAVING clause if these are part of the block. Nested queries: Within a query are identified as separate query blocks. Aggregate operators in SQL must be included in the extended algebra.
  • 8. Translation Example Possible SQL Query:  SELECT balance FROM account WHERE balance<2500 Possible Relational Algebra Query:  balance(balance<2500(account)) 3-8
  • 9. 9 Translating SQL Queries into Relational Algebra Consider: to find names of employees making more than everyone in department 5. SELECT lname, fname FROM employee WHERE salary > ( SELECT MAX(salary) FROM employee WHERE dno=5)
  • 10. 10 Translating SQL Queries into Relational Algebra 2 query blocks: SELECT lname, fname FROM employee WHERE salary > constant SELECT MAX(salary) FROM employee WHERE dno=5 Relational Algebra: π lname, fname (σsalary>cons (employee)) where cons is the result from: π MAX Salary (σdno=5(employee))
  • 11. 11 Translating SQL Queries into Relational Algebra consider: to find names of employees making more than everyone in department 5. SELECT lname,fname, dname FROM employee e, department d WHERE e.dno=d.dno Relational Algebra: π lname, fname (employee ⋈e.dno=d.dno department)
  • 12. Optimization The query optimizer selects an execution plan that has lowest and fastest but functionally equivalent form.  A relational algebra expression may have many equivalent expressions, each of which gives rise to a different evaluation plan.  Bala( bala>100(Account))   bala>100(Bala (Account)) both are equivalent query i.e. they display the same results. Amongst all equivalent evaluation plans choose the one with lowest cost. 3-12
  • 13. 13 Execution plan An internal representation of the query is then created, usually as a tree data structure called a query tree. The DBMS must then devise an execution strategy or plan for retrieving the results of the query from the database files. A query typically has many possible execution strategies, and the process of choosing a suitable one for processing a query is known as query optimization.
  • 14. Evaluation When the query came how the database answer it? The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query. 3-14
  • 15. 15 Relational Algebra: overview Project (unary)  <attr list> (R)  <attr list> is a list of attributes (columns) from R only  Ex: title, year, length (Movie) “horizontal restriction” A1 A2 A3 … An ... i A1 A2… Ak ... j  n K, n≥k
  • 16. 16 Project PROJECT can produce many tuples with same value  Relational algebra semantics says remove duplicates  SQL does not -- one difference between formal and actual query languages
  • 17. 17 Relational Algebra: Select Select or Restrict  <predicate> (R)  <predicate> is a conditional expression of the type that we are familiar with from conventional programming languages  <attribute> <op> <attribute>  <attribute> <op> <constant>  attribute in R  op  {=,,<,>,, …, AND, OR}  Ex: length100 (Movie) vertical restriction
  • 18. 18 Pictorially A1 A2 A3 … An ... i A1 A2 A3 … An ... j, i  j  title year length filmType Star Wars Mighty Ducks Wayne’s World 1977 1991 1992 124 104 95 color color color Movie result set # of selected tuples is referred to as the selectivity of the condition
  • 19. 19 Cartesian Product  R x S  Sets of all pairs that can be formed by choosing the first element of the pair to be any element of R, the second any element of S.  Resulting schema may be ambiguous  Use R.A or S.A to disambiguate an attribute that occurs in both schemas
  • 20. 20 Example A B 1 2 3 4 B C 2 5 4 7 D 6 8 9 10 11 x A R.BS.B C D R S 1 2 2 5 6 1 2 4 7 8 1 2 9 10 11 3 4 3 4 3 4 2 5 6 4 7 8 9 10 11
  • 21. 21 Join Operations Natural Join (binary)  R join S  Match only those tuples from R and S that agree in whatever attributes are common to the schemas of R and S  If r and s from r(R) and s(S) are successfully paired, result is called a joined tuple  This join operation is the same we used in earlier section to recombine relations that had been projected onto two subsets of their attributes (e.g., as a result of a BCNF decomposition)
  • 22. 22 Example A B 1 2 3 4 B C 2 5 4 7 D 6 8 9 10 11 join A B C D R S 1 2 5 6 3 4 7 8
  • 23. Optimization A relational algebra expression may have many equivalent expressions E.g.,salary75000(salary(instructor)) is equivalent to salary(salary75000(instructor)) Each relational algebra operation can be evaluated using one of several different algorithms  Correspondingly, a relational-algebra expression can be evaluated in many ways.  E.g., can use an index on salary to find instructors with salary < 75000,  or can perform complete relation scan and discard instructors with salary  75000 3-23
  • 24. Optimization…. Annotated expression specifying detailed evaluation strategy is called an evaluation-plan. Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost. Cost is estimated using statistical information from the database catalog e.g. number of tuples in each relation, size of tuples, etc.  Total cost= CPU cost + I/O cost + communication cost 3-24
  • 25. Three Key Concepts in QPO 1. Building blocks  Similarly, most DBMS have few building blocks: • select (point query, range query), join, sorting, ...  SQL query is decomposed in building blocks 2. Query processing strategies for building blocks  DBMS keeps a few processing strategies for each building block • e.g. a point query can be answer via an index or via scanning data-file 3. Query optimization  For each building block of a given query, DBMS QPO tries to choose • “most efficient” strategy given database parameters • parameter examples: table size, available indices, … • ex. index search is chosen for a point query if the index is available 3-25
  • 26. Query tree Query tree: a tree data structure that corresponds to a relational algebra expression. It represents the input relations of the query as leaf nodes of the tree, and represents the relational algebra operations as internal nodes. An execution of the query tree consists of executing an internal node operation whenever its operands are available and then replacing that internal node by the relation that results from executing the operation. 3-26
  • 27. Tree Representation of Relational Algebra balancebalance<2500(account)) balance balance<2500 account 3-27
  • 28. Making An Evaluation Plan Annotate Query Tree with evaluation instructions: The query can now be executed by the query execution engine. balance balance<2500 account use index 1 3-28
  • 29. Tree Representation of Relational Algebra A1,,,,Anp( R1 x,….Rk)) A1,,,An P x x x R3 R2 Rk R1 3-29
  • 30. Why Learn about QPO? Why learn about QPO in a DBMS?  Identify performance bottleneck for a query • is it the physical data model or QPO ?  How to help QPO speed up processing of a query ? • providing hints, rewriting query, etc.  How to enhance physical data model to speed up queries? • add indices, change file- structures, … 3-30
  • 31. Measures of Query Cost Cost is generally measured as total elapsed time for answering query  Many factors contribute to time cost • disk accesses, CPU, or even network communication Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account  Number of seeks * average-seek-cost  Number of blocks read * average-block-read-cost  Number of blocks written * average-block-write-cost • Cost to write a block is greater than cost to read a block • data is read back after being written to ensure that the write was successful 3-31
  • 32. 32 Algorithms for select operations Implementing the SELECT Operations There are many algorithms for executing a select operation , which is basically a search operation to locate the records in a disk file that satisfy a certain condition. Let as discuss on the ff relational operations.  OP1: SSN=“123” (Employee)  OP2: Dnumber>5 (department)  OP3: Dno>5 (employee)
  • 33. 33 Search Methods for Simple Selection S1.Linear search (brute force algorithm) : Retrieve every record in the file, and test whether its attribute values satisfy the selection condition. S2. Binary search: If the selection condition involves an equality comparison on a key attribute on which the file is ordered, binary search—which is more efficient than linear search—can be used.
  • 34. 34 Search Methods for Simple Selection S3. Using a primary index : If the selection condition involves an equality comparison on a key attribute with a primary index. for example, Eid = ‘123’ use the primary index to retrieve the record. Note that this condition retrieves a single record (at most). S4.Using a primary index to retrieve multiple records: If the comparison condition is >,>=,<, or <= on a key field with a primary index—for example, deptname > 5 use the index to find the record satisfying the corresponding equality condition (deptname= 5), then retrieve all subsequent records in the (ordered) file. For the condition deptname < 5, retrieve all the
  • 35. Search Methods for Simple Selection S5. Using a clustering index to retrieve multiple records: If the selection condition involves an equality comparison on a non- key attribute with a clustering index—for example, DNO = 5 is use the index to retrieve all the records satisfying the condition. S6. Using a secondary ( B+ tree) index on an equality comparison: This search method can be used to retrieve a single record if the indexing field is a key (has unique values) or to retrieve multiple records if the indexing field is not a key. This can also be used for comparisons involving >, >=, <, or <=. 3-35
  • 36. 36 Sorting Efficient evaluation for many operations Sorting uses keyboard order by:  SELECT cid,name FROM student ORDER BY name Implementations  Internal sorting (if records fit in main memory)  External sorting
  • 37. Why Sort?  A classic problem in computing  Data requested in sorted order  e.g., find students in increasing gpa order  Sorting is useful for eliminating duplicate copies in a collection of records  Problem: If a list is too large to fit in main memory, the time required to access a data value on a disk or tape dominates any efficiency analysis. E.g sort 10GB of data with 1GB of RAM.  Solution: Develop external sorting algorithms that minimize disk accesses. 3-37
  • 38. 38 External Sorting Refers to sorting algorithms that are suitable for large files of records stored on disk that do not fit entirely in main memory. External sorting handles a massive amount of data. This data may be too big to fit in RAM of the computer device for sorting. So data reside on slower external memory. The typical external sorting algorithm uses a sort-merge strategy, which starts by sorting small sub files called runs.
  • 39. 39 Basic External Sorting Algorithm  Assume unsorted data is on disk at start  Let M = maximum number of records that can be stored & sorted in internal memory at one time Algorithm Sort phase: First divide the file into runs such that the size of runs small enough to fit into main memory. 1. Read M records into main memory & sort internally. 2. Write this sorted sub-list onto disk. (This is one “run”). Until all data is processed into runs Merge phase: 1. Merge two runs into one sorted run by reading the first block of runs. 2. Pass first recodes to buffer blocks till buffer block is full 3. Write this output back to disk 4. When a block of a run is exhausted next block of the run is read
  • 40. 2-Way Sort: Requires 3 Buffers Phase 1: PREPARE.  Read a page, sort it, write it.  only one buffer page is used Phase 2, 3, …, etc.: MERGE:  Three buffer pages used. Main memory buffers INPUT 1 INPUT 2 OUTPUT Disk Disk Disk input Main memory Disk 1 buffer 1 buffer 1 buffer 3-40
  • 41. 41 Basic External Sorting 11 96 12 35 17 99 28 58 41 75 15 94 81 Unsorted Data on Disk Assume M = 3 (M would actually be much larger, of course.) First step is to read 3 data items at a time into main memory, sort them and write them back to disk as runs of length 3. 11 94 81 96 12 35 17 99 28 58 41 75 15
  • 42. 42 Basic External Sorting Next step is to merge the runs of length 3 into runs of length 6. 11 94 81 96 12 35 17 99 28 58 41 75 15 11 94 81 96 12 35 17 99 28 58 41 75 15
  • 43. 43 Basic External Sorting Next step is to merge the runs of length 6 into runs of length 12. 11 94 81 96 12 35 17 99 28 58 41 75 15 15 11 94 81 96 12 35 17 99 28 58 41 75
  • 44. 44 Basic External Sorting Next step is to merge the runs of length 12 into runs of length 24. Here we have less than 24, so we’re finished. 11 94 81 96 12 35 17 99 28 58 41 75 15 11 94 81 96 12 35 17 99 28 58 41 75 15
  • 45. 45 Example 2 18 20 19 14 11 12 16 13 21 17 15
  • 46. 46 Sort-Merge Example d 95 a 12 x 44 s 95 f 12 o 73 t 45 n 67 e 87 z 11 v 22 b 38 file memory t 45 n 67 e 87 z 11 v 22 b 38 d 95 a 12 x 44 a 12 d 95 x 44 R1 f 12 o 73 s 95 R2 e 87 n 67 t 45 R3 b 38 v 22 z 11 R4 a 12 d 95 x 44 s 95 f 12 o 73 run pass pass v 22 t 45 s 95 z 11 x 44 o 73 a 12 b 38 n 67 f 12 d 95 e 87
  • 47. Implementing the join operation The JOIN operation is one of the most time-consuming operations in query processing. Many of the join operations encountered in queries are of the EQUIJOIN and NATURAL JOIN varieties. The algorithms we consider are for join operations of the form :R joinA=B S Where A and B are domain-compatible attributes of R and S, respectively 3-47
  • 48. Methods for Implementing Joins J1. Nested-loop join (brute force): For each record t in R (outer loop), retrieve every record s from S (inner loop) and test whether the two records satisfy the join condition t[A] = s[B]. J2. Single-loop join (using an access structure to retrieve the matching records): If an index (or hash key) exists for one of the two join attributes—say, B of S—retrieve each record t in R, one at a time (single loop), and then use the access structure to retrieve directly all matching records s from S that satisfy s[B] = t[A]. 3-48
  • 49. Methods for Implementing Joins J3. Sort–merge join: If the records of R and S are physically sorted by value of the join attributes A and B, respectively, we can implement the join in the most efficient way possible, in order of the join attributes A and B. If the files are not sorted, they may be sorted first by using external sorting. In this method, pairs of file blocks are copied into memory buffers in order and the records of each file are scanned only once each for matching with the other file. unless both A and B are non-key attributes, in which case the method needs indexing. R(i) to refer to the record in R.A variation of the sort-merge join. 3-49
  • 50. Methods for implementing joins J4. Hash-join: The records of files R and S are both hashed to the same hash file, using the same hashing function on the join attributes A of R and B of S as hash keys. First, a single pass through the file with fewer records (say, R) hashes its records to the hash file buckets; this is called the partitioning phase, since the records of R are partitioned into the hash buckets. In the second phase, called the probing phase, a single pass through the other file (S) then hashes each of its records to probe the appropriate bucket, and that record is combined with all matching records from R in that bucket. This simplified description of hash-join assumes that the smaller of the two files fits entirely into memory buckets after the first phase. 3-50
  • 51. methods of query optimization There are two methods of query optimization. 1. Cost based Optimization (Physical) This is based on the cost of the query. The query can use different paths based on indexes, constraints, sorting methods etc. This method mainly uses the statistics like record size, number of records, number of records per block, number of blocks, table size, whether whole table fits in a block, organization of tables, uniqueness of column values, size of columns etc 3-51
  • 52. methods of query optimization cont… 2. Rule based optimization: Use heuristics, called query rewrite rules  eliminate many of the really bad plans  Rules that will improve performance with very high probability  Getting queries into a form that we know how to handle best  This method creates relational tree for the given query based on the equivalence rules. When these equivalence rules provide an alternative way of writing and evaluating the query, gives the better path to evaluate the query. 3-52
  • 53. Query Rewrite Rules  Transform one logical plan into another  Do not use statistics  Equivalences in relational algebra  Push-down predicates  Write projects early  Avoid cross-products if possible  Use left-deep trees  Use of constraints, e.g., uniqueness 3-53
  • 54. Query Rewrite Rules  First, move SELECT operations down the query tree  Second, perform the more restrictive SELECT operations first  Third, replace CARTESIAN PRODUCT and SELECT combinations with JOIN operations  Finally, move PROJECT operations down the query tree  This is called heuristic optimization 3-54
  • 55. Example Query Select B,D From R,S Where R.A = “c”  R.C=S.C 3-55
  • 56. Initial Logical Plan Relational Algebra: B,D [ R.A=“c” R.C = S.C (RXS)] Select B,D From R,S Where R.A = “c”  R.C=S.C B,D R.A = “c” Λ R.C = S.C X R S 3-56
  • 57. Apply Rewrite Rule (1) B,D [ R.C=S.C [R.A=“c”(R X S)]] Split the conjunction into two select predicates, the order doesn’t matter B,D R.A = “c” Λ R.C = S.C X R S B,D R.A = “c” X R S R.C = S.C 3-57
  • 58. Apply Rewrite Rule (2) B,D [ R.C=S.C [R.A=“c”(R)] X S] B,D R.A = “c” X R S R.C = S.C B,D R.A = “c” X R S R.C = S.C 3-58
  • 59. Apply Rewrite Rule (2) B,D [ R.C=S.C [R.A=“c”(R)] X S] B,D R.A = “c” R S R.C = S.C B,D R.A = “c” X R S R.C = S.C 3-59
  • 60. • How do we execute this query? - Do Cartesian product - Select tuples - Do projection One idea Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C 3-60
  • 61. R A B C S C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 Answer B D 2 x Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C 3-61
  • 62. 62 An Example (cont.) Plan 1  Cross product of R & S  Select tuples using WHERE conditions  Project on B & D Algebra expression B,D R.A=‘c’ S.E=2 R.C=S.C  R S B,D(R.A=‘c’ S.E=2 R.C=S.C (R S))
  • 63. R X S R.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 . . c 2 10 10 x 2 . . Found! Got one... Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C 3-63
  • 64. 64 An Example (cont.) Plan 2  Select R tuples with R.A=“c”  Select S tuples with S.E=2  Natural join  Project B & D Algebra expression B,D S.E=2 R S R.A=‘c’ B,D( R.A=“c” (R) S.E=2 (S))
  • 65. Relational Algebra Primer Select: R.A=“c” R.C=10 Project: B,D Cartesian Product: R X S Natural Join: R S 3-65
  • 66. Another idea: B,D R.A = “c” S.E = 2 R(A,B,C) S(C,D,E) Plan II natural join Select B,D From R,S Where R.A = “c”  S.E = 2  R.C=S.C 3-66
  • 67. 67 Query Evaluation How to evaluate individual relational operation?  Selection: find a subset of rows in a table  Join: connecting tuples from two tables  Other operations: union, projection, … How to estimate cost of individual operation? How does available buffer affect the cost? How to evaluate a relational algebraic expression?
  • 68. Algebraic Laws Commutative and Associative Laws  R U S = S U R, R U (S U T) = (R U S) U T  R ∩ S = S ∩ R, R ∩ (S ∩ T) = (R ∩ S) ∩ T 3-68
  • 69. Algebraic Laws Laws involving selection:   C AND C’(R) =  C( C’(R)) =  C(R) ∩  C’(R)   C OR C’(R) =  C(R) U  C’(R)   C (R U S) =  C (R) U  C (S) • When C involves only attributes of R   C (R S) =  C (R) S   C (R – S) =  C (R) – S   C (R ∩ S) =  C (R) ∩ S    3-69
  • 70. Transformation Rules for RA Operations Conjunctive Selection operations can cascade into individual Selection operations (and vice versa). pqr(R) = p(q(r(R))) Sometimes referred to as cascade of Selection. branchNo='B003'  salary>15000(Staff) = branchNo='B003'(salary>15000(Staff)) 70 3-70
  • 71. Transformation Rules for RA Operations Commutativity of Selection. p(q(R)) = q(p(R)) For example: branchNo='B003'(salary>15000(Staff)) = salary>15000(branchNo='B003'(Staff)) 71 3-71
  • 72. Disk Structure Storing Data: Disks and Files 72 3-72
  • 73. The Storage Hierarchy –Main memory (RAM) for currently used data. –Disk for the main database (secondary storage). –Tapes for archiving older versions of the data (tertiary storage). Smaller, Faster Bigger, Slower 3-73
  • 74. Disks and Files DBMS stores information on disks. This has major implications for DBMS design!  READ: transfer data from disk to main memory (RAM).  WRITE: transfer data from RAM to disk.  Both are high-cost operations, relative to in-memory operations, so must be planned carefully! 3-74
  • 75. Components of a Disk Platters Spindle The arm assembly is moved in or out to position a head on a desired track. Tracks under heads make a cylinder (imaginary!). Disk head Arm movement Arm assembly Tracks Sector Block size is a multiple of sector size (which is fixed). 75
  • 76. Disks  Secondary storage device of choice.  Main advantage over tapes: random access vs. sequential.  Data is stored and retrieved in units called disk blocks or pages.  Unlike RAM, time to retrieve a disk block varies depending upon location on disk.  Therefore, relative placement of blocks on disk has major impact on DBMS performance! 3-76
  • 77. Accessing a Disk Page Time to access (read/write) a disk block:  seek time (moving arms to position disk head on track)  rotational delay (waiting for block to rotate under head)  transfer time (actually moving data to/from disk surface) Seek time and rotational delay dominate.  Seek time varies between about 0.3 and 10msec  Rotational delay varies from 0 to 4msec  Transfer rate around 0.08msec Key to lower I/O cost: reduce seek/rotation delays! Hardware vs. software solutions? 3-77
  • 78. Index and index structure Index is  Mechanism for efficiently locating rows without having to scone entire table. Ex author catalog in library  Based on a search key: rows having a particular value for the search key attributes can be quickly located.  Candidate key-set of attributes, quarantines uniqueness.  Search key:- sequence of attributes, does not guarantee uniqueness.  This minimize the no of disk access required or it’s the way of optimizing the performance of database 3-78
  • 79. Structure of index  Search Key - attribute to set of attributes used to look up records in a file.  An index file consists of records (called index entries) of the form  Index files are typically much smaller than the original file  Pointer- holds address of particular disk block where the key value can be found.  Two basic kinds of indices:  Ordered indices: search keys are stored  Hash indices: search keys are distributed uniformly across “buckets” using a “hash function”. search-key pointer 3-79
  • 80. Index Evaluation Metrics Access types supported efficiently.  Equality searches – records with a specified value in an attribute.  Range searches – records with an attribute value falling within a specified range Access time-time to find and use a files Insertion time- time to push new record Deletion time-time to delete from record Space overhead- how much extra byte need for the index itself. 3-80
  • 81. Classification of Indexing  In an ordered index, index entries are stored sorted on the search key value  Eg. Author catalog in library  Primary index: in a sequentially ordered file, the index whose search key specifies the sequential order of the actual file. Also called clustering index.  Index entry is created for first record of each block  No of index entries= no of blocks  Secondary index: an index whose search key specifies an order different from the sequential order of the file. Also called non- clustering index. Number of entry in index file = number of entry in main file 3-81
  • 82. Primary Dense Index Files  Dense index — Index record appears for every search-key value in the file. Or  every entry for possible search key values. Faster but it requires more space to store index itself.  E.g. index on ID 3-82
  • 83. Dense Index Files (Cont.) Dense index on dept_name, with instructor file sorted on dept_name Don’t have a pointer to every records but one which has for search key 3-83
  • 84. Primary Sparse Index Files  Sparse Index: contains index records for only some search- key values  To locate a record with search-key value K :  Find index record with largest search-key value < K  Search file sequentially starting at the record to which the index record points  You reach to the nearest record the follow pointer. 3-84
  • 85. Sparse Index Files (Cont.) Compared to dense indices:  Less space and less maintenance overhead for insertions and deletions.  Generally slower than dense index for locating records. Good tradeoff: sparse index with an index entry for every block in file, corresponding to least search-key value in the block. 3-85
  • 86. Problems with simple indexes  Ex 100,000 entries  If we create desen index it will have very large index  If create sparse index we may have 50,000 sparse index.  Solution: create multiple sparse index 3-86
  • 87. Multilevel Index If primary index does not fit in memory, access becomes expensive. Solution: treat primary index kept on disk as a sequential file and construct a sparse index on it.  outer index – a sparse index of primary index  inner index – the primary index file  If even outer index is too large to fit in main memory, yet another level of index can be created, and so on.  Indices at all levels must be updated on insertion or deletion from the file. 3-87
  • 89. Multilevel Index B+-Tree Index Files  All the data is stored in leaf node.  Every leaf is at the same level  Internal nodes stores just keys  All the leafs have pointer/links with each other for faster accesses like link list  Keys are used for directing a search to the proper key  There is threshold level(M)= max no of elements at a node. 3-89
  • 90. B+-Tree Node Structure Typical node  Ki are the search-key values  Pi are pointers to children or pointers to records (for leaf nodes). The search-keys in a node are ordered K1 < K2 < K3 < . . . < Kn–1 3-90
  • 91. B+-Tree Node Structure Advantage of B+-tree index files:  Automatically reorganizes itself with small, local changes, in the time of insertions and deletions.  Reorganization of entire file is not required to maintain performance. Disadvantage of B+-trees:  Extra insertion and deletion overhead, space overhead.  B+-trees are used extensively 3-91
  • 92. B+-Tree Node Structure Internal nodes:  Internal (non-leaf) nodes contain at least ⌈n/2⌉ pointers, except the root node.  At most, an internal node can contain n pointers Leaf nodes:  Leaf nodes contain at least ⌈n/2⌉ record pointers and ⌈n/2⌉ key values.  At most, a leaf node can contain n record pointers and n key values.  Every leaf node contains one block pointer P to point to next leaf node and forms a linked list 3-92
  • 93. B+ Tree Insertion B+ tree are filled from bottom and each entry is done at the leaf node. If a leaf node overflows − • Split node into two parts. • Partition at i = ⌊(m + 1)/2⌋. • First i entries are stored in one node. • Rest of the entries i + 1 on wards are moved to a new node. • ith key is duplicated at the parent of the leaf. 3-93
  • 94. B+ Tree Insertion If a non-leaf node overflows − Split node into two parts. Partition the node at i = ⌈m + 1/2⌉. Entries up to i are kept in one node. Rest of the entries are moved to a new 3-94
  • 95. B+ Tree node structure A B Values <A Values>=A &&<B 3-95
  • 96. B+ Tree Deletion  B+ tree entries are deleted at the leaf nodes.  The target entry is searched and deleted.  If it is an internal node, delete and replace with the entry from the left position.  After deletion, underflow is tested,  If underflow occurs, distribute the entries from the nodes left to it.  If distribution is not possible from left, then  Distribute from the nodes right to it.  If distribution is not possible from left or from right, then  Merge the node with left and right to it. 3-96
  • 97. Rules for Constructing a B+ Tree  Key values are max node of elements in the node  The number of key values contained in a non-leaf node is 1 less than the number of pointers  So if n is the order key is M=n-1.  If the tree has order n, the number of occupied key values in a leaf node must be between (n-1)/2 and n-1.  If (n-1)/2 is not an integer, round up to determine the minimum number of occupied key values.  The tree must be balanced, that is, every path from the root node must have the same length. 3-97
  • 98. Rules for Constructing a B+ Tree Example n=5 Node Min Node Max Node Min Node Max Node Root 1 1 Internal node N/2 N 3 5 Leaf node N/2 N-1 3 4 3-98
  • 99. Searching Since no structure change in a B+ tree during a searching process, so just compare the key value with the data in the tree, then give the result back. For example: find the value 45 and 15 in below tree. 3-99
  • 100. Searching Result: 1. For the value of 45, not found. 2. For the value of 15, return the position where the pointer located. 3-100
  • 101. Insertion  Since insert a value into a B+ tree may cause the tree unbalance, so rearrange the tree if needed.  Example #1: insert 28 into the below tree. 25 28 30 3-101
  • 103. Insertion  Example #2: insert 70 into below tree 3-103
  • 104. Insertion  Process: split the tree 50 55 60 65 70 50 55 60 65 70 104
  • 105. Insertion Result: chose the middle key 60, and place it in the index page between 50 and 75. 3-105
  • 106. Insertion Exercise: add a key value 95 to the below tree. 75 80 85 90 95 25 50 60 75 85 75 80 85 90 95 3-106
  • 107. Insertion Result: again put the middle key 60 to the index page and rearrange the tree. 3-107
  • 108. Deletion  Same as insertion, the tree has to be rebuild if the deletion result violate the rule of B+ tree.  Example #1: delete 70 from the tree 60 65 This is OK. 3-108
  • 110. Deletion Example #2: delete 25 from below tree, but 25 appears in the index page. 28 30 But… This is OK. 3-110
  • 111. Deletion  Result: replace 28 in the index page. Add 28 3-111
  • 112. Deletion  Example #3: delete 60 from the below tree 65 50 55 65 3-112
  • 113. Deletion  Result: delete 60 from the index page and combine the rest of index pages. 3-113
  • 114. Example 1  Create B +Tree for the following data.  Insert: 50,75,100,120 M=2 3-114
  • 116. Insert 100 * 75 * * 50 * * 75 *100* a b c Node a splits creating 2 children: b and c 3-116
  • 117. Insert 8 * 50 * 100* 120 * * 100 * * 75 * * 75 * 3-117
  • 118. Exercise 1  B+Tree of order n=4 then key=3  Insert: 2,4,7,10,17,21,28 3-118
  • 119. Exercise 1  B+Tree of order n=6  Insert: 10,20,30,40,50,60,70,80,90 M=5 3-119
  • 120. Exercise 2  B+Tree of order n=4  Insert: 20,15,5,1,3,9,2 M=3 3-120
  • 121. Exercise 3  B+ Tree of order 3  Insert: 5,8,4,16,23,10,15 M=2 3-121
  • 122. Conclusion  For a B+ Tree:  It is easy to maintain it’s balance.  The searching time is short than most of other types of trees. 3-122
  • 123. Query Execution Plans An execution plan for a relational algebra query consists of a combination of the relational algebra query tree and information about the access methods to be used for each relation as well as the methods to be used in computing the relational operators stored in the tree. • Materialized evaluation: the result of an operation is stored as a temporary relation. • Pipelined evaluation: as the result of an operator is produced, it is forwarded to the next operator in sequence. 3-123
  • 124. 124 Evaluation  evaluate multiple operations in a plan  materialization  pipelining σcoursename=Advanced DBs student takes cid; hash join courseid; index- nested loop course πname
  • 125. 125 Materialization  create and read temporary relations  create implies writing to disk  more page writes σcoursename=Advanced DBs student takes cid; hash join courseid; index- nested loop course πname
  • 126. 126 Pipelining (1/2)  creating a pipeline of operations  reduces number of read-write operations  implementations  demand-driven - data pull  producer-driven - data push σcoursename=Advanced DBs student takes cid; hash join ccourseid; index- nested loop course πname
  • 127. 127 Pipelining (2/2)  can pipelining always be used?  any algorithm?  cost of R S  materialization and hash join: BR + 3(BR+BS)  pipelining and indexed nested loop join: NR * HTi σcoursename=Advanced DBs student takes cid courseid course pipelined materialized R S