SlideShare a Scribd company logo
Overview of Query Evaluation System catalogs is used to find the best way to evaluate the query SQL queries are translated into an extended form of relational algebra Queries are composed of several operators, and the algorithm for individual operators can be combined in many ways to evaluate the query System catalogs in Oracle Called data dictionary Access is allowed through views Categories (used as a prefix) »  USER »  ALL »  DBA »  Tables –  ALL_CATALOG –  _TAB_COLUMNS –  _TABLES –  _INDEXES –  _VIEWS
Examples of system catalog SELECT * FROM all_catalog WHERE owner = 'SMITH'; SELECT table_name, column_name FROM user_tab_columns WHERE table_name = 'EMPLOYEE'; SELECT num_rows, blocks, empty_blocks FROM user_tables Where table_name = 'EMPLOYEE'; SELECT view_name, text FROM user_views; Select * from user_constraintsl; Select CONSTRAINT_TYPE from user_constraints where TABLE_NAME=‘STUD’;
Query optimization Strengths of relational query language is the wide variety of ways in which a user can express the query and system can evaluate it How flexible the queries are written , it expresses the performance (good/bad) greatly on the quality of query optimizer Queries are parsed and then presented to  query optimizer, which is responsible for identifying an efficient execution plan Optimizer generates the alternative plans and least estimated cost plan is chosen ;Query is essentially treated as  σ  –  П  – join algebra exprn with remaining operations carried out on the result of above exprn Query optimization is the process of identifying the access plan with the minimum  cost Cost = Time taken to get all the answers Starting with System-R, most DBMSs use the same algorithm generate most of the access plans and select the cheapest one First, how do we determine the cost of a plan? Then, how long is this process going to take and how do we make it faster?
Query evaluation Alternative ways of evaluating a given query Equivalent expressions Different algorithms for each operation
Query execution cost Query execution cost is usually a weighted sum of the I/O cost (# disk accesses) and CPU cost (msec) w * IO_COST + CPU_COST Basic Idea: Cost of an operator depends on input data size, data distribution, physical layout The optimizer uses statistics about the relations to  estimate  the cost Need statistics on base relations and intermediate results
CPU costing model for query Platform: Oracle ,  DB Ver: 9.2 The formula for the cost (using the CPU Costing Model) of a query is: Cost = ( #SRds * sreadtime  + #MRds * mreadtime + #CPUCycles / cpuspeed ) / sreadtime where: #SRds = number of single block reads #MRds = number of multi block reads #CPUCycles = number of CPU Cycles sreadtim = single block read time mreadtime = multi block read time cpuspeed = Standard 'Oracle' CPU cycles per second The translation of this formula is: The cost is the time spent on single block reads, plus the time spent on multiblock reads, plus the CPU time required, all divided by the time is takes to do a single block read. This means that the cost of a query is the PREDICTED EXECUTION TIME, counted in  number of single block read times  and is effectively the unit of measure of the cost.
Query evaluation plan It consists of an extended relational algebra tree, with info at each node indicating the access methods to use for each table and the implementation method to use for each relational operator Consider the query:- Select s.sname from reserves R,Sailor S where R.sid=S.sid and R.bid=100 and s.rating>5; In Relational algebra it can be expressed as, П sname( σ bid=100 & rating > 5( σ sid=sid  reserves join sailors)) (draw diag.)
Query processing Query is processed in 3 phases, as below:- Parsing :  DBMs parses the SQL query and chooses the most efficient access/execution plan Execution:  the DBMs executes the SQL query using the chosen execution plan Fetching:  the DBMS fetches the data and sends the result set back to the client The processing of DDL is different from DML For DDL, DBMS actually updates the data dictionary tables or system catalog while DML manipulates end user data
SQL parsing phase Optimization process includes breaking down, parsing the query into smaller units and transforming the original query into slightly diff. version of original sql code SQL query can be fully equivalent and more efficient Fully equivalent means optimized query results are always as same as the original query More efficient means optimized query will always execute faster than original query Parsing activities are performed by query optimizer, they are as below :- Validated for syntax compliance Validated against data dictionary to ensure tables and col.are correct Validated againt data dictionary to ensure the user has proper access permissions Analyzed and decomposed into more atomic components Prepared for execution by determining the most efficient execution plan
SQL parsing ex The following operations are made during the  parsing . Validate the syntax of the statement: is the query a valid SQL statement? SQL> select nothing where 1=2; select nothing where 1=2                * ERROR at line 1: ORA-00923: FROM keyword not found where expected Validate the semantic of the statement: are the objects valid? is there any ambiguity? does the constant fit into the column?... SQL> select col from not_existent_table; select col from not_existent_table                 * ERROR at line 1: ORA-00942: table or view does not exist Search in the  shared pool : Is the query text already known (search among all the query texts)? if not, error Does the query referenced the same objects (search among all versions of the query)? if not, error Is the execution environment identical (same search)? If yes, execute the query. Allocate memory in the  shared pool  to store the data about the query Get the values of the bind variables and check if all values fit in the columns
Parsing ex.contd SQL> var v varchar2(20); SQL> exec :v := '12345678901' PL/SQL procedure successfully completed. SQL> insert into michel.t values (:v); insert into michel.t values (:v)                               * ERROR at line 1: ORA-12899: value too large for column "MICHEL"."T"."COL" (actual: 11, maximum: 10) Optimize the query execution Build the  parse tree  and the execution plan in a format that the SQL engine can use, this is named  row source generation Store the  parse tree  and the execution plan in the  shared pool .
Parsing and execution Once the SQL stmt is transformed , the DBMS created what is commonly known as an access/execution plan Access/execution plan contains series of steps a DBMs will use to execute the query and return the result set in most efficient way SQL execution :-  all i/o operations are indicated in the access plan are executed. When the execution plan is run, the proper locks are acquired for the data to be accessed and then retrieved  from data files and placed in DBMs data cache SQL fetching :-  after the parsing and execution phases are completed, all rows that match the specified conditions are retrieved ,sorted and grouped and/or aggregated In the fetching phase, the rows of resulting query result set are returned to the client. During this phase, the DBMS may use temporary table space to store temporary data
Query evaluation plan An  evaluation plan  defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated
cost-based query optimization Cost difference between evaluation plans for a query can be enormous E.g. seconds vs. days in some cases Steps in  cost-based query optimization Generate logically equivalent expressions using  equivalence rules Annotate resultant expressions to get alternative query plans Choose the cheapest plan based on  estimated cost Estimation of plan cost based on: Statistical information about relations. Examples: number of tuples, number of distinct values for an attribute Statistics estimation for intermediate results to compute cost of complex expressions Cost formulae for algorithms, computed using statistics
optimization Explain plan for select * from table where v_nm like ‘b%’ order by column; Explained – o/p Select * from table(DBMS_XPLAN.DISPLAY); Plan_table_o/p Predicate info Note:- No_of_rows selected
Optimization contd… Analyze table table_nm compute statistics; Explain plan for select * from table where …. Select * from table(DBMS_XPLAN.DISPLAY) Predicate info(identified by operation id) Note: CPU costing is off
Query graph and query plan Query Graph  is a single graph corresponding to each query. It does not specify any order on which operation to perform first. Query Plan  ( prev.diag) presents a specific order of operations for executing a query. It is  a set of steps used to help accessing and modifying a SQL RDMS. Since SQL is declarative, there are typically a large number of alternative ways to execute a given query, with widely varying performance.  When a query is submitted to the database, the query optimizer evaluates some of the different, correct possible plans for executing the query and returns what it considers the best alternative SQL query will be analysed first and parsed into a query graph
System catalog System catalog The collection of files corresponding to user’s tables and indexes represents the data in the database A relational DBMS contains info about every table and index that it contains The descriptive info is stored in a collection of special tables called as  catalog tables The catalog tables are known as data dictionary or system catalog
Information catalog In this, we have the info such as the size of the buffer pool,the page size and following info about the tables, indexes and views For each table, Its name,the file name, and the structure if the file in which it is stored The attribute name and the type The index name of each index on the table Integrity constraints For each index The index name and the structure of index The search key attributes For each view Its view name and definition
Statistics on System catalog (i)Cardinality :-the no. of N tuples for table R (ii)size:-the N no.of pages for each table R (iii)Index cardinality:-the no.of distinct key values for each index I (iv)Index size:-the no.of pages for each index I (v)Index height:-the number of non leaf levels  for each tree index I (vi)Index range:- the minimum present key value low val and max value for each index I
Common techq. For operator evaluation Indexing: if selection/join is specified use an index to examine tuples to satisfy condition Iteration: examine all tuples in an input table,one after other. Partitioning: partitioning tuples on a sort key. Sorting and hashing are used as partitioning techq.
Access Paths & cost model The selectivity of access paths is the number of pages retrieved(index and data pages), we use access paths to retrieve all desired tuples If a table contains an index that matches given selection, there are at least 2 access paths:- Index A scan of the data file The most selective access path is the one that retrieves the fewest pages; selective access paths minimizes the cost of data retrieval
The selectivity of the access paths depends on primary conjuncts in the selection condition Each conjunct acts as a filter on the table The fraction of the tuples that satisfy the conjunct is called the reduction factor Ex. We have a hash index H on sailors with search key(rname,bid,sid) and selection condition is rname=‘joe’ and bid=5 and sid=3 Index can be used to retrieve the tuples that satisfy all three
The catalog contains the  number of distinct key values ,Nkeys(H),in the hash index, as well as the number of pages, Npages, in the sailors table.  The fraction of pages satisfying primary conjuncts is Npages(sailors)*1/Nkeys(H) Selection, project and join Selection :- it is in the form  σ R.attr op value (R) Projection is to eliminate duplicates, to use partitioning Join :- joining the relations
Pipelined evaluation When a query is composed of several operators, the result of one operator is pipelined to another operator without creating temporary table to hold intermediate result If the o/p of an operator is saved in a temp. table for processing by the next operator, then it is materialized Pipelined evaluation has lower overhead costs than materialization(obviously as one new table is used) pg.407

More Related Content

PPTX
Distributed file system
PDF
Big data unit i
PPTX
Message passing in Distributed Computing Systems
PPTX
MEDIUM ACCESS CONTROL
PPTX
distributed Computing system model
PPT
Intruders
PPT
Distributed data processing
PDF
Distributed Database practicals
Distributed file system
Big data unit i
Message passing in Distributed Computing Systems
MEDIUM ACCESS CONTROL
distributed Computing system model
Intruders
Distributed data processing
Distributed Database practicals

What's hot (20)

PDF
PPTX
AI3391 Artificial intelligence Session 22 Cryptarithmetic problem.pptx
PDF
CRYPTOGRAPHY AND NETWORK SECURITY
PDF
Symmetric Cipher Model, Substitution techniques, Transposition techniques, St...
PPT
OLAP
PPTX
Query evaluation and optimization
PPTX
Confidentiality using symmetric encryption.pptx
PPT
CONVENTIONAL ENCRYPTION
PPTX
3 Data Mining Tasks
PPTX
Recognition-of-tokens
ODP
Distributed shared memory shyam soni
PPT
13. Query Processing in DBMS
PDF
Transport layer services
PPTX
Intro to Big Data and NoSQL
PPT
Branch prediction
PPT
Type Checking(Compiler Design) #ShareThisIfYouLike
PPTX
Multiplexing in mobile computing
PDF
Big Data Evolution
PPTX
UNIT - 1 Part 2: Data Warehousing and Data Mining
AI3391 Artificial intelligence Session 22 Cryptarithmetic problem.pptx
CRYPTOGRAPHY AND NETWORK SECURITY
Symmetric Cipher Model, Substitution techniques, Transposition techniques, St...
OLAP
Query evaluation and optimization
Confidentiality using symmetric encryption.pptx
CONVENTIONAL ENCRYPTION
3 Data Mining Tasks
Recognition-of-tokens
Distributed shared memory shyam soni
13. Query Processing in DBMS
Transport layer services
Intro to Big Data and NoSQL
Branch prediction
Type Checking(Compiler Design) #ShareThisIfYouLike
Multiplexing in mobile computing
Big Data Evolution
UNIT - 1 Part 2: Data Warehousing and Data Mining
Ad

Viewers also liked (20)

PDF
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
PPT
14. Query Optimization in DBMS
PDF
8 query processing and optimization
PPT
Chapter15
PPT
Query optimization
PDF
Query evaluation over network of data aggregators
PPTX
Query processing
PPTX
Distributed Query Processing
PPTX
PPT
3.9 external sorting
PPTX
2 optimization
DOCX
Index in sql server
PPTX
Before you optimize: Understanding Execution Plans
PDF
Strategies for SQL Server Index Analysis
PDF
Presentation interpreting execution plans for sql statements
PPT
external sorting
PPTX
Query Optimization
PPTX
Cost estimation for Query Optimization
PPTX
Introduction of sql server indexing
PPT
Classroom Observation Techniques
Query Processing and Optimisation - Lecture 10 - Introduction to Databases (1...
14. Query Optimization in DBMS
8 query processing and optimization
Chapter15
Query optimization
Query evaluation over network of data aggregators
Query processing
Distributed Query Processing
3.9 external sorting
2 optimization
Index in sql server
Before you optimize: Understanding Execution Plans
Strategies for SQL Server Index Analysis
Presentation interpreting execution plans for sql statements
external sorting
Query Optimization
Cost estimation for Query Optimization
Introduction of sql server indexing
Classroom Observation Techniques
Ad

Similar to Overview of query evaluation (20)

PPTX
Query processing and optimization on dbms
PDF
dd presentation.pdf
PDF
Measures of query cost
PPTX
Oracle performance tuning for java developers
PPTX
Query processing and optimization (updated)
PPT
Database performance tuning and query optimization
PDF
Managing Statistics for Optimal Query Performance
PPTX
Sql and PL/SQL Best Practices I
PPTX
Query optimization
PDF
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
PPTX
Oracle Query Optimizer - An Introduction
PPTX
Query processing
PDF
Issues in Query Processing and Optimization
PDF
Brad McGehee Intepreting Execution Plans Mar09
PDF
Brad McGehee Intepreting Execution Plans Mar09
PPTX
Query-porcessing-& Query optimization
PPTX
Lecture21-Query-Optimization-1April-2018.pptx
PPT
Query optimization and processing for advanced database systems
PPTX
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
PPT
The life of a query (oracle edition)
Query processing and optimization on dbms
dd presentation.pdf
Measures of query cost
Oracle performance tuning for java developers
Query processing and optimization (updated)
Database performance tuning and query optimization
Managing Statistics for Optimal Query Performance
Sql and PL/SQL Best Practices I
Query optimization
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
Oracle Query Optimizer - An Introduction
Query processing
Issues in Query Processing and Optimization
Brad McGehee Intepreting Execution Plans Mar09
Brad McGehee Intepreting Execution Plans Mar09
Query-porcessing-& Query optimization
Lecture21-Query-Optimization-1April-2018.pptx
Query optimization and processing for advanced database systems
LECTURE_06_DATABASE PROCESSING & OPTIMAZATION.pptx
The life of a query (oracle edition)

More from avniS (9)

PPT
Transaction unit 1 topic 4
PPT
Transaction unit1 topic 2
PPT
Sequences
PPT
Normalization
PPT
Multivalued dependency
PPT
Locks with updt nowait
PPT
Locking unit 1 topic 3
PPT
3 phases in transactions 3 units
PPT
Changing trends in sw development
Transaction unit 1 topic 4
Transaction unit1 topic 2
Sequences
Normalization
Multivalued dependency
Locks with updt nowait
Locking unit 1 topic 3
3 phases in transactions 3 units
Changing trends in sw development

Recently uploaded (20)

PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Modernizing your data center with Dell and AMD
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Approach and Philosophy of On baking technology
NewMind AI Weekly Chronicles - August'25 Week I
Chapter 3 Spatial Domain Image Processing.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Monthly Chronicles - July 2025
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Modernizing your data center with Dell and AMD
Dropbox Q2 2025 Financial Results & Investor Presentation
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Building Integrated photovoltaic BIPV_UPV.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation_ Review paper, used for researhc scholars
Spectral efficient network and resource selection model in 5G networks
Network Security Unit 5.pdf for BCA BBA.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...

Overview of query evaluation

  • 1. Overview of Query Evaluation System catalogs is used to find the best way to evaluate the query SQL queries are translated into an extended form of relational algebra Queries are composed of several operators, and the algorithm for individual operators can be combined in many ways to evaluate the query System catalogs in Oracle Called data dictionary Access is allowed through views Categories (used as a prefix) » USER » ALL » DBA » Tables – ALL_CATALOG – _TAB_COLUMNS – _TABLES – _INDEXES – _VIEWS
  • 2. Examples of system catalog SELECT * FROM all_catalog WHERE owner = 'SMITH'; SELECT table_name, column_name FROM user_tab_columns WHERE table_name = 'EMPLOYEE'; SELECT num_rows, blocks, empty_blocks FROM user_tables Where table_name = 'EMPLOYEE'; SELECT view_name, text FROM user_views; Select * from user_constraintsl; Select CONSTRAINT_TYPE from user_constraints where TABLE_NAME=‘STUD’;
  • 3. Query optimization Strengths of relational query language is the wide variety of ways in which a user can express the query and system can evaluate it How flexible the queries are written , it expresses the performance (good/bad) greatly on the quality of query optimizer Queries are parsed and then presented to query optimizer, which is responsible for identifying an efficient execution plan Optimizer generates the alternative plans and least estimated cost plan is chosen ;Query is essentially treated as σ – П – join algebra exprn with remaining operations carried out on the result of above exprn Query optimization is the process of identifying the access plan with the minimum cost Cost = Time taken to get all the answers Starting with System-R, most DBMSs use the same algorithm generate most of the access plans and select the cheapest one First, how do we determine the cost of a plan? Then, how long is this process going to take and how do we make it faster?
  • 4. Query evaluation Alternative ways of evaluating a given query Equivalent expressions Different algorithms for each operation
  • 5. Query execution cost Query execution cost is usually a weighted sum of the I/O cost (# disk accesses) and CPU cost (msec) w * IO_COST + CPU_COST Basic Idea: Cost of an operator depends on input data size, data distribution, physical layout The optimizer uses statistics about the relations to estimate the cost Need statistics on base relations and intermediate results
  • 6. CPU costing model for query Platform: Oracle , DB Ver: 9.2 The formula for the cost (using the CPU Costing Model) of a query is: Cost = ( #SRds * sreadtime  + #MRds * mreadtime + #CPUCycles / cpuspeed ) / sreadtime where: #SRds = number of single block reads #MRds = number of multi block reads #CPUCycles = number of CPU Cycles sreadtim = single block read time mreadtime = multi block read time cpuspeed = Standard 'Oracle' CPU cycles per second The translation of this formula is: The cost is the time spent on single block reads, plus the time spent on multiblock reads, plus the CPU time required, all divided by the time is takes to do a single block read. This means that the cost of a query is the PREDICTED EXECUTION TIME, counted in number of single block read times and is effectively the unit of measure of the cost.
  • 7. Query evaluation plan It consists of an extended relational algebra tree, with info at each node indicating the access methods to use for each table and the implementation method to use for each relational operator Consider the query:- Select s.sname from reserves R,Sailor S where R.sid=S.sid and R.bid=100 and s.rating>5; In Relational algebra it can be expressed as, П sname( σ bid=100 & rating > 5( σ sid=sid reserves join sailors)) (draw diag.)
  • 8. Query processing Query is processed in 3 phases, as below:- Parsing : DBMs parses the SQL query and chooses the most efficient access/execution plan Execution: the DBMs executes the SQL query using the chosen execution plan Fetching: the DBMS fetches the data and sends the result set back to the client The processing of DDL is different from DML For DDL, DBMS actually updates the data dictionary tables or system catalog while DML manipulates end user data
  • 9. SQL parsing phase Optimization process includes breaking down, parsing the query into smaller units and transforming the original query into slightly diff. version of original sql code SQL query can be fully equivalent and more efficient Fully equivalent means optimized query results are always as same as the original query More efficient means optimized query will always execute faster than original query Parsing activities are performed by query optimizer, they are as below :- Validated for syntax compliance Validated against data dictionary to ensure tables and col.are correct Validated againt data dictionary to ensure the user has proper access permissions Analyzed and decomposed into more atomic components Prepared for execution by determining the most efficient execution plan
  • 10. SQL parsing ex The following operations are made during the  parsing . Validate the syntax of the statement: is the query a valid SQL statement? SQL> select nothing where 1=2; select nothing where 1=2                * ERROR at line 1: ORA-00923: FROM keyword not found where expected Validate the semantic of the statement: are the objects valid? is there any ambiguity? does the constant fit into the column?... SQL> select col from not_existent_table; select col from not_existent_table                 * ERROR at line 1: ORA-00942: table or view does not exist Search in the  shared pool : Is the query text already known (search among all the query texts)? if not, error Does the query referenced the same objects (search among all versions of the query)? if not, error Is the execution environment identical (same search)? If yes, execute the query. Allocate memory in the  shared pool  to store the data about the query Get the values of the bind variables and check if all values fit in the columns
  • 11. Parsing ex.contd SQL> var v varchar2(20); SQL> exec :v := '12345678901' PL/SQL procedure successfully completed. SQL> insert into michel.t values (:v); insert into michel.t values (:v)                               * ERROR at line 1: ORA-12899: value too large for column "MICHEL"."T"."COL" (actual: 11, maximum: 10) Optimize the query execution Build the  parse tree  and the execution plan in a format that the SQL engine can use, this is named  row source generation Store the  parse tree  and the execution plan in the  shared pool .
  • 12. Parsing and execution Once the SQL stmt is transformed , the DBMS created what is commonly known as an access/execution plan Access/execution plan contains series of steps a DBMs will use to execute the query and return the result set in most efficient way SQL execution :- all i/o operations are indicated in the access plan are executed. When the execution plan is run, the proper locks are acquired for the data to be accessed and then retrieved from data files and placed in DBMs data cache SQL fetching :- after the parsing and execution phases are completed, all rows that match the specified conditions are retrieved ,sorted and grouped and/or aggregated In the fetching phase, the rows of resulting query result set are returned to the client. During this phase, the DBMS may use temporary table space to store temporary data
  • 13. Query evaluation plan An evaluation plan defines exactly what algorithm is used for each operation, and how the execution of the operations is coordinated
  • 14. cost-based query optimization Cost difference between evaluation plans for a query can be enormous E.g. seconds vs. days in some cases Steps in cost-based query optimization Generate logically equivalent expressions using equivalence rules Annotate resultant expressions to get alternative query plans Choose the cheapest plan based on estimated cost Estimation of plan cost based on: Statistical information about relations. Examples: number of tuples, number of distinct values for an attribute Statistics estimation for intermediate results to compute cost of complex expressions Cost formulae for algorithms, computed using statistics
  • 15. optimization Explain plan for select * from table where v_nm like ‘b%’ order by column; Explained – o/p Select * from table(DBMS_XPLAN.DISPLAY); Plan_table_o/p Predicate info Note:- No_of_rows selected
  • 16. Optimization contd… Analyze table table_nm compute statistics; Explain plan for select * from table where …. Select * from table(DBMS_XPLAN.DISPLAY) Predicate info(identified by operation id) Note: CPU costing is off
  • 17. Query graph and query plan Query Graph is a single graph corresponding to each query. It does not specify any order on which operation to perform first. Query Plan ( prev.diag) presents a specific order of operations for executing a query. It is a set of steps used to help accessing and modifying a SQL RDMS. Since SQL is declarative, there are typically a large number of alternative ways to execute a given query, with widely varying performance. When a query is submitted to the database, the query optimizer evaluates some of the different, correct possible plans for executing the query and returns what it considers the best alternative SQL query will be analysed first and parsed into a query graph
  • 18. System catalog System catalog The collection of files corresponding to user’s tables and indexes represents the data in the database A relational DBMS contains info about every table and index that it contains The descriptive info is stored in a collection of special tables called as catalog tables The catalog tables are known as data dictionary or system catalog
  • 19. Information catalog In this, we have the info such as the size of the buffer pool,the page size and following info about the tables, indexes and views For each table, Its name,the file name, and the structure if the file in which it is stored The attribute name and the type The index name of each index on the table Integrity constraints For each index The index name and the structure of index The search key attributes For each view Its view name and definition
  • 20. Statistics on System catalog (i)Cardinality :-the no. of N tuples for table R (ii)size:-the N no.of pages for each table R (iii)Index cardinality:-the no.of distinct key values for each index I (iv)Index size:-the no.of pages for each index I (v)Index height:-the number of non leaf levels for each tree index I (vi)Index range:- the minimum present key value low val and max value for each index I
  • 21. Common techq. For operator evaluation Indexing: if selection/join is specified use an index to examine tuples to satisfy condition Iteration: examine all tuples in an input table,one after other. Partitioning: partitioning tuples on a sort key. Sorting and hashing are used as partitioning techq.
  • 22. Access Paths & cost model The selectivity of access paths is the number of pages retrieved(index and data pages), we use access paths to retrieve all desired tuples If a table contains an index that matches given selection, there are at least 2 access paths:- Index A scan of the data file The most selective access path is the one that retrieves the fewest pages; selective access paths minimizes the cost of data retrieval
  • 23. The selectivity of the access paths depends on primary conjuncts in the selection condition Each conjunct acts as a filter on the table The fraction of the tuples that satisfy the conjunct is called the reduction factor Ex. We have a hash index H on sailors with search key(rname,bid,sid) and selection condition is rname=‘joe’ and bid=5 and sid=3 Index can be used to retrieve the tuples that satisfy all three
  • 24. The catalog contains the number of distinct key values ,Nkeys(H),in the hash index, as well as the number of pages, Npages, in the sailors table. The fraction of pages satisfying primary conjuncts is Npages(sailors)*1/Nkeys(H) Selection, project and join Selection :- it is in the form σ R.attr op value (R) Projection is to eliminate duplicates, to use partitioning Join :- joining the relations
  • 25. Pipelined evaluation When a query is composed of several operators, the result of one operator is pipelined to another operator without creating temporary table to hold intermediate result If the o/p of an operator is saved in a temp. table for processing by the next operator, then it is materialized Pipelined evaluation has lower overhead costs than materialization(obviously as one new table is used) pg.407