SlideShare a Scribd company logo
Distributed Database Systems                                                                            Distributed Database Systems




                                                                                                              Contents I
                                                                                                        1    Motivation

                               Distributed Database Systems                                             2    Detour on centralized query processing
                                                                                                               Translating SQL into relational algebra
                                  Distributed Query Processing
                                                                                                               Phases of centralized query processing
                                                                                                               Query parsing
                                   Katja Hose, Ralf Schenkel                                                   Query transformation
                                                                                                               Query optimization
                    Max-Planck-Institut f¨r Informatik, Cluster of Excellence MMCI
                                         u                                                              3    Basics of distributed query processing
                                                                                                               Phases of distributed query processing
                                      November 10, 2011                                                        Introduction
                                      November 17, 2011                                                        Meta data management
                                                                                                               Data localization
                                                                                                        4    Global query optimization
                                                                                                               Main questions
            Katja Hose                 Distributed Database Systems       November 10, 2011   1 / 167               Katja Hose          Distributed Database Systems   November 10, 2011   2 / 167




Distributed Database Systems                                                                            Distributed Database Systems
                                                                                                          Motivation



      Contents II                                                                                             Motivation
         Global query optimizer
         Distributed cost model                                                                         The task of query processing is . . .
         Join order optimization                                                                        . . . to answer user queries
         Total time models
         Response time models                                                                           Example
                                                                                                                How many students are at Saarland University?
                                                                                                                Answer: 18.000
                                                                                                        Additional constraints
5    Summary                                                                                                    Low response times
                                                                                                                High query throughput
                                                                                                                Efficient hardware usage
                                                                                                                ...

            Katja Hose                 Distributed Database Systems       November 10, 2011   3 / 167               Katja Hose          Distributed Database Systems   November 10, 2011   4 / 167
Distributed Database Systems                                                                            Distributed Database Systems
  Motivation                                                                                              Detour on centralized query processing



      Motivation                                                                                        1       Motivation
                                                                                                        2       Detour on centralized query processing
                                                                                                                  Translating SQL into relational algebra
                                                                                                                  Phases of centralized query processing
Differences to centralized query processing                                                                        Query parsing
        Considering the physical data distribution during query optimization                                      Query transformation
                                                                                                                  Query optimization
        Considering communication costs
                                                                                                        3       Basics of distributed query processing
Assumptions
                                                                                                                  Phases of distributed query processing
        Data is distributed among multiple nodes                                                                  Introduction
        Existence of a global conceptual schema, which is used by all nodes                                       Meta data management
                                                                                                                  Data localization
        Queries are formulated on the global schema
                                                                                                        4       Global query optimization
                                                                                                                  Main questions
                                                                                                                  Global query optimizer
                                                                                                                  Distributed cost model
            Katja Hose                     Distributed Database Systems   November 10, 2011   5 / 167                Katja Hose                       Distributed Database Systems   November 10, 2011   6 / 167




Distributed Database Systems                                                                            Distributed Database Systems
  Detour on centralized query processing                                                                  Detour on centralized query processing
                                                                                                            Translating SQL into relational algebra

         Join order optimization                                                                                Translating SQL into relational algebra
         Total time models
         Response time models                                                                           SQL query structure:

                                                                                                                select distinct a1 , . . . , an
                                                                                                                from            R1 , . . . , Rn
                                                                                                                where           p

                                                                                                        Algorithm:
5    Summary                                                                                                1    Translating the from clause
                                                                                                        Let R1 , . . . , Rk be the relations in the from clause of the query
                                                                                                        Construct expression:

                                                                                                                                          R1                                  if k = 1
                                                                                                                              R=
                                                                                                                                          ((. . . (R1 × R2 ) × . . . ) × Rk ) otherwise

            Katja Hose                     Distributed Database Systems   November 10, 2011   7 / 167                Katja Hose                       Distributed Database Systems   November 10, 2011   8 / 167
Distributed Database Systems                                                                                 Distributed Database Systems
  Detour on centralized query processing                                                                       Detour on centralized query processing
    Translating SQL into relational algebra                                                                      Translating SQL into relational algebra


        Translating SQL into relational algebra                                                                      Translating SQL into relational algebra


Algorithm :                                                                                                  Algorithm :
    2    Translating the where clause                                                                            3    Translating the select clause

Let F be the predicate in the where clause of the query (if a where clause                                   Let a1 , . . . , an (or “*”) be the projection in the select clause of the query
exists)                                                                                                      Construct expression:
Construct expression:
                                                                                                                                                  W                if the projection is “*”
                                                                                                                                       S=
                                       R      if there is no where clause                                                                         πa1 ,...,an (W ) otherwise
                          W =
                                       σF (R) otherwise                                                      Output:
                                                                                                             S



             Katja Hose                       Distributed Database Systems    November 10, 2011    9 / 167                Katja Hose                       Distributed Database Systems   November 10, 2011   10 / 167




Distributed Database Systems                                                                                 Distributed Database Systems
  Detour on centralized query processing                                                                       Detour on centralized query processing
    Translating SQL into relational algebra                                                                      Phases of centralized query processing


        Translating SQL into relational algebra                                                                      Workflow for centralized query processing
Example query
        select distinct e.EN ame, s.Salary
        from            Employees e, Salary s
        where           e.T itle = s.T itle and s.Salary ≥ 60.000

                                  R1                                  if k = 1
                      R=
                                  ((. . . (R1 × R2 ) × . . . ) × Rk ) otherwise


                                      R = Employees × Salary


                                       R      if there is no where clause
                          W =
                                       σF (R) otherwise
             Katja Hose                       Distributed Database Systems   November 10, 2011    11 / 167                Katja Hose                       Distributed Database Systems   November 10, 2011   12 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query parsing                                                                                            Query parsing


      Query parsing                                                                                            Example


Transform a declarative query into an internal representation
        Query formulated using a declarative query language, e.g., SQL                                   Example
        The Parser translates the query into an internal representation                                          Database managing information about employees and projects
                Called naive query plan                                                                                  Employees(EID, EN ame, T itle)
                Plan described by an operator tree of relational algebra operators                                       Assignment(EN o, P N o, Duration)
                                                                                                                 Query: return the names of all employees working for project ’P1’
                                                                                                                         SELECT EName
                                                                                                                         FROM Employees e, Assignment a
                                                                                                                         WHERE e.EID = ENo AND PNo=’P1’




            Katja Hose                     Distributed Database Systems   November 10, 2011   13 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   14 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query parsing                                                                                            Query parsing


      Example                                                                                                  Operator tree

                                                                                                         πEN ame σP N o= P 1 ∧Employees.EID=Assignment.EN o Employees × Assignment
Query
        SELECT EName
        FROM Employees e, Assignment a
        WHERE e.EID = ENo AND PNo=’P1’
Translation into relational algebra
        πEN ame σP N o= P 1 ∧Employees.EID=Assignment.EN o Employees ×
        Assignment
In contrast to the SQL statement, the algebra statement already contains
the required basic evaluation operators
                                                                                                                                                        Operator tree


            Katja Hose                     Distributed Database Systems   November 10, 2011   15 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   16 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query transformation                                                                                     Query transformation


      Workflow for centralized query processing                                                                   Query transformation


                                                                                                         Steps
                                                                                                             1   Name resolution
                                                                                                                 Transforming object names into internal names
                                                                                                             2   Semantic analysis
                                                                                                                 Checking for global relations and attributes, view expansion, global
                                                                                                                 access control
                                                                                                             3   Normalization
                                                                                                                 Transforming predicates into a canonical format
                                                                                                             4   Simple algebraic rewriting
                                                                                                                 Application of heuristics to eliminate bad plans



            Katja Hose                     Distributed Database Systems   November 10, 2011   17 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   18 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query transformation                                                                                     Query transformation


      Semantic analysis                                                                                          Normalization


                                                                                                         Objective
        Check if the global schema defines all attributes and relations
                                                                                                                 Simplification of the following optimization by transforming the query
        referenced in the query
                                                                                                                 into a canonical format
        If the query is formulated on a view, replace references to
                                                                                                                 Selection and join predicates
        relations/attributes with references to global relations/attributes
                                                                                                                         Conjunctive normal form vs. disjunctive normal form
        Perform simple integrity checks, e.g., are the types of attributes                                               Conjunctive normal form:
        used in comparison predicates of the same type?                                                                  (p11 ∨ p12 ∨ · · · ∨ p1n ) ∧ · · · ∧ (pm1 ∨ pm2 ∨ · · · ∨ pmn )
        Initial check if the query has the rights to access referenced                                                   Disjunctive normal form:
                                                                                                                         (p11 ∧ p12 ∧ · · · ∧ p1n ) ∨ · · · ∨ (pm1 ∧ pm2 ∧ · · · ∧ pmn )
        relations/attributes
                                                                                                                 Transformation based on equivalence rules for logical operators



            Katja Hose                     Distributed Database Systems   November 10, 2011   19 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   20 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query transformation                                                                                     Query transformation


      Normalization                                                                                            Normalization
                                                                                                         Example
                                                                                                         SELECT EName
Equivalence rules                                                                                        FROM Employees e, Assignment a
        p1 ∧ p2 ⇐⇒ p2 ∧ p1 and p1 ∨ p2 ⇐⇒ p2 ∨ p1                                                        WHERE e.EID = a.ENo AND Duration ≥ 3 AND (PNo=’P1’ OR
                                                                                                         PNo=’P2’)
        p1 ∧ (p2 ∧ p3 ) ⇐⇒ (p1 ∧ p2 ) ∧ p3 and p1 ∨ (p2 ∨ p3 ) ⇐⇒ (p1 ∨ p2 ) ∨ p3
        p1 ∧ (p2 ∨ p3 ) ⇐⇒ (p1 ∧ p2) ∨ (p1 ∧ p3 ) and                                                    Selection condition in disjunctive normal form
        p1 ∨ (p2 ∧ p3 ) ⇐⇒ (p1 ∨ p2) ∧ (p1 ∨ p3 )
                                                                                                                           (EID = ENo ∧ Duration ≥ 3 ∧ PNo=’P1’) ∨
        ¬(p1 ∧ p2 ) ⇐⇒ ¬p1 ∨ ¬p2 and ¬(p1 ∨ p2 ) ⇐⇒ ¬p1 ∧ ¬p2
                                                                                                                                  (EID = ENo ∧ Duration ≥ 3 ∧ PNo=’P2’)
        ¬(¬p1 ) ⇐⇒ p1
                                                                                                         Selection condition in conjunctive normal form

                                                                                                                      EID = ENo ∧ Duration ≥ 3 ∧ (PNo=’P1’ ∨ PNo=’P2’)

            Katja Hose                     Distributed Database Systems   November 10, 2011   21 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   22 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query transformation                                                                                     Query optimization


      Simple algebraic rewriting                                                                               Workflow for centralized query processing

Simple optimizations that are always beneficial regardless of system state
        Elimination of redundant predicates
        Simplification of expressions
        Unnesting of subqueries and views
Tasks
       Recognize and simplify all
       expressions/operations/subqueries that
       are “obviously” unnecessary, redundant,
       or contradictory.
       Do not consider system state
       information, e.g., size of tables,
       existence of indexes, etc.

            Katja Hose                     Distributed Database Systems   November 10, 2011   23 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   24 / 167
Distributed Database Systems                                                                                       Distributed Database Systems
  Detour on centralized query processing                                                                             Detour on centralized query processing
    Query optimization                                                                                                 Query optimization


        Query optimization                                                                                               Heuristics


Steps                                                                                                                      Use simple heuristics which usually lead to better performance
  1 Algebraic optimization
                                                                                                                           Not the optimal plan is needed, but the really bad ones should be
                Find a good relational algebra operator tree                                                               avoided
                Heuristic query optimization
                                                                                                                           Heuristics
                Cost-based query optimization
                Statistical query optimization                                                                                     Break selections
                                                                                                                                   Complex selection criteria should be broken into multiple parts
    2    Physical optimization                                                                                                     Push projection and push selection
                Find suitable algorithms for implementing the operations                                                           Cheap selections and projections should be performed as early as
                                                                                                                                   possible to reduce the sizes of intermediate results
                                                                                                                                   Force joins
                                                                                                                                   In most cases, using a join is much cheaper than using a Cartesian
                                                                                                                                   product and a selection


            Katja Hose                       Distributed Database Systems           November 10, 2011   25 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   26 / 167




Distributed Database Systems                                                                                       Distributed Database Systems
  Detour on centralized query processing                                                                             Detour on centralized query processing
    Query optimization                                                                                                 Query optimization


        Algebraic optimization rules                                                                                     Algebraic optimization rules

Operator            is commutative:

                                            r1     r2 ⇐⇒ r2            r1
                                                                                                                   Combinations of selections σ can be combined using logical and (∧). The
Operator            is associative:                                                                                order of the selections is arbitrary:

                                (r1        r2 )    r3 ⇐⇒ r1            (r2   r3 )                                                        σF1 (σF2 (r1 )) ⇐⇒ σF1 ∧F2 (r1 ) ⇐⇒ σF2 (σF1 (r1 ))

For operator π in combination with another operator π, the “outer”                                                 Exploiting commutativity of ∧
parameter dominates the “inner” one:

                                 πX (πY (r1 )) ⇐⇒ πX (r1 ) if X ⊆ Y



            Katja Hose                       Distributed Database Systems           November 10, 2011   27 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   28 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query optimization                                                                                       Query optimization


      Algebraic optimization rules                                                                             Algebraic optimization rules
                                                                                                         Operators σ and               commute if all selection attributes are contained in the same
                                                                                                         relation:
Operators π and σ commute if predicate F is defined based on the                                                                    σF (r1           r2 ) ⇐⇒ σF (r1 )          r2 if attr(F ) ⊆ R1
projection attributes:                                                                                   A selection predicate can be split up in conjunction with a join (F = F1 ∧ F2 ) if
                                                                                                         the attributes referred to by F1 and F2 are contained in different relations:
                         σF (πX (r1 )) ⇐⇒ πX (σF (r1 )) if attr(F ) ⊆ X
                                                                                                                                           σF (r1          r2 ) ⇐⇒ σF1 (r1 )             σF2 (r2 )
Alternatively, change in ordering possible if the projection is extended by
                                                                                                                                           if attr(F1 ) ⊆ R1 and attr(F2 ) ⊆ R2
all necessary attributes:
                                                                                                         In any case, part of a selection can be split up by separating predicates F1
               πX1 (σF (r1 )) ⇐⇒ πX1 (σF (πX1 ,X2 (r1 ))) if attr(F ) ⊇ X2                               referencing attributes of R1 only, F2 contains the remaining predicates referencing
                                                                                                         attributes of both relations

                                                                                                                              σF (r1       r2 ) ⇐⇒ σF2 (σF1 (r1 )                r2 ) if attr(F1 ) ⊆ R1


            Katja Hose                     Distributed Database Systems   November 10, 2011   29 / 167               Katja Hose                           Distributed Database Systems               November 10, 2011   30 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Detour on centralized query processing
    Query optimization                                                                                       Query optimization


      Algebraic optimization rules                                                                             Algebraic optimization rules


Commutativity of σ and ∪:                                                                                Commutativity of π and                       :

                                 σF (r1 ∪ r2 ) ⇐⇒ σF (r1 ) ∪ σF (r2 )                                                               πX (r1            r2 ) ⇐⇒ πX (πY1 (r1 )                 πY2 (r2 ))

Commutativity of σ and −:                                                                                with
                                                                                                                                               Y1 = (X ∩ R1 ) ∪ (R1 ∩ R2 )
                                 σF (r1 − r2 ) ⇐⇒ σF (r1 ) − σF (r2 )
                                                                                                         and
or in case F only references tuples in r1 :                                                                                                    Y2 = (X ∩ R2 ) ∪ (R1 ∩ R2 )

                                     σF (r1 − r2 ) ⇐⇒ σF (r1 ) − r2                                      Pushing a projection is possible if all Yi are defined in such a way that they
                                                                                                         preserve all attributes necessary to perform the join.



            Katja Hose                     Distributed Database Systems   November 10, 2011   31 / 167               Katja Hose                           Distributed Database Systems               November 10, 2011   32 / 167
Distributed Database Systems                                                                                    Distributed Database Systems
  Detour on centralized query processing                                                                          Detour on centralized query processing
    Query optimization                                                                                              Query optimization


      Algebraic optimization rules                                                                                    Heuristic algebraic optimization – Example


Further rules
        Commutativity of π and ∪:

                                     πX (r1 ∪ r2 ) ⇐⇒ πX (r1 ) ∪ πX (r2 )
                                                                                                                Use algebraic optimization heuristics
        Distributive law for and ∪, distributive law for and −,
        Commutativity of renaming β with other operators, . . .                                                        Force join
        Idempotence, e.g., A ∨ A ⇐⇒ A                                                                                  Push selection and projection
        Operations involving empty relations
        Commutative and associative laws for                         , ∪ und ∩



            Katja Hose                     Distributed Database Systems          November 10, 2011   33 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   34 / 167




Distributed Database Systems                                                                                    Distributed Database Systems
  Detour on centralized query processing                                                                          Detour on centralized query processing
    Query optimization                                                                                              Query optimization


      Cost-based algebraic query optimization                                                                         Physical query optimization

                                                                                                                Physical optimization
Most non-distributed RDBMS strongly rely on cost-based optimizations
                                                                                                                        Input:
        Aim for better optimized plan with respect to system and data                                                   Optimized query plan consisting of algebra operators
        characteristics                                                                                                 Choose an algorithm to compute a particular algebra operator
        Join order optimization
                                                                                                                        Join:
        Basic approach                                                                                                  Block-Nested-Loop join, hash join, merge join, . . .
                Establish a cost model for various operations
                Enumerate all query plans and compute costs                                                             Select:
                Pick the best query plan                                                                                Full table scan, index lookup, ad-hoc index generation & lookup, . . .
        Usually, dynamic programming techniques are used to keep                                                Tasks
        computational effort manageable
                                                                                                                        Translating a query plan into an execution plan
                                                                                                                Physical and algebraic optimization are often interleaved

            Katja Hose                     Distributed Database Systems          November 10, 2011   35 / 167               Katja Hose                     Distributed Database Systems   November 10, 2011   36 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Detour on centralized query processing                                                                   Basics of distributed query processing
    Query optimization


      Query optimization example                                                                         1    Motivation
                                                                                                         2    Detour on centralized query processing
                                                                                                                Translating SQL into relational algebra
                                                                                                                Phases of centralized query processing
Output: query execution plan
                                                                                                                Query parsing
                                                                                                                Query transformation
                                                                                                                Query optimization
                                                                                                         3    Basics of distributed query processing
                                                                                                                Phases of distributed query processing
                                                                                                                Introduction
                                                                                                                Meta data management
                                                                                                                Data localization
                                                                                                         4    Global query optimization
                                                                                                                Main questions
                                                                                                                Global query optimizer
                                                                                                                Distributed cost model
             Katja Hose                    Distributed Database Systems   November 10, 2011   37 / 167                Katja Hose                      Distributed Database Systems   November 10, 2011   38 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
                                                                                                             Phases of distributed query processing

         Join order optimization                                                                               Workflow for distributed query processing
         Total time models
         Response time models




5    Summary




             Katja Hose                    Distributed Database Systems   November 10, 2011   39 / 167                Katja Hose                      Distributed Database Systems   November 10, 2011   40 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Introduction                                                                                             Introduction


      Basic considerations                                                                                     Basic considerations
                                                                                                         Costs are more difficult to predict
Distributed query processing
                                                                                                                 Join selectivity: is it worthwhile to push down a selection?
        Shares the same properties of centralized query processing
                                                                                                                 Data is distributed: difficult to get meaningful statistics
        Similar problem but with different objectives and constraints
                                                                                                                 Network latency is very hard to predict
Objectives for centralized query processing                                                                      Current workload at nodes, load shedding
        Minimize the number of disk accesses                                                             Additional cost factors and constraints
        Minimize computational time                                                                              Extension of relational algebra (sending/receiving data)
Objectives for distributed query processing                                                                      Data localization (which node holds relevant data)
        Minimize resource consumption                                                                            Replication and caching (where to compute an operation)
        Minimize response time                                                                                   Network models
        Maximize throughput                                                                                      Response-time models
                                                                                                                 Data and structural heterogeneity (federated databases . . . )
             Katja Hose                    Distributed Database Systems   November 10, 2011   41 / 167                Katja Hose                    Distributed Database Systems      November 10, 2011   42 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Introduction                                                                                             Introduction


      Consequences                                                                                             Example

                                                                                                         Query
Optimization is much more difficult than in the central case
                                                                                                                 Return the names of all employees working for project ’P1’
        Statistics and costs change over time, e.g., workload at a node,
        network load                                                                                             πEN ame (πEID,EN ame (Employees)                          Employees.EID=Assignment.EN o
                                                                                                                 πEN o (σP N o= P 1 (Assignment)))
        More conflicting optimization goals
        Increase throughput → reduce replication and parallelization,                                    Problems
        increase query response time → increase parallelization                                                  Relations are fragmented and distributed among five nodes
        More cost factors and constraints                                                                        The Employees relation uses primary horizontal fragmentation
Consequences                                                                                                     One fragment located at node 1, the other at node 2, no replication
        Adaptive query plans (create an initial plan and optimize it on-the-fly)                                  The Assignment relation uses derived horizontal fragmentation
                                                                                                                 One fragment located at node 3, the other at node 4, no replication
        Do not aim for the best plan, but for a good plan
                                                                                                                 The query originates from node 5

             Katja Hose                    Distributed Database Systems   November 10, 2011   43 / 167                Katja Hose                    Distributed Database Systems      November 10, 2011   44 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Introduction                                                                                             Introduction


      Example                                                                                                  Example
                                                                                                         Cost model and statistics
                                                                                                                 Accessing a tuple costs 1 unit (acc)
                                                                                                                 Transferring a tuple costs 10 units (trans)
                                                                                                                 There are 400 employees and 1000 assignments
                                                                                                                 20 assignments for project ‘P1’
                                                                                                                 All tuples are uniformly distributed, i.e., nodes 3 and 4 provide 10
                                                                                                                 assignments for project ‘P1’ each
                                                                                                                 There are local indexes on attribute P N o at nodes 3 and 4 (as well as
                                                                                                                 indexes on primary keys at all nodes)
                                                                                                                 Direct tuple access is possible on local sites, no scanning
                                                                                                                 All nodes can directly communicate with each other
                                                                                                                 Simplification: no costs for unions and projections

             Katja Hose                    Distributed Database Systems   November 10, 2011   45 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   46 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Introduction                                                                                             Introduction


      Example                                                                                                  Example
                                                                                                         Simple execution plan - Version B
Simple execution plan - Version A
                                                                                                         Ship intermediate results
Transfer all data to Node 5




             Katja Hose                    Distributed Database Systems   November 10, 2011   47 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   48 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Introduction                                                                                             Introduction


      Example                                                                                                  Example
                                                                                                         Costs plan B: 440 units
Costs plan A: 23.000 units




             Katja Hose                    Distributed Database Systems   November 10, 2011   49 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   50 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Introduction                                                                                             Introduction


      Important aspects of distributed query processing                                                        Important aspects of distributed query processing




        Meta data management
        Data localization
        Global query optimization
        Post-processing




             Katja Hose                    Distributed Database Systems   November 10, 2011   51 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   52 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Meta data management                                                                                     Meta data management


      Workflow for distributed query processing                                                                 Meta data management



                                                                                                         Prerequisites to perform query optimization
                                                                                                                 Meta data must be available
                                                                                                                 Meta data is stored in the catalog
                                                                                                                 Catalog provides information about the data distribution
                                                                                                         Use this information to decide, for instance, if it is worthwhile to execute a
                                                                                                         selection very early.




             Katja Hose                    Distributed Database Systems   November 10, 2011   53 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   54 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Meta data management                                                                                     Meta data management


      Meta data management                                                                                     Meta data management
Typical contents of a catalog for distributed database management systems
        Database schema                                                                                  Where to store the catalog in a distributed system?
        Definitions of tables, views, constraints, keys,. . .                                                     Central node
        Partitioning schema                                                                                      Simple solution, bottleneck
        Information about how the schema is partitioned and how tables can                                       Replicated at all nodes
        be reconstructed                                                                                         Updates are expensive
        Allocation schema
                                                                                                                 Fragmented
        Information about which fragment can be found at which node
                                                                                                                 In rare cases, the catalog may become very large
        (including information about replication)
                                                                                                                 Catalog has to be fragmented and allocated
        Network information
                                                                                                                 Caching
        Information about node connections, network model
                                                                                                                 Replicate only needed parts of a central catalog, anticipate potential
        Additional physical information
                                                                                                                 inconsistencies
        Information about indexes, data statistics (histograms, etc.),
        hardware resources (processing & storage),. . .
             Katja Hose                    Distributed Database Systems   November 10, 2011   55 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   56 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Meta data management                                                                                     Meta data management


      Meta data management                                                                                     Meta data management



Centralized catalog                                                                                      Replicated catalog
        One instance of the global catalog at a central node                                                     Full copy of the global catalog at each node
        Advantages                                                                                               Advantages
                No need to update copies                                                                                 Little communication overhead for queries
                Little memory consumption                                                                                Good availability
        Disadvantages                                                                                            Disadvantages
                Communication with central node for each query                                                           High update costs
                Central node potentially represents a bottleneck




             Katja Hose                    Distributed Database Systems   November 10, 2011   57 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   58 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Meta data management                                                                                     Meta data management


      Meta data management                                                                                     Meta data management



Fragmented catalog                                                                                       Caching catalog data

        Partitioning the global catalog and assigning partitions to nodes                                        Caching non-local catalog data
        Advantages                                                                                               Advantages
                Sharing load among nodes                                                                                 Avoiding remote access to frequently needed catalog data
                Reducing update overhead                                                                                 Reducing communication overhead
        Disadvantages                                                                                            Disadvantages
                Localizing necessary partitions of the global catalog                                                    Coherency control
                                                                                                                         Invalidating cached copies in the presence of updates




             Katja Hose                    Distributed Database Systems   November 10, 2011   59 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   60 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Meta data management                                                                                     Data localization


      Meta data management                                                                                     Workflow for distributed query processing



Caching catalog data
        Explicit invalidation
                Owner of catalog data remembers nodes with local copies
                In case of updates: sending an invalidation message to nodes with local
                copies
        Implicit invalidation
                Identifying old catalog data during runtime (adding version numbers
                and time stamps to query messages)




             Katja Hose                    Distributed Database Systems   November 10, 2011   61 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   62 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Data localization                                                                                        Data localization


      Data localization                                                                                        Example – horizontal reduction
Objective                                                                                                Schema

        Creating subqueries in consideration of the data distribution                                            Projects1 = σBudget≤150.000 (Projects)
                                                                                                                 Projects2 = σ150.000<Budget≤200.000 (Projects)
Assumptions                                                                                                      Projects3 = σBudget>200.000 (Projects)
        Fragmentation is defined by fragmentation expressions                                             Reconstruction expression (horizontal fragmentation)
        Each fragment is allocated only at one node (no replication)                                             Projects = Projects1 ∪ Projects2 ∪ Projects3
        Fragmentation expressions and locations of the fragments are stored                              Example query
        in the catalog
                                                                                                                 σLocation= Saarbr. ∧Budget≤100.000 (Projects)
Main tasks                                                                                               After replacing references to global relations
        Replace access to global relations with accesses to the fragments                                     σLocation= Saarbr. ∧Budget≤100.000 (Projects1 ∪ Projects2 ∪
        Insert reconstruction expression into algebra query                                                   Projects3 )
        Basic algebraic simplifications of the query                                                                                          Further optimization is possible!
             Katja Hose                    Distributed Database Systems   November 10, 2011   63 / 167                Katja Hose                    Distributed Database Systems   November 10, 2011   64 / 167
Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Data localization                                                                                        Data localization


      Query simplification – horizontal reduction                                                               Example – horizontal reduction

Objective
                                                                                                         Query with fragmentation expression
        Eliminate non-necessary subqueries                                                               σLocation= Saarbr. ∧Budget≤100.000 (Projects1 ∪ Projects2 ∪ Projects3 )

Horizontal reduction rule                                                                                Fragment definitions
                                                                                                                 Projects1 = σBudget≤150.000 (Projects)
     Given fragments of R as FR = {R1 , . . . , Rn } with Ri = σpi (R)                                           Projects2 = σ150.000<Budget≤200.000 (Projects)
        All fragments Ri for which σps (Ri ) = ∅ can be removed                                                  Projects3 = σBudget>200.000 (Projects)
        with ps denoting the query’s selection predicate
                                                                                                         Because of
        σps (Ri ) = ∅ ⇐ ∀x ∈ R : ¬(ps (x) ∧ (pi (x))                                                     σBudget≤100.000 (Projects2 ) = ∅, σBudget≤100.000 (Projects3 ) = ∅
        The selection with the query predicate ps on fragment Ri is empty if
        ps contradicts the fragmentation predicate pi of Ri , i.e., ps and pi are                        We obtain the reduced query
        never true at the same time for all tuples in Ri                                                 σLocation= Saarbr. (σBudget≤100.000 (Projects1 ))



             Katja Hose                    Distributed Database Systems   November 10, 2011   65 / 167                Katja Hose                      Distributed Database Systems   November 10, 2011   66 / 167




Distributed Database Systems                                                                             Distributed Database Systems
  Basics of distributed query processing                                                                   Basics of distributed query processing
    Data localization                                                                                        Data localization


      Query simplification – join reduction                                                                     Example – join reduction
Join Reductions                                                                                          Schema
        Larger joins are replaced by multiple partial joins on fragments                                 Projects(PNo, PName, Budget, Location)
        Distributive law: (R1 ∪ R2 ) S = (R1 S) ∪ (R2 S)                                                         Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects)
                                                                                                                 Projects2 = σP N o= P 3 (Projects)
        Eliminate all union fragments that will return an empty result
                                                                                                                 Projects3 = σP N o= P 4 (Projects)
Expectations
                                                                                                         Assignment(ENo, PNo, Duration)
        Elimination of partial joins producing empty results                                                     Assignment1 = σP N o= P 1 ∨P N o= P 2 (Assignment)
        Depends on fragmentation optimality                                                                      Assignment2 = σP N o= P 3 ∨P N o= P 4 (Assignment)
        Many joins on small relations have lower resource costs than one large
                                                                                                         Example query
        join
        Depends on fragmentation and applied join algorithms                                             select * from Projects p, Assignment a where p.PNo = a.PNo
        Smaller joins can be executed in parallel                                                        In relational algebra
        Might decrease response time but might also increase communication                                                                          Projects        Assignment
        costs
             Katja Hose                    Distributed Database Systems   November 10, 2011   67 / 167                Katja Hose                      Distributed Database Systems   November 10, 2011   68 / 167
Distributed Database Systems                                                                                    Distributed Database Systems
  Basics of distributed query processing                                                                          Basics of distributed query processing
    Data localization                                                                                               Data localization


      Example – join reduction                                                                                        Query simplification – join reduction
Query
                                           Projects        Assignment                                           Join reduction rule
                                                                                                                     Given fragments of R as FR = {R1 , . . . , Rn } and fragments of S as
After replacing global relations with reconstruction expressions                                                     FS = {S1 , . . . , Sn }
 (Projects1 ∪ Projects2 ∪ Projects3 )                              (Assignment1 ∪ Assignment2 )                         Apply distributive law, e.g.:
                                                                                                                        (R1 ∪ R2 )           (S1 ∪ S2 ) = (R1        S1 ) ∪ (R1           S2 ) ∪ (R2    S1 ) ∪ (R2     S2 )
After applying the distributive law                                                                                     All partial joins between fragments Ri and Sj for which Ri                                 Sj = ∅
                                                                                                                        can be removed
      (Projects1               Assignment1 ) ∪ (Projects1                   Assignment2 ) ∪
                                                                                                                        Ri Sj = ∅ ⇐ ∀x ∈ Ri , y ∈ Sj : ¬(pi (x) ∧ pj (y))
      (Projects2               Assignment1 ) ∪ (Projects2                   Assignment2 ) ∪
                                                                                                                        The join between fragments Ri and Rj is empty if their respective
          (Projects3               Assignment1 ) ∪ (Projects3                 Assignment2 )                             fragmentation predicates (on the join attribute) contradict, i.e., there
                                                                                                                        is no tuple combination x and y such that both partitioning
                                      Further optimization is possible!                                                 predicates are fulfilled at the same time.

             Katja Hose                      Distributed Database Systems        November 10, 2011   69 / 167                Katja Hose                    Distributed Database Systems            November 10, 2011   70 / 167




Distributed Database Systems                                                                                    Distributed Database Systems
  Basics of distributed query processing                                                                          Basics of distributed query processing
    Data localization                                                                                               Data localization


      Example – join reduction                                                                                        Query simplification – join reduction for horizontal
                                                                                                                      fragmentation
Query with fragmentation expression
      (Projects1               Assignment1 ) ∪ (Projects1                   Assignment2 ) ∪                     The easiest join reduction case follows from derived horizontal
      (Projects2               Assignment1 ) ∪ (Projects2                   Assignment2 ) ∪                     fragmentation
          (Projects3               Assignment1 ) ∪ (Projects3                 Assignment2 )                             For each fragment of the first relation, there is exactly one matching
                                                                                                                        fragment of the second relation
Some of these partial joins are empty, e.g.:
                                                                                                                        Simply use the information contained in the reconstruction expression
                                    Projects1           Assignment2 = ∅                                                 instead of comparing the reconstruction predicates to each other
Because their fragmentation expressions contradict:                                                             Join reduction for arbitrary horizontal partitioning might not be beneficial
                     Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects) and
                     Assignment2 = σP N o= P 3 ∨P N o= P 4 (Assignment)
Reduced query
      (Projects1               Assignment1 ) ∪ (Projects2                   Assignment2 ) ∪
                                                           (Projects3         Assignment2 )
             Katja Hose                      Distributed Database Systems        November 10, 2011   71 / 167                Katja Hose                    Distributed Database Systems            November 10, 2011   72 / 167
Distributed Database Systems                                                                                  Distributed Database Systems
  Basics of distributed query processing                                                                        Basics of distributed query processing
    Data localization                                                                                             Data localization

      Query simplification – join reduction for derived                                                              Query simplification – join reduction for derived
      horizontal fragmentation                                                                                      horizontal fragmentation

Example                                                                                                       After replacing global relations with reconstruction expressions
                            Projects(PNo, PName, Budget, Location)
                                                                                                                          (Projects1 ∪ Projects2 )                    (Assignment1 ∪ Assignment2 )
                     Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects)
                     Projects2 = σP N o= P 3 ∨P N o= P 4 (Projects)                                           After applying the distributive law

                                   Assignment(ENo, PNo, Duration)                                                   (Projects1               Assignment1 ) ∪ (Projects1                  Assignment2 ) ∪
                       Assignment1 = Assignment                        Projects1                                        (Projects2               Assignment1 ) ∪ (Projects2                Assignment2 )
                       Assignment2 = Assignment                        Projects2                              Reduced query (using information about fragmentation of relation Assignment
                                                                                                              directly)
Query in relational algebra
                                           Projects        Assignment                                                     (Projects1               Assignment1 ) ∪ (Projects2               Assignment2 )



             Katja Hose                      Distributed Database Systems      November 10, 2011   73 / 167                Katja Hose                     Distributed Database Systems        November 10, 2011   74 / 167




Distributed Database Systems                                                                                  Distributed Database Systems
  Basics of distributed query processing                                                                        Basics of distributed query processing
    Data localization                                                                                             Data localization


      Query simplification – vertical reduction                                                                      Example – vertical reduction
                                                                                                              Schema
                                                                                                              Projects(PNo, PName, Budget, Location)
                                                                                                                  Projects1 = πP N o,P N ame,Location (Projects)
                                                                                                                  Projects2 = πP N o,Budget (Projects)
Vertical fragmentation rule
                                                                                                              Reconstruction expression
        Given fragments of R as FR = {R1 , . . . , Rn } with Ri = πβi (R) with
                                                                                                                      Projects = Projects1                     Projects2
        βi representing the enumeration of a subset of R’s attributes
        Avoid joining fragments containing “useless” attributes, i.e.,                                        Example query
        fragments containing only attributes that are not referenced in the                                           πP N ame (Projects)
        query and not output in the result
                                                                                                              After replacing references to global relations
                                                                                                                      πP N ame (Projects1                Projects2 )

                                                                                                              After removing unnecessary fragments
                                                                                                                      πP N ame (Projects1 )

             Katja Hose                      Distributed Database Systems      November 10, 2011   75 / 167                Katja Hose                     Distributed Database Systems        November 10, 2011   76 / 167
Distributed Database Systems                                                                                 Distributed Database Systems
  Basics of distributed query processing                                                                       Basics of distributed query processing
    Data localization                                                                                            Data localization


      Query simplification – hybrid fragmentation                                                                   Qualified relations
                                                                                                                     Supporting algebraic optimization of queries involving fragments
                                                                                                                     Annotating fragments and intermediate relations with predicates
                                                                                                                     Estimating the size of a relation
        The reconstruction expression introduces combinations of joins and                                           Extension of relational algebra
        unions
        General guidelines                                                                                   Definition: qualified relation
                Remove empty relations generated by contradicting relations on                               A qualified relation is a pair [R : qR ] where R is a relation and qR is a
                horizontal fragments                                                                         predicate.
                Remove useless relations generated by vertical fragments
                Break and distribute joins, eliminate empty fragment joins                                   Example
                                                                                                             Representing horizontal fragments as qualified relations where the
                                                                                                             qualification predicate corresponds to the fragmentation expression

                                                                                                                                               [Projects : σP N o= P 1 ∨P N o= P 2 ]
             Katja Hose                    Distributed Database Systems       November 10, 2011   77 / 167                Katja Hose                    Distributed Database Systems   November 17, 2011   78 / 167




Distributed Database Systems                                                                                 Distributed Database Systems
  Basics of distributed query processing                                                                       Basics of distributed query processing
    Data localization                                                                                            Data localization


      Qualified relations                                                                                           Qualified relations
                                                                                                             Example query
                                                                                                                                               σ100.000≤Budget≤200.000 (Projects)
Extended relational algebra                                                                                  Qualified relations
                                                                                                                        E1      =      σ100.000≤Budget≤200.000 [Projects1 : Budget ≤ 150.000]
(1)     E   :=      σF [R : qR ]                       → [E      : F ∧ qR ]
                                                                                                                                       [E1 : (100.000 ≤ Budget ≤ 200.000) ∧ (Budget ≤ 150.000)]
(2)     E   :=      πA [R : qR ]                       → [E      : qR ]
                                                                                                                                       [E1 : 100.000 ≤ Budget ≤ 150.000]
(3)     E   :=      [R : qR ] × [S : qS ]              → [E      : qR ∧ qS ]
(4)     E   :=      [R : qR ] − [S : qS ]              → [E      : qR ]                                                E2       =      σ1000≤Budget≤200.000 [Projects2 : 150.000 < Budget ≤ 200.000]
(5)     E   :=      [R : qR ] ∪ [S : qS ]              → [E      : qR ∨ qS ]                                                           [E2 : (100.000 ≤ Budget ≤ 200.000) ∧
(6)     E   :=      [R : qR ] F [S : qS ]              → [E      : qR ∧ qS ∧ F ]                                                       (150.000 < Budget ≤ 200.000)]
                                                                                                                                       [E2 : 150.000 < Budget ≤ 200.000]
                                                                                                                       E3       =      σ100.000≤Budget≤200.000 [Projects3 : Budget > 200.000]
                                                                                                                                       [E3 : (100.000 ≤ Budget ≤ 200.000) ∧ (Budget > 200.000)]
                                                                                                                                       E3 = ∅


             Katja Hose                    Distributed Database Systems       November 17, 2011   79 / 167                Katja Hose                    Distributed Database Systems   November 17, 2011   80 / 167
Distributed Database Systems                                                                 Distributed Database Systems
  Global query optimization                                                                    Global query optimization



1    Motivation                                                                                       Join order optimization
                                                                                                      Total time models
2    Detour on centralized query processing                                                           Response time models
       Translating SQL into relational algebra
       Phases of centralized query processing
       Query parsing
       Query transformation
       Query optimization
3    Basics of distributed query processing
       Phases of distributed query processing
       Introduction
                                                                                             5    Summary
       Meta data management
       Data localization
4    Global query optimization
       Main questions
       Global query optimizer
       Distributed cost model
            Katja Hose         Distributed Database Systems   November 17, 2011   81 / 167               Katja Hose         Distributed Database Systems   November 17, 2011   82 / 167




Distributed Database Systems                                                                 Distributed Database Systems
  Global query optimization                                                                    Global query optimization
    Main questions                                                                               Main questions


      Workflow for distributed query processing                                                     Introduction to global query optimization




                                                                                             Main questions
                                                                                                     When to optimize?
                                                                                                     What criteria to optimize?
                                                                                                     Where to execute the query?




            Katja Hose         Distributed Database Systems   November 17, 2011   83 / 167               Katja Hose         Distributed Database Systems   November 17, 2011   84 / 167
Distributed Database Systems                                                                     Distributed Database Systems
  Global query optimization                                                                        Global query optimization
    Main questions                                                                                   Main questions


      When to optimize?                                                                                When to optimize?

Full compile time optimization                                                                   Fully dynamic optimization
        The full query execution plan is computed at compile time                                        Each query is optimized individually at runtime
        Assumption
                                                                                                         This technique heavily relies on heuristics, learning algorithms, and
               Applications use canned queries
                                                                                                         luck
               Prepared and parameterized SQL statements
                                                                                                         Pros
        Pros
                                                                                                                Might produce very good plans
               Queries can be executed directly
                                                                                                                Uses current network state
        Cons                                                                                                    Also usable for ad-hoc queries
               Complex to model                                                                          Cons
               Much information unknown or too expensive to gather
                                                                                                                Result quality might be very unpredictable
               Collecting statistics on all nodes?
                                                                                                                Complex algorithms and heuristics
               Statistics outdated
                                                                                                                Difficult to keep statistics up-to-date
               Especially machine load and network properties are very volatile

            Katja Hose             Distributed Database Systems   November 17, 2011   85 / 167               Katja Hose             Distributed Database Systems   November 17, 2011   86 / 167




Distributed Database Systems                                                                     Distributed Database Systems
  Global query optimization                                                                        Global query optimization
    Main questions                                                                                   Main questions


      When to optimize?                                                                                When to optimize?


Semi-dynamic optimization                                                                        Hierarchical optimization
        Pre-optimize the query                                                                           Plans are created in multiple stages
        During query execution, test if execution runs as expected during                                Global-Local-Plans
        optimization                                                                                            Global query optimizer creates a global query plan
        e.g., are tuples/fragments delivered in time?, does the network adhere                                  Focus on data transfer: which intermediate results are to be computed
                                                                                                                by which node? How should intermediate results be shipped?
        to the predicted properties?, are there any bad network latencies?, etc.
                                                                                                                Local query optimizers create local query plans
        If execution shows severe deviations, compute a new query plan for all                                  Decide on query plan layout, algorithms, indexes, etc. to deliver the
        parts that have not yet been executed                                                                   requested intermediate result
Makes only sense for queries that run for a longer time                                                  Two-Step-Plans



            Katja Hose             Distributed Database Systems   November 17, 2011   87 / 167               Katja Hose             Distributed Database Systems   November 17, 2011   88 / 167
Distributed Database Systems                                                                          Distributed Database Systems
  Global query optimization                                                                             Global query optimization
    Main questions                                                                                        Main questions


      When to optimize?                                                                                     What criteria to optimize?
Hierarchical optimization                                                                             Important aspects for global optimization
        Plans are created in multiple stages
                                                                                                              Communication operators
        Global-Local-Plans
        Two-Step-Plans                                                                                        Fragment cardinalities
               During compile time, only stable parts of the plan are computed                                Order of operations
               Join order, join methods, access paths, etc.                                                   Join ordering
               During query execution, all missing plan elements are added                                    Because permutations of the joins within the query may lead to
               Node selection, transfer policies, etc.
               Both steps can be performed using traditional query optimization                               improvements of orders of magnitude
               techniques                                                                             Most important alternative optimization criteria
                         Plan enumeration with dynamic programming
                         Complexity is manageable as each optimization problem is much easier                 Query response time
                         than a full optimization                                                             Resource consumption
                         During runtime optimization, fresh statistics are available
                                                                                                              Total query execution costs
Most distributed database management systems use semi-dynamic or
hierarchical optimization techniques (or both)                                                                ...
            Katja Hose                  Distributed Database Systems   November 17, 2011   89 / 167               Katja Hose           Distributed Database Systems   November 17, 2011   90 / 167




Distributed Database Systems                                                                          Distributed Database Systems
  Global query optimization                                                                             Global query optimization
    Main questions                                                                                        Main questions


      Where to execute the query?                                                                           Global query optimization


                                                                                                      Global query optimization. . .
        Query optimizer has to decide which parts of the query have to be                             . . . deals with finding the “best” ordering of operations in the query
        shipped to which node (cost model)                                                            (extended by fragmentation expressions and including communication
                                                                                                      operations) that minimizes a cost function.
        In heavily replicated scenarios, clever hybrid shipping can effectively
        be used for load balancing                                                                            Input
        Move expensive computations to lightly loaded nodes, avoid                                            an algebraic query extended by fragmentation expressions
        expensive communication                                                                               Output
                                                                                                              an algebraic query or query execution plan with communication
                                                                                                              operations




            Katja Hose                  Distributed Database Systems   November 17, 2011   91 / 167               Katja Hose           Distributed Database Systems   November 17, 2011   92 / 167
Distributed Database Systems                                                                 Distributed Database Systems
  Global query optimization                                                                    Global query optimization
    Global query optimizer                                                                       Global query optimizer


      Basics of global query optimization                                                          Optimizer components

Objective
                                                                                             The global optimizer has three main components
        Choose a cost efficient execution plan based on the algebraic query
        plan given as input                                                                          The search space
        Decide which parts of the query have to be transferred to which node                         Set of alternative equivalent execution plans to represent the input
                                                                                                     query
Prerequisites
                                                                                                     The cost model
        Knowledge about fragmentation                                                                Predicts the costs of a given query execution plan
        Knowledge about fragment/relation sizes                                                      The search strategy
        Knowledge about data distribution                                                            Explores the search space and selects the best plan
        Knowledge about costs of operations



            Katja Hose         Distributed Database Systems   November 17, 2011   93 / 167               Katja Hose          Distributed Database Systems   November 17, 2011   94 / 167




Distributed Database Systems                                                                 Distributed Database Systems
  Global query optimization                                                                    Global query optimization
    Global query optimizer                                                                       Global query optimizer


      Phases of optimization                                                                       Search space

                                                                                             Query
Phases                                                                                            SELECT EName, Title
                                                                                                  FROM Employees e, Assignment a, Project p
  1   Spanning the search space using                                                             WHERE e.EID = ENo AND a.PNo=p.PNo
      transformation rules
      → equivalent search plans                                                                                             Equivalent join trees
  2   Applying a search strategy and a
      cost model
      → choose an efficient plan
Main focus: join trees and join
ordering
                                                                                             O(N !) different join trees by applying commutativity and associativity
                                                                                             rules for N relations

            Katja Hose         Distributed Database Systems   November 17, 2011   95 / 167               Katja Hose          Distributed Database Systems   November 17, 2011   96 / 167
Distributed Database Systems                                                                                            Distributed Database Systems
  Global query optimization                                                                                               Global query optimization
    Global query optimizer                                                                                                  Global query optimizer


      Search space                                                                                                            Search strategies
Tree variants for join order optimization
        Linear join trees
               All inner nodes have at least one leaf node (base relation) as child
                                                                                                                        A search strategy needs to reduce search space
               Reduces search space
        Bushy trees                                                                                                             Applying heuristics (similar to centralized algebraic optimization)
               May have inner nodes with no base relation as child                                                                     Perform projections and selections when accessing base relations
               High potential for parallelization                                                                                      Avoid Cartesian products – enforce joins
                                                                                                                                Applying further heuristics influencing the shape of the join tree
                                ⊲⊳
                                                                                                                                       Reducing the size of the search space vs. exhibiting parallelism
                                                                                         ⊲⊳
                                                                                                                                       Linear vs. bushy trees
                           ⊲⊳        R1
                                                                              ⊲⊳                   ⊲⊳
                      ⊲⊳        R2
                                                                         R1        R2         R3        R4
                R3         R4

                                                                         bushy join tree
                 linear join tree

            Katja Hose                    Distributed Database Systems                  November 17, 2011    97 / 167               Katja Hose             Distributed Database Systems    November 17, 2011    98 / 167




Distributed Database Systems                                                                                            Distributed Database Systems
  Global query optimization                                                                                               Global query optimization
    Global query optimizer                                                                                                  Global query optimizer


      Search strategies                                                                                                       Search strategies

Deterministic search strategy
        Systematic generation of query plans
                                                                                                                        Example deterministic search strategies
        Starting with plans accessing the base relations
                                                                                                                                Dynamic programming
        Constructing complex plans by combining easier plans, e.g., joining
                                                                                                                                       (Almost) exhaustive search by building all possible plans (breadth first)
        one more relation at each step                                                                                                 “Very bad” partial plans are pruned at an early stage
                                                                                                                                       Guarantee to find the best plan
                                                                                                                                       Only possible for a small number (5-6) of relations
                                                                                                                                Greedy algorithm
                                                                                                                                       Only one plan is built (depth-first)



Exhaustive search guarantees finding the best plan

            Katja Hose                    Distributed Database Systems                  November 17, 2011    99 / 167               Katja Hose             Distributed Database Systems   November 17, 2011    100 / 167
Distributed Database Systems                                                                           Distributed Database Systems
  Global query optimization                                                                              Global query optimization
    Global query optimizer                                                                                 Distributed cost model


      Search strategies                                                                                      Distributed cost model



Randomized search strategy                                                                             Components
        One or more start plans using a greedy strategy (depth-first search)                                    Cost functions
        Improving start plans by examining “neighbor plans”                                                    Estimating costs to execute operations
        Neighbor plan: applying transformation rules, e.g., exchanging two                                     Statistics
        arbitrarily chosen operations                                                                          Data about relation sizes, attribute domains, value distribution, etc.
        Better performance with a higher number of relations                                                   Formulas
                                                                                                               Determine cardinalities, sizes of intermediate results, etc.
No guarantee to find the best plan




            Katja Hose                  Distributed Database Systems   November 17, 2011   101 / 167               Katja Hose                 Distributed Database Systems   November 17, 2011   102 / 167




Distributed Database Systems                                                                           Distributed Database Systems
  Global query optimization                                                                              Global query optimization
    Distributed cost model                                                                                 Distributed cost model


      Cost functions                                                                                         Cost functions

Total execution time                                                                                   Components of total execution time
        Sum of all costs, i.e., the sum of all processing times at all nodes                                   Local processing costs/time
        involved in answering the query
                                                                                                                                      Tlocal = TCPU · #insts + TI/O · #opsI/O
                               Ttotal = TCPU · #insts + TI/O · #opsI/O +
                                       TMSG · #msgs + TTR · #bytes                                             Communication costs/time

                                                                                                                                      Tcomm = TMSG · #msgs + TTR · #bytes
               TCPU time to process a CPU instruction
               TI/O time for a disk access
                                                                                                               Coefficients (TCPU , TI/O , TMSG , TTR ) characterize a specific
               TMSG time to send and receive a message
               TTR time to transmit a data unit from one node to another
                                                                                                               distributed database system
               #bytes is the sum of the sizes of all messages                                                  WAN (Wide Area Network): communication time is dominant
               Typical assumption: TTR is constant – although it might not be true                             LAN (Local Area Network): also local costs play an important role
               for remote nodes

            Katja Hose                  Distributed Database Systems   November 17, 2011   103 / 167               Katja Hose                 Distributed Database Systems   November 17, 2011   104 / 167
Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Distributed cost model                                                                            Distributed cost model


      Cost functions                                                                                    Total time vs. response time

                                                                                                  Communication costs
Response time
        Time that elapses between query initiation and completion
        Considering parallel local processing and parallel communication

                     Tresponse =TCPU · seq #insts + TI/O · seq #opsI/O +
                               TMSG · seq #msgs + TTR · seq #bytes

        where seq #x represents the maximum number of instructions                                                  Tcommtotal   = 2 · TMSG + TTR · (x + y)
        (insts), I/O operations (opsI/O ), messages (msgs), or bytes (bytes)                                     Tcommresponse   = max{TMSG + TTR · x, TMSG + TTR · y}
        that have to be processed sequentially
                                                                                                  Minimizing response time does not imply that the total time is also
                                                                                                  minimized!

            Katja Hose             Distributed Database Systems   November 17, 2011   105 / 167               Katja Hose             Distributed Database Systems   November 17, 2011   106 / 167




Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Distributed cost model                                                                            Distributed cost model


      Statistics                                                                                        Typical statistics


                                                                                                  Typical statistics for relation R fragmented as R1 , R2 , . . . , Rr with
                                                                                                  attributes A1 , . . . , An
Good statistics are crucial
                                                                                                          Length of each attribute Ai in terms of bytes: length(Ai )
        Most important cost factor:
                                                                                                          Number of distinct values for each attribute Ai and for each fragment
        Size of intermediate results produced during execution
                                                                                                          Rj : valuesAi ,Rj := card(πAi (Rj ))
        Estimating sizes using statistics and formulas
                                                                                                          Minimum and maximum attribute values: min(Ai ) and max(Ai )
        Tradeoff between precision and costs of managing statistics
                                                                                                          Number of dinstinct values (cardinality) of the attribute domains:
                                                                                                          card(dom[Ai ])
                                                                                                          Number of tuples in each fragment Rj : card(Rj )



            Katja Hose             Distributed Database Systems   November 17, 2011   107 / 167               Katja Hose             Distributed Database Systems   November 17, 2011   108 / 167
Distributed Database Systems                                                                        Distributed Database Systems
  Global query optimization                                                                           Global query optimization
    Distributed cost model                                                                              Distributed cost model


      Additional statistics                                                                               Cardinality estimation

                                                                                                    Assumptions
Additional statistics                                                                                       Independence between attributes
        Histogram for each attribute Ai to approximate the frequency                                        Uniform distribution of attribute values
        distribution
                                                                                                    Selectivity
        Join selectivity factor for some pairs of relations
                                                                                                            Ratio between expected number of result tuples and tuples of the
                                                 card(R S)                                                  input relation
                                  SFJ (R, S) =
                                               card(R) · card(S)
                                                                                                                                                 Expected result size
        good (high) selectivity: SFJ = 0.001                                                                                        SF =
                                                                                                                                           Cardinality of the input relation
        bad (low) selectivity: SFJ = 0.5
                                                                                                            Example: σF (R) returns 10% of R’s tuples                      SFS (F, R) = 0.1
                                                                                                            (SF selectivity factor)


            Katja Hose               Distributed Database Systems   November 17, 2011   109 / 167               Katja Hose                  Distributed Database Systems     November 17, 2011   110 / 167




Distributed Database Systems                                                                        Distributed Database Systems
  Global query optimization                                                                           Global query optimization
    Distributed cost model                                                                              Distributed cost model


      Cardinality estimation                                                                              Selection
                                                                                                    Cardinality
Assumptions
                                                                                                                                   card(σF (R)) = SFS (F, R) · card(R)
        Independence between attributes
                                                                                                    Selectivity
        Uniform distribution of attribute values
                                                                                                            Selectivity depends on selection predicates p(A) and constants v
Cardinality
                                                                                                                                                                  1              1
                                                                                                                                    SFS (A = v, R)      =                =
        Estimate result size (cardinality of the output relation)                                                                                             valuesA,R     card(πA (R))
        Example: SFS (F, R) = 0.1                                                                                                                                v − min(A)
                                                                                                                                    SFS (A < v, R)      =
                                                                                                                                                              max(A) − min(A)
                               card(σF (R)) = SFS (F, R) · card(R)                                                                                               max(A) − v
                                                                                                                                    SFS (A > v, R)      =
                                                                                                                                                              max(A) − min(A)
                                                                                                                                                                    v2 − v1
                                                                                                                          SFS (v1 < A < v2 , R)         =
                                                                                                                                                              max(A) − min(A)

            Katja Hose               Distributed Database Systems   November 17, 2011   111 / 167               Katja Hose                  Distributed Database Systems     November 17, 2011   112 / 167
Distributed Database Systems                                                                            Distributed Database Systems
  Global query optimization                                                                               Global query optimization
    Distributed cost model                                                                                  Distributed cost model


      Selection                                                                                               Projection
                                                                                                        Cardinality
Cardinality                                                                                                     Without duplicate elimination

                               card(σF (R)) = SFS (F, R) · card(R)                                                                          card(πA (R)) = card(R)

Selectivity                                                                                                     With duplicate elimination (if defined on an arbitrary attribute A):
        Selectivity depends on selection predicates p(A) and constants v
                                                                                                                                           card(πA (R)) = valuesA,R
               SFS (p(Ai ) ∧ p(Aj ), R)       = SFS (p(Ai ), R) · SFS (p(Aj ), R)
               SFS (p(Ai ) ∨ p(Aj ), R)       = SFS (p(Ai ), R) + SFS (p(Aj ), R) −
                                                                                                                With duplicate elimination (if one of the attributes is the primary key):
                                                    (SFS (p(Ai ), R) · SFS (p(Aj ), R))                                                   card(πAi ,... (R)) = card(R)

                                                                                                        Cardinalities for projections on arbitrary combinations of attributes are
                                                                                                        hard to predict because attribute correlations are unknown
            Katja Hose                 Distributed Database Systems     November 17, 2011   113 / 167               Katja Hose              Distributed Database Systems   November 17, 2011   114 / 167




Distributed Database Systems                                                                            Distributed Database Systems
  Global query optimization                                                                               Global query optimization
    Distributed cost model                                                                                  Distributed cost model


      Cartesian product                                                                                       Joins

                                                                                                        Cardinality
                                                                                                                Given: R           S with R(A, B) and S(B, C)
                                                                                                                Upper bound: size of the Cartesian product
Cardinality                                                                                             Natural join on attribute B
                                                                                                                No B values shared between R and S:
                                card(R × S) = card(R) · card(S)
                                                                                                                                                card(R          S) = 0
                                                                                                                Foreign key relationship R.B → S.B:
                                                                                                                                            card(R         S) = card(R)
                                                                                                                All tuples in R.B und S.B have the same value:
                                                                                                                                       card(R      S) = card(R) · card(S)

            Katja Hose                 Distributed Database Systems     November 17, 2011   115 / 167               Katja Hose              Distributed Database Systems   November 17, 2011   116 / 167
Distributed Database Systems                                                                           Distributed Database Systems
  Global query optimization                                                                              Global query optimization
    Distributed cost model                                                                                 Distributed cost model


      Joins                                                                                                  Union and Difference
                                                                                                       Cardinality
Cardinality
                                                                                                               Difficult to estimate because duplicates are removed
        Given: R           S with R(A, B) and S(B, C)
                                                                                                               Union
        Upper bound: size of the Cartesian product
                                                                                                                      Upper bound
Natural join on attribute B                                                                                                                  card(R ∪ S) = card(R) + card(S)

        Estimate                                                                                                      Lower bound

                                                   card(R) · card(S)                                                                       card(R ∪ S) = max{card(R), card(S)}
                          card(R        S) =
                                               max{valuesB,R , valuesB,S }
                                                                                                               Difference
        Store statistics (join cardinality SFJ ) for important joins                                                  Upper bound
                                                                                                                                                   card(R  S) = card(R)
                               card(R     S) = SFJ · card(R) · card(S)                                                Lower bound
                                                                                                                                                        card(R  S) = 0

            Katja Hose                  Distributed Database Systems   November 17, 2011   117 / 167               Katja Hose                  Distributed Database Systems        November 17, 2011   118 / 167




Distributed Database Systems                                                                           Distributed Database Systems
  Global query optimization                                                                              Global query optimization
    Distributed cost model                                                                                 Distributed cost model


      Selectivity estimation using histograms                                                                Selectivity estimation using histograms

Histograms
        In reality distribution of attribute values in a relation is often not                         Equality predicate
        uniform
                                                                                                               Given predicate A = v
        Histograms consist of a set of buckets bi
                                                                                                               Identify bucket bi with v ∈ rangei
Example histogram on attribute A of relation R
                                                                                                                                                                              1
                                   Each bucket bi defined by                                                                                     SFS (A = v, R) =
                                                                                                                                                                              di
                                         Range: rangei
                                         Range of values in attribute domain dom[A]                                                                                                      fi
                                         Frequency: fi
                                                                                                                                      card(σA=v (R)) = SFS (A = v, R) · fi =
                                                                                                                                                                                         di
                                         Number of tuples of R where R.A ∈ rangei
                                         Distinct values: di
                                         Number of distinct values of A where R.A ∈ rangei

            Katja Hose                  Distributed Database Systems   November 17, 2011   119 / 167               Katja Hose                  Distributed Database Systems        November 17, 2011   120 / 167
Distributed Database Systems                                                                   Distributed Database Systems
  Global query optimization                                                                      Global query optimization
    Distributed cost model                                                                         Join order optimization


      Selectivity estimation using histograms                                                        Phases of optimization


                                                                                               Phases
Range predicates
                                                                                                 1   Spanning the search space using
        Given predicate A ≤ v
                                                                                                     transformation rules
        Identify buckets that overlap the queried range                                              → equivalent search plans
        Sum up frequencies                                                                       2   Applying a search strategy and a
                               i−1
                                              v − min(rangei )                                       cost model
            card(σA≤v (R)) =         fi +                             · fi
                                          max(rangei ) − min(rangei )                                → choose an efficient plan
                               j=1
                                                                                               Main focus: join trees and join
          Bucket i only partially overlaps the queried range                                   ordering



            Katja Hose          Distributed Database Systems   November 17, 2011   121 / 167               Katja Hose         Distributed Database Systems       November 17, 2011   122 / 167




Distributed Database Systems                                                                   Distributed Database Systems
  Global query optimization                                                                      Global query optimization
    Join order optimization                                                                        Join order optimization


      Join order optimization                                                                        Join order optimization two relations



Simplifying assumptions                                                                        Determine the join order for two relations R                  S
        No distinction between fragments and relations
        Ignoring local processing time
        Ignoring other operations (selection, projection)
        No pipelining
        Ignoring data transfer to the result site                                              Transfer the smaller relation to minimize the network load




            Katja Hose          Distributed Database Systems   November 17, 2011   123 / 167               Katja Hose         Distributed Database Systems       November 17, 2011   124 / 167
Distributed Database Systems                                                                                   Distributed Database Systems
  Global query optimization                                                                                      Global query optimization
    Join order optimization                                                                                        Join order optimization


        Join order optimization for three relations                                                                  Join order optimization with semijoins
Determine the join order for three relations R                     A   S    B   T                              Considering semijoins for joining two relations R (at nodeR ) and S (at
                                           1   R   nodeS , nodeS : R = R            S, R        nodeT ,
                                                                                                               nodeS ) results in three alternatives – assuming A is the join attribute
                                                                                                                 1 R
                                               nodeT : R   T                                                            A S = (R A S) A S = (R A πA (S)) A S
                                                                                                                 2 R
                                                                                                                        A S = R A (S A R)
                                           2   S   nodeR , nodeR : R = R            S, R        nodeT ,
                                               nodeT : R   T
                                                                                                                 3 R
                                                                                                                        A S = (R A S) A (S A R)
                                           3   S   nodeT , nodeT : S = S            T, S        nodeR ,
                                               nodeR : S   R                                                   Workflow for alternative 1
                                           4   T   nodeS , nodeS : S = S
                                               nodeR : S   R
                                                                                    T, S        nodeR ,
                                                                                                                   nodeS : compute S = πA (S), send S to nodeR
                                           5   T     nodeS , R         nodeS , nodeS : R    S      R               nodeR : compute R = R A S , send R to nodeS
                                                                                                                   nodeS : compute R A S
Possible orders                                                                                                Transfer costs (neglecting TM SG )
    1          nodeR : send R to nodeS                                                                             TT R · card(πA (S)) + TT R · card(R A S )
               nodeS : compute join R = R            S, send R to nodeT                                            Considerung full joins (R A S) only and assuming that
               nodeT : compute join R   T
                                                                                                                   card(R) < card(S), the complete relation R would have been sent to
    2          nodeS : send S to nodeR                                                                             nodeS , costs: TT R · card(R)
               nodeR : compute join R = R S, send R to nodeT
            Katja Hose            Distributed Database Systems November 17, 2011                   125 / 167               Katja Hose            Distributed Database Systems   November 17, 2011   126 / 167
               nodeT : compute join R        T
    3          nodeS : send S to nodeT
               node : compute join S = S
                        T
Distributed Database Systems                        T , send S to nodeR                                        Distributed Database Systems
               node : compute join S
  Global query optimization
                        R              R                                                                         Global query optimization
    Join order optimization                                                                                        Total time models
    4          nodeT : send T to nodeS
        SemijoinS vs. joinsjoin S
            node : compute            =S            T , send S to nodeR                                              Total time models
               nodeR : compute join S  R
    5          nodeT : send T to nodeS                                                                         Basic strategy
               nodeR : send R to nodeS
               nodeS : compute join R S            R
                                                                                                                       Coordinator (master) site
Conclusion                                                                                                             Exhaustive search
Decision
    Transfer costssizes of the T R · card(πA (S)) + TT R · card(R A S)
    Based on the semijoin: Tbase relations and intermediate results                                                    Optimization objective: total time
    Transfer exploiting parallelismTT R · card(R) 5
    Perhaps costs standard join: of alternative                                                                Input
The semijoin is preferable if                                                                                          Relational algebra tree
                                                                                                                       Cost model
                          card(πA (S)) + card(R          A   S) < card(R)
                                                                                                                       Statistics
                                                                                                                       Location of relations
                                                                                                               Output
                                                                                                                       Optimized query execution plan

            Katja Hose              Distributed Database Systems            November 17, 2011      127 / 167               Katja Hose            Distributed Database Systems   November 17, 2011   128 / 167
Distributed Database Systems                                                                     Distributed Database Systems
  Global query optimization                                                                        Global query optimization
    Total time models                                                                                Total time models


      Total time models                                                                                Site selection and data transfer




Aspects                                                                                          Query shipping
        Cost model
                                                                                                       Query initiator (node at which
        Site selection and data transfer
                                                                                                       the query is issued/optimized)
        Join order optimization                                                                        sends the query to other nodes
        Join implementation                                                                            Receiver nodes compute the
                                                                                                       query result and ship the result
                                                                                                       back to the initiator




            Katja Hose            Distributed Database Systems   November 17, 2011   129 / 167               Katja Hose          Distributed Database Systems   November 17, 2011   130 / 167




Distributed Database Systems                                                                     Distributed Database Systems
  Global query optimization                                                                        Global query optimization
    Total time models                                                                                Total time models


      Site selection and data transfer                                                                 Site selection and data transfer


                                                                                                 Hybrid shipping
Data shipping                                                                                          Initiator sends partial queries to
                                                                                                       other nodes
      Query remains at the initiator
                                                                                                       Other nodes execute some parts
      Initiator sends data request
                                                                                                       of the query and send
      messages to other nodes
                                                                                                       intermediate results to the
      Receiver nodes ship all required                                                                 initiator
      data to the initiator
                                                                                                       Initiator executes remaining
      Initiator computes result                                                                        query operations
                                                                                                       (post-processing)


            Katja Hose            Distributed Database Systems   November 17, 2011   131 / 167               Katja Hose          Distributed Database Systems   November 17, 2011   132 / 167
Distributed Database Systems                                                                              Distributed Database Systems
  Global query optimization                                                                                 Global query optimization
    Total time models                                                                                         Total time models


      Site selection and data transfer for joins                                                                Site selection and data transfer for joins

Problem                                                                                                   Scenario

        Queries make extensive use of joins                                                                       2 nodes; one (nodeR ) storing relation R the other (nodeS ) storing
                                                                                                                  relation S
        Computing joins is very expensive
                                                                                                                  The query asks for R           S
        Especially in distributed systems: special attention because of
        fragments and replication                                                                                      R     A      B
                                                                                                                                         S   B       C   D
                                                                                                                             3      7
Basic strategies                                                                                                             1      1
                                                                                                                                             9       8   8
                                                                                                                                             1       5   1           R      S   A     B     C    D
        Ship whole                                                                                                           4      6
                                                                                                                                             9       4   2                      1     1     5    1
                                                                                                                             7      7
        Transferring the complete relation                                                                                   4      5
                                                                                                                                             4       3   3                      4     5     7    8
                                                                                                                                             4       2   6
        Fetch as needed                                                                                                      6      2
                                                                                                                                             5       7   8
        Transferring the relation piecewise                                                                                  5      7



            Katja Hose             Distributed Database Systems           November 17, 2011   133 / 167               Katja Hose             Distributed Database Systems           November 17, 2011   134 / 167




Distributed Database Systems                                                                              Distributed Database Systems
  Global query optimization                                                                                 Global query optimization
    Total time models                                                                                         Total time models


      Ship whole                                                                                                Ship whole

             R     A      B                                                                                            R     A      B
                               S   B   C     D                                                                                           S   B       C   D
                   3      7                                                                                                  3      7
                                   9   8     8                                                                                               9       8   8
                   1      1                                                                                                  1      1
                                   1   5     1             R      S   A     B     C    D                                                     1       5   1           R      S   A     B     C    D
                   4      6                                                                                                  4      6
                                   9   4     2                        1     1     5    1                                                     9       4   2                      1     1     5    1
                   7      7                                                                                                  7      7
                                   4   3     3                        4     5     7    8                                                     4       3   3                      4     5     7    8
                   4      5                                                                                                  4      5
                                   4   2     6                                                                                               4       2   6
                   6      2                                                                                                  6      2
                                   5   7     8                                                                                               5       7   8
                   5      7                                                                                                  5      7

Execution at nodeR                                                                                        Execution at nodeS
        nodeR : send data request message (relation S) to nodeS                                                   nodeS : send data request message (relation R) to nodeR
        nodeS : send requested data (relation S) to nodeR                                                         nodeR : send requested data (relation R) to nodeS
Total costs: 2 messages, 18 attribute values                                                              Total costs: 2 messages, 14 attribute values

            Katja Hose             Distributed Database Systems           November 17, 2011   135 / 167               Katja Hose             Distributed Database Systems           November 17, 2011   136 / 167
Distributed Database Systems                                                                                          Distributed Database Systems
  Global query optimization                                                                                             Global query optimization
    Total time models                                                                                                     Total time models


      Ship whole                                                                                                            Fetch as needed

               R     A     B
                               S   B       C       D                                                                                 R    A      B
                                                                                                                                                     S   B      C    D
                     3     7                                                                                                              3      7
                                   9       8       8                                                                                      1      1
                                                                                                                                                         9      8    8
                     1     1                                                                                                                             1      5    1               R      S   A   B   C    D
                                   1       5       1           R          S       A       B       C   D                                   4      6
                                                                                                                                                         9      4    2                          1   1   5    1
                     4     6                                                                                                              7      7
                                                                                                                                                         4      3    3                          4   5   7    8
                                   9       4       2                              1       1       5   1                                   4      5
                     7     7                                                                                                              6      2
                                                                                                                                                         4      2    6
                                   4       3       3                              4       5       7   8                                                  5      7    8
                     4     5                                                                                                              5      7
                                   4       2       6
                     6     2
                                   5       7       8
                     5     7                                                                                          Execution at nodeR
                                                                                                                              nodeR : send data request message (tuples of relation S with B = ‘7 ) to nodeS
Execution at a third node nodeX
                                                                                                                              nodeS : send requested data (0 tuples of relation S with B = ‘7 ) to nodeR
        nodeX : send data request message (relation R) to nodeR
                                                                                                                              nodeR : send data request message (tuples of relation S with B = ‘1 ) to nodeS
        nodeX : send data request message (relation S) to nodeS                                                               nodeS : send requested data (1 tuple of relation S with B = ‘1 ) to nodeR
        nodeR : send requested data (relation R) to nodeX                                                                     ...
        nodeS : send requested data (relation S) to nodeX
                                                                                                                      Total costs: 7 · 2 = 14 messages, 7 + 2 · 3 = 13 attribute values
Total costs: 4 messages, 18 + 14 = 32 attribute values
              Katja Hose               Distributed Database Systems                   November 17, 2011   137 / 167                 Katja Hose               Distributed Database Systems           November 17, 2011   138 / 167




Distributed Database Systems                                                                                          Distributed Database Systems
  Global query optimization                                                                                             Global query optimization
    Total time models                                                                                                     Total time models


      Fetch as needed                                                                                                       Ship whole vs. fetch as needed

               R    A      B
                               S   B      C    D
                    3      7
                                   9      8    8
                    1      1
                                   1      5    1               R      S       A       B       C   D
                    4      6
                    7      7
                                   9
                                   4
                                          4
                                          3
                                               2
                                               3
                                                                              1
                                                                              4
                                                                                      1
                                                                                      5
                                                                                              5
                                                                                              7
                                                                                                  1
                                                                                                  8
                                                                                                                      Conclusion
                    4      5
                                   4      2    6
                    6      2
                    5      7
                                   5      7    8                                                                              Fetch as needed results in a high number of messages
                                                                                                                              Ship whole results in high amounts of transferred data
Execution at nodeS
                                                                                                                      More advanced strategies based on these two basic strategies
        nodeS : send data request message (tuples of relation R with B = ‘9 ) to nodeR
        nodeR : send requested data (0 tuples of relation R with B = ‘9 ) to nodeS                                            Semijoin
        nodeS : send data request message (tuples of relation R with B = ‘1 ) to nodeR                                        Bitvector join
        nodeR : send requested data (1 tuple of relation R with B = ‘1 ) to nodeS
        ...

Total costs: 6 · 2 = 12 messages, 6 + 2 · 2 = 10 attribute values

              Katja Hose               Distributed Database Systems                   November 17, 2011   139 / 167                 Katja Hose               Distributed Database Systems           November 17, 2011   140 / 167
Distributed Database Systems                                                                          Distributed Database Systems
  Global query optimization                                                                             Global query optimization
    Total time models                                                                                     Total time models


      Semijoin                                                                                              Semijoin

Requesting all join partners in just one step

Basic consideration:
               R S = R (S                  R) = R           (S      πB (R))
with B being the join attribute

Algorithm
        nodeR : determine πB (R) and send the result to nodeS
        nodeS : determine S = S          πB (R) = S             R and send result to
        nodeR
        nodeR : determine R      S =R            S


            Katja Hose           Distributed Database Systems         November 17, 2011   141 / 167               Katja Hose          Distributed Database Systems   November 17, 2011   142 / 167




Distributed Database Systems                                                                          Distributed Database Systems
  Global query optimization                                                                             Global query optimization
    Total time models                                                                                     Total time models


      Bitvector join                                                                                        Bitvector join



        Also known as hash filter join
                                                                                                      Algorithm
        Avoiding the transfer of all join attribute values to the other node
                                                                                                              nodeR : determine πB (R), apply hash function h to the result, set the
        Transfer bitvector instead BV [1 . . . n]
                                                                                                              corresponding bits in BV to 1, and send the result to nodeS
Transformation                                                                                                nodeS : apply hash function h to the join attribute of relation S,
        Choose an appropriate hash function h                                                                 determine S = {t ∈ S|BV [h(t.B)] = 1}, send S to nodeR
        Apply h to transform attribute values to the range [1 . . . n]                                        nodeR : determine R     S =R            S
        Set the corresponding bits in the bitvector BV [1 . . . n] to 1




            Katja Hose           Distributed Database Systems         November 17, 2011   143 / 167               Katja Hose          Distributed Database Systems   November 17, 2011   144 / 167
Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Total time models                                                                                 Total time models


      Bitvector join                                                                                    Bitvector join



                                                                                                  Conclusions
                                                                                                          Transferring the bitvector reduces network load
                                                                                                          Bitvector only indicates potential join partners because multiple
                                                                                                          attribute values might map to the same hash value
                                                                                                          Might result in transferring unnecessary tuples
                                                                                                          Requirements: an appropriate hash function h and n needs to be
                                                                                                          large enough to avoid a high number of collisions




            Katja Hose             Distributed Database Systems   November 17, 2011   145 / 167               Katja Hose               Distributed Database Systems   November 17, 2011   146 / 167




Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Response time models                                                                              Response time models


      Response time models                                                                              Response time models

                                                                                                  Two different response times
                                                                                                          When does the first result tuples arrive?
        “Classic” cost models consider total resource consumption of a query                              When have all result tuples arrived?
               Good results for heavy computational load and slow network
               connections                                                                        Example situation
               By saving resources, many queries can be executed in parallel
                                                                                                          Given relations/fragments A, B, C, and D
               (minimum load, maximum throughput)
        Optimization for short response times                                                             Full replication, i.e., all relations/fragments are available on all nodes
               “Waste” some resources to get query results earlier                                        Compute (A             B)   (C     D)
               Take advantage of lightly loaded machines and fast connections                             Assumptions
               Utilize intraquery parallelism                                                                    Each join costs 20 time units (TCP U + TI/O )
                                                                                                                 Transferring an intermediate result costs 10 time units (TM SG + TT R )
                                                                                                                 Accessing a relation is for free
                                                                                                                 Each node has one computational thread

            Katja Hose             Distributed Database Systems   November 17, 2011   147 / 167               Katja Hose               Distributed Database Systems   November 17, 2011   148 / 167
Distributed Database Systems                                                                       Distributed Database Systems
  Global query optimization                                                                          Global query optimization
    Response time models                                                                               Response time models


      Example                                                                                            Example
Two plans
        Plan 1: Execute all operations on one node
        Total costs: 60
        Plan 2: Join on different nodes, ship results
        Total costs: 80




                                                                                                   Response time costs: 60 for plan 1, 50 for plan 2
                         Plan 1                                                                    ⇒ Plan 2 is better with respect to response time
                                                                                                   Because operations can be executed in parallel (exploiting intra-query
                                                                  Plan 2                           parallelism)
Plan 1 is obviously better with respect to total costs                                             Response time can be improved even more by applying pipelining
            Katja Hose             Distributed Database Systems    November 17, 2011   149 / 167               Katja Hose               Distributed Database Systems   November 17, 2011   150 / 167




Distributed Database Systems                                                                       Distributed Database Systems
  Global query optimization                                                                          Global query optimization
    Response time models                                                                               Response time models


      Pipelining                                                                                         Pipelining


Goal of applying pipelining
Good first tuple response times by executing queries in a pipelined fashion                         Problems
                                                                                                           Operations have different execution times
        Not pipelined                                                                                      If execution speed of operations in the pipeline differs, tuples are
               Each operation is fully completed and an intermediate result is created                     either cached or the pipeline blocks
               Next operation reads intermediate result and is then fully completed                        Some operations more suitable than others
               Reading and writing of intermediate results costs resources                                        Good: scan, select, project, union, . . .
        Pipelined                                                                                                 Tricky: join, intersection, . . .
               Operations do not create intermediate results                                                      Very hard: sort
               Each processed tuple is fed directly into the next operation
               Tuples “flow” through the operations



            Katja Hose             Distributed Database Systems    November 17, 2011   151 / 167               Katja Hose               Distributed Database Systems   November 17, 2011   152 / 167
Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Response time models                                                                              Response time models


      Pipelining example                                                                                Pipelining example
Simple query                                                                                      Simple query
    Tablescan, selection, projection                                                                      Tablescan, selection, projection
    1000 tuples are scanned, selectivity is 0.1                                                           1000 tuples are scanned, selectivity is 0.1
Costs                                                                                             Costs
    Accessing one tuple during tablescan: 2 time units                                                    Accessing one tuple during tablescan: 2 time units
    Selecting (testing) one tuple: 1 time unit                                                            Selecting (testing) one tuple: 1 time unit
    Projecting one tuple: 1 time unit                                                                     Projecting one tuple: 1 time unit
           Non-Pipelined
                                      time               event                                             Pipelined             time                          event
                                        2         first tuple in IR1                                                                2            first tuple finished table scan
                                      2000         all tuples in IR1                                                               3     first tuple finished selection (if selected. . . )
                                      2001        first tuple in IR2                                                                4                  first tuple in Result
                                      3000         all tuples in IR2                                                             3098            last tuple finished tablescan
                                      3001       first tuple in Result                                                            3099             last tuple finished selection
                                      3100       all tuples in Result                                                            3100                  all tuples in Result
            Katja Hose          Distributed Database Systems      November 17, 2011   153 / 167               Katja Hose            Distributed Database Systems    November 17, 2011   154 / 167




Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Response time models                                                                              Response time models


      Pipelining example                                                                                Pipelining example

Join query                                                                                        Costs
        Joining two table subsets using a non-pipelined                                                 1000 tuple are scanned in each pipeline,
        BNL(Block-Nested-Loop) join                                                                     selectivity 0.1
        Both pipelines run in parallel                                                                  Joining 100 100 tuples: 10.000 time units
                                                                                                        (one time unit per combination)
                                                                                                  Response time
                                                                                                          The first tuple arrives at the end of any pipeline after 4 time units
                                                                                                          All tuples have arrived at the end of the pipelines after 3.100 time
                                                                                                          units
                                                                                                          Final result will be available after 13.100 time units
                                                                                                                 No benefit from pipelining with respect to response time
                                                                                                                 First tuple arrives long after step 3.100
            Katja Hose          Distributed Database Systems      November 17, 2011   155 / 167               Katja Hose            Distributed Database Systems    November 17, 2011   156 / 167
Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Response time models                                                                              Response time models


      Joins and pipelining                                                                              Single-pipelined hash join


                                                                                                          “Classic” join algorithm
Suboptimal result because of the unpipelined join                                                         Basic idea A B
                                                                                                                 One input relation is read from an intermediate result (A), the other is
        Most traditional join algorithms are unsuitable for pipelining                                           pipelined through the join operation (B)
        Single/semi-pipelined: only one pipeline, the other intermediate result                                  All tuples of A are stored in a hash table
                                                                                                                           Hash function is used on the join attribute
        has to be available
                                                                                                                           All tuples with the same hash value for the join attribute are in the
        Fully pipelined: both inputs are processed in a pipelined fashion                                                  same bucket
                                                                                                                 Every incoming tuple (via pipeline) of B is hashed by join attributes
                                                                                                                 Compare tuple to each tuple in the respective A bucket
                                                                                                                 Return those tuples showing matching join attributes




            Katja Hose             Distributed Database Systems   November 17, 2011   157 / 167               Katja Hose                   Distributed Database Systems     November 17, 2011   158 / 167




Distributed Database Systems                                                                      Distributed Database Systems
  Global query optimization                                                                         Global query optimization
    Response time models                                                                              Response time models


      Double-pipelined hash join                                                                        Double-pipelined hash join – example


        Dynamically build hashtables for A and B tuples – memory intensive!
        Process tuples upon arrival
               Cache tuples if necessary                                                                B(31, B2) arrives
               Balance between A and B tuples for better performance
               Rely on statistics for a good A:B ratio                                                  Insert into B Hash
        If a new tuple arrives of relation A                                                            Find matching A tuples
               Insert it into the A hashtable                                                                   Found A3
               Check in the B hashtable if there are join partners                                              Assume that A3 matches B3. . .
               If yes, return all combined AB tuples                                                    Add AB(A3, B2) to the result
        If a new B tuple arrives, process it analogously



            Katja Hose             Distributed Database Systems   November 17, 2011   159 / 167               Katja Hose                   Distributed Database Systems     November 17, 2011   160 / 167
Distributed Database Systems                                                                    Distributed Database Systems
  Global query optimization                                                                       Global query optimization
    Response time models                                                                            Response time models


      Pipelining in distributed setups                                                                Pipelining in distributed setups – tuple blocking


In pipelines, tuples “flow” through the operations
        Works well with one processing unit! (one node)                                         Minimize communication overhead by tuple blocking
        Problem: sending each tuple in separate from one node to another
                                                                                                        Do not send single tuples, but blocks containing multiple tuples
        might be inefficient
                                                                                                               Burst transmission
        Communication costs                                                                                    Packets have to be cached
               Setting up transfer and opening communication channel                                           Block size should be at least the packet size of the underlying network
               Composing a message                                                                             protocol
               Transmitting message: header information and payload (minimum
               packet size is bigger than tuple)                                                Results in even more cost factors for the cost model
               Receiving and decoding a message
               Closing the channel



            Katja Hose           Distributed Database Systems   November 17, 2011   161 / 167               Katja Hose             Distributed Database Systems   November 17, 2011   162 / 167




Distributed Database Systems                                                                    Distributed Database Systems
  Global query optimization                                                                       Summary



      Summary on global query optimization                                                            Summary I

                                                                                                        Detour on centralized query processing
                                                                                                               Query parsing
                                                                                                               Query transformation
Global query optimization has to deal with additional constraints and cost                                     Query optimization
factors compared to “classic” query optimization                                                        Basics of distributed query optimization
                                                                                                               Many steps can be reused from centralized query processing
        Network costs, network model, shipping policies
                                                                                                               Optimization in distributed systems is much more complex (network
        Fragmentation and allocation schemes                                                                   latency, selectivities, communication costs, response time, etc.)
        Different optimization goals (response time vs. total time)                                             Meta data management – where to store the global catalog?
                                                                                                               Data localization – consider fragmentation
                                                                                                        Distributed query optimization
                                                                                                               Very important question: where to execute which parts of the query?
                                                                                                               When to optimize: compile time vs. dynamic optimization, most
                                                                                                               common: semi-dynamic and hierarchical optimization
                                                                                                               Cost model (cost functions, statistics, cardinality estimation, etc.)
            Katja Hose           Distributed Database Systems   November 17, 2011   163 / 167               Katja Hose             Distributed Database Systems   November 17, 2011   164 / 167
Distributed Database Systems                                                                      Distributed Database Systems
  Summary                                                                                           Summary



      Summary II                                                                                        References I


                                                                                                                     ¨
                                                                                                         M. Tamer Ozsu, P. Valduriez.
                                                                                                         Principles of Distributed Database Systems.
               Join order optimization                                                                   Third Edition, Springer, 2011.
               Join implementations (ship whole, fetch as needed, semijoin, bitvector
               join, pipelined hash join, etc.)                                                          E. Rahm.
               Total time and response time                                                              Mehrrechner-Datenbanksysteme.
                                                                                                         Addison-Wesley, Bonn, 1994.
                                                                                                         P. Dadam.
                                                                                                         Verteilte Datenbanken und Client/Server-Systeme.
                                                                                                         Springer-Verlag, Berlin, Heidelberg 1996.



            Katja Hose             Distributed Database Systems   November 17, 2011   165 / 167               Katja Hose         Distributed Database Systems   November 17, 2011   166 / 167




Distributed Database Systems
  Summary



      References II



       Toby J. Teorey
       Database modeling and design
       Third Edition, Morgan Kaufmann Publishers, San Francisco, CA,
       1999.
       D. Kossmann.
       The State of the Art in Distributed Query Processing,
       ACM Computing Surveys,
       Vol. 32, No. 4, 2000, S. 422-469.




            Katja Hose             Distributed Database Systems   November 17, 2011   167 / 167

More Related Content

PPTX
Query processing in Distributed Database System
PPTX
Distributed DBMS - Unit 6 - Query Processing
PPTX
Distributed DBMS - Unit 2 - Overview of RDBMS
PPTX
Query processing
PPT
Query processing-and-optimization
PPTX
Distributed DBMS - Unit 5 - Semantic Data Control
PPTX
Database ,7 query localization
PPT
Overview of query evaluation
Query processing in Distributed Database System
Distributed DBMS - Unit 6 - Query Processing
Distributed DBMS - Unit 2 - Overview of RDBMS
Query processing
Query processing-and-optimization
Distributed DBMS - Unit 5 - Semantic Data Control
Database ,7 query localization
Overview of query evaluation

What's hot (20)

PPTX
Query processing and Query Optimization
PPT
Chapter15
PDF
MSc CST (5yr Integrated Course ) Syllabus - Madras University
PPT
Query compiler
PDF
Lecture 2
PDF
Implementation of query optimization for reducing run time
PPT
Chapter16
PPTX
Heuristic approch monika sanghani
PPTX
Database , 8 Query Optimization
PPT
When &amp; Why\'s of Denormalization
PDF
Query Distributed RDF Graphs: The Effects of Partitioning Paper
PPTX
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
PPTX
Denormalization
PDF
Chapter 3 principles of parallel algorithm design
PPT
Dwh lecture-07-denormalization
PDF
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
PPT
Databases: Denormalisation
PPTX
Cost estimation for Query Optimization
PDF
8 query processing and optimization
Query processing and Query Optimization
Chapter15
MSc CST (5yr Integrated Course ) Syllabus - Madras University
Query compiler
Lecture 2
Implementation of query optimization for reducing run time
Chapter16
Heuristic approch monika sanghani
Database , 8 Query Optimization
When &amp; Why\'s of Denormalization
Query Distributed RDF Graphs: The Effects of Partitioning Paper
Distributed DBMS - Unit 9 - Distributed Deadlock & Recovery
Denormalization
Chapter 3 principles of parallel algorithm design
Dwh lecture-07-denormalization
대용량 데이터 분석을 위한 병렬 Clustering 알고리즘 최적화
Databases: Denormalisation
Cost estimation for Query Optimization
8 query processing and optimization
Ad

Viewers also liked (20)

PPT
Sub join a query optimization algorithm for flash-based database
PPTX
Database Join
PPTX
Database Introduction - Join Query
PPT
Scrum Model
PPTX
SQL Join Basic
PDF
Joins in databases
PPTX
Database - Normalization
PDF
Everything about Database JOINS and Relationships
PPTX
Database Normalization
PPTX
Normalization in Database
PDF
NATS - A new nervous system for distributed cloud platforms
PPT
Normalization of database tables
PPT
Types Of Join In Sql Server - Join With Example In Sql Server
PPT
Design principles of scalable, distributed systems
PPT
Distributed Systems Architecture in Software Engineering SE11
PDF
From Mainframe to Microservice: An Introduction to Distributed Systems
PPTX
Database Normalization
PDF
A Join Operator for Property Graphs
PPT
Distributed & parallel system
Sub join a query optimization algorithm for flash-based database
Database Join
Database Introduction - Join Query
Scrum Model
SQL Join Basic
Joins in databases
Database - Normalization
Everything about Database JOINS and Relationships
Database Normalization
Normalization in Database
NATS - A new nervous system for distributed cloud platforms
Normalization of database tables
Types Of Join In Sql Server - Join With Example In Sql Server
Design principles of scalable, distributed systems
Distributed Systems Architecture in Software Engineering SE11
From Mainframe to Microservice: An Introduction to Distributed Systems
Database Normalization
A Join Operator for Property Graphs
Distributed & parallel system
Ad

Similar to Distributed_Database_System (20)

PPTX
Search Computing Overview
PPT
Data mining - GDi Techno Solutions
PDF
Improving Findability: The Role of Information Architecture in Effective Search
PPTX
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
PPTX
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
PDF
Linked Open data: CNR
PDF
IT Discovery: Automated Global Assessment
PDF
Provenance: From e-Science to the Web Of Data
PPTX
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
PDF
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
PDF
Challenges Distributed Information Retrieval [RBY] (ICDE 2007 Turkey)
PDF
PPTX
Book Recommendation System using Data Mining for the University of Hong Kong ...
PPTX
UAB 2011- Combining human and computational intelligence
PDF
Soeren okfn greece meetup
PPTX
Mesh Labs Introduction June 2012
PDF
DataONE_cobb_hubbub2012_20120924_v05
PDF
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
PPTX
NISO Webinar: The Three S's of Electronic Resource Management: Systems, Stand...
PDF
Search Computing Overview
Data mining - GDi Techno Solutions
Improving Findability: The Role of Information Architecture in Effective Search
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Large Scale Search, Discovery and Analytics with Hadoop, Mahout and Solr
Linked Open data: CNR
IT Discovery: Automated Global Assessment
Provenance: From e-Science to the Web Of Data
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
"A Study of I/O and Virtualization Performance with a Search Engine based on ...
Challenges Distributed Information Retrieval [RBY] (ICDE 2007 Turkey)
Book Recommendation System using Data Mining for the University of Hong Kong ...
UAB 2011- Combining human and computational intelligence
Soeren okfn greece meetup
Mesh Labs Introduction June 2012
DataONE_cobb_hubbub2012_20120924_v05
Large-Scale Search Discovery Analytics with Hadoop, Mahout, Solr
NISO Webinar: The Three S's of Electronic Resource Management: Systems, Stand...

More from Philip Zhong (14)

PPTX
Cisco Webex Distributed Framework and Data Store Design
PPTX
How to Implement Distributed Data Store
PPT
MongoDB Knowledge Shareing
PPT
Adapter Poxy Pattern
PPTX
How to estimate_oracle_cost
PDF
Mongo db program_installation_guide
PDF
Mongo db sharding_cluster_installation_guide
PDF
Vitess percona 2012
PPTX
Mysql performance tuning
PPTX
Mysql5.1 character set testing
PPTX
How to write_language_compiler
PPTX
Compare mysql5.1.50 mysql5.5.8
PPTX
Mysql handle socket
PPTX
Mysql architecture&parameters
Cisco Webex Distributed Framework and Data Store Design
How to Implement Distributed Data Store
MongoDB Knowledge Shareing
Adapter Poxy Pattern
How to estimate_oracle_cost
Mongo db program_installation_guide
Mongo db sharding_cluster_installation_guide
Vitess percona 2012
Mysql performance tuning
Mysql5.1 character set testing
How to write_language_compiler
Compare mysql5.1.50 mysql5.5.8
Mysql handle socket
Mysql architecture&parameters

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
A Presentation on Touch Screen Technology
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
Chapter 5: Probability Theory and Statistics
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
Encapsulation theory and applications.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
1. Introduction to Computer Programming.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
A Presentation on Touch Screen Technology
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
TLE Review Electricity (Electricity).pptx
Enhancing emotion recognition model for a student engagement use case through...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
SOPHOS-XG Firewall Administrator PPT.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Hybrid model detection and classification of lung cancer
A comparative study of natural language inference in Swahili using monolingua...
Chapter 5: Probability Theory and Statistics
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Encapsulation theory and applications.pdf
A Presentation on Artificial Intelligence
1. Introduction to Computer Programming.pptx

Distributed_Database_System

  • 1. Distributed Database Systems Distributed Database Systems Contents I 1 Motivation Distributed Database Systems 2 Detour on centralized query processing Translating SQL into relational algebra Distributed Query Processing Phases of centralized query processing Query parsing Katja Hose, Ralf Schenkel Query transformation Query optimization Max-Planck-Institut f¨r Informatik, Cluster of Excellence MMCI u 3 Basics of distributed query processing Phases of distributed query processing November 10, 2011 Introduction November 17, 2011 Meta data management Data localization 4 Global query optimization Main questions Katja Hose Distributed Database Systems November 10, 2011 1 / 167 Katja Hose Distributed Database Systems November 10, 2011 2 / 167 Distributed Database Systems Distributed Database Systems Motivation Contents II Motivation Global query optimizer Distributed cost model The task of query processing is . . . Join order optimization . . . to answer user queries Total time models Response time models Example How many students are at Saarland University? Answer: 18.000 Additional constraints 5 Summary Low response times High query throughput Efficient hardware usage ... Katja Hose Distributed Database Systems November 10, 2011 3 / 167 Katja Hose Distributed Database Systems November 10, 2011 4 / 167
  • 2. Distributed Database Systems Distributed Database Systems Motivation Detour on centralized query processing Motivation 1 Motivation 2 Detour on centralized query processing Translating SQL into relational algebra Phases of centralized query processing Differences to centralized query processing Query parsing Considering the physical data distribution during query optimization Query transformation Query optimization Considering communication costs 3 Basics of distributed query processing Assumptions Phases of distributed query processing Data is distributed among multiple nodes Introduction Existence of a global conceptual schema, which is used by all nodes Meta data management Data localization Queries are formulated on the global schema 4 Global query optimization Main questions Global query optimizer Distributed cost model Katja Hose Distributed Database Systems November 10, 2011 5 / 167 Katja Hose Distributed Database Systems November 10, 2011 6 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Translating SQL into relational algebra Join order optimization Translating SQL into relational algebra Total time models Response time models SQL query structure: select distinct a1 , . . . , an from R1 , . . . , Rn where p Algorithm: 5 Summary 1 Translating the from clause Let R1 , . . . , Rk be the relations in the from clause of the query Construct expression: R1 if k = 1 R= ((. . . (R1 × R2 ) × . . . ) × Rk ) otherwise Katja Hose Distributed Database Systems November 10, 2011 7 / 167 Katja Hose Distributed Database Systems November 10, 2011 8 / 167
  • 3. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Translating SQL into relational algebra Translating SQL into relational algebra Translating SQL into relational algebra Translating SQL into relational algebra Algorithm : Algorithm : 2 Translating the where clause 3 Translating the select clause Let F be the predicate in the where clause of the query (if a where clause Let a1 , . . . , an (or “*”) be the projection in the select clause of the query exists) Construct expression: Construct expression: W if the projection is “*” S= R if there is no where clause πa1 ,...,an (W ) otherwise W = σF (R) otherwise Output: S Katja Hose Distributed Database Systems November 10, 2011 9 / 167 Katja Hose Distributed Database Systems November 10, 2011 10 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Translating SQL into relational algebra Phases of centralized query processing Translating SQL into relational algebra Workflow for centralized query processing Example query select distinct e.EN ame, s.Salary from Employees e, Salary s where e.T itle = s.T itle and s.Salary ≥ 60.000 R1 if k = 1 R= ((. . . (R1 × R2 ) × . . . ) × Rk ) otherwise R = Employees × Salary R if there is no where clause W = σF (R) otherwise Katja Hose Distributed Database Systems November 10, 2011 11 / 167 Katja Hose Distributed Database Systems November 10, 2011 12 / 167
  • 4. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query parsing Query parsing Query parsing Example Transform a declarative query into an internal representation Query formulated using a declarative query language, e.g., SQL Example The Parser translates the query into an internal representation Database managing information about employees and projects Called naive query plan Employees(EID, EN ame, T itle) Plan described by an operator tree of relational algebra operators Assignment(EN o, P N o, Duration) Query: return the names of all employees working for project ’P1’ SELECT EName FROM Employees e, Assignment a WHERE e.EID = ENo AND PNo=’P1’ Katja Hose Distributed Database Systems November 10, 2011 13 / 167 Katja Hose Distributed Database Systems November 10, 2011 14 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query parsing Query parsing Example Operator tree πEN ame σP N o= P 1 ∧Employees.EID=Assignment.EN o Employees × Assignment Query SELECT EName FROM Employees e, Assignment a WHERE e.EID = ENo AND PNo=’P1’ Translation into relational algebra πEN ame σP N o= P 1 ∧Employees.EID=Assignment.EN o Employees × Assignment In contrast to the SQL statement, the algebra statement already contains the required basic evaluation operators Operator tree Katja Hose Distributed Database Systems November 10, 2011 15 / 167 Katja Hose Distributed Database Systems November 10, 2011 16 / 167
  • 5. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query transformation Workflow for centralized query processing Query transformation Steps 1 Name resolution Transforming object names into internal names 2 Semantic analysis Checking for global relations and attributes, view expansion, global access control 3 Normalization Transforming predicates into a canonical format 4 Simple algebraic rewriting Application of heuristics to eliminate bad plans Katja Hose Distributed Database Systems November 10, 2011 17 / 167 Katja Hose Distributed Database Systems November 10, 2011 18 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query transformation Semantic analysis Normalization Objective Check if the global schema defines all attributes and relations Simplification of the following optimization by transforming the query referenced in the query into a canonical format If the query is formulated on a view, replace references to Selection and join predicates relations/attributes with references to global relations/attributes Conjunctive normal form vs. disjunctive normal form Perform simple integrity checks, e.g., are the types of attributes Conjunctive normal form: used in comparison predicates of the same type? (p11 ∨ p12 ∨ · · · ∨ p1n ) ∧ · · · ∧ (pm1 ∨ pm2 ∨ · · · ∨ pmn ) Initial check if the query has the rights to access referenced Disjunctive normal form: (p11 ∧ p12 ∧ · · · ∧ p1n ) ∨ · · · ∨ (pm1 ∧ pm2 ∧ · · · ∧ pmn ) relations/attributes Transformation based on equivalence rules for logical operators Katja Hose Distributed Database Systems November 10, 2011 19 / 167 Katja Hose Distributed Database Systems November 10, 2011 20 / 167
  • 6. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query transformation Normalization Normalization Example SELECT EName Equivalence rules FROM Employees e, Assignment a p1 ∧ p2 ⇐⇒ p2 ∧ p1 and p1 ∨ p2 ⇐⇒ p2 ∨ p1 WHERE e.EID = a.ENo AND Duration ≥ 3 AND (PNo=’P1’ OR PNo=’P2’) p1 ∧ (p2 ∧ p3 ) ⇐⇒ (p1 ∧ p2 ) ∧ p3 and p1 ∨ (p2 ∨ p3 ) ⇐⇒ (p1 ∨ p2 ) ∨ p3 p1 ∧ (p2 ∨ p3 ) ⇐⇒ (p1 ∧ p2) ∨ (p1 ∧ p3 ) and Selection condition in disjunctive normal form p1 ∨ (p2 ∧ p3 ) ⇐⇒ (p1 ∨ p2) ∧ (p1 ∨ p3 ) (EID = ENo ∧ Duration ≥ 3 ∧ PNo=’P1’) ∨ ¬(p1 ∧ p2 ) ⇐⇒ ¬p1 ∨ ¬p2 and ¬(p1 ∨ p2 ) ⇐⇒ ¬p1 ∧ ¬p2 (EID = ENo ∧ Duration ≥ 3 ∧ PNo=’P2’) ¬(¬p1 ) ⇐⇒ p1 Selection condition in conjunctive normal form EID = ENo ∧ Duration ≥ 3 ∧ (PNo=’P1’ ∨ PNo=’P2’) Katja Hose Distributed Database Systems November 10, 2011 21 / 167 Katja Hose Distributed Database Systems November 10, 2011 22 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query transformation Query optimization Simple algebraic rewriting Workflow for centralized query processing Simple optimizations that are always beneficial regardless of system state Elimination of redundant predicates Simplification of expressions Unnesting of subqueries and views Tasks Recognize and simplify all expressions/operations/subqueries that are “obviously” unnecessary, redundant, or contradictory. Do not consider system state information, e.g., size of tables, existence of indexes, etc. Katja Hose Distributed Database Systems November 10, 2011 23 / 167 Katja Hose Distributed Database Systems November 10, 2011 24 / 167
  • 7. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Query optimization Heuristics Steps Use simple heuristics which usually lead to better performance 1 Algebraic optimization Not the optimal plan is needed, but the really bad ones should be Find a good relational algebra operator tree avoided Heuristic query optimization Heuristics Cost-based query optimization Statistical query optimization Break selections Complex selection criteria should be broken into multiple parts 2 Physical optimization Push projection and push selection Find suitable algorithms for implementing the operations Cheap selections and projections should be performed as early as possible to reduce the sizes of intermediate results Force joins In most cases, using a join is much cheaper than using a Cartesian product and a selection Katja Hose Distributed Database Systems November 10, 2011 25 / 167 Katja Hose Distributed Database Systems November 10, 2011 26 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Algebraic optimization rules Operator is commutative: r1 r2 ⇐⇒ r2 r1 Combinations of selections σ can be combined using logical and (∧). The Operator is associative: order of the selections is arbitrary: (r1 r2 ) r3 ⇐⇒ r1 (r2 r3 ) σF1 (σF2 (r1 )) ⇐⇒ σF1 ∧F2 (r1 ) ⇐⇒ σF2 (σF1 (r1 )) For operator π in combination with another operator π, the “outer” Exploiting commutativity of ∧ parameter dominates the “inner” one: πX (πY (r1 )) ⇐⇒ πX (r1 ) if X ⊆ Y Katja Hose Distributed Database Systems November 10, 2011 27 / 167 Katja Hose Distributed Database Systems November 10, 2011 28 / 167
  • 8. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Algebraic optimization rules Operators σ and commute if all selection attributes are contained in the same relation: Operators π and σ commute if predicate F is defined based on the σF (r1 r2 ) ⇐⇒ σF (r1 ) r2 if attr(F ) ⊆ R1 projection attributes: A selection predicate can be split up in conjunction with a join (F = F1 ∧ F2 ) if the attributes referred to by F1 and F2 are contained in different relations: σF (πX (r1 )) ⇐⇒ πX (σF (r1 )) if attr(F ) ⊆ X σF (r1 r2 ) ⇐⇒ σF1 (r1 ) σF2 (r2 ) Alternatively, change in ordering possible if the projection is extended by if attr(F1 ) ⊆ R1 and attr(F2 ) ⊆ R2 all necessary attributes: In any case, part of a selection can be split up by separating predicates F1 πX1 (σF (r1 )) ⇐⇒ πX1 (σF (πX1 ,X2 (r1 ))) if attr(F ) ⊇ X2 referencing attributes of R1 only, F2 contains the remaining predicates referencing attributes of both relations σF (r1 r2 ) ⇐⇒ σF2 (σF1 (r1 ) r2 ) if attr(F1 ) ⊆ R1 Katja Hose Distributed Database Systems November 10, 2011 29 / 167 Katja Hose Distributed Database Systems November 10, 2011 30 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Algebraic optimization rules Commutativity of σ and ∪: Commutativity of π and : σF (r1 ∪ r2 ) ⇐⇒ σF (r1 ) ∪ σF (r2 ) πX (r1 r2 ) ⇐⇒ πX (πY1 (r1 ) πY2 (r2 )) Commutativity of σ and −: with Y1 = (X ∩ R1 ) ∪ (R1 ∩ R2 ) σF (r1 − r2 ) ⇐⇒ σF (r1 ) − σF (r2 ) and or in case F only references tuples in r1 : Y2 = (X ∩ R2 ) ∪ (R1 ∩ R2 ) σF (r1 − r2 ) ⇐⇒ σF (r1 ) − r2 Pushing a projection is possible if all Yi are defined in such a way that they preserve all attributes necessary to perform the join. Katja Hose Distributed Database Systems November 10, 2011 31 / 167 Katja Hose Distributed Database Systems November 10, 2011 32 / 167
  • 9. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Algebraic optimization rules Heuristic algebraic optimization – Example Further rules Commutativity of π and ∪: πX (r1 ∪ r2 ) ⇐⇒ πX (r1 ) ∪ πX (r2 ) Use algebraic optimization heuristics Distributive law for and ∪, distributive law for and −, Commutativity of renaming β with other operators, . . . Force join Idempotence, e.g., A ∨ A ⇐⇒ A Push selection and projection Operations involving empty relations Commutative and associative laws for , ∪ und ∩ Katja Hose Distributed Database Systems November 10, 2011 33 / 167 Katja Hose Distributed Database Systems November 10, 2011 34 / 167 Distributed Database Systems Distributed Database Systems Detour on centralized query processing Detour on centralized query processing Query optimization Query optimization Cost-based algebraic query optimization Physical query optimization Physical optimization Most non-distributed RDBMS strongly rely on cost-based optimizations Input: Aim for better optimized plan with respect to system and data Optimized query plan consisting of algebra operators characteristics Choose an algorithm to compute a particular algebra operator Join order optimization Join: Basic approach Block-Nested-Loop join, hash join, merge join, . . . Establish a cost model for various operations Enumerate all query plans and compute costs Select: Pick the best query plan Full table scan, index lookup, ad-hoc index generation & lookup, . . . Usually, dynamic programming techniques are used to keep Tasks computational effort manageable Translating a query plan into an execution plan Physical and algebraic optimization are often interleaved Katja Hose Distributed Database Systems November 10, 2011 35 / 167 Katja Hose Distributed Database Systems November 10, 2011 36 / 167
  • 10. Distributed Database Systems Distributed Database Systems Detour on centralized query processing Basics of distributed query processing Query optimization Query optimization example 1 Motivation 2 Detour on centralized query processing Translating SQL into relational algebra Phases of centralized query processing Output: query execution plan Query parsing Query transformation Query optimization 3 Basics of distributed query processing Phases of distributed query processing Introduction Meta data management Data localization 4 Global query optimization Main questions Global query optimizer Distributed cost model Katja Hose Distributed Database Systems November 10, 2011 37 / 167 Katja Hose Distributed Database Systems November 10, 2011 38 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Phases of distributed query processing Join order optimization Workflow for distributed query processing Total time models Response time models 5 Summary Katja Hose Distributed Database Systems November 10, 2011 39 / 167 Katja Hose Distributed Database Systems November 10, 2011 40 / 167
  • 11. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Basic considerations Basic considerations Costs are more difficult to predict Distributed query processing Join selectivity: is it worthwhile to push down a selection? Shares the same properties of centralized query processing Data is distributed: difficult to get meaningful statistics Similar problem but with different objectives and constraints Network latency is very hard to predict Objectives for centralized query processing Current workload at nodes, load shedding Minimize the number of disk accesses Additional cost factors and constraints Minimize computational time Extension of relational algebra (sending/receiving data) Objectives for distributed query processing Data localization (which node holds relevant data) Minimize resource consumption Replication and caching (where to compute an operation) Minimize response time Network models Maximize throughput Response-time models Data and structural heterogeneity (federated databases . . . ) Katja Hose Distributed Database Systems November 10, 2011 41 / 167 Katja Hose Distributed Database Systems November 10, 2011 42 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Consequences Example Query Optimization is much more difficult than in the central case Return the names of all employees working for project ’P1’ Statistics and costs change over time, e.g., workload at a node, network load πEN ame (πEID,EN ame (Employees) Employees.EID=Assignment.EN o πEN o (σP N o= P 1 (Assignment))) More conflicting optimization goals Increase throughput → reduce replication and parallelization, Problems increase query response time → increase parallelization Relations are fragmented and distributed among five nodes More cost factors and constraints The Employees relation uses primary horizontal fragmentation Consequences One fragment located at node 1, the other at node 2, no replication Adaptive query plans (create an initial plan and optimize it on-the-fly) The Assignment relation uses derived horizontal fragmentation One fragment located at node 3, the other at node 4, no replication Do not aim for the best plan, but for a good plan The query originates from node 5 Katja Hose Distributed Database Systems November 10, 2011 43 / 167 Katja Hose Distributed Database Systems November 10, 2011 44 / 167
  • 12. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Example Example Cost model and statistics Accessing a tuple costs 1 unit (acc) Transferring a tuple costs 10 units (trans) There are 400 employees and 1000 assignments 20 assignments for project ‘P1’ All tuples are uniformly distributed, i.e., nodes 3 and 4 provide 10 assignments for project ‘P1’ each There are local indexes on attribute P N o at nodes 3 and 4 (as well as indexes on primary keys at all nodes) Direct tuple access is possible on local sites, no scanning All nodes can directly communicate with each other Simplification: no costs for unions and projections Katja Hose Distributed Database Systems November 10, 2011 45 / 167 Katja Hose Distributed Database Systems November 10, 2011 46 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Example Example Simple execution plan - Version B Simple execution plan - Version A Ship intermediate results Transfer all data to Node 5 Katja Hose Distributed Database Systems November 10, 2011 47 / 167 Katja Hose Distributed Database Systems November 10, 2011 48 / 167
  • 13. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Example Example Costs plan B: 440 units Costs plan A: 23.000 units Katja Hose Distributed Database Systems November 10, 2011 49 / 167 Katja Hose Distributed Database Systems November 10, 2011 50 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Introduction Introduction Important aspects of distributed query processing Important aspects of distributed query processing Meta data management Data localization Global query optimization Post-processing Katja Hose Distributed Database Systems November 10, 2011 51 / 167 Katja Hose Distributed Database Systems November 10, 2011 52 / 167
  • 14. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Workflow for distributed query processing Meta data management Prerequisites to perform query optimization Meta data must be available Meta data is stored in the catalog Catalog provides information about the data distribution Use this information to decide, for instance, if it is worthwhile to execute a selection very early. Katja Hose Distributed Database Systems November 10, 2011 53 / 167 Katja Hose Distributed Database Systems November 10, 2011 54 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Meta data management Meta data management Typical contents of a catalog for distributed database management systems Database schema Where to store the catalog in a distributed system? Definitions of tables, views, constraints, keys,. . . Central node Partitioning schema Simple solution, bottleneck Information about how the schema is partitioned and how tables can Replicated at all nodes be reconstructed Updates are expensive Allocation schema Fragmented Information about which fragment can be found at which node In rare cases, the catalog may become very large (including information about replication) Catalog has to be fragmented and allocated Network information Caching Information about node connections, network model Replicate only needed parts of a central catalog, anticipate potential Additional physical information inconsistencies Information about indexes, data statistics (histograms, etc.), hardware resources (processing & storage),. . . Katja Hose Distributed Database Systems November 10, 2011 55 / 167 Katja Hose Distributed Database Systems November 10, 2011 56 / 167
  • 15. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Meta data management Meta data management Centralized catalog Replicated catalog One instance of the global catalog at a central node Full copy of the global catalog at each node Advantages Advantages No need to update copies Little communication overhead for queries Little memory consumption Good availability Disadvantages Disadvantages Communication with central node for each query High update costs Central node potentially represents a bottleneck Katja Hose Distributed Database Systems November 10, 2011 57 / 167 Katja Hose Distributed Database Systems November 10, 2011 58 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Meta data management Meta data management Meta data management Fragmented catalog Caching catalog data Partitioning the global catalog and assigning partitions to nodes Caching non-local catalog data Advantages Advantages Sharing load among nodes Avoiding remote access to frequently needed catalog data Reducing update overhead Reducing communication overhead Disadvantages Disadvantages Localizing necessary partitions of the global catalog Coherency control Invalidating cached copies in the presence of updates Katja Hose Distributed Database Systems November 10, 2011 59 / 167 Katja Hose Distributed Database Systems November 10, 2011 60 / 167
  • 16. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Meta data management Data localization Meta data management Workflow for distributed query processing Caching catalog data Explicit invalidation Owner of catalog data remembers nodes with local copies In case of updates: sending an invalidation message to nodes with local copies Implicit invalidation Identifying old catalog data during runtime (adding version numbers and time stamps to query messages) Katja Hose Distributed Database Systems November 10, 2011 61 / 167 Katja Hose Distributed Database Systems November 10, 2011 62 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Data localization Example – horizontal reduction Objective Schema Creating subqueries in consideration of the data distribution Projects1 = σBudget≤150.000 (Projects) Projects2 = σ150.000<Budget≤200.000 (Projects) Assumptions Projects3 = σBudget>200.000 (Projects) Fragmentation is defined by fragmentation expressions Reconstruction expression (horizontal fragmentation) Each fragment is allocated only at one node (no replication) Projects = Projects1 ∪ Projects2 ∪ Projects3 Fragmentation expressions and locations of the fragments are stored Example query in the catalog σLocation= Saarbr. ∧Budget≤100.000 (Projects) Main tasks After replacing references to global relations Replace access to global relations with accesses to the fragments σLocation= Saarbr. ∧Budget≤100.000 (Projects1 ∪ Projects2 ∪ Insert reconstruction expression into algebra query Projects3 ) Basic algebraic simplifications of the query Further optimization is possible! Katja Hose Distributed Database Systems November 10, 2011 63 / 167 Katja Hose Distributed Database Systems November 10, 2011 64 / 167
  • 17. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – horizontal reduction Example – horizontal reduction Objective Query with fragmentation expression Eliminate non-necessary subqueries σLocation= Saarbr. ∧Budget≤100.000 (Projects1 ∪ Projects2 ∪ Projects3 ) Horizontal reduction rule Fragment definitions Projects1 = σBudget≤150.000 (Projects) Given fragments of R as FR = {R1 , . . . , Rn } with Ri = σpi (R) Projects2 = σ150.000<Budget≤200.000 (Projects) All fragments Ri for which σps (Ri ) = ∅ can be removed Projects3 = σBudget>200.000 (Projects) with ps denoting the query’s selection predicate Because of σps (Ri ) = ∅ ⇐ ∀x ∈ R : ¬(ps (x) ∧ (pi (x)) σBudget≤100.000 (Projects2 ) = ∅, σBudget≤100.000 (Projects3 ) = ∅ The selection with the query predicate ps on fragment Ri is empty if ps contradicts the fragmentation predicate pi of Ri , i.e., ps and pi are We obtain the reduced query never true at the same time for all tuples in Ri σLocation= Saarbr. (σBudget≤100.000 (Projects1 )) Katja Hose Distributed Database Systems November 10, 2011 65 / 167 Katja Hose Distributed Database Systems November 10, 2011 66 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – join reduction Example – join reduction Join Reductions Schema Larger joins are replaced by multiple partial joins on fragments Projects(PNo, PName, Budget, Location) Distributive law: (R1 ∪ R2 ) S = (R1 S) ∪ (R2 S) Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects) Projects2 = σP N o= P 3 (Projects) Eliminate all union fragments that will return an empty result Projects3 = σP N o= P 4 (Projects) Expectations Assignment(ENo, PNo, Duration) Elimination of partial joins producing empty results Assignment1 = σP N o= P 1 ∨P N o= P 2 (Assignment) Depends on fragmentation optimality Assignment2 = σP N o= P 3 ∨P N o= P 4 (Assignment) Many joins on small relations have lower resource costs than one large Example query join Depends on fragmentation and applied join algorithms select * from Projects p, Assignment a where p.PNo = a.PNo Smaller joins can be executed in parallel In relational algebra Might decrease response time but might also increase communication Projects Assignment costs Katja Hose Distributed Database Systems November 10, 2011 67 / 167 Katja Hose Distributed Database Systems November 10, 2011 68 / 167
  • 18. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Example – join reduction Query simplification – join reduction Query Projects Assignment Join reduction rule Given fragments of R as FR = {R1 , . . . , Rn } and fragments of S as After replacing global relations with reconstruction expressions FS = {S1 , . . . , Sn } (Projects1 ∪ Projects2 ∪ Projects3 ) (Assignment1 ∪ Assignment2 ) Apply distributive law, e.g.: (R1 ∪ R2 ) (S1 ∪ S2 ) = (R1 S1 ) ∪ (R1 S2 ) ∪ (R2 S1 ) ∪ (R2 S2 ) After applying the distributive law All partial joins between fragments Ri and Sj for which Ri Sj = ∅ can be removed (Projects1 Assignment1 ) ∪ (Projects1 Assignment2 ) ∪ Ri Sj = ∅ ⇐ ∀x ∈ Ri , y ∈ Sj : ¬(pi (x) ∧ pj (y)) (Projects2 Assignment1 ) ∪ (Projects2 Assignment2 ) ∪ The join between fragments Ri and Rj is empty if their respective (Projects3 Assignment1 ) ∪ (Projects3 Assignment2 ) fragmentation predicates (on the join attribute) contradict, i.e., there is no tuple combination x and y such that both partitioning Further optimization is possible! predicates are fulfilled at the same time. Katja Hose Distributed Database Systems November 10, 2011 69 / 167 Katja Hose Distributed Database Systems November 10, 2011 70 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Example – join reduction Query simplification – join reduction for horizontal fragmentation Query with fragmentation expression (Projects1 Assignment1 ) ∪ (Projects1 Assignment2 ) ∪ The easiest join reduction case follows from derived horizontal (Projects2 Assignment1 ) ∪ (Projects2 Assignment2 ) ∪ fragmentation (Projects3 Assignment1 ) ∪ (Projects3 Assignment2 ) For each fragment of the first relation, there is exactly one matching fragment of the second relation Some of these partial joins are empty, e.g.: Simply use the information contained in the reconstruction expression Projects1 Assignment2 = ∅ instead of comparing the reconstruction predicates to each other Because their fragmentation expressions contradict: Join reduction for arbitrary horizontal partitioning might not be beneficial Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects) and Assignment2 = σP N o= P 3 ∨P N o= P 4 (Assignment) Reduced query (Projects1 Assignment1 ) ∪ (Projects2 Assignment2 ) ∪ (Projects3 Assignment2 ) Katja Hose Distributed Database Systems November 10, 2011 71 / 167 Katja Hose Distributed Database Systems November 10, 2011 72 / 167
  • 19. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – join reduction for derived Query simplification – join reduction for derived horizontal fragmentation horizontal fragmentation Example After replacing global relations with reconstruction expressions Projects(PNo, PName, Budget, Location) (Projects1 ∪ Projects2 ) (Assignment1 ∪ Assignment2 ) Projects1 = σP N o= P 1 ∨P N o= P 2 (Projects) Projects2 = σP N o= P 3 ∨P N o= P 4 (Projects) After applying the distributive law Assignment(ENo, PNo, Duration) (Projects1 Assignment1 ) ∪ (Projects1 Assignment2 ) ∪ Assignment1 = Assignment Projects1 (Projects2 Assignment1 ) ∪ (Projects2 Assignment2 ) Assignment2 = Assignment Projects2 Reduced query (using information about fragmentation of relation Assignment directly) Query in relational algebra Projects Assignment (Projects1 Assignment1 ) ∪ (Projects2 Assignment2 ) Katja Hose Distributed Database Systems November 10, 2011 73 / 167 Katja Hose Distributed Database Systems November 10, 2011 74 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – vertical reduction Example – vertical reduction Schema Projects(PNo, PName, Budget, Location) Projects1 = πP N o,P N ame,Location (Projects) Projects2 = πP N o,Budget (Projects) Vertical fragmentation rule Reconstruction expression Given fragments of R as FR = {R1 , . . . , Rn } with Ri = πβi (R) with Projects = Projects1 Projects2 βi representing the enumeration of a subset of R’s attributes Avoid joining fragments containing “useless” attributes, i.e., Example query fragments containing only attributes that are not referenced in the πP N ame (Projects) query and not output in the result After replacing references to global relations πP N ame (Projects1 Projects2 ) After removing unnecessary fragments πP N ame (Projects1 ) Katja Hose Distributed Database Systems November 10, 2011 75 / 167 Katja Hose Distributed Database Systems November 10, 2011 76 / 167
  • 20. Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Query simplification – hybrid fragmentation Qualified relations Supporting algebraic optimization of queries involving fragments Annotating fragments and intermediate relations with predicates Estimating the size of a relation The reconstruction expression introduces combinations of joins and Extension of relational algebra unions General guidelines Definition: qualified relation Remove empty relations generated by contradicting relations on A qualified relation is a pair [R : qR ] where R is a relation and qR is a horizontal fragments predicate. Remove useless relations generated by vertical fragments Break and distribute joins, eliminate empty fragment joins Example Representing horizontal fragments as qualified relations where the qualification predicate corresponds to the fragmentation expression [Projects : σP N o= P 1 ∨P N o= P 2 ] Katja Hose Distributed Database Systems November 10, 2011 77 / 167 Katja Hose Distributed Database Systems November 17, 2011 78 / 167 Distributed Database Systems Distributed Database Systems Basics of distributed query processing Basics of distributed query processing Data localization Data localization Qualified relations Qualified relations Example query σ100.000≤Budget≤200.000 (Projects) Extended relational algebra Qualified relations E1 = σ100.000≤Budget≤200.000 [Projects1 : Budget ≤ 150.000] (1) E := σF [R : qR ] → [E : F ∧ qR ] [E1 : (100.000 ≤ Budget ≤ 200.000) ∧ (Budget ≤ 150.000)] (2) E := πA [R : qR ] → [E : qR ] [E1 : 100.000 ≤ Budget ≤ 150.000] (3) E := [R : qR ] × [S : qS ] → [E : qR ∧ qS ] (4) E := [R : qR ] − [S : qS ] → [E : qR ] E2 = σ1000≤Budget≤200.000 [Projects2 : 150.000 < Budget ≤ 200.000] (5) E := [R : qR ] ∪ [S : qS ] → [E : qR ∨ qS ] [E2 : (100.000 ≤ Budget ≤ 200.000) ∧ (6) E := [R : qR ] F [S : qS ] → [E : qR ∧ qS ∧ F ] (150.000 < Budget ≤ 200.000)] [E2 : 150.000 < Budget ≤ 200.000] E3 = σ100.000≤Budget≤200.000 [Projects3 : Budget > 200.000] [E3 : (100.000 ≤ Budget ≤ 200.000) ∧ (Budget > 200.000)] E3 = ∅ Katja Hose Distributed Database Systems November 17, 2011 79 / 167 Katja Hose Distributed Database Systems November 17, 2011 80 / 167
  • 21. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization 1 Motivation Join order optimization Total time models 2 Detour on centralized query processing Response time models Translating SQL into relational algebra Phases of centralized query processing Query parsing Query transformation Query optimization 3 Basics of distributed query processing Phases of distributed query processing Introduction 5 Summary Meta data management Data localization 4 Global query optimization Main questions Global query optimizer Distributed cost model Katja Hose Distributed Database Systems November 17, 2011 81 / 167 Katja Hose Distributed Database Systems November 17, 2011 82 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions Workflow for distributed query processing Introduction to global query optimization Main questions When to optimize? What criteria to optimize? Where to execute the query? Katja Hose Distributed Database Systems November 17, 2011 83 / 167 Katja Hose Distributed Database Systems November 17, 2011 84 / 167
  • 22. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions When to optimize? When to optimize? Full compile time optimization Fully dynamic optimization The full query execution plan is computed at compile time Each query is optimized individually at runtime Assumption This technique heavily relies on heuristics, learning algorithms, and Applications use canned queries luck Prepared and parameterized SQL statements Pros Pros Might produce very good plans Queries can be executed directly Uses current network state Cons Also usable for ad-hoc queries Complex to model Cons Much information unknown or too expensive to gather Result quality might be very unpredictable Collecting statistics on all nodes? Complex algorithms and heuristics Statistics outdated Difficult to keep statistics up-to-date Especially machine load and network properties are very volatile Katja Hose Distributed Database Systems November 17, 2011 85 / 167 Katja Hose Distributed Database Systems November 17, 2011 86 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions When to optimize? When to optimize? Semi-dynamic optimization Hierarchical optimization Pre-optimize the query Plans are created in multiple stages During query execution, test if execution runs as expected during Global-Local-Plans optimization Global query optimizer creates a global query plan e.g., are tuples/fragments delivered in time?, does the network adhere Focus on data transfer: which intermediate results are to be computed by which node? How should intermediate results be shipped? to the predicted properties?, are there any bad network latencies?, etc. Local query optimizers create local query plans If execution shows severe deviations, compute a new query plan for all Decide on query plan layout, algorithms, indexes, etc. to deliver the parts that have not yet been executed requested intermediate result Makes only sense for queries that run for a longer time Two-Step-Plans Katja Hose Distributed Database Systems November 17, 2011 87 / 167 Katja Hose Distributed Database Systems November 17, 2011 88 / 167
  • 23. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions When to optimize? What criteria to optimize? Hierarchical optimization Important aspects for global optimization Plans are created in multiple stages Communication operators Global-Local-Plans Two-Step-Plans Fragment cardinalities During compile time, only stable parts of the plan are computed Order of operations Join order, join methods, access paths, etc. Join ordering During query execution, all missing plan elements are added Because permutations of the joins within the query may lead to Node selection, transfer policies, etc. Both steps can be performed using traditional query optimization improvements of orders of magnitude techniques Most important alternative optimization criteria Plan enumeration with dynamic programming Complexity is manageable as each optimization problem is much easier Query response time than a full optimization Resource consumption During runtime optimization, fresh statistics are available Total query execution costs Most distributed database management systems use semi-dynamic or hierarchical optimization techniques (or both) ... Katja Hose Distributed Database Systems November 17, 2011 89 / 167 Katja Hose Distributed Database Systems November 17, 2011 90 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Main questions Main questions Where to execute the query? Global query optimization Global query optimization. . . Query optimizer has to decide which parts of the query have to be . . . deals with finding the “best” ordering of operations in the query shipped to which node (cost model) (extended by fragmentation expressions and including communication operations) that minimizes a cost function. In heavily replicated scenarios, clever hybrid shipping can effectively be used for load balancing Input Move expensive computations to lightly loaded nodes, avoid an algebraic query extended by fragmentation expressions expensive communication Output an algebraic query or query execution plan with communication operations Katja Hose Distributed Database Systems November 17, 2011 91 / 167 Katja Hose Distributed Database Systems November 17, 2011 92 / 167
  • 24. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Basics of global query optimization Optimizer components Objective The global optimizer has three main components Choose a cost efficient execution plan based on the algebraic query plan given as input The search space Decide which parts of the query have to be transferred to which node Set of alternative equivalent execution plans to represent the input query Prerequisites The cost model Knowledge about fragmentation Predicts the costs of a given query execution plan Knowledge about fragment/relation sizes The search strategy Knowledge about data distribution Explores the search space and selects the best plan Knowledge about costs of operations Katja Hose Distributed Database Systems November 17, 2011 93 / 167 Katja Hose Distributed Database Systems November 17, 2011 94 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Phases of optimization Search space Query Phases SELECT EName, Title FROM Employees e, Assignment a, Project p 1 Spanning the search space using WHERE e.EID = ENo AND a.PNo=p.PNo transformation rules → equivalent search plans Equivalent join trees 2 Applying a search strategy and a cost model → choose an efficient plan Main focus: join trees and join ordering O(N !) different join trees by applying commutativity and associativity rules for N relations Katja Hose Distributed Database Systems November 17, 2011 95 / 167 Katja Hose Distributed Database Systems November 17, 2011 96 / 167
  • 25. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Search space Search strategies Tree variants for join order optimization Linear join trees All inner nodes have at least one leaf node (base relation) as child A search strategy needs to reduce search space Reduces search space Bushy trees Applying heuristics (similar to centralized algebraic optimization) May have inner nodes with no base relation as child Perform projections and selections when accessing base relations High potential for parallelization Avoid Cartesian products – enforce joins Applying further heuristics influencing the shape of the join tree ⊲⊳ Reducing the size of the search space vs. exhibiting parallelism ⊲⊳ Linear vs. bushy trees ⊲⊳ R1 ⊲⊳ ⊲⊳ ⊲⊳ R2 R1 R2 R3 R4 R3 R4 bushy join tree linear join tree Katja Hose Distributed Database Systems November 17, 2011 97 / 167 Katja Hose Distributed Database Systems November 17, 2011 98 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Global query optimizer Search strategies Search strategies Deterministic search strategy Systematic generation of query plans Example deterministic search strategies Starting with plans accessing the base relations Dynamic programming Constructing complex plans by combining easier plans, e.g., joining (Almost) exhaustive search by building all possible plans (breadth first) one more relation at each step “Very bad” partial plans are pruned at an early stage Guarantee to find the best plan Only possible for a small number (5-6) of relations Greedy algorithm Only one plan is built (depth-first) Exhaustive search guarantees finding the best plan Katja Hose Distributed Database Systems November 17, 2011 99 / 167 Katja Hose Distributed Database Systems November 17, 2011 100 / 167
  • 26. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Global query optimizer Distributed cost model Search strategies Distributed cost model Randomized search strategy Components One or more start plans using a greedy strategy (depth-first search) Cost functions Improving start plans by examining “neighbor plans” Estimating costs to execute operations Neighbor plan: applying transformation rules, e.g., exchanging two Statistics arbitrarily chosen operations Data about relation sizes, attribute domains, value distribution, etc. Better performance with a higher number of relations Formulas Determine cardinalities, sizes of intermediate results, etc. No guarantee to find the best plan Katja Hose Distributed Database Systems November 17, 2011 101 / 167 Katja Hose Distributed Database Systems November 17, 2011 102 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cost functions Cost functions Total execution time Components of total execution time Sum of all costs, i.e., the sum of all processing times at all nodes Local processing costs/time involved in answering the query Tlocal = TCPU · #insts + TI/O · #opsI/O Ttotal = TCPU · #insts + TI/O · #opsI/O + TMSG · #msgs + TTR · #bytes Communication costs/time Tcomm = TMSG · #msgs + TTR · #bytes TCPU time to process a CPU instruction TI/O time for a disk access Coefficients (TCPU , TI/O , TMSG , TTR ) characterize a specific TMSG time to send and receive a message TTR time to transmit a data unit from one node to another distributed database system #bytes is the sum of the sizes of all messages WAN (Wide Area Network): communication time is dominant Typical assumption: TTR is constant – although it might not be true LAN (Local Area Network): also local costs play an important role for remote nodes Katja Hose Distributed Database Systems November 17, 2011 103 / 167 Katja Hose Distributed Database Systems November 17, 2011 104 / 167
  • 27. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cost functions Total time vs. response time Communication costs Response time Time that elapses between query initiation and completion Considering parallel local processing and parallel communication Tresponse =TCPU · seq #insts + TI/O · seq #opsI/O + TMSG · seq #msgs + TTR · seq #bytes where seq #x represents the maximum number of instructions Tcommtotal = 2 · TMSG + TTR · (x + y) (insts), I/O operations (opsI/O ), messages (msgs), or bytes (bytes) Tcommresponse = max{TMSG + TTR · x, TMSG + TTR · y} that have to be processed sequentially Minimizing response time does not imply that the total time is also minimized! Katja Hose Distributed Database Systems November 17, 2011 105 / 167 Katja Hose Distributed Database Systems November 17, 2011 106 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Statistics Typical statistics Typical statistics for relation R fragmented as R1 , R2 , . . . , Rr with attributes A1 , . . . , An Good statistics are crucial Length of each attribute Ai in terms of bytes: length(Ai ) Most important cost factor: Number of distinct values for each attribute Ai and for each fragment Size of intermediate results produced during execution Rj : valuesAi ,Rj := card(πAi (Rj )) Estimating sizes using statistics and formulas Minimum and maximum attribute values: min(Ai ) and max(Ai ) Tradeoff between precision and costs of managing statistics Number of dinstinct values (cardinality) of the attribute domains: card(dom[Ai ]) Number of tuples in each fragment Rj : card(Rj ) Katja Hose Distributed Database Systems November 17, 2011 107 / 167 Katja Hose Distributed Database Systems November 17, 2011 108 / 167
  • 28. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Additional statistics Cardinality estimation Assumptions Additional statistics Independence between attributes Histogram for each attribute Ai to approximate the frequency Uniform distribution of attribute values distribution Selectivity Join selectivity factor for some pairs of relations Ratio between expected number of result tuples and tuples of the card(R S) input relation SFJ (R, S) = card(R) · card(S) Expected result size good (high) selectivity: SFJ = 0.001 SF = Cardinality of the input relation bad (low) selectivity: SFJ = 0.5 Example: σF (R) returns 10% of R’s tuples SFS (F, R) = 0.1 (SF selectivity factor) Katja Hose Distributed Database Systems November 17, 2011 109 / 167 Katja Hose Distributed Database Systems November 17, 2011 110 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cardinality estimation Selection Cardinality Assumptions card(σF (R)) = SFS (F, R) · card(R) Independence between attributes Selectivity Uniform distribution of attribute values Selectivity depends on selection predicates p(A) and constants v Cardinality 1 1 SFS (A = v, R) = = Estimate result size (cardinality of the output relation) valuesA,R card(πA (R)) Example: SFS (F, R) = 0.1 v − min(A) SFS (A < v, R) = max(A) − min(A) card(σF (R)) = SFS (F, R) · card(R) max(A) − v SFS (A > v, R) = max(A) − min(A) v2 − v1 SFS (v1 < A < v2 , R) = max(A) − min(A) Katja Hose Distributed Database Systems November 17, 2011 111 / 167 Katja Hose Distributed Database Systems November 17, 2011 112 / 167
  • 29. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Selection Projection Cardinality Cardinality Without duplicate elimination card(σF (R)) = SFS (F, R) · card(R) card(πA (R)) = card(R) Selectivity With duplicate elimination (if defined on an arbitrary attribute A): Selectivity depends on selection predicates p(A) and constants v card(πA (R)) = valuesA,R SFS (p(Ai ) ∧ p(Aj ), R) = SFS (p(Ai ), R) · SFS (p(Aj ), R) SFS (p(Ai ) ∨ p(Aj ), R) = SFS (p(Ai ), R) + SFS (p(Aj ), R) − With duplicate elimination (if one of the attributes is the primary key): (SFS (p(Ai ), R) · SFS (p(Aj ), R)) card(πAi ,... (R)) = card(R) Cardinalities for projections on arbitrary combinations of attributes are hard to predict because attribute correlations are unknown Katja Hose Distributed Database Systems November 17, 2011 113 / 167 Katja Hose Distributed Database Systems November 17, 2011 114 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Cartesian product Joins Cardinality Given: R S with R(A, B) and S(B, C) Upper bound: size of the Cartesian product Cardinality Natural join on attribute B No B values shared between R and S: card(R × S) = card(R) · card(S) card(R S) = 0 Foreign key relationship R.B → S.B: card(R S) = card(R) All tuples in R.B und S.B have the same value: card(R S) = card(R) · card(S) Katja Hose Distributed Database Systems November 17, 2011 115 / 167 Katja Hose Distributed Database Systems November 17, 2011 116 / 167
  • 30. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Joins Union and Difference Cardinality Cardinality Difficult to estimate because duplicates are removed Given: R S with R(A, B) and S(B, C) Union Upper bound: size of the Cartesian product Upper bound Natural join on attribute B card(R ∪ S) = card(R) + card(S) Estimate Lower bound card(R) · card(S) card(R ∪ S) = max{card(R), card(S)} card(R S) = max{valuesB,R , valuesB,S } Difference Store statistics (join cardinality SFJ ) for important joins Upper bound card(R S) = card(R) card(R S) = SFJ · card(R) · card(S) Lower bound card(R S) = 0 Katja Hose Distributed Database Systems November 17, 2011 117 / 167 Katja Hose Distributed Database Systems November 17, 2011 118 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Distributed cost model Selectivity estimation using histograms Selectivity estimation using histograms Histograms In reality distribution of attribute values in a relation is often not Equality predicate uniform Given predicate A = v Histograms consist of a set of buckets bi Identify bucket bi with v ∈ rangei Example histogram on attribute A of relation R 1 Each bucket bi defined by SFS (A = v, R) = di Range: rangei Range of values in attribute domain dom[A] fi Frequency: fi card(σA=v (R)) = SFS (A = v, R) · fi = di Number of tuples of R where R.A ∈ rangei Distinct values: di Number of distinct values of A where R.A ∈ rangei Katja Hose Distributed Database Systems November 17, 2011 119 / 167 Katja Hose Distributed Database Systems November 17, 2011 120 / 167
  • 31. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Distributed cost model Join order optimization Selectivity estimation using histograms Phases of optimization Phases Range predicates 1 Spanning the search space using Given predicate A ≤ v transformation rules Identify buckets that overlap the queried range → equivalent search plans Sum up frequencies 2 Applying a search strategy and a i−1 v − min(rangei ) cost model card(σA≤v (R)) = fi + · fi max(rangei ) − min(rangei ) → choose an efficient plan j=1 Main focus: join trees and join Bucket i only partially overlaps the queried range ordering Katja Hose Distributed Database Systems November 17, 2011 121 / 167 Katja Hose Distributed Database Systems November 17, 2011 122 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Join order optimization Join order optimization Join order optimization Join order optimization two relations Simplifying assumptions Determine the join order for two relations R S No distinction between fragments and relations Ignoring local processing time Ignoring other operations (selection, projection) No pipelining Ignoring data transfer to the result site Transfer the smaller relation to minimize the network load Katja Hose Distributed Database Systems November 17, 2011 123 / 167 Katja Hose Distributed Database Systems November 17, 2011 124 / 167
  • 32. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Join order optimization Join order optimization Join order optimization for three relations Join order optimization with semijoins Determine the join order for three relations R A S B T Considering semijoins for joining two relations R (at nodeR ) and S (at 1 R nodeS , nodeS : R = R S, R nodeT , nodeS ) results in three alternatives – assuming A is the join attribute 1 R nodeT : R T A S = (R A S) A S = (R A πA (S)) A S 2 R A S = R A (S A R) 2 S nodeR , nodeR : R = R S, R nodeT , nodeT : R T 3 R A S = (R A S) A (S A R) 3 S nodeT , nodeT : S = S T, S nodeR , nodeR : S R Workflow for alternative 1 4 T nodeS , nodeS : S = S nodeR : S R T, S nodeR , nodeS : compute S = πA (S), send S to nodeR 5 T nodeS , R nodeS , nodeS : R S R nodeR : compute R = R A S , send R to nodeS nodeS : compute R A S Possible orders Transfer costs (neglecting TM SG ) 1 nodeR : send R to nodeS TT R · card(πA (S)) + TT R · card(R A S ) nodeS : compute join R = R S, send R to nodeT Considerung full joins (R A S) only and assuming that nodeT : compute join R T card(R) < card(S), the complete relation R would have been sent to 2 nodeS : send S to nodeR nodeS , costs: TT R · card(R) nodeR : compute join R = R S, send R to nodeT Katja Hose Distributed Database Systems November 17, 2011 125 / 167 Katja Hose Distributed Database Systems November 17, 2011 126 / 167 nodeT : compute join R T 3 nodeS : send S to nodeT node : compute join S = S T Distributed Database Systems T , send S to nodeR Distributed Database Systems node : compute join S Global query optimization R R Global query optimization Join order optimization Total time models 4 nodeT : send T to nodeS SemijoinS vs. joinsjoin S node : compute =S T , send S to nodeR Total time models nodeR : compute join S R 5 nodeT : send T to nodeS Basic strategy nodeR : send R to nodeS nodeS : compute join R S R Coordinator (master) site Conclusion Exhaustive search Decision Transfer costssizes of the T R · card(πA (S)) + TT R · card(R A S) Based on the semijoin: Tbase relations and intermediate results Optimization objective: total time Transfer exploiting parallelismTT R · card(R) 5 Perhaps costs standard join: of alternative Input The semijoin is preferable if Relational algebra tree Cost model card(πA (S)) + card(R A S) < card(R) Statistics Location of relations Output Optimized query execution plan Katja Hose Distributed Database Systems November 17, 2011 127 / 167 Katja Hose Distributed Database Systems November 17, 2011 128 / 167
  • 33. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Total time models Site selection and data transfer Aspects Query shipping Cost model Query initiator (node at which Site selection and data transfer the query is issued/optimized) Join order optimization sends the query to other nodes Join implementation Receiver nodes compute the query result and ship the result back to the initiator Katja Hose Distributed Database Systems November 17, 2011 129 / 167 Katja Hose Distributed Database Systems November 17, 2011 130 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Site selection and data transfer Site selection and data transfer Hybrid shipping Data shipping Initiator sends partial queries to other nodes Query remains at the initiator Other nodes execute some parts Initiator sends data request of the query and send messages to other nodes intermediate results to the Receiver nodes ship all required initiator data to the initiator Initiator executes remaining Initiator computes result query operations (post-processing) Katja Hose Distributed Database Systems November 17, 2011 131 / 167 Katja Hose Distributed Database Systems November 17, 2011 132 / 167
  • 34. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Site selection and data transfer for joins Site selection and data transfer for joins Problem Scenario Queries make extensive use of joins 2 nodes; one (nodeR ) storing relation R the other (nodeS ) storing relation S Computing joins is very expensive The query asks for R S Especially in distributed systems: special attention because of fragments and replication R A B S B C D 3 7 Basic strategies 1 1 9 8 8 1 5 1 R S A B C D Ship whole 4 6 9 4 2 1 1 5 1 7 7 Transferring the complete relation 4 5 4 3 3 4 5 7 8 4 2 6 Fetch as needed 6 2 5 7 8 Transferring the relation piecewise 5 7 Katja Hose Distributed Database Systems November 17, 2011 133 / 167 Katja Hose Distributed Database Systems November 17, 2011 134 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Ship whole Ship whole R A B R A B S B C D S B C D 3 7 3 7 9 8 8 9 8 8 1 1 1 1 1 5 1 R S A B C D 1 5 1 R S A B C D 4 6 4 6 9 4 2 1 1 5 1 9 4 2 1 1 5 1 7 7 7 7 4 3 3 4 5 7 8 4 3 3 4 5 7 8 4 5 4 5 4 2 6 4 2 6 6 2 6 2 5 7 8 5 7 8 5 7 5 7 Execution at nodeR Execution at nodeS nodeR : send data request message (relation S) to nodeS nodeS : send data request message (relation R) to nodeR nodeS : send requested data (relation S) to nodeR nodeR : send requested data (relation R) to nodeS Total costs: 2 messages, 18 attribute values Total costs: 2 messages, 14 attribute values Katja Hose Distributed Database Systems November 17, 2011 135 / 167 Katja Hose Distributed Database Systems November 17, 2011 136 / 167
  • 35. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Ship whole Fetch as needed R A B S B C D R A B S B C D 3 7 3 7 9 8 8 1 1 9 8 8 1 1 1 5 1 R S A B C D 1 5 1 R S A B C D 4 6 9 4 2 1 1 5 1 4 6 7 7 4 3 3 4 5 7 8 9 4 2 1 1 5 1 4 5 7 7 6 2 4 2 6 4 3 3 4 5 7 8 5 7 8 4 5 5 7 4 2 6 6 2 5 7 8 5 7 Execution at nodeR nodeR : send data request message (tuples of relation S with B = ‘7 ) to nodeS Execution at a third node nodeX nodeS : send requested data (0 tuples of relation S with B = ‘7 ) to nodeR nodeX : send data request message (relation R) to nodeR nodeR : send data request message (tuples of relation S with B = ‘1 ) to nodeS nodeX : send data request message (relation S) to nodeS nodeS : send requested data (1 tuple of relation S with B = ‘1 ) to nodeR nodeR : send requested data (relation R) to nodeX ... nodeS : send requested data (relation S) to nodeX Total costs: 7 · 2 = 14 messages, 7 + 2 · 3 = 13 attribute values Total costs: 4 messages, 18 + 14 = 32 attribute values Katja Hose Distributed Database Systems November 17, 2011 137 / 167 Katja Hose Distributed Database Systems November 17, 2011 138 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Fetch as needed Ship whole vs. fetch as needed R A B S B C D 3 7 9 8 8 1 1 1 5 1 R S A B C D 4 6 7 7 9 4 4 3 2 3 1 4 1 5 5 7 1 8 Conclusion 4 5 4 2 6 6 2 5 7 5 7 8 Fetch as needed results in a high number of messages Ship whole results in high amounts of transferred data Execution at nodeS More advanced strategies based on these two basic strategies nodeS : send data request message (tuples of relation R with B = ‘9 ) to nodeR nodeR : send requested data (0 tuples of relation R with B = ‘9 ) to nodeS Semijoin nodeS : send data request message (tuples of relation R with B = ‘1 ) to nodeR Bitvector join nodeR : send requested data (1 tuple of relation R with B = ‘1 ) to nodeS ... Total costs: 6 · 2 = 12 messages, 6 + 2 · 2 = 10 attribute values Katja Hose Distributed Database Systems November 17, 2011 139 / 167 Katja Hose Distributed Database Systems November 17, 2011 140 / 167
  • 36. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Semijoin Semijoin Requesting all join partners in just one step Basic consideration: R S = R (S R) = R (S πB (R)) with B being the join attribute Algorithm nodeR : determine πB (R) and send the result to nodeS nodeS : determine S = S πB (R) = S R and send result to nodeR nodeR : determine R S =R S Katja Hose Distributed Database Systems November 17, 2011 141 / 167 Katja Hose Distributed Database Systems November 17, 2011 142 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Bitvector join Bitvector join Also known as hash filter join Algorithm Avoiding the transfer of all join attribute values to the other node nodeR : determine πB (R), apply hash function h to the result, set the Transfer bitvector instead BV [1 . . . n] corresponding bits in BV to 1, and send the result to nodeS Transformation nodeS : apply hash function h to the join attribute of relation S, Choose an appropriate hash function h determine S = {t ∈ S|BV [h(t.B)] = 1}, send S to nodeR Apply h to transform attribute values to the range [1 . . . n] nodeR : determine R S =R S Set the corresponding bits in the bitvector BV [1 . . . n] to 1 Katja Hose Distributed Database Systems November 17, 2011 143 / 167 Katja Hose Distributed Database Systems November 17, 2011 144 / 167
  • 37. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Total time models Total time models Bitvector join Bitvector join Conclusions Transferring the bitvector reduces network load Bitvector only indicates potential join partners because multiple attribute values might map to the same hash value Might result in transferring unnecessary tuples Requirements: an appropriate hash function h and n needs to be large enough to avoid a high number of collisions Katja Hose Distributed Database Systems November 17, 2011 145 / 167 Katja Hose Distributed Database Systems November 17, 2011 146 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Response time models Response time models Two different response times When does the first result tuples arrive? “Classic” cost models consider total resource consumption of a query When have all result tuples arrived? Good results for heavy computational load and slow network connections Example situation By saving resources, many queries can be executed in parallel Given relations/fragments A, B, C, and D (minimum load, maximum throughput) Optimization for short response times Full replication, i.e., all relations/fragments are available on all nodes “Waste” some resources to get query results earlier Compute (A B) (C D) Take advantage of lightly loaded machines and fast connections Assumptions Utilize intraquery parallelism Each join costs 20 time units (TCP U + TI/O ) Transferring an intermediate result costs 10 time units (TM SG + TT R ) Accessing a relation is for free Each node has one computational thread Katja Hose Distributed Database Systems November 17, 2011 147 / 167 Katja Hose Distributed Database Systems November 17, 2011 148 / 167
  • 38. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Example Example Two plans Plan 1: Execute all operations on one node Total costs: 60 Plan 2: Join on different nodes, ship results Total costs: 80 Response time costs: 60 for plan 1, 50 for plan 2 Plan 1 ⇒ Plan 2 is better with respect to response time Because operations can be executed in parallel (exploiting intra-query Plan 2 parallelism) Plan 1 is obviously better with respect to total costs Response time can be improved even more by applying pipelining Katja Hose Distributed Database Systems November 17, 2011 149 / 167 Katja Hose Distributed Database Systems November 17, 2011 150 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining Pipelining Goal of applying pipelining Good first tuple response times by executing queries in a pipelined fashion Problems Operations have different execution times Not pipelined If execution speed of operations in the pipeline differs, tuples are Each operation is fully completed and an intermediate result is created either cached or the pipeline blocks Next operation reads intermediate result and is then fully completed Some operations more suitable than others Reading and writing of intermediate results costs resources Good: scan, select, project, union, . . . Pipelined Tricky: join, intersection, . . . Operations do not create intermediate results Very hard: sort Each processed tuple is fed directly into the next operation Tuples “flow” through the operations Katja Hose Distributed Database Systems November 17, 2011 151 / 167 Katja Hose Distributed Database Systems November 17, 2011 152 / 167
  • 39. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining example Pipelining example Simple query Simple query Tablescan, selection, projection Tablescan, selection, projection 1000 tuples are scanned, selectivity is 0.1 1000 tuples are scanned, selectivity is 0.1 Costs Costs Accessing one tuple during tablescan: 2 time units Accessing one tuple during tablescan: 2 time units Selecting (testing) one tuple: 1 time unit Selecting (testing) one tuple: 1 time unit Projecting one tuple: 1 time unit Projecting one tuple: 1 time unit Non-Pipelined time event Pipelined time event 2 first tuple in IR1 2 first tuple finished table scan 2000 all tuples in IR1 3 first tuple finished selection (if selected. . . ) 2001 first tuple in IR2 4 first tuple in Result 3000 all tuples in IR2 3098 last tuple finished tablescan 3001 first tuple in Result 3099 last tuple finished selection 3100 all tuples in Result 3100 all tuples in Result Katja Hose Distributed Database Systems November 17, 2011 153 / 167 Katja Hose Distributed Database Systems November 17, 2011 154 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining example Pipelining example Join query Costs Joining two table subsets using a non-pipelined 1000 tuple are scanned in each pipeline, BNL(Block-Nested-Loop) join selectivity 0.1 Both pipelines run in parallel Joining 100 100 tuples: 10.000 time units (one time unit per combination) Response time The first tuple arrives at the end of any pipeline after 4 time units All tuples have arrived at the end of the pipelines after 3.100 time units Final result will be available after 13.100 time units No benefit from pipelining with respect to response time First tuple arrives long after step 3.100 Katja Hose Distributed Database Systems November 17, 2011 155 / 167 Katja Hose Distributed Database Systems November 17, 2011 156 / 167
  • 40. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Joins and pipelining Single-pipelined hash join “Classic” join algorithm Suboptimal result because of the unpipelined join Basic idea A B One input relation is read from an intermediate result (A), the other is Most traditional join algorithms are unsuitable for pipelining pipelined through the join operation (B) Single/semi-pipelined: only one pipeline, the other intermediate result All tuples of A are stored in a hash table Hash function is used on the join attribute has to be available All tuples with the same hash value for the join attribute are in the Fully pipelined: both inputs are processed in a pipelined fashion same bucket Every incoming tuple (via pipeline) of B is hashed by join attributes Compare tuple to each tuple in the respective A bucket Return those tuples showing matching join attributes Katja Hose Distributed Database Systems November 17, 2011 157 / 167 Katja Hose Distributed Database Systems November 17, 2011 158 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Double-pipelined hash join Double-pipelined hash join – example Dynamically build hashtables for A and B tuples – memory intensive! Process tuples upon arrival Cache tuples if necessary B(31, B2) arrives Balance between A and B tuples for better performance Rely on statistics for a good A:B ratio Insert into B Hash If a new tuple arrives of relation A Find matching A tuples Insert it into the A hashtable Found A3 Check in the B hashtable if there are join partners Assume that A3 matches B3. . . If yes, return all combined AB tuples Add AB(A3, B2) to the result If a new B tuple arrives, process it analogously Katja Hose Distributed Database Systems November 17, 2011 159 / 167 Katja Hose Distributed Database Systems November 17, 2011 160 / 167
  • 41. Distributed Database Systems Distributed Database Systems Global query optimization Global query optimization Response time models Response time models Pipelining in distributed setups Pipelining in distributed setups – tuple blocking In pipelines, tuples “flow” through the operations Works well with one processing unit! (one node) Minimize communication overhead by tuple blocking Problem: sending each tuple in separate from one node to another Do not send single tuples, but blocks containing multiple tuples might be inefficient Burst transmission Communication costs Packets have to be cached Setting up transfer and opening communication channel Block size should be at least the packet size of the underlying network Composing a message protocol Transmitting message: header information and payload (minimum packet size is bigger than tuple) Results in even more cost factors for the cost model Receiving and decoding a message Closing the channel Katja Hose Distributed Database Systems November 17, 2011 161 / 167 Katja Hose Distributed Database Systems November 17, 2011 162 / 167 Distributed Database Systems Distributed Database Systems Global query optimization Summary Summary on global query optimization Summary I Detour on centralized query processing Query parsing Query transformation Global query optimization has to deal with additional constraints and cost Query optimization factors compared to “classic” query optimization Basics of distributed query optimization Many steps can be reused from centralized query processing Network costs, network model, shipping policies Optimization in distributed systems is much more complex (network Fragmentation and allocation schemes latency, selectivities, communication costs, response time, etc.) Different optimization goals (response time vs. total time) Meta data management – where to store the global catalog? Data localization – consider fragmentation Distributed query optimization Very important question: where to execute which parts of the query? When to optimize: compile time vs. dynamic optimization, most common: semi-dynamic and hierarchical optimization Cost model (cost functions, statistics, cardinality estimation, etc.) Katja Hose Distributed Database Systems November 17, 2011 163 / 167 Katja Hose Distributed Database Systems November 17, 2011 164 / 167
  • 42. Distributed Database Systems Distributed Database Systems Summary Summary Summary II References I ¨ M. Tamer Ozsu, P. Valduriez. Principles of Distributed Database Systems. Join order optimization Third Edition, Springer, 2011. Join implementations (ship whole, fetch as needed, semijoin, bitvector join, pipelined hash join, etc.) E. Rahm. Total time and response time Mehrrechner-Datenbanksysteme. Addison-Wesley, Bonn, 1994. P. Dadam. Verteilte Datenbanken und Client/Server-Systeme. Springer-Verlag, Berlin, Heidelberg 1996. Katja Hose Distributed Database Systems November 17, 2011 165 / 167 Katja Hose Distributed Database Systems November 17, 2011 166 / 167 Distributed Database Systems Summary References II Toby J. Teorey Database modeling and design Third Edition, Morgan Kaufmann Publishers, San Francisco, CA, 1999. D. Kossmann. The State of the Art in Distributed Query Processing, ACM Computing Surveys, Vol. 32, No. 4, 2000, S. 422-469. Katja Hose Distributed Database Systems November 17, 2011 167 / 167