3. 3
Topics discussed in this chapter
Topics discussed in this chapter
Concepts for Object-Oriented Databases
Weaknesses of RDBMSs
Overview of Object-Oriented Concepts
Object Identity, Object Structure, and Type Constructors
Encapsulation of Operations, Methods, and Persistence
05/06/25
Advanced Database System
4. 4
Weaknesses of RDBMSs
Weaknesses of RDBMSs
Poor Representation of “Real World” Entities.
Difficulty in representing complex data types
Poor Support for Integrity and Enterprise Constraints.
Limited Operations.
Schema changes are difficult.
05/06/25
Advanced Database System
5. 5
What is an OODBMS ?
What is an OODBMS ?
OODBMS:-Object-oriented Database Management System
Data is represented in the form if objects as OOP.
Its stores complex data without mapping to relational
rows and columns.
One benefit of ODBMS is that when it’s integrated with
an OOP there is much greater consistency between the
database and programming language.
OODB=OOP concepts + DB concepts
05/06/25
Advanced Database System
6. 6
What is an OODBMS ?
What is an OODBMS ?
Features of OODBs:
It allow to specify state and Behavior
Most of OOP use objects directly from the DB.
Object identity.
Methods and messages.
Classes, subclasses, super classes, and inheritance.
Overriding, Overloading, Polymorphism and Dynamic Binding.
05/06/25
Advanced Database System
7. 7
Object Oriented Concepts:
Abstraction, encapsulation, and information hiding.
Objects and classes
Object identity.
Methods and messages.
Subclasses, super classes, and inheritance.
Overriding, Overloading, Polymorphism and Dynamic Binding.
05/06/25
Advanced Database System
8. 8
Objects
Objects
Object is a uniquely identifiable entity that contains both:
The attributes that describe the state of a real-world
object and the actions associated with it.
In definition very similar to that of an entity, however,
Object encapsulates both state and behavior;
An entity only models state (attributes).
05/06/25
Advanced Database System
9. 9
Specifying Object Persistence
The typical mechanisms for making an object persistent are Naming and
Reachability.
The Naming mechanism involves giving an object a unique persistent
name through which it can be retrieved by this and other programs.
The Reachability mechanism works by making the object reachable from
some persistent object
An object B is said to be reachable from an object A if a sequence of
references in the object graph lead from object A to object B
05/06/25
Advanced Database System
10. 10
Object Identity (OID)
Object Identity (OID)
In RDBMS, entity identity is value-based: primary key is used to provide
uniqueness.
Primary keys do not provide type of object identity required in OO
systems:
key only unique within a relation, not across entire system;
key generally chosen from attributes of relation, making it
dependent on entity state.
Objects exist independently of their (current) values
05/06/25
Advanced Database System
11. 11
Object Identity (OID)
Object Identity (OID)
OID is the unique, system-generated mechanism of referring
persistent objects.
OIDs cannot be based on ordinary values provided by
application (value orientation) ... but: OIDs are
Unique (system-wide) means not relation based
Unchanged during object lifetime
Not reused after object deletion (immutable)
Generally system-managed:
05/06/25
Advanced Database System
12. 12
Object Identity (OID)
Object Identity (OID)
The main property required of an OID is that it be immutable that is ,
the OID value of a particular objet should not change. This preserves
the identity of the real object being represented.
Assume we have one object its name is Obj1 and OID:A123 then when
the object is deleted from the database since the OID is immutable we
can’t use to refer any other object. i.e. if an object is deleted its OID
must not be assigned to any other object.
05/06/25
Advanced Database System
13. 13
Complex Objects
Complex Objects
A Complex object is something that can be viewed as a single
object in the real world but it actually consists of many sub-
objects.
Two types of complex objects:
Unstructured complex objects:
• Their structure is hard to determine.
• Requires a large amount of storage.
• BLOB (Binary Large Objects): images and
• CLOB(character long objects): large strings.
Structured complex objects:
• Clear structure. E.g tuple
05/06/25
Advanced Database System
14. Object Structure
In OODB, the value of a complex object can be constructed from
other objects
Each object can be viewed or represented as a triplet( i, c, v)
Where i is the unique object identifier (OID)
c is the constructor or an indication of how the object value is
constructed (operator)
v is the value of the object (state)
Basic constructors are atom, tuple, set
Others: list, array
05/06/25
Advanced Database System 3-14
15. 15
Type constructors
Type constructors: In OO databases , the state(current value)
of a complex object may be constructed from other
objects(other values ) by using certain type constructors.
This determine how the object is constructed and it tell us
the basic structure of the object.
05/06/25
Advanced Database System
16. 16
Type constructors
Kind of basic constructors are:
Atom: is used to represent all basic atomic values such as int, char,
float and string.
Set: set of values of same type with no duplicate items
Bag: set with duplication allowed
List: ordered collection of items of same type {123,456,678}
Array: similar to a list but with a fixed size.
Tuple: collection of elements of the above types.
05/06/25
Advanced Database System
17. 17
Type Constructors
Tuple constructor TUPLE OF
Name|Set of locations|Array of emp
It represent in the form of : <A1:i1, A2:i2….An:in>
Eg: <Name :i1, Set of locations:i2,Array of emp: i3>
Set constructor SET OF
Many elements of the same type build a set.
Each element can only be contained once in the set.
Multi-set constructor BAG OF: like set but
one element can have copies in the bag
List constructor LIST OF: like bag, but
The OIDs in a list are ordered, and hence we can refer to the first,
second, or nth object in a list Sequence is of interest
05/06/25
Advanced Database System
18. Object Structure
Object state interpreted based on constructor ‘C’
Type ‘C’ Object state ‘V’
Atom Value is domain of basic values
Set OID={ i1, i2 , i3….. in }
Tuple { a1:i1 , a2:i2…..an:in } }
List Ordered list { i1, i2 , i3 …in, }
Array Array of OIDs
05/06/25 Advanced Database System
3-18
19. Object Structure
The value v can be interpreted on the basis of the
constructor c
Example:
if c = atom then v = atomic value
o1 = (i1, atom, ‘House’)
o2 = (i2, atom, Blue)
o3 = (i3, atom, Sugarland)
o4 = (i4, atom, 5)
o5 = (i5, atom, Research)
o6 = (i6, atom, 22-May-10)
The value ‘House’
05/06/25
Advanced Database System 3-19
20. Object Structure
Example:
if c = tuple then v = < a1:i1..an:in >
o8 = (i8, tuple, < dname:i5, dnumber: i4, mgr: i9,
locations:i7, employees:i10, projects: i11>)
o9 = (i9, tuple, < manager:i12, managerstartdate: i6>)
A department tuple
05/06/25
Advanced Database System 3-20
21. Object Structure
Example:
if c = set then v = {i1,i2,i3}
o7 = (i7, set, { i2, i1 , i3 } )
o10 = (i10, set, { i12, i13 , i14 } )
o11 = (i11, set, { i15, i16 , i17 } )
A set of employees
05/06/25
Advanced Database System 3-21
22. 22
Classes
Classes
Classes are blueprints for defining a set of similar
objects. Or common description of similar objects.
Objects in a class are called instances.
Class is also an object with own class attributes and
class methods.
Object created from the same class share the same
class attributes and methods.
05/06/25
Advanced Database System
23. 23
Class Instance
Class Instance Share
Share Attributes & Methods
Attributes & Methods
BRANCH
BranchNo = B005
Street = 22 Deer Rd
City = London
Postcode = SW1 4EH
BranchNo = B007
Street = 16 Argyll St
City = Aberdeen
Postcode = AB2 3SU
BranchNo = B003
Street = 163 Main St
City = Glasgow
Postcode = G11 9QX
Attributes
branchNo
street
city
postcode
Methods
print()
getPostCode()
numberOfStaff()
05/06/25
Advanced Database System
24. 24
OO Data Modelling:
OO Data Modelling: Unified Modeling Language (UML)
Unified Modeling Language (UML)
UML is a standard language for specifying, constructing,
visualizing, and documenting the artifacts of a software
system.
Include many structural diagrams (Class, Object diagrams…) and
behavioral diagrams (UseCase, Sequence diagrams…).
Used to model objects and object relationships.
Class Name
Attribute
Method
MANAGER
StaffNo
sex
DOB
salary
increasesalary()
PROPERTY
PropertyNo
street
city
postcode
rooms
type
1..1 manage 1.1 1..1 offer 1.*
offered-by
Association
05/06/25
Advanced Database System
25. 25
Unified Modeling Language (UML)
Unified Modeling Language (UML)
05/06/25
PERSON
Name
FName
LName
STAFF
StaffNo
position
DOB
salary
OWNER
OwnerNo
address
CLIENT
ClientNo
telNO
prefType
MaxRent
MANAGER SALESTAFF
BRANCH
PROPERTY
PropertyNo
rooms
rent
BranchNo
address
Manages
WorksAt
Offers
Views
Owns
1
1
1
M
1
M
1
M M
N
ManagedBy
Has
OwnedBy
IsOfferedBy
ViewedBy
Advanced Database System
28. 28
Introduction
Query Processing
Activities involved in retrieving data from the database.
This includes translation of high –level queries into low
level expressions that can be used at physical level of
the file system, query optimization and actual execution
of the query to get the result.
29. 29
Query Processing…
Aims of query processing (QP):
Transform query written in high-level language (e.g.,
SQL), into correct and efficient execution strategy
expressed in low-level language that implements
relational algebra (RA);
Execute strategy to retrieve required data.
30. Basic Steps in Query Processing
Basic Steps in Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
3-30
31. Parsing and translation
Scanner: The scanner specifies and recognizes the language tokens such as
SQL Keywords, attribute names, and relation names in the text of the query.
Parser: The parser checks the query syntax to determine whether it is
formulated according to the syntax rules of the query language.
Validation: The query must be validated by checking that all attributes and
relation names are valid and semantically meaningful names in the schema of
the particular database being queried.
3-31
32. Parsing and translation
Query is converted to relational algebra by SQL
interpreter.
Relational Algebra converted to annotated tree, joins as
branches
Each operator has implementation choices.
32
33. 33
Translating SQL Queries into Relational Algebra
Query block:
The basic unit that can be translated into the algebraic operators
and optimized.
A query block contains a single SELECT-FROM-WHERE expression, as
well as GROUP BY and HAVING clause if these are part of the block.
Nested queries:
Within a query are identified as separate query blocks.
Aggregate operators in SQL must be included in the extended
algebra.
34. Translation Example
Possible SQL Query:
SELECT balance FROM account WHERE balance<2500
Possible Relational Algebra Query:
balance(balance<2500(account))
3-34
35. 35
Translating SQL Queries into Relational Algebra
Consider: to find names of employees making more than
everyone in department 5.
SELECT lname, fname FROM employee WHERE salary >
( SELECT MAX(salary) FROM employee WHERE dno=5)
36. 36
Translating SQL Queries into Relational Algebra
2 query blocks:
SELECT lname, fname
FROM employee
WHERE salary > constant
SELECT MAX(salary)
FROM employee
WHERE dno=5
Relational Algebra:
π lname, fname (σsalary>cons (employee))
where cons is the result from:
π MAX Salary (σdno=5(employee))
37. 37
Translating SQL Queries into Relational Algebra
consider: to find names of employees making more
than everyone in department 5.
SELECT lname,fname, dname FROM employee e,
department d WHERE e.dno=d.dno
Relational Algebra:
π lname, fname (employee ⋈e.dno=d.dno department)
38. Optimization
The query optimizer selects an execution plan that has
lowest and fastest but functionally equivalent form.
A relational algebra expression may have many
equivalent expressions, each of which gives rise to a
different evaluation plan.
Bala( bala>100(Account))
bala>100(Bala (Account)) both are equivalent query i.e.
they display the same results.
Amongst all equivalent evaluation plans choose the one with
lowest cost.
3-38
39. 39
Execution plan
An internal representation of the query is then created, usually as a
tree data structure called a query tree.
The DBMS must then devise an execution strategy or plan for
retrieving the results of the query from the database files.
A query typically has many possible execution strategies, and the
process of choosing a suitable one for processing a query is known as
query optimization.
40. Evaluation
When the query came how the database answer it?
The query-execution engine takes a query-evaluation plan,
executes that plan, and returns the answers to the query.
3-40
41. 41
Relational Algebra: overview
Project (unary)
<attr list> (R)
<attr list> is a list of attributes (columns) from R only
Ex: title, year, length (Movie) “horizontal restriction”
A1 A2 A3 … An
...
i
A1 A2… Ak
...
j
n K, n≥k
42. 42
Project
PROJECT can produce many tuples with same value
Relational algebra semantics says remove duplicates
SQL does not -- one difference between formal and
actual query languages
43. 43
Relational Algebra: Select
Select or Restrict
<predicate> (R)
<predicate> is a conditional expression of the type that we are
familiar with from conventional programming languages
<attribute> <op> <attribute>
<attribute> <op> <constant>
attribute in R
op {=,,<,>,, …, AND, OR}
Ex: length100 (Movie) vertical restriction
44. 44
Pictorially
A1 A2 A3 … An
...
i
A1 A2 A3 … An
...
j, i j
title year length filmType
Star Wars
Mighty
Ducks
Wayne’s
World
1977
1991
1992
124
104
95
color
color
color
Movie
result set
# of selected tuples is referred to as the selectivity of the condition
45. 45
Cartesian Product
R x S
Sets of all pairs that can be formed by choosing the first
element of the pair to be any element of R, the second any
element of S.
Resulting schema may be ambiguous
Use R.A or S.A to disambiguate an attribute that occurs in
both schemas
46. 46
Example
A B
1 2
3 4
B C
2 5
4 7
D
6
8
9 10 11
x
A R.BS.B C D
R S
1 2 2 5 6
1 2 4 7 8
1 2 9 10 11
3 4
3 4
3 4
2 5 6
4 7 8
9 10 11
47. 47
Join Operations
Natural Join (binary)
R join S
Match only those tuples from R and S that agree in whatever
attributes are common to the schemas of R and S
If r and s from r(R) and s(S) are successfully paired, result is
called a joined tuple
This join operation is the same we used in earlier section to
recombine relations that had been projected onto two subsets of
their attributes (e.g., as a result of a BCNF decomposition)
48. 48
Example
A B
1 2
3 4
B C
2 5
4 7
D
6
8
9 10 11
join
A B C D
R S
1 2 5 6
3 4 7 8
49. Optimization
Optimization
A relational algebra expression may have many equivalent expressions
E.g.,salary75000(salary(instructor)) is equivalent to
salary(salary75000(instructor))
Each relational algebra operation can be evaluated using one of several
different algorithms
Correspondingly, a relational-algebra expression can be evaluated
in many ways.
E.g., can use an index on salary to find instructors with salary <
75000,
or can perform complete relation scan and discard instructors
with salary 75000
3-49
50. Optimization….
Annotated expression specifying detailed evaluation strategy is called an
evaluation-plan.
Query Optimization: Amongst all equivalent evaluation plans choose the
one with lowest cost.
Cost is estimated using statistical information from the database catalog
e.g. number of tuples in each relation, size of tuples, etc.
Total cost= CPU cost + I/O cost + communication cost
3-50
51. Three Key Concepts in QPO
1. Building blocks
Similarly, most DBMS have few building blocks:
• select (point query, range query), join, sorting, ...
SQL query is decomposed in building blocks
2. Query processing strategies for building blocks
DBMS keeps a few processing strategies for each building
block
• e.g. a point query can be answer via an index or via scanning
data-file
3. Query optimization
For each building block of a given query, DBMS QPO tries
to choose
• “most efficient” strategy given database parameters
• parameter examples: table size, available indices, …
• ex. index search is chosen for a point query if the index is
available
3-51
52. Query tree
Query tree: a tree data structure that corresponds to a
relational algebra expression. It represents the input
relations of the query as leaf nodes of the tree, and
represents the relational algebra operations as internal
nodes.
An execution of the query tree consists of executing an
internal node operation whenever its operands are available
and then replacing that internal node by the relation that
results from executing the operation.
3-52
53. Tree Representation of Relational Algebra
balancebalance<2500(account))
balance
balance<2500
account
3-53
54. Making An Evaluation Plan
Annotate Query Tree with evaluation instructions:
The query can now be executed by the query execution engine.
balance
balance<2500
account
use index 1
3-54
55. Tree Representation of Relational Algebra
A1,,,,Anp( R1 x,….Rk))
A1,,,An
P
x
x
x
R3
R2
Rk
R1
3-55
56. Why Learn about QPO?
Why learn about QPO in a DBMS?
Identify performance bottleneck for a query
• is it the physical data model or QPO ?
How to help QPO speed up processing of a query ?
• providing hints, rewriting query, etc.
How to enhance physical data model to speed up
queries?
• add indices, change file- structures, …
3-56
57. Measures of Query Cost
Measures of Query Cost
Cost is generally measured as total elapsed time for answering
query
Many factors contribute to time cost
• disk accesses, CPU, or even network communication
Typically disk access is the predominant cost, and is also relatively
easy to estimate. Measured by taking into account
Number of seeks * average-seek-cost
Number of blocks read * average-block-read-cost
Number of blocks written * average-block-write-cost
• Cost to write a block is greater than cost to read a block
• data is read back after being written to ensure that the write
was successful
3-57
58. 58
Algorithms for select operations
Implementing the SELECT Operations
There are many algorithms for executing a select operation , which is
basically a search operation to locate the records in a disk file that
satisfy a certain condition.
Let as discuss on the ff relational operations.
OP1: SSN=“123” (Employee)
OP2: Dnumber>5 (department)
OP3: Dno>5 (employee)
#8:A state is values of these property attributes.
A behavior: operation that modifies or operate uo on the property attributes Ex A= lxw…area is operating on attributes
#22:Age and height… its data type..
Users are allowed to define their own types