SlideShare a Scribd company logo
CSEC 321
Lecture 2
Relational Algebra
Relational Database: Definitions
• Relational database: a set of relations.
• Relation: made up of 2 parts:
– Schema : specifies name of relation, plus name
and type of each column.
• E.g. Students(sid: string, name: string, login: string, age:
integer, gpa: real)
– Instance : a table, with rows and columns.
• #rows = cardinality
• #fields = degree / arity
• Can think of a relation as a set of rows or
tuples.
– i.e., all rows are distinct
Ex: Instance of Students Relation
sid name login age gpa
53666 Jones jones@cs 18 3.4
53688 Smith smith@eecs 18 3.2
53650 Smith smith@math 19 3.8
• Cardinality = 3, arity = 5 , all rows distinct
• Do all values in each column of a relation instance
have to be distinct?
Relational Query Languages
• Query languages: Allow manipulation and retrieval of
data from a database.
• Relational model supports simple, powerful QLs:
– Strong formal foundation based on logic.
– Allows for much optimization.
• Query Languages != programming languages!
– QLs not intended to be used for complex calculations.
– QLs support easy, efficient access to large data sets.
Preliminaries
• A query is applied to relation instances, and the
result of a query is also a relation instance.
Relational Algebra: 5 Basic Operations
• Selection ( s ) Selects a subset of rows from relation
(horizontal).
• Projection ( p ) Retains only wanted columns from relation
(vertical).
• Cross-product (  ) Allows us to combine two relations.
• Set-difference ( — ) Tuples in r1, but not in r2.
• Union (  ) Tuples in r1 or in r2.
Since each operation returns a relation, operations can be
composed! (Algebra is “closed”.)
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
sid bid day
22 101 10/10/96
58 103 11/12/96
R1
S1
S2
bid bname color
101 Interlake blue
102 Interlake red
103 Clipper green
104 Marine red
Boats
Example Instances
Projection (p)
page S( )2• Examples: ;
• Retains only attributes that are in the “projection list”.
• Schema of result:
– exactly the fields in the projection list, with the same names that they
had in the input relation.
• Projection operator has to eliminate duplicates (How do they
arise? Why remove them?)
– Note: real systems typically don’t do duplicate elimination unless the
user explicitly asks for it. (Why not?)
psname rating
S
,
( )2
Projection (p)
age
35.0
55.5
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S2
sname rating
yuppy 9
lubber 8
guppy 5
rusty 10
)2(
,
S
ratingsname
p
page S( )2
Selection (s)
srating
S
8
2( )
sname rating
yuppy 9
rusty 10
p ssname rating rating
S
,
( ( ))
8
2
• Selects rows that satisfy selection condition.
• Result is a relation.
Schema of result is same as that of the input relation.
• Do we need to do duplicate elimination?
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
Union and Set-Difference
• Both of these operations take two input relations,
which must be union-compatible:
– Same number of fields.
– `Corresponding’ fields have the same type.
• For which, if any, is duplicate elimination required?
Union
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
44 guppy 5 35.0
28 yuppy 9 35.0
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1
S2
S S1 2
Set Difference
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid sname rating age
28 yuppy 9 35.0
31 lubber 8 55.5
44 guppy 5 35.0
58 rusty 10 35.0
S1
S2
sid sname rating age
22 dustin 7 45.0
S2 – S1
sid sname rating age
28 yuppy 9 35.0
44 guppy 5 35.0
S S1 2
Cross-Product
• S1  R1: Each row of S1 paired with each row of R1.
• Q: How many rows in the result?
• Result schema has one field per field of S1 and R1,
with field names `inherited’ if possible.
– May have a naming conflict: Both S1 and R1 have a field
with the same name.
– In this case, can use the renaming operator:
 ( ( , ), )C sid sid S R1 1 5 2 1 1  
Cross Product Example
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 22 101 10/10/96
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 22 101 10/10/96
31 lubber 8 55.5 58 103 11/12/96
58 rusty 10 35.0 22 101 10/10/96
58 rusty 10 35.0 58 103 11/12/96
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid bid day
22 101 10/10/96
58 103 11/12/96
R1
S1
S1 x R1 =
Compound Operator: Intersection
• In addition to the 5 basic operators, there are several
additional “Compound Operators”
– These add no computational power to the language, but are
useful shorthands.
– Can be expressed solely with the basic ops.
• Intersection takes two input relations, which must be
union-compatible.
• Q: How to express it using basic operators?
R  S = R  (R  S)
Natural Join Example
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
sid bid day
22 101 10/10/96
58 103 11/12/96
R1
S1
S1 R1 =
sid sname rating age bid day
22 dustin 7 45.0 101 10/10/96
58 rusty 10 35.0 103 11/12/96
Other Types of Joins
• Condition Join (or “theta-join”):
• Result schema same as that of cross-product.
• May have fewer tuples than cross-product.
• Equi-Join: Special case: condition c contains only
conjunction of equalities.
R c S c R S  s ( )
(sid) sname rating age (sid) bid day
22 dustin 7 45.0 58 103 11/12/96
31 lubber 8 55.5 58 103 11/12/96
11
.1.1
RS
sidRsidS 

Examples
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
bid bname color
101 Interlake Blue
102 Interlake Red
103 Clipper Green
104 Marine Red
sid bid day
22 101 10/10/96
58 103 11/12/96
Reserves
Sailors
Boats
Find names of sailors who’ve reserved boat #103
• Solution 1: p ssname bid
serves Sailors(( Re ) )
103

• Solution 2: p ssname bid
serves Sailors( (Re ))
103

Find names of sailors who’ve reserved a red boat
• Information about boat color only available in
Boats; so need an extra join:
p ssname color red
Boats serves Sailors((
' '
) Re )

 
 A more efficient solution:
p p p ssname sid bid color red
Boats s Sailors( ((
' '
) Re ) )

 
 A query optimizer can find this given the first solution!
Find sailors who’ve reserved a red boat or a green
boat
• Can identify all red or green boats, then find
sailors who’ve reserved one of these boats:
 s( ,(
' ' ' '
))Tempboats
color red color green
Boats
  
p sname Tempboats serves Sailors( Re ) 
Find sailors who’ve reserved a red and a green boat
• Cut-and-paste previous slide?

 (Tempboats,(s
color'red'color'green'
Boats))
p sname Tempboats serves Sailors( Re ) 
Find sailors who’ve reserved a red and a green boat
• Previous approach won’t work! Must identify
sailors who’ve reserved red boats, sailors who’ve
reserved green boats, then find the intersection
(note that sid is a key for Sailors):
 p s( , ((
' '
) Re ))Tempred
sid color red
Boats serves


p sname Tempred Tempgreen Sailors(( ) ) 
 p s( , ((
' '
) Re ))Tempgreen
sid color green
Boats serves


Summary
• Relational Algebra: a small set of operators
mapping relations to relations
– Operational, in the sense that you specify the
explicit order of operations
– A closed set of operators! Can mix and match.
• Basic ops include: s, p, , , —
• Important compound ops: ,
SQL - A language for Relational DBs
• SQL (a.k.a. “Sequel”), standard
language
• Data Definition Language (DDL)
– create, modify, delete relations
– specify constraints
– administer users, security, etc.
• Data Manipulation Language (DML)
– Specify queries to find tuples that satisfy
criteria
– add, modify, remove tuples
SQL Overview
• CREATE TABLE <name> ( <field> <domain>, … )
• INSERT INTO <name> (<field names>)
VALUES (<field values>)
• DELETE FROM <name>
WHERE <condition>
• UPDATE <name>
SET <field name> = <value>
WHERE <condition>
• SELECT <fields>
FROM <name>
WHERE <condition>
Creating Relations in SQL
• Creates the Students relation.
– Note: the type (domain) of each field is
specified, and enforced by the DBMS
whenever tuples are added or modified.
CREATE TABLE Students
(sid CHAR(20),
name CHAR(20),
login CHAR(10),
age INTEGER,
gpa FLOAT)
Table Creation (continued)
• Another example: the Enrolled table
holds information about courses
students take.
CREATE TABLE Enrolled
(sid CHAR(20),
cid CHAR(20),
grade CHAR(2))
Adding and Deleting Tuples
• Can insert a single tuple using:
INSERT INTO Students (sid, name, login, age, gpa)
VALUES (‘53688’, ‘Smith’, ‘smith@ee’, 18, 3.2)
• Can delete all tuples satisfying some condition
(e.g., name = Smith):
DELETE
FROM Students S
WHERE S.name = ‘Smith’
Powerful variants of these commands are available;
more later!
Keys
• Keys are a way to associate tuples in
different relations
• Keys are one form of integrity constraint
(IC)
sid name login age gpa
53666 Jones jones@cs 18 3.4
53688 Smith smith@eecs 18 3.2
53650 Smith smith@math 19 3.8
sid cid grade
53666 Carnatic101 C
53666 Reggae203 B
53650 Topology112 A
53666 History105 B
Enrolled Students
PRIMARY KeyFOREIGN Key
Primary Keys
• A set of fields is a superkey if:
– No two distinct tuples can have same values in all key fields
• A set of fields is a key for a relation if :
– It is a superkey
– No subset of the fields is a superkey
• what if >1 key for a relation?
– One of the keys is chosen (by DBA) to be the primary key.
Other keys are called candidate keys.
• E.g.
– sid is a key for Students.
– What about name?
– The set {sid, gpa} is a superkey.
Primary and Candidate Keys in SQL
• Possibly many candidate keys (specified using
UNIQUE), one of which is chosen as the primary key.
• Keys must be used carefully!
• “For a given student and course, there is a single grade.”
“Students can take only one course, and no two students
in a course receive the same grade.”
CREATE TABLE Enrolled
(sid CHAR(20)
cid CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid,cid))
CREATE TABLE Enrolled
(sid CHAR(20)
cid CHAR(20),
grade CHAR(2),
PRIMARY KEY (sid),
UNIQUE (cid, grade))
vs.
Foreign Keys, Referential Integrity
• Foreign key: Set of fields in one relation
that is used to `refer’ to a tuple in another
relation.
– Must correspond to the primary key of the other
relation.
– Like a `logical pointer’.
• If all foreign key constraints are enforced,
referential integrity is achieved (i.e., no
dangling references.)
Foreign Keys in SQL
• E.g. Only students listed in the Students relation
should be allowed to enroll for courses.
– sid is a foreign key referring to Students:
CREATE TABLE Enrolled
(sid CHAR(20),cid CHAR(20),grade CHAR(2),
PRIMARY KEY (sid,cid),
FOREIGN KEY (sid) REFERENCES Students )
sid cid grade
53666 Carnatic101 C
53666 Reggae203 B
53650 Topology112 A
53666 History105 B
Enrolled
sid name login age gpa
53666 Jones jones@cs 18 3.4
53688 Smith smith@eecs 18 3.2
53650 Smith smith@math 19 3.8
Students
11111 English102 A
Enforcing Referential Integrity
• Consider Students and Enrolled; sid in Enrolled is a
foreign key that references Students.
• What should be done if an Enrolled tuple with a non-
existent student id is inserted? (Reject it!)
• What should be done if a Students tuple is deleted?
– Also delete all Enrolled tuples that refer to it?
– Disallow deletion of a Students tuple that is referred to?
– Set sid in Enrolled tuples that refer to it to a default sid?
– (In SQL, also: Set sid in Enrolled tuples that refer to it to a
special value null, denoting `unknown’ or `inapplicable’.)
• Similar issues arise if primary key of Students tuple is
updated.
Integrity Constraints (ICs)
• IC: condition that must be true for any
instance of the database; e.g., domain
constraints.
– ICs are specified when schema is defined.
– ICs are checked when relations are modified.
• A legal instance of a relation is one that
satisfies all specified ICs.
– DBMS should not allow illegal instances.
• If the DBMS checks ICs, stored data is more
faithful to real-world meaning.
– Avoids data entry errors, too!
Where do ICs Come From?
• ICs are based upon the semantics of the real-world
that is being described in the database relations.
• We can check a database instance to see if an IC is
violated, but we can NEVER infer that an IC is true by
looking at an instance.
– An IC is a statement about all possible instances!
– From example, we know name is not a key, but the
assertion that sid is a key is given to us.
• Key and foreign key ICs are the most common; more
general ICs supported too.
• In the real world, sometimes the constraint should
hold but doesn’t --> data cleaning!
Relational Query Languages
• A major strength of the relational model:
supports simple, powerful querying of data.
• Queries can be written intuitively, and the
DBMS is responsible for efficient evaluation.
– The key: precise semantics for relational queries.
– Allows the optimizer to extensively re-order
operations, and still ensure that the answer does
not change.
The SQL Query Language
• The most widely used relational query
language.
– Current std is SQL:2003; SQL92 is a basic subset
• To find all 18 year old students, we can write:
SELECT *
FROM Students S
WHERE S.age=18
• To find just names and logins, replace the first line:
SELECT S.name, S.login
sid name age gpa
53666 Jones 18 3.4
53688 Smith 18 3.2
53650 Smith
login
jones@cs
smith@ee
smith@math 19 3.8
Querying Multiple Relations
• What does the following query compute?
SELECT S.name, E.cid
FROM Students S, Enrolled E
WHERE S.sid=E.sid AND E.grade='A'
sid cid grade
53831 Carnatic101 C
53831 Reggae203 B
53650 Topology112 A
53666 History105 B
Given the following instance of
Enrolled
S.name E.cid
Smith Topology112
we get:
Semantics of a Query
• A conceptual evaluation method for the previous
query:
1. do FROM clause: compute cross-product of Students and
Enrolled
2. do WHERE clause: Check conditions, discard tuples that fail
3. do SELECT clause: Delete unwanted fields
• Remember, this is conceptual. Actual evaluation will
be much more efficient, but must produce the same
answers.
Cross-product of Students and Enrolled Instances
S.sid S.name S.login S.age S.gpa E.sid E.cid E.grade
53666 Jones jones@cs 18 3.4 53831 Carnatic101 C
53666 Jones jones@cs 18 3.4 53832 Reggae203 B
53666 Jones jones@cs 18 3.4 53650 Topology112 A
53666 Jones jones@cs 18 3.4 53666 History105 B
53688 Smith smith@ee 18 3.2 53831 Carnatic101 C
53688 Smith smith@ee 18 3.2 53831 Reggae203 B
53688 Smith smith@ee 18 3.2 53650 Topology112 A
53688 Smith smith@ee 18 3.2 53666 History105 B
53650 Smith smith@math 19 3.8 53831 Carnatic101 C
53650 Smith smith@math 19 3.8 53831 Reggae203 B
53650 Smith smith@math 19 3.8 53650 Topology112 A
53650 Smith smith@math 19 3.8 53666 History105 B
Relational Model: Summary
• A tabular representation of data.
• Simple and intuitive, currently the most widely used
– Object-relational support in most products
– XML support added in SQL:2003, most systems
• Integrity constraints can be specified by the DBA,
based on application semantics. DBMS checks for
violations.
– Two important ICs: primary and foreign keys
– In addition, we always have domain constraints.
• Powerful query languages exist.
– SQL is the standard commercial one
• DDL - Data Definition Language
• DML - Data Manipulation Language
GOSUB XML;
Databases for Programmers
• Programmers think about objects
(structs)
– Nested and interleaved
• Often want to “persist” these things
• Options
– encode opaquely and store
– translate to a structured form
• relational DB, XML file
– pros and cons?
YUCK!!
• How do I “relationalize” my objects?
• Have to write a converter for each
class?
• Think about when to save things into
the DB?
• Good news:
– Can all be automated
– With varying amounts of trouble
Object-Relational Mappings
• Roughly:
– Class ~ Entity Set
– Instance ~ Entity
– Data member ~ Attribute
– Reference ~ Foreign Key
Details, details
• We have to map this down to tables
• Which table holds which class of object?
• What about relationships?
• Solution #1: Declarative Configuration
– Write a description file (often in XML)
• E.g. Enterprise Java Beans (EJBs)
• Solution #2: Convention
– Agree to use some conventions
• E.g. Rails
Ruby on Rails
• Ruby: an OO scripting language
– and a pretty nice one, too
• Rails: a framework for web apps
– “convention over configuration”
• great for standard web-app stuff!
– allows overriding as needed
• Very ER-like
Rails and ER
• Models
– Employees
– Departments
lot
name
Employees
ssn
Works_In
since
dname
budgetdid
Departments
Some Rails “Models”
app/models/state.rb
class State < ActiveRecord::Base
has_many :cities
end
app/models/city.rb
class City < ActiveRecord::Base
belongs_to :state
end
Further Reading
• Chapter 18 (through 18.3) in Agile Web
Development with Rails

More Related Content

PPTX
Erd practice exercises
PDF
[APJ] Common Table Expressions (CTEs) in SQL
 
PPT
Présentation Oracle DataBase 11g
PDF
Overview of Database and Database Management
PPTX
Salesforce ppt
PPT
Les 16 resource
PPT
Disk structure.45
PDF
MERGE SQL Statement: Lesser Known Facets
Erd practice exercises
[APJ] Common Table Expressions (CTEs) in SQL
 
Présentation Oracle DataBase 11g
Overview of Database and Database Management
Salesforce ppt
Les 16 resource
Disk structure.45
MERGE SQL Statement: Lesser Known Facets

What's hot (20)

PPTX
Scaling @Bouygues Telecom AWS Paris 2019
PPTX
Introduction to distributed database
PPTX
Introduction to Oracle Database
PPTX
NESTED SUBQUERY.pptx
PDF
Autonomous Data Warehouse
PPT
Data Flow Diagram
PPT
Sql ppt
PPT
Unit 02 dbms
PPTX
Relational Calculus
PPTX
Ch-2-Query-Process.pptx advanced database
PPTX
Force.com Data Modeling: The Advantages of Denormalization
PDF
What Is Salesforce CRM, Editions, Licenses?
PPTX
SQL Server Database Backup and Restore Plan
PDF
Various Types of Vendors that Exist in the Software Ecosystem
PPT
dbms notes.ppt
PPTX
(SQL초보자를 위한, 쿼리최적화 for SQL튜닝)SQL쿼리작성Tip,최적화팁,최적화된SQL작성방법교육
PPTX
Deep dive into Salesforce Connected App
PPTX
SQL Basics
PPT
Oracle data pump
PDF
Oracle 12cR2 Installation On Linux With ASM
Scaling @Bouygues Telecom AWS Paris 2019
Introduction to distributed database
Introduction to Oracle Database
NESTED SUBQUERY.pptx
Autonomous Data Warehouse
Data Flow Diagram
Sql ppt
Unit 02 dbms
Relational Calculus
Ch-2-Query-Process.pptx advanced database
Force.com Data Modeling: The Advantages of Denormalization
What Is Salesforce CRM, Editions, Licenses?
SQL Server Database Backup and Restore Plan
Various Types of Vendors that Exist in the Software Ecosystem
dbms notes.ppt
(SQL초보자를 위한, 쿼리최적화 for SQL튜닝)SQL쿼리작성Tip,최적화팁,최적화된SQL작성방법교육
Deep dive into Salesforce Connected App
SQL Basics
Oracle data pump
Oracle 12cR2 Installation On Linux With ASM
Ad

Similar to Database managment System Relational Algebra (20)

PPTX
Relational algebra
PPT
lefg sdfg ssdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg d...
PPT
lecture8Alg.ppt
PPTX
Relational Model
PDF
APznzab-krNx9xYwUY9_3k8Hh19mmThz2R8IODQ0Q7QpGzIRd4klcTiJbr1Xbm6ooppFjMsR6TZ6B...
PPT
Relational Algebra
PPT
R Algebra.ppt
PPT
Ch7
PPT
Relational Algebra
PPT
Relational Algebra DBMS formal language used to query and manipulate relation...
PPT
chapter 5-Relational Algebra and calculus.ppt
PPTX
Relational Algebra in Database Systems.pptx
PDF
Ch4_Algebra.pdf
PPT
lecture05-14f.ppt
PPT
Relational algebra in database management system
PDF
chapter 6 Relational Algebra and calculus.pdf
PPT
Relational Algebra and Calculus.ppt
PPT
Relational_Intro_1. relational databaseppt
PPT
Relational_Database_managementsystem.ppt
PPT
Relational_Intro_1.ppt
Relational algebra
lefg sdfg ssdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg sdfg d...
lecture8Alg.ppt
Relational Model
APznzab-krNx9xYwUY9_3k8Hh19mmThz2R8IODQ0Q7QpGzIRd4klcTiJbr1Xbm6ooppFjMsR6TZ6B...
Relational Algebra
R Algebra.ppt
Ch7
Relational Algebra
Relational Algebra DBMS formal language used to query and manipulate relation...
chapter 5-Relational Algebra and calculus.ppt
Relational Algebra in Database Systems.pptx
Ch4_Algebra.pdf
lecture05-14f.ppt
Relational algebra in database management system
chapter 6 Relational Algebra and calculus.pdf
Relational Algebra and Calculus.ppt
Relational_Intro_1. relational databaseppt
Relational_Database_managementsystem.ppt
Relational_Intro_1.ppt
Ad

Recently uploaded (20)

DOCX
573137875-Attendance-Management-System-original
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Well-logging-methods_new................
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
composite construction of structures.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
additive manufacturing of ss316l using mig welding
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
PPT on Performance Review to get promotions
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Welding lecture in detail for understanding
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Construction Project Organization Group 2.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
573137875-Attendance-Management-System-original
Mechanical Engineering MATERIALS Selection
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Well-logging-methods_new................
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
composite construction of structures.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Foundation to blockchain - A guide to Blockchain Tech
OOP with Java - Java Introduction (Basics)
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
additive manufacturing of ss316l using mig welding
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPT on Performance Review to get promotions
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Welding lecture in detail for understanding
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Construction Project Organization Group 2.pptx
Internet of Things (IOT) - A guide to understanding

Database managment System Relational Algebra

  • 2. Relational Database: Definitions • Relational database: a set of relations. • Relation: made up of 2 parts: – Schema : specifies name of relation, plus name and type of each column. • E.g. Students(sid: string, name: string, login: string, age: integer, gpa: real) – Instance : a table, with rows and columns. • #rows = cardinality • #fields = degree / arity • Can think of a relation as a set of rows or tuples. – i.e., all rows are distinct
  • 3. Ex: Instance of Students Relation sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 • Cardinality = 3, arity = 5 , all rows distinct • Do all values in each column of a relation instance have to be distinct?
  • 4. Relational Query Languages • Query languages: Allow manipulation and retrieval of data from a database. • Relational model supports simple, powerful QLs: – Strong formal foundation based on logic. – Allows for much optimization. • Query Languages != programming languages! – QLs not intended to be used for complex calculations. – QLs support easy, efficient access to large data sets.
  • 5. Preliminaries • A query is applied to relation instances, and the result of a query is also a relation instance.
  • 6. Relational Algebra: 5 Basic Operations • Selection ( s ) Selects a subset of rows from relation (horizontal). • Projection ( p ) Retains only wanted columns from relation (vertical). • Cross-product (  ) Allows us to combine two relations. • Set-difference ( — ) Tuples in r1, but not in r2. • Union (  ) Tuples in r1 or in r2. Since each operation returns a relation, operations can be composed! (Algebra is “closed”.)
  • 7. sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 sid bid day 22 101 10/10/96 58 103 11/12/96 R1 S1 S2 bid bname color 101 Interlake blue 102 Interlake red 103 Clipper green 104 Marine red Boats Example Instances
  • 8. Projection (p) page S( )2• Examples: ; • Retains only attributes that are in the “projection list”. • Schema of result: – exactly the fields in the projection list, with the same names that they had in the input relation. • Projection operator has to eliminate duplicates (How do they arise? Why remove them?) – Note: real systems typically don’t do duplicate elimination unless the user explicitly asks for it. (Why not?) psname rating S , ( )2
  • 9. Projection (p) age 35.0 55.5 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S2 sname rating yuppy 9 lubber 8 guppy 5 rusty 10 )2( , S ratingsname p page S( )2
  • 10. Selection (s) srating S 8 2( ) sname rating yuppy 9 rusty 10 p ssname rating rating S , ( ( )) 8 2 • Selects rows that satisfy selection condition. • Result is a relation. Schema of result is same as that of the input relation. • Do we need to do duplicate elimination? sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0
  • 11. Union and Set-Difference • Both of these operations take two input relations, which must be union-compatible: – Same number of fields. – `Corresponding’ fields have the same type. • For which, if any, is duplicate elimination required?
  • 12. Union sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 44 guppy 5 35.0 28 yuppy 9 35.0 sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S1 S2 S S1 2
  • 13. Set Difference sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid sname rating age 28 yuppy 9 35.0 31 lubber 8 55.5 44 guppy 5 35.0 58 rusty 10 35.0 S1 S2 sid sname rating age 22 dustin 7 45.0 S2 – S1 sid sname rating age 28 yuppy 9 35.0 44 guppy 5 35.0 S S1 2
  • 14. Cross-Product • S1  R1: Each row of S1 paired with each row of R1. • Q: How many rows in the result? • Result schema has one field per field of S1 and R1, with field names `inherited’ if possible. – May have a naming conflict: Both S1 and R1 have a field with the same name. – In this case, can use the renaming operator:  ( ( , ), )C sid sid S R1 1 5 2 1 1  
  • 15. Cross Product Example (sid) sname rating age (sid) bid day 22 dustin 7 45.0 22 101 10/10/96 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 22 101 10/10/96 31 lubber 8 55.5 58 103 11/12/96 58 rusty 10 35.0 22 101 10/10/96 58 rusty 10 35.0 58 103 11/12/96 sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid bid day 22 101 10/10/96 58 103 11/12/96 R1 S1 S1 x R1 =
  • 16. Compound Operator: Intersection • In addition to the 5 basic operators, there are several additional “Compound Operators” – These add no computational power to the language, but are useful shorthands. – Can be expressed solely with the basic ops. • Intersection takes two input relations, which must be union-compatible. • Q: How to express it using basic operators? R  S = R  (R  S)
  • 17. Natural Join Example sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 sid bid day 22 101 10/10/96 58 103 11/12/96 R1 S1 S1 R1 = sid sname rating age bid day 22 dustin 7 45.0 101 10/10/96 58 rusty 10 35.0 103 11/12/96
  • 18. Other Types of Joins • Condition Join (or “theta-join”): • Result schema same as that of cross-product. • May have fewer tuples than cross-product. • Equi-Join: Special case: condition c contains only conjunction of equalities. R c S c R S  s ( ) (sid) sname rating age (sid) bid day 22 dustin 7 45.0 58 103 11/12/96 31 lubber 8 55.5 58 103 11/12/96 11 .1.1 RS sidRsidS  
  • 19. Examples sid sname rating age 22 dustin 7 45.0 31 lubber 8 55.5 58 rusty 10 35.0 bid bname color 101 Interlake Blue 102 Interlake Red 103 Clipper Green 104 Marine Red sid bid day 22 101 10/10/96 58 103 11/12/96 Reserves Sailors Boats
  • 20. Find names of sailors who’ve reserved boat #103 • Solution 1: p ssname bid serves Sailors(( Re ) ) 103  • Solution 2: p ssname bid serves Sailors( (Re )) 103 
  • 21. Find names of sailors who’ve reserved a red boat • Information about boat color only available in Boats; so need an extra join: p ssname color red Boats serves Sailors(( ' ' ) Re )     A more efficient solution: p p p ssname sid bid color red Boats s Sailors( (( ' ' ) Re ) )     A query optimizer can find this given the first solution!
  • 22. Find sailors who’ve reserved a red boat or a green boat • Can identify all red or green boats, then find sailors who’ve reserved one of these boats:  s( ,( ' ' ' ' ))Tempboats color red color green Boats    p sname Tempboats serves Sailors( Re ) 
  • 23. Find sailors who’ve reserved a red and a green boat • Cut-and-paste previous slide?   (Tempboats,(s color'red'color'green' Boats)) p sname Tempboats serves Sailors( Re ) 
  • 24. Find sailors who’ve reserved a red and a green boat • Previous approach won’t work! Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (note that sid is a key for Sailors):  p s( , (( ' ' ) Re ))Tempred sid color red Boats serves   p sname Tempred Tempgreen Sailors(( ) )   p s( , (( ' ' ) Re ))Tempgreen sid color green Boats serves  
  • 25. Summary • Relational Algebra: a small set of operators mapping relations to relations – Operational, in the sense that you specify the explicit order of operations – A closed set of operators! Can mix and match. • Basic ops include: s, p, , , — • Important compound ops: ,
  • 26. SQL - A language for Relational DBs • SQL (a.k.a. “Sequel”), standard language • Data Definition Language (DDL) – create, modify, delete relations – specify constraints – administer users, security, etc. • Data Manipulation Language (DML) – Specify queries to find tuples that satisfy criteria – add, modify, remove tuples
  • 27. SQL Overview • CREATE TABLE <name> ( <field> <domain>, … ) • INSERT INTO <name> (<field names>) VALUES (<field values>) • DELETE FROM <name> WHERE <condition> • UPDATE <name> SET <field name> = <value> WHERE <condition> • SELECT <fields> FROM <name> WHERE <condition>
  • 28. Creating Relations in SQL • Creates the Students relation. – Note: the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. CREATE TABLE Students (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT)
  • 29. Table Creation (continued) • Another example: the Enrolled table holds information about courses students take. CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2))
  • 30. Adding and Deleting Tuples • Can insert a single tuple using: INSERT INTO Students (sid, name, login, age, gpa) VALUES (‘53688’, ‘Smith’, ‘smith@ee’, 18, 3.2) • Can delete all tuples satisfying some condition (e.g., name = Smith): DELETE FROM Students S WHERE S.name = ‘Smith’ Powerful variants of these commands are available; more later!
  • 31. Keys • Keys are a way to associate tuples in different relations • Keys are one form of integrity constraint (IC) sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 sid cid grade 53666 Carnatic101 C 53666 Reggae203 B 53650 Topology112 A 53666 History105 B Enrolled Students PRIMARY KeyFOREIGN Key
  • 32. Primary Keys • A set of fields is a superkey if: – No two distinct tuples can have same values in all key fields • A set of fields is a key for a relation if : – It is a superkey – No subset of the fields is a superkey • what if >1 key for a relation? – One of the keys is chosen (by DBA) to be the primary key. Other keys are called candidate keys. • E.g. – sid is a key for Students. – What about name? – The set {sid, gpa} is a superkey.
  • 33. Primary and Candidate Keys in SQL • Possibly many candidate keys (specified using UNIQUE), one of which is chosen as the primary key. • Keys must be used carefully! • “For a given student and course, there is a single grade.” “Students can take only one course, and no two students in a course receive the same grade.” CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid)) CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid), UNIQUE (cid, grade)) vs.
  • 34. Foreign Keys, Referential Integrity • Foreign key: Set of fields in one relation that is used to `refer’ to a tuple in another relation. – Must correspond to the primary key of the other relation. – Like a `logical pointer’. • If all foreign key constraints are enforced, referential integrity is achieved (i.e., no dangling references.)
  • 35. Foreign Keys in SQL • E.g. Only students listed in the Students relation should be allowed to enroll for courses. – sid is a foreign key referring to Students: CREATE TABLE Enrolled (sid CHAR(20),cid CHAR(20),grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students ) sid cid grade 53666 Carnatic101 C 53666 Reggae203 B 53650 Topology112 A 53666 History105 B Enrolled sid name login age gpa 53666 Jones jones@cs 18 3.4 53688 Smith smith@eecs 18 3.2 53650 Smith smith@math 19 3.8 Students 11111 English102 A
  • 36. Enforcing Referential Integrity • Consider Students and Enrolled; sid in Enrolled is a foreign key that references Students. • What should be done if an Enrolled tuple with a non- existent student id is inserted? (Reject it!) • What should be done if a Students tuple is deleted? – Also delete all Enrolled tuples that refer to it? – Disallow deletion of a Students tuple that is referred to? – Set sid in Enrolled tuples that refer to it to a default sid? – (In SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.) • Similar issues arise if primary key of Students tuple is updated.
  • 37. Integrity Constraints (ICs) • IC: condition that must be true for any instance of the database; e.g., domain constraints. – ICs are specified when schema is defined. – ICs are checked when relations are modified. • A legal instance of a relation is one that satisfies all specified ICs. – DBMS should not allow illegal instances. • If the DBMS checks ICs, stored data is more faithful to real-world meaning. – Avoids data entry errors, too!
  • 38. Where do ICs Come From? • ICs are based upon the semantics of the real-world that is being described in the database relations. • We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance. – An IC is a statement about all possible instances! – From example, we know name is not a key, but the assertion that sid is a key is given to us. • Key and foreign key ICs are the most common; more general ICs supported too. • In the real world, sometimes the constraint should hold but doesn’t --> data cleaning!
  • 39. Relational Query Languages • A major strength of the relational model: supports simple, powerful querying of data. • Queries can be written intuitively, and the DBMS is responsible for efficient evaluation. – The key: precise semantics for relational queries. – Allows the optimizer to extensively re-order operations, and still ensure that the answer does not change.
  • 40. The SQL Query Language • The most widely used relational query language. – Current std is SQL:2003; SQL92 is a basic subset • To find all 18 year old students, we can write: SELECT * FROM Students S WHERE S.age=18 • To find just names and logins, replace the first line: SELECT S.name, S.login sid name age gpa 53666 Jones 18 3.4 53688 Smith 18 3.2 53650 Smith login jones@cs smith@ee smith@math 19 3.8
  • 41. Querying Multiple Relations • What does the following query compute? SELECT S.name, E.cid FROM Students S, Enrolled E WHERE S.sid=E.sid AND E.grade='A' sid cid grade 53831 Carnatic101 C 53831 Reggae203 B 53650 Topology112 A 53666 History105 B Given the following instance of Enrolled S.name E.cid Smith Topology112 we get:
  • 42. Semantics of a Query • A conceptual evaluation method for the previous query: 1. do FROM clause: compute cross-product of Students and Enrolled 2. do WHERE clause: Check conditions, discard tuples that fail 3. do SELECT clause: Delete unwanted fields • Remember, this is conceptual. Actual evaluation will be much more efficient, but must produce the same answers.
  • 43. Cross-product of Students and Enrolled Instances S.sid S.name S.login S.age S.gpa E.sid E.cid E.grade 53666 Jones jones@cs 18 3.4 53831 Carnatic101 C 53666 Jones jones@cs 18 3.4 53832 Reggae203 B 53666 Jones jones@cs 18 3.4 53650 Topology112 A 53666 Jones jones@cs 18 3.4 53666 History105 B 53688 Smith smith@ee 18 3.2 53831 Carnatic101 C 53688 Smith smith@ee 18 3.2 53831 Reggae203 B 53688 Smith smith@ee 18 3.2 53650 Topology112 A 53688 Smith smith@ee 18 3.2 53666 History105 B 53650 Smith smith@math 19 3.8 53831 Carnatic101 C 53650 Smith smith@math 19 3.8 53831 Reggae203 B 53650 Smith smith@math 19 3.8 53650 Topology112 A 53650 Smith smith@math 19 3.8 53666 History105 B
  • 44. Relational Model: Summary • A tabular representation of data. • Simple and intuitive, currently the most widely used – Object-relational support in most products – XML support added in SQL:2003, most systems • Integrity constraints can be specified by the DBA, based on application semantics. DBMS checks for violations. – Two important ICs: primary and foreign keys – In addition, we always have domain constraints. • Powerful query languages exist. – SQL is the standard commercial one • DDL - Data Definition Language • DML - Data Manipulation Language
  • 46. Databases for Programmers • Programmers think about objects (structs) – Nested and interleaved • Often want to “persist” these things • Options – encode opaquely and store – translate to a structured form • relational DB, XML file – pros and cons?
  • 47. YUCK!! • How do I “relationalize” my objects? • Have to write a converter for each class? • Think about when to save things into the DB? • Good news: – Can all be automated – With varying amounts of trouble
  • 48. Object-Relational Mappings • Roughly: – Class ~ Entity Set – Instance ~ Entity – Data member ~ Attribute – Reference ~ Foreign Key
  • 49. Details, details • We have to map this down to tables • Which table holds which class of object? • What about relationships? • Solution #1: Declarative Configuration – Write a description file (often in XML) • E.g. Enterprise Java Beans (EJBs) • Solution #2: Convention – Agree to use some conventions • E.g. Rails
  • 50. Ruby on Rails • Ruby: an OO scripting language – and a pretty nice one, too • Rails: a framework for web apps – “convention over configuration” • great for standard web-app stuff! – allows overriding as needed • Very ER-like
  • 51. Rails and ER • Models – Employees – Departments lot name Employees ssn Works_In since dname budgetdid Departments
  • 52. Some Rails “Models” app/models/state.rb class State < ActiveRecord::Base has_many :cities end app/models/city.rb class City < ActiveRecord::Base belongs_to :state end
  • 53. Further Reading • Chapter 18 (through 18.3) in Agile Web Development with Rails