SlideShare a Scribd company logo
2
Most read
3
Most read
8
Most read
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
Design Theory
Problems:
 Lots of data repetition.
 A single change (for example, Room change) needs a lots of update operations.
 Deletion causes unexpected data lost problem.
 Incomplete data insertion causes problem.
Here Lat, Lng are dependent on Room; Room, Time are dependent on Class.
That is, Room  { Lat, Lng } and Class  { Room, Time }
Updated version: More efficient solution if you decompose the table into 3 different tables based on the dependencies.
~375
cs145
students
~300
cs245
students
~375
cs145
students
~300
cs245
students
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
Data Anomalies >>
- Problems that occur when we try to cram too much into a single relation are called anomalies.
1. Redundancy: Information may be repeated unnecessarily in several tuples.
2. Update Anomaly: We may change information in one tuple but leave the same information unchanged in another.
3. Delete Anomaly: If a set of values get deleted, we may lose other information as a side effect.
4. Insert Anomaly: We can’t insert a new row because of some missing value whose value can’t be null.
After decomposition (without anomalies):
If every course is in only one room,
contains redundant information!
If we update the room number for one tuple,
we get inconsistent data
= an update anomaly
If everyone drops the class, we lose what room
the class was in!
= a delete anomaly
Similarly, we can’t reserve a
room without students
= an insert anomaly
Is this form better?
• Any Redundancy?
• Any Update anomaly?
• Any Delete anomaly?
• Any Insert anomaly?
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
Normalization >>
Normalization is a systematic approach of decomposing tables to eliminate data redundancy (repetition) and
undesirable characteristics like insert, update and delete anomalies.
1.1) Functional Dependency:
- Let, A = { A1, A2, … … , Am } and B = { B1, B2, … … , Bn } in R
- The functional dependency A  B on R holds if for any tuples ti, tj in R:
ti[A] = tj[A] implies ti[B] = tj[B]
that is whenever two or more tuples in R agree on all the attributes of A, they must also agree on all the
attributes of B.
- if left side equals ti[A1] = tj[A1] , ti[A2]=tj[A2] , … … , ti[Am] = tj[Am]
then right side also equals, ti[B1] = tj[B1] , ti[B2]=tj[B2] , … … , ti[Bn] = tj[Bn]
- Flow diagram:
ti
tj
ti
tj
If t1, t2 agree
here.
…they also agree here!
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
- FD is a constraint that holds/does not hold on an instance.
- A particular instance of R may coincidently satisfy some FD but this FD may not hold for R in general.
- If the FD holds for every instances of Relation R, then FD becomes a part of the relational schema.
- Example,
i. {position} -> {phone} holds for this instance.
ii. {phone} -> {position} doesn’t hold for this instance.
- Practice:
A B C
1 2 3
2 2 3
3 2 3
4 3 2
5 2 3
6 3 2
- Why we need FDs in Database Design:
i. First we will start with some relational schema (received from ERD)
ii. [Task 1] Then we will find out its Functional Dependencies.
iii. [Task 2] Finally by using these FDs we will design a better schema that will minimize the possibility of
anomalies.
1.2) Task 1 (Discover all FDs):
- Armstrong’s Axioms:
i. Reflexivity rule: If α is a set of attributes and β ⊆ α, then α → β holds.
Ex: AB  B, here B is a subset of AB.
ii. Augmentation rule: If α → β holds and γ is a set of attributes, then γα → γβ holds.
Ex: if AB  C holds then, AB D  C D holds
iii. Transitivity rule: If α → β holds and β → γ holds, then α → γ holds.
Ex: if A  B and B  C, then A  C
A  A Valid AB  A Valid
A  B Valid AB  B Valid
A  C Valid AB  C Valid
B  A Invalid BC  A Invalid
B  B Valid BC  B Valid
B  C Valid BC  C Valid
C  A Invalid CA  A Valid
C  B Valid CA  B Valid
C  C Valid CA  C Valid
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
- Additional Rules:
i. Union rule: If α → β holds and α → γ holds, then α → βγ holds.
Ex: if A  B and A  C, then A  BC
ii. Decomposition rule: If α → βγ holds, then α → β holds and α → γ holds.
Ex: if A  BC then A  B and A  C
iii. Pseudo-transitivity rule: If α → β holds and γβ → δ holds, then αγ → δ holds.
Ex: if A  B holds and CB  D then CA  D
- Let, R = (A, B, C, G, H, I) and F = { A → B, A → C, CG → H, CG → I, B → H }
Then:
▹ A → H. Since A → B and B → H hold, we apply the transitivity rule.
▹ CG → HI. Since CG → H and CG → I , the union rule implies that CG → HI
▹ AG → I. Since A → C and CG → I, the pseudo-transitivity rule implies that AG → I holds.
- Functional Dependency Closure: [out of syllabus]
- Example:
Let, R = (A, B, C, D) and F = {A → B, B → C}
F+
= {
}
- Inefficient process!!!!!
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
- Closure of Attribute Set:
Let α be a set of attributes. We call the set of all attributes functionally determined by α under a set F of
functional dependencies the closure of α under F. we denote it by α+.
Algorithm:
Example:
Let, R = (A, B, C, G, H, I) and F = {A → B, A → C, CG → H, CG → I, B → H}
Now, attribute closure of AG that is (AG)+
:
Initially, (AG)+
= AG
= AG B [using A  B rule, as A is a part of AG]
= AGB C [using A  C rule, as A is a part of AGB]
= AGBC H [using CG  H rule, as CG is a part of AGBC]
= AGBCH I [using CG  I rule, as CG is a part of AGBCH]
= AGBCHI [using B  H rule, as B is a part of AGBCHI, no change]
= AGBCHI [no more check is needed as every FDs is checked]
Now, (AB)+
= AB
= AB [using A  B rule, as A is a part of AB, no change]
= AB C [using A  C rule, as A is a part of AB]
= ABC H [using B  H rule, as B is a part of ABC]
= ABCH [couldn’t use CG  H rule, as CG is not a part of ABCH]
= ABCH [couldn’t use CH  I rule, as CG is not a part of ABCH]
= ABCH [no more changes is possible]
Practice:
If R = (A, B, C, D, E) and F = {B  AC, C  AB, ABC  D, BD  A, AD  C, E  D}
a) Find all the attribute closures with single element of R.
b) Find all the attribute closures for all the sets with two attributes from relation R.
Uses:
▹ Superkey check:
To test if α is a superkey, we compute α+, and check if α+ contains all attributes of R. Ex: (AG)+
= ABCGHI
▹ FD validity checking:
We can check if a functional dependency α → β holds (or, in other words,
is in F +
), by checking if β ⊆ α+
Ex. AG → I is valid as (AG)+
= ABCGHI
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
▹ Determine all FDs: [No need]
For each γ ⊆ R, we find the closure γ+
, and for each S ⊆ γ+
, we output a functional dependency γ → S.
1.3) Different types of Keys:
- Superkey:
Let R be a relation schema. A subset K of R is a superkey of R if, in any legal relation r(R), for all pairs t1
and t2 of tuples in r such that t1 ≠ t2, then t1[K] ≠ t2[K].
A set X of attributes in R is a superkey of R if and only if X+
contains all attributes of R. In other words, X
is a superkey if and only if it determines all other attributes.
- Candidate key:
X is a candidate key if and only if it is a superkey, but none of its proper subset is a superkey.
All candidate key finding algorithm:
Observation 1: any candidate key must contain attributes that have not appeared on the RHS of any functional
dependency. (RHS keys are those keys that need help from others to be determined).
Observation 2: if an attribute has occurred on the RHS of some FD, but not on the LHS of any FD, then it cannot
be in any candidate key. (These keys are determined by others and no other keys are dependent on them).
Final Algorithm:
1) Find all the attributes that have not appeared on the RHS of any FD. Denote this set by 𝜶
2) Denote the set of attributes that appear on the RHS of some FD, but not on the LHS of any FD by 𝜷
3) Compute the closure set 𝛼+
, if 𝛼+
= R, then 𝛼 is the only candidate key.
4) If 𝛼 +
≠ R, then for each attribute x in R - 𝛽, test whether 𝛼 U { x } is a candidate key. If not, try to add another
attribute from R- 𝛽 to 𝛼 and test whether it is candidate key.
5) Repeat step 4, until all candidate keys have been found.
Example 1:
If R = (A, B, C, D, E) and F = {A  C, CD  B}
then, 𝛼 = { A, D, E} , 𝛽 = { B }
Now 𝛼 +
= ABCDE = R
So 𝛼 is the only candidate key.
Example 2:
If R = (A, B, C, D, E) and F = {A  C, C  BD, D  A}
then, 𝛼 = { E }, 𝛽 = { B }
Now 𝛼 +
= { E } , not a superkey/candidate key. We will test each of {C, E} , {A, E}, {D, E} next ( not {B, E} ).
{C, E}+
={ C, E, B, D, A } . Therefore { C, E } is a superkey. { C, E } is also a candidate key since neither { E } nor { C }
is a superkey.
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
{ A, E }+
={ A, E, C, B, D }. Similar to the above, { A, E } is a candidate key.
Similarly we can verify { D, E } is a candidate key.
Therefore {C,E}, {A,E}, {D,E} are all of the candidate keys.
Practice 1:
If R=(A,B,C,D,E) and F = {A-->BC, CD-->E, B-->D ,E-->A}
a) compute closure for each 𝛽 in 𝛽  𝛾 in F.
b) List candidate keys of R.
Practice 2:
If R = (A, B, C, D, E) and F = {AC, BD, ACD, CDE, EA} then list the candidate keys of R
Practice 3:
If R = (P, Q, R, S, T, U) and F = {PQRTU, PRS, UP, RS, STPU} then list the candidate keys of R.
Practice 4:
If R = (U, V, X, Y, Z) and F = {UVXZ, UXY, XY, VZYX, ZUV} then list the candidate keys of R.
1.4) Extraneous Attribute Detection:
An attribute of a functional dependency is said to be extraneous if we can remove it without changing the
closure of the set of functional dependencies.
Let R be the relation schema, and let F be the given set of functional dependencies that hold on R. Consider an
attribute A in a dependency α → β.
 If A ∈ β, to check if A is extraneous consider the set F’= (F - {α → β}) ∪ {α → (β - A)} and compute α+ (the closure
of α) under F’; if α+ includes A, then A is extraneous in β.
Example:
F = { AB → CD, A → E, E → C} , check if C is extraneous in AB  CD or not?
formula, if F = { P  QR, Q  R } then R is extraneous in P  QR
 If A ∈ α, to check if A is extraneous, let γ = α - {A}, and compute γ+ (the closure of γ) under F; if γ+ includes all
attributes in β, then A is extraneous in α.
Example:
F = { P→Q, PQ→R }, check if Q is extraneous in PQ→R?
Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com
1.5) Minimal Cover(No redundancy):
Given a set F of FDs, we say another set E of FDs is a minimal cover of F if
▸ Every FDs in E has a single attribute on the RHS.
▸ F and E are equivalent, that is, every FD in E can be inferred from the FDs in F, and every FD in F can be inferred
from the FDs in E.
▸ Every FD A b in E is minimal in its LHS, that is, there is no proper subset C of A such that C b
▸ There is no redundant FD in E. That is removing any FD from E will result in a set of FD that is not equivalent to F.
Algorithm:
Initially E=F
Step 1: rewrite each FD that has m attributes on the RHS into m FDs where the RHS is a single attribute.
Step 2: remove trivial FDs.
Step 3: minimize LHS of each FD. For each FD X y in E, and for each attribute x in X, if X-{x}  y is implied by E,
then replace X y with X-{x} y.
Step 4: remove redundant FDs. For each FD in E, if it is implied by other FDs in E, then remove it from E.
Example:
If R=(A, B, C, D, E, F) and F={ ABC  CDEF, C  E, A  B, D  F }
Final minimal cover, F = {AC  D, C  E, A  B, D  F}
Practices:
1. F = { AB→CD, B→C, BC→D, CD→EF, E→F}. Find minimal cover for this FD set.
Solution: F = {B→CD, CD→E, E→F} is minimal cover.
2. F = {A→BC,CD→E, E→C, D→AEH, ABH→BD, DH→BC}. Find minimal cover for this FD set.
Solution: F = {A→BC, D→AEH, AH→D, E→C} is minimal cover
3. F = { AB -> C, C -> A, BC -> D, ACD -> B, D -> E, D -> G, BE -> C, CG -> B, CG -> D, CE -> A, CE -> G}
Solution 1: {AB -> C, C -> A, BC -> D, CD -> B, D -> E, D -> G, BE -> C, CG -> D, CE -> G}
Solution 2: {AB -> C, C -> A, BC -> D, D -> E, D -> G, BE -> C, CG -> B, CE -> G}
Step 1
ABC  C
ABC  D
ABC  E
ABC  F
C  E
A  B
D  F
Step 2
ABC  C (cancel)
ABC  D
ABC  E
ABC  F
C  E
A  B
D  F
Step 3
C  E
A  B
D  F
AC  D
AC  F
ABC  E (cancel)
Step 4
C  E
A  B
D  F
AC  D
AC  F (cancel)

More Related Content

PDF
DBMS 12 | Design theory 2 [Normalization 2]
PDF
DBMS 8 | Memory Hierarchy and Indexing
PDF
DBMS 9 | Extendible Hashing
PDF
DBMS 4 | MySQL - DDL & DML Commands
PDF
DBMS 2 | Entity Relationship Model
PDF
DBMS 6 | MySQL Practice List - Rank Related Queries
PDF
DBMS 10 | Database Transactions
PDF
DBMS 7 | Relational Query Language
DBMS 12 | Design theory 2 [Normalization 2]
DBMS 8 | Memory Hierarchy and Indexing
DBMS 9 | Extendible Hashing
DBMS 4 | MySQL - DDL & DML Commands
DBMS 2 | Entity Relationship Model
DBMS 6 | MySQL Practice List - Rank Related Queries
DBMS 10 | Database Transactions
DBMS 7 | Relational Query Language

What's hot (20)

PDF
DBMS 1 | Introduction to DBMS
PDF
Modul praktikum 11 hashing table
PDF
DBMS 5 | MySQL Practice List - HR Schema
PPT
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
PPTX
Normalization
PDF
DBMS 3 | ER Diagram to Relational Schema
PDF
TOC 2 | Deterministic Finite Automata
PDF
TOC 9 | Pushdown Automata
PPTX
Removing ambiguity-from-cfg
PPT
SQL subquery
PPTX
SQL JOIN
PDF
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
PDF
lec02-Syntax Analysis and LL(1).pdf
PDF
Window functions in MySQL 8.0
PPTX
2.8 normal forms gnf & problems
PPTX
Fd & Normalization - Database Management System
PPTX
Linear search-and-binary-search
PPTX
PDF
Chapter1 Formal Language and Automata Theory
PPT
Branch & bound
DBMS 1 | Introduction to DBMS
Modul praktikum 11 hashing table
DBMS 5 | MySQL Practice List - HR Schema
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Normalization
DBMS 3 | ER Diagram to Relational Schema
TOC 2 | Deterministic Finite Automata
TOC 9 | Pushdown Automata
Removing ambiguity-from-cfg
SQL subquery
SQL JOIN
Tanel Poder - Troubleshooting Complex Oracle Performance Issues - Part 1
lec02-Syntax Analysis and LL(1).pdf
Window functions in MySQL 8.0
2.8 normal forms gnf & problems
Fd & Normalization - Database Management System
Linear search-and-binary-search
Chapter1 Formal Language and Automata Theory
Branch & bound
Ad

Similar to DBMS 11 | Design Theory [Normalization 1] (20)

PPT
MODULE 4 -Normalization_1.ppt
PPT
6 normalization
PPT
7. Relational Database Design in DBMS
PDF
Assignment#16
PPT
Database normalization
PPTX
DBMS FDs and Normalization.pptx
PDF
Dbms unit-3
PDF
lecture-8 Rules-of-Inference-dbms-CS.pdf
PPT
test
PPTX
DBMS Unit 3.pptx
PPT
Database
PPT
DBMS-Normalization.ppt
PPTX
RDBMS PARUL UNIVERSITY VADODARA BTECH CSE
PPT
PPTX
Ch12_Normalization (1).pptxfffffffffffffffffffff
PDF
Cs501 fd nf
PDF
Introduction to database-Normalisation
MODULE 4 -Normalization_1.ppt
6 normalization
7. Relational Database Design in DBMS
Assignment#16
Database normalization
DBMS FDs and Normalization.pptx
Dbms unit-3
lecture-8 Rules-of-Inference-dbms-CS.pdf
test
DBMS Unit 3.pptx
Database
DBMS-Normalization.ppt
RDBMS PARUL UNIVERSITY VADODARA BTECH CSE
Ch12_Normalization (1).pptxfffffffffffffffffffff
Cs501 fd nf
Introduction to database-Normalisation
Ad

More from Mohammad Imam Hossain (19)

PDF
DS & Algo 6 - Offline Assignment 6
PDF
DS & Algo 6 - Dynamic Programming
PDF
DS & Algo 5 - Disjoint Set and MST
PDF
DS & Algo 4 - Graph and Shortest Path Search
PDF
DS & Algo 3 - Offline Assignment 3
PDF
DS & Algo 3 - Divide and Conquer
PDF
DS & Algo 2 - Offline Assignment 2
PDF
DS & Algo 2 - Recursion
PDF
DS & Algo 1 - Offline Assignment 1
PDF
DS & Algo 1 - C++ and STL Introduction
PDF
TOC 10 | Turing Machine
PDF
TOC 8 | Derivation, Parse Tree & Ambiguity Check
PDF
TOC 7 | CFG in Chomsky Normal Form
PDF
TOC 6 | CFG Design
PDF
TOC 5 | Regular Expressions
PDF
TOC 4 | Non-deterministic Finite Automata
PDF
TOC 3 | Different Operations on DFA
PDF
TOC 1 | Introduction to Theory of Computation
PDF
Web 6 | JavaScript DOM
DS & Algo 6 - Offline Assignment 6
DS & Algo 6 - Dynamic Programming
DS & Algo 5 - Disjoint Set and MST
DS & Algo 4 - Graph and Shortest Path Search
DS & Algo 3 - Offline Assignment 3
DS & Algo 3 - Divide and Conquer
DS & Algo 2 - Offline Assignment 2
DS & Algo 2 - Recursion
DS & Algo 1 - Offline Assignment 1
DS & Algo 1 - C++ and STL Introduction
TOC 10 | Turing Machine
TOC 8 | Derivation, Parse Tree & Ambiguity Check
TOC 7 | CFG in Chomsky Normal Form
TOC 6 | CFG Design
TOC 5 | Regular Expressions
TOC 4 | Non-deterministic Finite Automata
TOC 3 | Different Operations on DFA
TOC 1 | Introduction to Theory of Computation
Web 6 | JavaScript DOM

Recently uploaded (20)

PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Business Ethics Teaching Materials for college
PDF
Pre independence Education in Inndia.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
Cell Structure & Organelles in detailed.
PPTX
Pharma ospi slides which help in ospi learning
PDF
Complications of Minimal Access Surgery at WLH
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Institutional Correction lecture only . . .
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Final Presentation General Medicine 03-08-2024.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
O5-L3 Freight Transport Ops (International) V1.pdf
Microbial disease of the cardiovascular and lymphatic systems
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Anesthesia in Laparoscopic Surgery in India
Week 4 Term 3 Study Techniques revisited.pptx
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
Classroom Observation Tools for Teachers
Abdominal Access Techniques with Prof. Dr. R K Mishra
Business Ethics Teaching Materials for college
Pre independence Education in Inndia.pdf
PPH.pptx obstetrics and gynecology in nursing
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Cell Structure & Organelles in detailed.
Pharma ospi slides which help in ospi learning
Complications of Minimal Access Surgery at WLH
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Institutional Correction lecture only . . .
human mycosis Human fungal infections are called human mycosis..pptx
Final Presentation General Medicine 03-08-2024.pptx

DBMS 11 | Design Theory [Normalization 1]

  • 1. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com Design Theory Problems:  Lots of data repetition.  A single change (for example, Room change) needs a lots of update operations.  Deletion causes unexpected data lost problem.  Incomplete data insertion causes problem. Here Lat, Lng are dependent on Room; Room, Time are dependent on Class. That is, Room  { Lat, Lng } and Class  { Room, Time } Updated version: More efficient solution if you decompose the table into 3 different tables based on the dependencies. ~375 cs145 students ~300 cs245 students ~375 cs145 students ~300 cs245 students
  • 2. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com Data Anomalies >> - Problems that occur when we try to cram too much into a single relation are called anomalies. 1. Redundancy: Information may be repeated unnecessarily in several tuples. 2. Update Anomaly: We may change information in one tuple but leave the same information unchanged in another. 3. Delete Anomaly: If a set of values get deleted, we may lose other information as a side effect. 4. Insert Anomaly: We can’t insert a new row because of some missing value whose value can’t be null. After decomposition (without anomalies): If every course is in only one room, contains redundant information! If we update the room number for one tuple, we get inconsistent data = an update anomaly If everyone drops the class, we lose what room the class was in! = a delete anomaly Similarly, we can’t reserve a room without students = an insert anomaly Is this form better? • Any Redundancy? • Any Update anomaly? • Any Delete anomaly? • Any Insert anomaly?
  • 3. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com Normalization >> Normalization is a systematic approach of decomposing tables to eliminate data redundancy (repetition) and undesirable characteristics like insert, update and delete anomalies. 1.1) Functional Dependency: - Let, A = { A1, A2, … … , Am } and B = { B1, B2, … … , Bn } in R - The functional dependency A  B on R holds if for any tuples ti, tj in R: ti[A] = tj[A] implies ti[B] = tj[B] that is whenever two or more tuples in R agree on all the attributes of A, they must also agree on all the attributes of B. - if left side equals ti[A1] = tj[A1] , ti[A2]=tj[A2] , … … , ti[Am] = tj[Am] then right side also equals, ti[B1] = tj[B1] , ti[B2]=tj[B2] , … … , ti[Bn] = tj[Bn] - Flow diagram: ti tj ti tj If t1, t2 agree here. …they also agree here!
  • 4. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com - FD is a constraint that holds/does not hold on an instance. - A particular instance of R may coincidently satisfy some FD but this FD may not hold for R in general. - If the FD holds for every instances of Relation R, then FD becomes a part of the relational schema. - Example, i. {position} -> {phone} holds for this instance. ii. {phone} -> {position} doesn’t hold for this instance. - Practice: A B C 1 2 3 2 2 3 3 2 3 4 3 2 5 2 3 6 3 2 - Why we need FDs in Database Design: i. First we will start with some relational schema (received from ERD) ii. [Task 1] Then we will find out its Functional Dependencies. iii. [Task 2] Finally by using these FDs we will design a better schema that will minimize the possibility of anomalies. 1.2) Task 1 (Discover all FDs): - Armstrong’s Axioms: i. Reflexivity rule: If α is a set of attributes and β ⊆ α, then α → β holds. Ex: AB  B, here B is a subset of AB. ii. Augmentation rule: If α → β holds and γ is a set of attributes, then γα → γβ holds. Ex: if AB  C holds then, AB D  C D holds iii. Transitivity rule: If α → β holds and β → γ holds, then α → γ holds. Ex: if A  B and B  C, then A  C A  A Valid AB  A Valid A  B Valid AB  B Valid A  C Valid AB  C Valid B  A Invalid BC  A Invalid B  B Valid BC  B Valid B  C Valid BC  C Valid C  A Invalid CA  A Valid C  B Valid CA  B Valid C  C Valid CA  C Valid
  • 5. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com - Additional Rules: i. Union rule: If α → β holds and α → γ holds, then α → βγ holds. Ex: if A  B and A  C, then A  BC ii. Decomposition rule: If α → βγ holds, then α → β holds and α → γ holds. Ex: if A  BC then A  B and A  C iii. Pseudo-transitivity rule: If α → β holds and γβ → δ holds, then αγ → δ holds. Ex: if A  B holds and CB  D then CA  D - Let, R = (A, B, C, G, H, I) and F = { A → B, A → C, CG → H, CG → I, B → H } Then: ▹ A → H. Since A → B and B → H hold, we apply the transitivity rule. ▹ CG → HI. Since CG → H and CG → I , the union rule implies that CG → HI ▹ AG → I. Since A → C and CG → I, the pseudo-transitivity rule implies that AG → I holds. - Functional Dependency Closure: [out of syllabus] - Example: Let, R = (A, B, C, D) and F = {A → B, B → C} F+ = { } - Inefficient process!!!!!
  • 6. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com - Closure of Attribute Set: Let α be a set of attributes. We call the set of all attributes functionally determined by α under a set F of functional dependencies the closure of α under F. we denote it by α+. Algorithm: Example: Let, R = (A, B, C, G, H, I) and F = {A → B, A → C, CG → H, CG → I, B → H} Now, attribute closure of AG that is (AG)+ : Initially, (AG)+ = AG = AG B [using A  B rule, as A is a part of AG] = AGB C [using A  C rule, as A is a part of AGB] = AGBC H [using CG  H rule, as CG is a part of AGBC] = AGBCH I [using CG  I rule, as CG is a part of AGBCH] = AGBCHI [using B  H rule, as B is a part of AGBCHI, no change] = AGBCHI [no more check is needed as every FDs is checked] Now, (AB)+ = AB = AB [using A  B rule, as A is a part of AB, no change] = AB C [using A  C rule, as A is a part of AB] = ABC H [using B  H rule, as B is a part of ABC] = ABCH [couldn’t use CG  H rule, as CG is not a part of ABCH] = ABCH [couldn’t use CH  I rule, as CG is not a part of ABCH] = ABCH [no more changes is possible] Practice: If R = (A, B, C, D, E) and F = {B  AC, C  AB, ABC  D, BD  A, AD  C, E  D} a) Find all the attribute closures with single element of R. b) Find all the attribute closures for all the sets with two attributes from relation R. Uses: ▹ Superkey check: To test if α is a superkey, we compute α+, and check if α+ contains all attributes of R. Ex: (AG)+ = ABCGHI ▹ FD validity checking: We can check if a functional dependency α → β holds (or, in other words, is in F + ), by checking if β ⊆ α+ Ex. AG → I is valid as (AG)+ = ABCGHI
  • 7. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com ▹ Determine all FDs: [No need] For each γ ⊆ R, we find the closure γ+ , and for each S ⊆ γ+ , we output a functional dependency γ → S. 1.3) Different types of Keys: - Superkey: Let R be a relation schema. A subset K of R is a superkey of R if, in any legal relation r(R), for all pairs t1 and t2 of tuples in r such that t1 ≠ t2, then t1[K] ≠ t2[K]. A set X of attributes in R is a superkey of R if and only if X+ contains all attributes of R. In other words, X is a superkey if and only if it determines all other attributes. - Candidate key: X is a candidate key if and only if it is a superkey, but none of its proper subset is a superkey. All candidate key finding algorithm: Observation 1: any candidate key must contain attributes that have not appeared on the RHS of any functional dependency. (RHS keys are those keys that need help from others to be determined). Observation 2: if an attribute has occurred on the RHS of some FD, but not on the LHS of any FD, then it cannot be in any candidate key. (These keys are determined by others and no other keys are dependent on them). Final Algorithm: 1) Find all the attributes that have not appeared on the RHS of any FD. Denote this set by 𝜶 2) Denote the set of attributes that appear on the RHS of some FD, but not on the LHS of any FD by 𝜷 3) Compute the closure set 𝛼+ , if 𝛼+ = R, then 𝛼 is the only candidate key. 4) If 𝛼 + ≠ R, then for each attribute x in R - 𝛽, test whether 𝛼 U { x } is a candidate key. If not, try to add another attribute from R- 𝛽 to 𝛼 and test whether it is candidate key. 5) Repeat step 4, until all candidate keys have been found. Example 1: If R = (A, B, C, D, E) and F = {A  C, CD  B} then, 𝛼 = { A, D, E} , 𝛽 = { B } Now 𝛼 + = ABCDE = R So 𝛼 is the only candidate key. Example 2: If R = (A, B, C, D, E) and F = {A  C, C  BD, D  A} then, 𝛼 = { E }, 𝛽 = { B } Now 𝛼 + = { E } , not a superkey/candidate key. We will test each of {C, E} , {A, E}, {D, E} next ( not {B, E} ). {C, E}+ ={ C, E, B, D, A } . Therefore { C, E } is a superkey. { C, E } is also a candidate key since neither { E } nor { C } is a superkey.
  • 8. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com { A, E }+ ={ A, E, C, B, D }. Similar to the above, { A, E } is a candidate key. Similarly we can verify { D, E } is a candidate key. Therefore {C,E}, {A,E}, {D,E} are all of the candidate keys. Practice 1: If R=(A,B,C,D,E) and F = {A-->BC, CD-->E, B-->D ,E-->A} a) compute closure for each 𝛽 in 𝛽  𝛾 in F. b) List candidate keys of R. Practice 2: If R = (A, B, C, D, E) and F = {AC, BD, ACD, CDE, EA} then list the candidate keys of R Practice 3: If R = (P, Q, R, S, T, U) and F = {PQRTU, PRS, UP, RS, STPU} then list the candidate keys of R. Practice 4: If R = (U, V, X, Y, Z) and F = {UVXZ, UXY, XY, VZYX, ZUV} then list the candidate keys of R. 1.4) Extraneous Attribute Detection: An attribute of a functional dependency is said to be extraneous if we can remove it without changing the closure of the set of functional dependencies. Let R be the relation schema, and let F be the given set of functional dependencies that hold on R. Consider an attribute A in a dependency α → β.  If A ∈ β, to check if A is extraneous consider the set F’= (F - {α → β}) ∪ {α → (β - A)} and compute α+ (the closure of α) under F’; if α+ includes A, then A is extraneous in β. Example: F = { AB → CD, A → E, E → C} , check if C is extraneous in AB  CD or not? formula, if F = { P  QR, Q  R } then R is extraneous in P  QR  If A ∈ α, to check if A is extraneous, let γ = α - {A}, and compute γ+ (the closure of γ) under F; if γ+ includes all attributes in β, then A is extraneous in α. Example: F = { P→Q, PQ→R }, check if Q is extraneous in PQ→R?
  • 9. Mohammad Imam Hossain, Lecturer, dept. of CSE, UIU. Email: imambuet11@gmail.com 1.5) Minimal Cover(No redundancy): Given a set F of FDs, we say another set E of FDs is a minimal cover of F if ▸ Every FDs in E has a single attribute on the RHS. ▸ F and E are equivalent, that is, every FD in E can be inferred from the FDs in F, and every FD in F can be inferred from the FDs in E. ▸ Every FD A b in E is minimal in its LHS, that is, there is no proper subset C of A such that C b ▸ There is no redundant FD in E. That is removing any FD from E will result in a set of FD that is not equivalent to F. Algorithm: Initially E=F Step 1: rewrite each FD that has m attributes on the RHS into m FDs where the RHS is a single attribute. Step 2: remove trivial FDs. Step 3: minimize LHS of each FD. For each FD X y in E, and for each attribute x in X, if X-{x}  y is implied by E, then replace X y with X-{x} y. Step 4: remove redundant FDs. For each FD in E, if it is implied by other FDs in E, then remove it from E. Example: If R=(A, B, C, D, E, F) and F={ ABC  CDEF, C  E, A  B, D  F } Final minimal cover, F = {AC  D, C  E, A  B, D  F} Practices: 1. F = { AB→CD, B→C, BC→D, CD→EF, E→F}. Find minimal cover for this FD set. Solution: F = {B→CD, CD→E, E→F} is minimal cover. 2. F = {A→BC,CD→E, E→C, D→AEH, ABH→BD, DH→BC}. Find minimal cover for this FD set. Solution: F = {A→BC, D→AEH, AH→D, E→C} is minimal cover 3. F = { AB -> C, C -> A, BC -> D, ACD -> B, D -> E, D -> G, BE -> C, CG -> B, CG -> D, CE -> A, CE -> G} Solution 1: {AB -> C, C -> A, BC -> D, CD -> B, D -> E, D -> G, BE -> C, CG -> D, CE -> G} Solution 2: {AB -> C, C -> A, BC -> D, D -> E, D -> G, BE -> C, CG -> B, CE -> G} Step 1 ABC  C ABC  D ABC  E ABC  F C  E A  B D  F Step 2 ABC  C (cancel) ABC  D ABC  E ABC  F C  E A  B D  F Step 3 C  E A  B D  F AC  D AC  F ABC  E (cancel) Step 4 C  E A  B D  F AC  D AC  F (cancel)