SlideShare a Scribd company logo
Functional Dependency
& Normalization
Dependency
• A dependency occurs in a database when
information stored in the same database table
uniquely determines other information stored in the
same table.
Functional Dependency
• A functional dependency is defined as a constraint
between two sets of attributes in a relation from a
database.
• Given a relation R, a set of attributes X in R is said to
functionally determine another attribute Y, also in R,
(written X → Y) if and only if each X value is
associated with at most one Y value.
• A Functional Dependency describes a relationship between
attributes within a single relation.
• An attribute is functionally dependent on another if we can
use the value of one attribute to determine the value of
another.
• We use the arrow symbol → to indicate a functional
dependency. X → Y is read X functionally determines Y
In other words….
X is the determinant set and Y is the dependent
attribute. Thus, given a tuple and the values of the
attributes in X, one can determine the corresponding
value of the Y attribute.
Example
Employee
SSN Name JobType DeptName
557-78-6587 Lance Smith Accountant Salary
214-45-2398 Lance Smith Engineer Product
Note: Name is functionally dependent on SSN because an
employee’s name can be uniquely determined from their SSN.
Name does not determine SSN, because more than one employee
can have the same name..
Functional Dependence (FD)
Functional Dependence (FD)
Fully Functional Dependency
Candidate Functional
Dependency (CFD)
• A candidate functional dependency is
functional dependence that includes all
attributes of the table. It should also be noted
that a well-formed dependency diagram must
have at least one candidate functional
dependency and that there can be more than
one candidate functional dependence for a
given dependency diagram.
Primary Functional
Dependency (PFD)
• PFD is a candidate functional dependency that
is selected to determine the primary key. The
determinant of PFD is the primary key of the
relational database table. Each dependency
diagram must have one and only one primary
functional dependency. If a relational
database table has only one candidate
functional dependency, then it automatically
becomes the primary functional dependency.
Primary Functional
Dependency (PFD)
• Once the primary key has been determined,
there will be three possible types of functional
dependencies:
– A  B, A key attribute functionally determines a
non-key attribute.
– A  B, A non-key attribute functionally
determines a non-key attribute.
– A  B, A non-key attribute functionally
determines a key attribute.
Primary Functional
Dependency (PFD)
• A Partial Functional Dependency is a
functional dependency where the
determinant consists of key attributes, but not
the entire primary key, and the determined
consists of non-key attributes.
• A Transitive Functional Dependency is a
functional dependency where the
determinant and the determined both
consists of non-key attributes.
Primary Functional
Dependency (PFD)
• A Multi-Value Dependency (MVD) occurs when
two or more independent multi valued facts
about the same attribute occur within the
same table. It means that if in a relation R
having A, B and C as attributes, B and C are
multi-value facts about A, which is
represented as A  B and A  C, then
multi value dependency exit only if B and C
are independent on each other.
Trivial Functional Dependency
• Trivial: If an FD X → Y holds where Y subset of X,
then it is called a trivial FD. Trivial FDs always hold.
• Non-trivial: If an FD X → Y holds where Y is not
subset of X, then it is called non-trivial FD.
Keys
• key is a set of attributes that uniquely identifies an
entire tuple, a functional dependency allows us to
express constraints that uniquely identify the values
of certain attributes.
• However, a candidate key is always a determinant,
but a determinant doesn’t need to be a key.
Closure
• Let a relation R have some functional dependencies F
specified. The closure of F (usually written as F+) is the set of
all functional dependencies that may be logically derived from
F.
• F+, the closure, is the set of all the functional dependencies
including F and those that can be deduced from F.
• The closure is important and may, for example, be needed in
finding one or more candidate keys of the relation.
Axioms
Developed by Armstrong in 1974, there are six
rules (axioms) that all possible functional
dependencies may be derived from them.
Axioms Cont…
1. Reflexivity Rule --- If X is a set of attributes and
Y is a subset of X, then X  Y holds.
each subset of X is functionally dependent on X.
2. Augmentation Rule --- If X  Y holds and W is
a set of attributes, then WX  WY holds.
3. Transitivity Rule --- If X  Y and Y  Z holds,
then X  Z holds.
Derived Theorems from Axioms
4. Union Rule --- If X  Y and X  Z holds, then
X  YZ holds.
5. Decomposition Rule --- If X  YZ holds, then
so do X  Y and X  Z.
6. Pseudotransitivity Rule --- If X  Y and WY 
Z hold then so does WX  Z.
Normalization
• Data normalization is a technique of
organizing the data in database.
• Normalization of data can be defined as a
process during which redundant relation
schemas are decomposed by breaking up their
attributes into smaller relation schemas that
process desirable properties.
Objectives of Normalization
Normal Forms
First Normal Form
• A relation is said to be in 1NF if and only if
every cell/entry of the relation has at most a
single value. In other words “a relation is in
1NF if and only if all underlying domains
contain atomic values or single value only.”
• The objective of normalizing a table is to
remove its repeating groups and ensure that
all entries of the resulting table have at most a
single value.
Example: Unnormalized table
Course
Code
Course Name Teacher
Name
Roll
No
Name System
Used
Hourly
Rate
Total
Hours
C1 Visual Basic ABC 100 A1 P – I 20 7
101 A2 P – II 30 3
102 A3 Celeron 10 6
103 A4 P – IV 40 1
C2 Oracle & Dev DEF 100 A1 P – I 20 7
104 A5 P – III 35 3
105 A6 P – II 30 1
101 A2 P – II 30 2
C3 C++ KJP 106 A7 P – IV 40 3
107 A8 P – IV 40 2
108 A9 P – I 20 1
C4 Java Kumar 109 A10 Cyrix 20 2
Approaches to normalize table
• In general, there are two basic approaches to
normalize tables.
STUDENT (Flattening)
(Normalized Table)
Course
Code
Course Name Teacher
Name
Roll
No
Name System
Used
Hourly
Rate
Total
Hours
C1 Visual Basic ABC 100 A1 P – I 20 7
C1 Visual Basic ABC 101 A2 P – II 30 3
C1 Visual Basic ABC 102 A3 Celeron 10 6
C1 Visual Basic ABC 103 A4 P – IV 40 1
C2 Oracle & Dev DEF 100 A1 P – I 20 7
C2 Oracle & Dev DEF 104 A5 P – III 35 3
C2 Oracle & Dev DEF 105 A6 P – II 30 1
C2 Oracle & Dev DEF 101 A2 P – II 30 2
C3 C++ KJP 106 A7 P – IV 40 3
C3 C++ KJP 107 A8 P – IV 40 2
C3 C++ KJP 108 A9 P – I 20 1
C4 Java Kumar 109 A10 Cyrix 20 2
Approaches to normalize table
Normalisation
Normalisation
STUDENT (Decomposition)
(Normalized Table)
31
Course
Code
Roll
No
Name System
Used
Hourly
Rate
Total
Hours
C1 100 A1 P – I 20 7
C1 101 A2 P – II 30 3
C1 102 A3 Celeron 10 6
C1 103 A4 P – IV 40 1
C2 100 A1 P – I 20 7
C2 104 A5 P – III 35 3
C2 105 A6 P – II 30 1
C2 101 A2 P – II 30 2
C3 106 A7 P – IV 40 3
C3 107 A8 P – IV 40 2
C3 108 A9 P – I 20 1
C4 109 A10 Cyrix 20 2
Course
Code
Course Name Teacher
Name
C1 Visual Basic ABC
C2 Oracle & Dev DEF
C3 C++ KJP
C4 Java Kumar
COURSE
COURSE_STUDENT
PK = (Course_code,
RollNo)
Anomalies in 1NF Relations
• Redundancies in 1NF relations lead to a variety of data
anomalies.
1. Insert anomalies:
We cannot insert the information about the
student until he joins any course e.g. we cannot store
information about the roll no 110 until he join any
course, similarly we are unable to store the information
about the course until there is a student who enroll into
that course.
These anomalies occur because course_code,
rollno is the composite key and we cannot insert null in
any of these two attributes.
Anomalies in 1NF Relations
2. Update anomalies:
This relation is also susceptible to update
anomalies because the course in which a student studies
may appear many times in the table. If a teacher moves
to another course, we are now faced with two problems:
we either search the entire table looking for that teacher
and update his or her course_code value or we miss one
or more tuples of that STUDENT and end up with an
inconsistent database. For small tables, this type of
anomaly may not seem to be much of a problem.
But for larger tables this may cause the problem of
inconsistency.
Anomalies in 1NF Relations
3. Delete anomalies:
This relation experiences deletion anomalies
whenever we delete the last tuple of a particular
student. In this case, we not only delete the course
information that connects that student to a particular
course, but also lose other information about the system
on which this student works.
Let us consider, the case where we have to delete
the information of student having rollno 109, then we
also lose the information about course_code C4 . Also if
we have to delete the information of java course we lose
the information about the student Kumar.
Second Normal Form (2NF)
• A relation R is in 2NF if and only if it is in 1NF
and every non-key attribute is fully functional
dependent on the primary key.
Cours_Code
RollNo
Name
System_Used
Hourly_Rate
Course_Name
Teacher_NameTotal Hrs.
Functional Dependence
Diagram
Second Normal Form (2NF)
• A resultant database of 1NF Course_Code does
not satisfy above rule, because non-key
attributes Name, System_Used and
Hourly_Rate are not fully dependent on the
primary key (Course_Code, Rollno) because
Name, System_Used and Hourly_Rate are
functional dependent on Rollno and Rollno is a
subset of the primary key so it does not hold
the law of fully functional dependence.
Rule to convert 1NF to 2NF
• Consider a relation where a primary key
consists of attributes A and B. These two
attributes determine all other attributes.
• Attribute C is fully dependent on the key.
• Attribute D is partially dependent on the key
because we only need attribute A to
functionally determine it.
• Attributes C and D are non-key attributes.
Rule to convert 1NF to 2NF
• The rule is to replace the original relation by
two new relations. The first new relation has
three attributes: A, B and C. The primary key
of this relation is (A, B) i.e. the primary key of
the original relation. The second relation has A
and D as its only two attributes. Observe that
attribute A has been designated, as the
primary key of the second relation and that
attribute D is now fully dependent on the key.
Rule to transform 1NF to 2NF
A*
B*
C
D
Convert To
A*
B*
C
A*
D
1NF 2NF
Second Normal Form (2NF)
Course
Code
Roll
No
Total
Hours
C1 100 7
C1 101 3
C1 102 6
C1 103 1
C2 100 7
C2 104 3
C2 105 1
C2 101 2
C3 106 3
C3 107 2
C3 108 1
C4 109 2
HOURS_ASSIGNED
Course
Code
Course
Name
Teacher
Name
C1
Visual
Basic
ABC
C2
Oracle
& Dev
DEF
C3 C++ KJP
C4 Java Kumar
STUDENT_SYSTEM_CHARGE
COURSE
Roll
No
Name System
Used
Hourly
Rate
100 A1 P – I 20
101 A2 P – II 30
102 A3 Celeron 10
103 A4 P – IV 40
100 A1 P – I 20
104 A5 P – III 35
105 A6 P – II 30
101 A2 P – II 30
106 A7 P – IV 40
107 A8 P – IV 40
108 A9 P – I 20
109 A10 Cyrix 20
Removal of Anomalies
of 1NF Relations
• Insert Anomalies: It is now possible to insert
the information about the student who does
not join any course e.g. we can store the
information about the Rollno 110 who does
not join any course in
STUDENT_SYSTEM_CHARGE database.
Similarly now we are able to store the
information about the course which has no
enrolled student e.g. we can store that C1
course is of Visual Basic in COURSE table.
Removal of Anomalies
of 1NF Relations
• Update Anomalies: Now, it is possible to
change the teacher for a particular course in
the COURSE table through a single
modification. So, no data inconsistency will
arise.
• Delete Anomalies: in the revised structure, we
can delete the information of student having
Rollno 109 without losing the information
about his course i.e. C4.
Anomalies in 2NF
• Relations in 2NF are still subject to data
anomalies. Let us assume that the system on
which a student works functionally
determines the hourly rate charged from the
student i.e.
System_Used  Hourly_Rate
• Due to this fact the anomalies will occur in
case of 2NF.
Anomalies in 2NF Relations
• Insert Anomalies: Insertion anomalies occur
in the Student_System_Charge relation. For
example, consider a situation where we would
like to set in advance the rate to be charged
from the students for a particular system. We
can not insert this info. Until there is a student
assigned to that type of system because
Rollno is the primary key for this relation and
we can not insert the null value into it.
Anomalies in 2NF
• Update Anomalies: Update anomalies will
also occur in the Student_System_Charge
relation because there may be several
students which are working on the same type
of the system. If the Hourly_Rate for a
particular system changes, we need to make
sure that the corresponding rate is changed
for all the students that work on that type of
system. Otherwise, the database may end up
in an inconsistent state.
Anomalies in 2NF
• Delete Anomalies: Delete anomalies will also
occur in the Student_System_Charge relation.
This type of anomaly occurs whenever we
delete the tuple of a student who happens to
be the only student left which is working on a
particular system. In this case, we will also
lose the information about the rate that we
charge for that particular system.
Prime and nonprime attributes
• An attribute of a relation schema R is called
Prime attribute if it is a member of some
candidate key of R.
• An attribute is called non prime if it is not a
prime attribute i.e. it is not the member of any
candidate key.
Third Normal Form (3NF)
• A relation R is in 3NF if and only if the
following conditions are satisfied
simultaneously:
– R is already in 2NF
– No nonprime attribute functionally determines
any other nonprime attribute Or No nonprime
attribute is transitively dependent on the key
• The objective of transforming relations into
3NF is to remove all transitive dependencies.
Third Normal Form (3NF)
RollNo
Name
System_Used
Hourly_Rate
Functional Dependence
Diagram
Rule to Resolve Transitivity Dependence
A*
B
C
Convert To
A*
B
B
C
3NF2NF
RollNo  System_Used
System_Used  Hourly_Rate
It Means RollNo  Hourly_Rate
Third Normal Form (3NF)
Roll
No
Name System
Used
100 A1 P – I
101 A2 P – II
102 A3 Celeron
103 A4 P – IV
100 A1 P – I
104 A5 P – III
105 A6 P – II
101 A2 P – II
106 A7 P – IV
107 A8 P – IV
108 A9 P – I
109 A10 Cyrix
System
Used
Hourly
Rate
Celeron 10
Cyrix 20
P – I 20
P – II 30
P – III 35
P – IV 40
STUDENT_SYSTEM
CHARGES
Roll
No
Name System
Used
Hourly
Rate
100 A1 P – I 20
101 A2 P – II 30
102 A3 Celeron 10
103 A4 P – IV 40
100 A1 P – I 20
104 A5 P – III 35
105 A6 P – II 30
101 A2 P – II 30
106 A7 P – IV 40
107 A8 P – IV 40
108 A9 P – I 20
109 A10 Cyrix 20
STUDENT_SYSTEM_CHARGE
Convert To
3NF2NF
Removal of Anomalies
of 2NF Relations
• Insert Anomalies: In the revised structure of
STUDENT_SYSTEM and CHARGES, it is possible
to insert in advance the rate to be charged
from the students for a particular system.
• Update Anomalies: If the Hourly_Rate for a
particular system changes, we need only to
change a single record in CHARGES database
for that particular system.
Removal of Anomalies
of 2NF Relations
• Delete Anomalies: We delete the tuple of a
student who happens to be the only student
left which is working on a particular system
without losing the information about the rate
that we charge for that particular system.
Anomalies in 3NF
• The relations in 3NF are suceptible to data
anomalies particularly when the relations
have two overlapping candidate keys or when
a non-prime attribute functionally determines
a prime attribute. Let us consider the example
which illustrate these anomalies:
Anomalies in 3NF
• We can take a case of Supplier_Part Table
having following attributes:
Supplier_Part(Sno, Sname, Pno, Qty)
Lets suppose that Sname is unique for each
Sno as shown below:
Sno Sname Pno Qty
S1 Rahat P1 300
S2 Raju P2 200
S1 Rahat P3 100
S2 Raju P1 200
Anomalies in 3NF
• This relation has two candidate keys: (Sno,
Pno) and (Sname, Pno) that overlap on the
attribute Pno. The relation is in 3NF because
there is single nonprime attribute.
• The relation is susceptible to update
anomalies e.g. if one of the supplier changes
its name, then we have to make multiple
changes which is equal to the number part
supplied by that particular supplier.
Boyce-Codd Normal Form
• Boyce-Codd Normal Form (BCNF) is used to
eliminate the anomalies of 3NF.
• BCNF states that a relation R is in BCNF if and
only if every determinant is a candidate key.
• Here determinant is a simple attribute or
composite attribute on which some other
attribute is fully functionally dependent
• E.g. (Sno, Pno)  Qty, here (Sno, Pno) is a
composite determinant. Sno  Sname, here Sno
is simple attribute determinant.
Boyce-Codd Normal Form
Functional Dependency Diagram of Supplier_Part Relation
Qty
Sname
Pno
Sno
Qty
Sno
Pno
Sname
FD of the above relations are:
(Sno, Pno)  Qty
(Sname, Pno)  Qty
Sno  Sname
Sname  Sno
Boyce-Codd Normal Form
• Both the relations are in 3NF, because there is only
one non-key attribute i.e. Qty and it is FFD and non-
transitively dependent on the primary key.
• But Supplier_Part relation is not in BCNF because this
relation has four determinants:
– (Sno, Pno), (Sname, Pno), (Sno), (Sname)
• Out of these four determinants (Sno, Pno) and
(Sname, Pno) are unique but Sno and Sname
determinants are not candidate keys.
Boyce-Codd Normal Form
• In order to make this relation in BCNF, we non-
loss decompose this relation in two
projections SN (Sno, Sname) and SP (Sno, Pno,
Qty).
• SN relation has two determinants Sno, Sname
and both are unique.
• SP has one determinant (Sno, Pno) and is also
unique.
Decomposition of tables
• Decomposition means dividing a table into
more than one table. The main purpose of
decomposition is to eliminate redundancy by
decomposing a relation into several relations
in a higher normal form.
• Types of decomposition:
– Lossy decomposition
– Lossless decomposition
Lossy Decomposition
 Lossy decomposition results in the loss of the
information. Let R be a relation , decomposition
of R is a set of relation schemas (R1, R2, R3….) such
that R = R1 U R2 U …..U Rn such that each Ri is a
subset of R ( for i = 1,2…,n)
 For example, For relation R(x,y,z) there can be 2
subsets:
R1(x,z) and R2(y,z)
If we union R1 and R2, we get R ,i.e, R = R1 U R2
Lossy Decomposition
• The major problem with decomposition is that
we may not be able to get the original relation
after performing the union of instances of the
original relation- results in information loss.
Example : Problem with Decomposition
Model Name Price Category
a11 100 Canon
s20 200 Nikon
a70 150 Canon
R
Model Name Category
a11 Canon
s20 Nikon
a70 Canon
Price Category
100 Canon
200 Nikon
150 Canon
R1 R2
Example : Problem with Decomposition
R1 U R2 Model Name Price Category
a11 100 Canon
a11 150 Canon
s20 200 Nikon
a70 100 Canon
a70 150 Canon
Model Name Price Category
a11 100 Canon
s20 200 Nikon
a70 150 Canon
R
Loss-less decomposition
• A decomposition {R1, R2,…, Rn} of a relation R is called
a lossless decomposition for R if the natural join of R1,
R2,…, Rn produces exactly the relation R.
• A decomposition is lossless if we can recover:
R(A, B, C)
Decompose
R1(A, B) R2(A, C)
Recover
R’(A, B, C)
Thus, R’ = R
Forth Normal Form (4NF)
• A relation R is in 4NF if and only if the
following conditions are satisfied
simultaneously:
– R is already in 3NF or BCNF.
– If it contains no multi-valued dependencies.
• Multi-Valued Dependency (MVD)
– MVD is the dependency where one attribute
value is potentially a ‘multi-valued fact’ about
another.
Forth Normal Form (4NF)
• MVD can be defined informally as follows:
– MVDs occur when two or more independent
multi valued facts about the same attribute occur
within the same table. It means that if in a
relation R having A, B and C as attributes, B and C
are muti-value facts about A, which is
represented as AB and AC ,then muti
value dependency exist only if B and C are
independent of each other.
Forth Normal Form (4NF)
Course S_Name Text_Book
Physics Ankit Mechanics
Physics Ankit Optics
Physics Rahat Mechanics
Physics Rahat Optics
Chemistry Ankit Org. Chemistry
Chemistry Ankit Inorg. Chemistry
English Raj Eng. Literature
English Raj Eng. Grammer
Course_Student_Book
MVD exists :
Course   S_Name
Course   Text_Book
Forth Normal Form (4NF)
• Anomalies of database with MVDs:
– If a new student joins the physics course then we
have to make two insertions for that student in
the database, which is equal to no. of physics text
books.
– If the name of the physics textbook is required to
change we have the update the no. of records
equal to no. of students in physics course.
– If a physics textbook is required to be deleted
then we have to delete no. of records.
Forth Normal Form (4NF)
• To put Course_Student_Book relation into 4NF,
two separate tables are formed as shown
below:
Course S_Name
Physics Ankit
Physics Rahat
Chemistry Ankit
English Raj
Course_Student
Course Text_Book
Physics Mechanics
Physics Optics
Chemistry Org. Chemistry
Chemistry Inorg. Chemistry
English Eng. Literature
English Eng. Grammer
Course_Student
Fifth Normal Form (5NF)
• A relation R is in 5NF if and only if the
following conditions are satisfied
simultaneously:
– R is already in 4NF.
– It cannot be further non-loss decomposed.
• 5NF is of little practical use to the database
designer, but it is of interest from a theoretical
point of view.
Fifth Normal Form (5NF)
• In all of the normal forms discussed so far, no
loss decomposition was achieved by the
decomposing of a single table into two
separate tables. No loss decomposition is
possible because of the availability of the join
operator as part of the relational model. In
considering 5NF, consideration must be given
to table where non-loss decomposition can
only be achieved by decomposition into three
or more separate tables.
Fifth Normal Form (5NF)
• Consider the table: Agent_Company_Product
below, table lists agents, the companies they work
for and the products they sell for those
companies. The agents do not necessarily sell all
the products supplied by the companies they do
business with.
Agent Company P_Name
Suneet ABC Nut
Suneet ABC Screw
Suneet CDF Bolt
Raj ABC Bolt
Agent_Company_Product
Fifth Normal Form (5NF)
• Suppose the table is decomposed into three
projections say P1, P2, P3:
Agent Company
Suneet ABC
Suneet CDF
Raj ABC
P1
Agent P_Name
Suneet Nut
Suneet Screw
Suneet Bolt
Raj Bolt
P2
Company P_Name
ABC Nut
ABC Screw
ABC Bolt
CDE Bolt
P3
Fifth Normal Form (5NF)
• Apply Natural Join to Projection P1 and P2 over the Agent
column:
Agent Company P_Name
Suneet ABC Nut
Suneet ABC Screw
Suneet ABC Bolt*
Suneet CDE Nut*
Suneet CDE Screw*
Suneet CDE Bolt
Raj ABC Bolt
Natural Join of P1 & P2
 The resulting table is spurious, since the asterisked
row of the table contains incorrect information.
Fifth Normal Form (5NF)
• Apply Natural Join to Projection P1, P2 and P3 over the
Company and P_Name columns:
Agent Company P_Name
Suneet ABC Nut
Suneet ABC Screw
Suneet ABC Bolt*
Suneet CDE Bolt
Raj ABC Bolt
Natural Join of (P1 & P2) & P3
 It is still containing spurious row. It is not simply
possible to decompose Agent_Company_Product
table without losing information.
Fifth Normal Form (5NF)
• Now Consider the different case where, if an
agent for a company and the company makes a
product, then he always sells that product for the
company. Under these circumstances, the
Agent_Company_Product table is shown below:
Agent Company P_Name
Suneet ABC Nut
Raj ABC Bolt
Raj ABC Nut
Suneet CDF Bolt
Suneet ABC Bolt
Agent_Company_Product
Fifth Normal Form (5NF)
• The assumption being that ABC makes both Nuts and Bolts and that
CDF makes Bolts only. This table can be decomposed into its three
projections without loss of information as shown below:
Agent Company
Suneet ABC
Suneet CDF
Raj ABC
P1
Agent P_Name
Suneet Nut
Suneet Bolt
Raj Bolt
Raj Nut
P2
Company P_Name
ABC Nut
ABC Bolt
CDE Bolt
P3
Fifth Normal Form (5NF)
• All redundancy is removed, if the natural join of P1 and P2 is
taken, the result is:
Agent Company P_Name
Suneet ABC Nut
Suneet ABC Bolt
Suneet CDE Nut*
Suneet CDE Bolt
Raj ABC Bolt
Raj ABC Nut
Natural Join of P1 & P2
 The resulting table is spurious, since the asterisked
row of the table contains incorrect information.
Fifth Normal Form (5NF)
• Now, if this result is joined with P3 over the column
‘Company’ and ‘P_Name’ the following table is
obtained.
Agent Company P_Name
Suneet ABC Nut
Suneet ABC Bolt
Suneet CDE Bolt
Raj ABC Bolt
Raj ABC Nut
Natural Join of (P1 & P2) & P3
 This is a correct recomposition of the original table
and no loss decomposition into the three
projections is achieved.
Fifth Normal Form (5NF)
• If the original table and the table formed, after
decomposing the original table into no. of
tables and then joining those table together,
are identical then the original table violates
the 5NF.
• Detecting that a table violates 5NF is very
difficult in the practice and for this reason this
normal form has little in any practical
application.
Steps of Normalization
Step 1. Create Unnormalized Relation
Step 2. Separate Repeating & Non-
repeating Attributes
Step 3. Remove Partial Dependencies
Step 4. Remove Transitive Dependencies
Step 5. Remove Multi-Valued Dependencies
Step 6. Decompose Table Such That Further
Decomposition is not Possible
1NF
2NF
3NF
4NF
5NF

More Related Content

PPTX
STACKS IN DATASTRUCTURE
PPTX
Functional dependencies in Database Management System
PPTX
Database Modeling Using Entity.. Weak And Strong Entity Types
PDF
Array data structure
PPTX
Integrity Constraints
PPT
Binary search tree(bst)
PPTX
Normalization in RDBMS
PPTX
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
STACKS IN DATASTRUCTURE
Functional dependencies in Database Management System
Database Modeling Using Entity.. Weak And Strong Entity Types
Array data structure
Integrity Constraints
Binary search tree(bst)
Normalization in RDBMS
FUNCTION DEPENDENCY AND TYPES & EXAMPLE

What's hot (20)

PPTX
Application of Stack For Expression Evaluation by Prakash Zodge DSY 41.pptx
PDF
Normalization | (1NF) |(2NF) (3NF)|BCNF| 4NF |5NF
PPTX
Priority Queue in Data Structure
PPTX
Relational model
PPSX
Functional dependency
PDF
Database Normalization
PPT
Data structures using c
PPTX
serializability in dbms
PPTX
Presentation on array
PDF
Normalization in DBMS
PPTX
Slide 6 er strong & weak entity
PPT
Normalization
PPTX
Entity relation(1)
PPTX
Functional dependancy
PPT
Asymptotic notations
PPTX
Dbms normalization
PDF
All pairs shortest path algorithm
PPTX
Graph representation
PPT
Dbms relational model
PPT
Graph colouring
Application of Stack For Expression Evaluation by Prakash Zodge DSY 41.pptx
Normalization | (1NF) |(2NF) (3NF)|BCNF| 4NF |5NF
Priority Queue in Data Structure
Relational model
Functional dependency
Database Normalization
Data structures using c
serializability in dbms
Presentation on array
Normalization in DBMS
Slide 6 er strong & weak entity
Normalization
Entity relation(1)
Functional dependancy
Asymptotic notations
Dbms normalization
All pairs shortest path algorithm
Graph representation
Dbms relational model
Graph colouring
Ad

Similar to Normalisation (20)

PDF
Normalization.pdf
PPT
Database normalization
PPT
Normmmalizzarion.ppt
PPTX
DBMS_Module 3_Functional Dependencies and Normalization.pptx
PPTX
Unit 3 dbms
PPTX
Normalization-1NF,2NF,Functional dependencies.pptx
PPT
UNIT-IV.ppt
PDF
DBMS unit-3.pdf
PPTX
normalization ppt.pptx
PPTX
functional dependency in database (1).pptx
PPTX
Normalization
PPT
DBMS-Unit-3.0 Functional dependencies.ppt
PPTX
functional dependency in engineering.pptx
PPTX
Database normalization
PPT
Normalization by Sanu
PPTX
L1-Normalization 1NF 2NF 3NF 4NF BCNF.pptx
PDF
chapter 4-Functional Dependency and Normilization.pdf
PDF
UNIT 4 NORMALIZATION AND QUERY OPTIMIZATION 9.pdf
PPTX
Chapter-8 Relational Database Design
Normalization.pdf
Database normalization
Normmmalizzarion.ppt
DBMS_Module 3_Functional Dependencies and Normalization.pptx
Unit 3 dbms
Normalization-1NF,2NF,Functional dependencies.pptx
UNIT-IV.ppt
DBMS unit-3.pdf
normalization ppt.pptx
functional dependency in database (1).pptx
Normalization
DBMS-Unit-3.0 Functional dependencies.ppt
functional dependency in engineering.pptx
Database normalization
Normalization by Sanu
L1-Normalization 1NF 2NF 3NF 4NF BCNF.pptx
chapter 4-Functional Dependency and Normilization.pdf
UNIT 4 NORMALIZATION AND QUERY OPTIMIZATION 9.pdf
Chapter-8 Relational Database Design
Ad

More from Soumyajit Dutta (8)

PPTX
Er model
PPTX
Concurrency control
PPT
Aggregate functions
PPTX
Transaction processing
PPT
Single row functions
PPTX
Chapter 4
PPTX
Chapter 2
PPTX
Computer Organisation Design
Er model
Concurrency control
Aggregate functions
Transaction processing
Single row functions
Chapter 4
Chapter 2
Computer Organisation Design

Recently uploaded (20)

PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Business Ethics Teaching Materials for college
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
Classroom Observation Tools for Teachers
Supply Chain Operations Speaking Notes -ICLT Program
human mycosis Human fungal infections are called human mycosis..pptx
VCE English Exam - Section C Student Revision Booklet
Renaissance Architecture: A Journey from Faith to Humanism
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Abdominal Access Techniques with Prof. Dr. R K Mishra
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Cell Types and Its function , kingdom of life
Business Ethics Teaching Materials for college
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPH.pptx obstetrics and gynecology in nursing
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Week 4 Term 3 Study Techniques revisited.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
O5-L3 Freight Transport Ops (International) V1.pdf
Anesthesia in Laparoscopic Surgery in India
Microbial diseases, their pathogenesis and prophylaxis
Module 4: Burden of Disease Tutorial Slides S2 2025
Classroom Observation Tools for Teachers

Normalisation

  • 2. Dependency • A dependency occurs in a database when information stored in the same database table uniquely determines other information stored in the same table.
  • 3. Functional Dependency • A functional dependency is defined as a constraint between two sets of attributes in a relation from a database. • Given a relation R, a set of attributes X in R is said to functionally determine another attribute Y, also in R, (written X → Y) if and only if each X value is associated with at most one Y value.
  • 4. • A Functional Dependency describes a relationship between attributes within a single relation. • An attribute is functionally dependent on another if we can use the value of one attribute to determine the value of another. • We use the arrow symbol → to indicate a functional dependency. X → Y is read X functionally determines Y
  • 5. In other words…. X is the determinant set and Y is the dependent attribute. Thus, given a tuple and the values of the attributes in X, one can determine the corresponding value of the Y attribute.
  • 6. Example Employee SSN Name JobType DeptName 557-78-6587 Lance Smith Accountant Salary 214-45-2398 Lance Smith Engineer Product Note: Name is functionally dependent on SSN because an employee’s name can be uniquely determined from their SSN. Name does not determine SSN, because more than one employee can have the same name..
  • 10. Candidate Functional Dependency (CFD) • A candidate functional dependency is functional dependence that includes all attributes of the table. It should also be noted that a well-formed dependency diagram must have at least one candidate functional dependency and that there can be more than one candidate functional dependence for a given dependency diagram.
  • 11. Primary Functional Dependency (PFD) • PFD is a candidate functional dependency that is selected to determine the primary key. The determinant of PFD is the primary key of the relational database table. Each dependency diagram must have one and only one primary functional dependency. If a relational database table has only one candidate functional dependency, then it automatically becomes the primary functional dependency.
  • 12. Primary Functional Dependency (PFD) • Once the primary key has been determined, there will be three possible types of functional dependencies: – A  B, A key attribute functionally determines a non-key attribute. – A  B, A non-key attribute functionally determines a non-key attribute. – A  B, A non-key attribute functionally determines a key attribute.
  • 13. Primary Functional Dependency (PFD) • A Partial Functional Dependency is a functional dependency where the determinant consists of key attributes, but not the entire primary key, and the determined consists of non-key attributes. • A Transitive Functional Dependency is a functional dependency where the determinant and the determined both consists of non-key attributes.
  • 14. Primary Functional Dependency (PFD) • A Multi-Value Dependency (MVD) occurs when two or more independent multi valued facts about the same attribute occur within the same table. It means that if in a relation R having A, B and C as attributes, B and C are multi-value facts about A, which is represented as A  B and A  C, then multi value dependency exit only if B and C are independent on each other.
  • 15. Trivial Functional Dependency • Trivial: If an FD X → Y holds where Y subset of X, then it is called a trivial FD. Trivial FDs always hold. • Non-trivial: If an FD X → Y holds where Y is not subset of X, then it is called non-trivial FD.
  • 16. Keys • key is a set of attributes that uniquely identifies an entire tuple, a functional dependency allows us to express constraints that uniquely identify the values of certain attributes. • However, a candidate key is always a determinant, but a determinant doesn’t need to be a key.
  • 17. Closure • Let a relation R have some functional dependencies F specified. The closure of F (usually written as F+) is the set of all functional dependencies that may be logically derived from F. • F+, the closure, is the set of all the functional dependencies including F and those that can be deduced from F. • The closure is important and may, for example, be needed in finding one or more candidate keys of the relation.
  • 18. Axioms Developed by Armstrong in 1974, there are six rules (axioms) that all possible functional dependencies may be derived from them.
  • 19. Axioms Cont… 1. Reflexivity Rule --- If X is a set of attributes and Y is a subset of X, then X  Y holds. each subset of X is functionally dependent on X. 2. Augmentation Rule --- If X  Y holds and W is a set of attributes, then WX  WY holds. 3. Transitivity Rule --- If X  Y and Y  Z holds, then X  Z holds.
  • 20. Derived Theorems from Axioms 4. Union Rule --- If X  Y and X  Z holds, then X  YZ holds. 5. Decomposition Rule --- If X  YZ holds, then so do X  Y and X  Z. 6. Pseudotransitivity Rule --- If X  Y and WY  Z hold then so does WX  Z.
  • 21. Normalization • Data normalization is a technique of organizing the data in database. • Normalization of data can be defined as a process during which redundant relation schemas are decomposed by breaking up their attributes into smaller relation schemas that process desirable properties.
  • 24. First Normal Form • A relation is said to be in 1NF if and only if every cell/entry of the relation has at most a single value. In other words “a relation is in 1NF if and only if all underlying domains contain atomic values or single value only.” • The objective of normalizing a table is to remove its repeating groups and ensure that all entries of the resulting table have at most a single value.
  • 25. Example: Unnormalized table Course Code Course Name Teacher Name Roll No Name System Used Hourly Rate Total Hours C1 Visual Basic ABC 100 A1 P – I 20 7 101 A2 P – II 30 3 102 A3 Celeron 10 6 103 A4 P – IV 40 1 C2 Oracle & Dev DEF 100 A1 P – I 20 7 104 A5 P – III 35 3 105 A6 P – II 30 1 101 A2 P – II 30 2 C3 C++ KJP 106 A7 P – IV 40 3 107 A8 P – IV 40 2 108 A9 P – I 20 1 C4 Java Kumar 109 A10 Cyrix 20 2
  • 26. Approaches to normalize table • In general, there are two basic approaches to normalize tables.
  • 27. STUDENT (Flattening) (Normalized Table) Course Code Course Name Teacher Name Roll No Name System Used Hourly Rate Total Hours C1 Visual Basic ABC 100 A1 P – I 20 7 C1 Visual Basic ABC 101 A2 P – II 30 3 C1 Visual Basic ABC 102 A3 Celeron 10 6 C1 Visual Basic ABC 103 A4 P – IV 40 1 C2 Oracle & Dev DEF 100 A1 P – I 20 7 C2 Oracle & Dev DEF 104 A5 P – III 35 3 C2 Oracle & Dev DEF 105 A6 P – II 30 1 C2 Oracle & Dev DEF 101 A2 P – II 30 2 C3 C++ KJP 106 A7 P – IV 40 3 C3 C++ KJP 107 A8 P – IV 40 2 C3 C++ KJP 108 A9 P – I 20 1 C4 Java Kumar 109 A10 Cyrix 20 2
  • 31. STUDENT (Decomposition) (Normalized Table) 31 Course Code Roll No Name System Used Hourly Rate Total Hours C1 100 A1 P – I 20 7 C1 101 A2 P – II 30 3 C1 102 A3 Celeron 10 6 C1 103 A4 P – IV 40 1 C2 100 A1 P – I 20 7 C2 104 A5 P – III 35 3 C2 105 A6 P – II 30 1 C2 101 A2 P – II 30 2 C3 106 A7 P – IV 40 3 C3 107 A8 P – IV 40 2 C3 108 A9 P – I 20 1 C4 109 A10 Cyrix 20 2 Course Code Course Name Teacher Name C1 Visual Basic ABC C2 Oracle & Dev DEF C3 C++ KJP C4 Java Kumar COURSE COURSE_STUDENT PK = (Course_code, RollNo)
  • 32. Anomalies in 1NF Relations • Redundancies in 1NF relations lead to a variety of data anomalies. 1. Insert anomalies: We cannot insert the information about the student until he joins any course e.g. we cannot store information about the roll no 110 until he join any course, similarly we are unable to store the information about the course until there is a student who enroll into that course. These anomalies occur because course_code, rollno is the composite key and we cannot insert null in any of these two attributes.
  • 33. Anomalies in 1NF Relations 2. Update anomalies: This relation is also susceptible to update anomalies because the course in which a student studies may appear many times in the table. If a teacher moves to another course, we are now faced with two problems: we either search the entire table looking for that teacher and update his or her course_code value or we miss one or more tuples of that STUDENT and end up with an inconsistent database. For small tables, this type of anomaly may not seem to be much of a problem. But for larger tables this may cause the problem of inconsistency.
  • 34. Anomalies in 1NF Relations 3. Delete anomalies: This relation experiences deletion anomalies whenever we delete the last tuple of a particular student. In this case, we not only delete the course information that connects that student to a particular course, but also lose other information about the system on which this student works. Let us consider, the case where we have to delete the information of student having rollno 109, then we also lose the information about course_code C4 . Also if we have to delete the information of java course we lose the information about the student Kumar.
  • 35. Second Normal Form (2NF) • A relation R is in 2NF if and only if it is in 1NF and every non-key attribute is fully functional dependent on the primary key. Cours_Code RollNo Name System_Used Hourly_Rate Course_Name Teacher_NameTotal Hrs. Functional Dependence Diagram
  • 36. Second Normal Form (2NF) • A resultant database of 1NF Course_Code does not satisfy above rule, because non-key attributes Name, System_Used and Hourly_Rate are not fully dependent on the primary key (Course_Code, Rollno) because Name, System_Used and Hourly_Rate are functional dependent on Rollno and Rollno is a subset of the primary key so it does not hold the law of fully functional dependence.
  • 37. Rule to convert 1NF to 2NF • Consider a relation where a primary key consists of attributes A and B. These two attributes determine all other attributes. • Attribute C is fully dependent on the key. • Attribute D is partially dependent on the key because we only need attribute A to functionally determine it. • Attributes C and D are non-key attributes.
  • 38. Rule to convert 1NF to 2NF • The rule is to replace the original relation by two new relations. The first new relation has three attributes: A, B and C. The primary key of this relation is (A, B) i.e. the primary key of the original relation. The second relation has A and D as its only two attributes. Observe that attribute A has been designated, as the primary key of the second relation and that attribute D is now fully dependent on the key.
  • 39. Rule to transform 1NF to 2NF A* B* C D Convert To A* B* C A* D 1NF 2NF
  • 40. Second Normal Form (2NF) Course Code Roll No Total Hours C1 100 7 C1 101 3 C1 102 6 C1 103 1 C2 100 7 C2 104 3 C2 105 1 C2 101 2 C3 106 3 C3 107 2 C3 108 1 C4 109 2 HOURS_ASSIGNED Course Code Course Name Teacher Name C1 Visual Basic ABC C2 Oracle & Dev DEF C3 C++ KJP C4 Java Kumar STUDENT_SYSTEM_CHARGE COURSE Roll No Name System Used Hourly Rate 100 A1 P – I 20 101 A2 P – II 30 102 A3 Celeron 10 103 A4 P – IV 40 100 A1 P – I 20 104 A5 P – III 35 105 A6 P – II 30 101 A2 P – II 30 106 A7 P – IV 40 107 A8 P – IV 40 108 A9 P – I 20 109 A10 Cyrix 20
  • 41. Removal of Anomalies of 1NF Relations • Insert Anomalies: It is now possible to insert the information about the student who does not join any course e.g. we can store the information about the Rollno 110 who does not join any course in STUDENT_SYSTEM_CHARGE database. Similarly now we are able to store the information about the course which has no enrolled student e.g. we can store that C1 course is of Visual Basic in COURSE table.
  • 42. Removal of Anomalies of 1NF Relations • Update Anomalies: Now, it is possible to change the teacher for a particular course in the COURSE table through a single modification. So, no data inconsistency will arise. • Delete Anomalies: in the revised structure, we can delete the information of student having Rollno 109 without losing the information about his course i.e. C4.
  • 43. Anomalies in 2NF • Relations in 2NF are still subject to data anomalies. Let us assume that the system on which a student works functionally determines the hourly rate charged from the student i.e. System_Used  Hourly_Rate • Due to this fact the anomalies will occur in case of 2NF.
  • 44. Anomalies in 2NF Relations • Insert Anomalies: Insertion anomalies occur in the Student_System_Charge relation. For example, consider a situation where we would like to set in advance the rate to be charged from the students for a particular system. We can not insert this info. Until there is a student assigned to that type of system because Rollno is the primary key for this relation and we can not insert the null value into it.
  • 45. Anomalies in 2NF • Update Anomalies: Update anomalies will also occur in the Student_System_Charge relation because there may be several students which are working on the same type of the system. If the Hourly_Rate for a particular system changes, we need to make sure that the corresponding rate is changed for all the students that work on that type of system. Otherwise, the database may end up in an inconsistent state.
  • 46. Anomalies in 2NF • Delete Anomalies: Delete anomalies will also occur in the Student_System_Charge relation. This type of anomaly occurs whenever we delete the tuple of a student who happens to be the only student left which is working on a particular system. In this case, we will also lose the information about the rate that we charge for that particular system.
  • 47. Prime and nonprime attributes • An attribute of a relation schema R is called Prime attribute if it is a member of some candidate key of R. • An attribute is called non prime if it is not a prime attribute i.e. it is not the member of any candidate key.
  • 48. Third Normal Form (3NF) • A relation R is in 3NF if and only if the following conditions are satisfied simultaneously: – R is already in 2NF – No nonprime attribute functionally determines any other nonprime attribute Or No nonprime attribute is transitively dependent on the key • The objective of transforming relations into 3NF is to remove all transitive dependencies.
  • 49. Third Normal Form (3NF) RollNo Name System_Used Hourly_Rate Functional Dependence Diagram Rule to Resolve Transitivity Dependence A* B C Convert To A* B B C 3NF2NF RollNo  System_Used System_Used  Hourly_Rate It Means RollNo  Hourly_Rate
  • 50. Third Normal Form (3NF) Roll No Name System Used 100 A1 P – I 101 A2 P – II 102 A3 Celeron 103 A4 P – IV 100 A1 P – I 104 A5 P – III 105 A6 P – II 101 A2 P – II 106 A7 P – IV 107 A8 P – IV 108 A9 P – I 109 A10 Cyrix System Used Hourly Rate Celeron 10 Cyrix 20 P – I 20 P – II 30 P – III 35 P – IV 40 STUDENT_SYSTEM CHARGES Roll No Name System Used Hourly Rate 100 A1 P – I 20 101 A2 P – II 30 102 A3 Celeron 10 103 A4 P – IV 40 100 A1 P – I 20 104 A5 P – III 35 105 A6 P – II 30 101 A2 P – II 30 106 A7 P – IV 40 107 A8 P – IV 40 108 A9 P – I 20 109 A10 Cyrix 20 STUDENT_SYSTEM_CHARGE Convert To 3NF2NF
  • 51. Removal of Anomalies of 2NF Relations • Insert Anomalies: In the revised structure of STUDENT_SYSTEM and CHARGES, it is possible to insert in advance the rate to be charged from the students for a particular system. • Update Anomalies: If the Hourly_Rate for a particular system changes, we need only to change a single record in CHARGES database for that particular system.
  • 52. Removal of Anomalies of 2NF Relations • Delete Anomalies: We delete the tuple of a student who happens to be the only student left which is working on a particular system without losing the information about the rate that we charge for that particular system.
  • 53. Anomalies in 3NF • The relations in 3NF are suceptible to data anomalies particularly when the relations have two overlapping candidate keys or when a non-prime attribute functionally determines a prime attribute. Let us consider the example which illustrate these anomalies:
  • 54. Anomalies in 3NF • We can take a case of Supplier_Part Table having following attributes: Supplier_Part(Sno, Sname, Pno, Qty) Lets suppose that Sname is unique for each Sno as shown below: Sno Sname Pno Qty S1 Rahat P1 300 S2 Raju P2 200 S1 Rahat P3 100 S2 Raju P1 200
  • 55. Anomalies in 3NF • This relation has two candidate keys: (Sno, Pno) and (Sname, Pno) that overlap on the attribute Pno. The relation is in 3NF because there is single nonprime attribute. • The relation is susceptible to update anomalies e.g. if one of the supplier changes its name, then we have to make multiple changes which is equal to the number part supplied by that particular supplier.
  • 56. Boyce-Codd Normal Form • Boyce-Codd Normal Form (BCNF) is used to eliminate the anomalies of 3NF. • BCNF states that a relation R is in BCNF if and only if every determinant is a candidate key. • Here determinant is a simple attribute or composite attribute on which some other attribute is fully functionally dependent • E.g. (Sno, Pno)  Qty, here (Sno, Pno) is a composite determinant. Sno  Sname, here Sno is simple attribute determinant.
  • 57. Boyce-Codd Normal Form Functional Dependency Diagram of Supplier_Part Relation Qty Sname Pno Sno Qty Sno Pno Sname FD of the above relations are: (Sno, Pno)  Qty (Sname, Pno)  Qty Sno  Sname Sname  Sno
  • 58. Boyce-Codd Normal Form • Both the relations are in 3NF, because there is only one non-key attribute i.e. Qty and it is FFD and non- transitively dependent on the primary key. • But Supplier_Part relation is not in BCNF because this relation has four determinants: – (Sno, Pno), (Sname, Pno), (Sno), (Sname) • Out of these four determinants (Sno, Pno) and (Sname, Pno) are unique but Sno and Sname determinants are not candidate keys.
  • 59. Boyce-Codd Normal Form • In order to make this relation in BCNF, we non- loss decompose this relation in two projections SN (Sno, Sname) and SP (Sno, Pno, Qty). • SN relation has two determinants Sno, Sname and both are unique. • SP has one determinant (Sno, Pno) and is also unique.
  • 60. Decomposition of tables • Decomposition means dividing a table into more than one table. The main purpose of decomposition is to eliminate redundancy by decomposing a relation into several relations in a higher normal form. • Types of decomposition: – Lossy decomposition – Lossless decomposition
  • 61. Lossy Decomposition  Lossy decomposition results in the loss of the information. Let R be a relation , decomposition of R is a set of relation schemas (R1, R2, R3….) such that R = R1 U R2 U …..U Rn such that each Ri is a subset of R ( for i = 1,2…,n)  For example, For relation R(x,y,z) there can be 2 subsets: R1(x,z) and R2(y,z) If we union R1 and R2, we get R ,i.e, R = R1 U R2
  • 62. Lossy Decomposition • The major problem with decomposition is that we may not be able to get the original relation after performing the union of instances of the original relation- results in information loss.
  • 63. Example : Problem with Decomposition Model Name Price Category a11 100 Canon s20 200 Nikon a70 150 Canon R Model Name Category a11 Canon s20 Nikon a70 Canon Price Category 100 Canon 200 Nikon 150 Canon R1 R2
  • 64. Example : Problem with Decomposition R1 U R2 Model Name Price Category a11 100 Canon a11 150 Canon s20 200 Nikon a70 100 Canon a70 150 Canon Model Name Price Category a11 100 Canon s20 200 Nikon a70 150 Canon R
  • 65. Loss-less decomposition • A decomposition {R1, R2,…, Rn} of a relation R is called a lossless decomposition for R if the natural join of R1, R2,…, Rn produces exactly the relation R. • A decomposition is lossless if we can recover: R(A, B, C) Decompose R1(A, B) R2(A, C) Recover R’(A, B, C) Thus, R’ = R
  • 66. Forth Normal Form (4NF) • A relation R is in 4NF if and only if the following conditions are satisfied simultaneously: – R is already in 3NF or BCNF. – If it contains no multi-valued dependencies. • Multi-Valued Dependency (MVD) – MVD is the dependency where one attribute value is potentially a ‘multi-valued fact’ about another.
  • 67. Forth Normal Form (4NF) • MVD can be defined informally as follows: – MVDs occur when two or more independent multi valued facts about the same attribute occur within the same table. It means that if in a relation R having A, B and C as attributes, B and C are muti-value facts about A, which is represented as AB and AC ,then muti value dependency exist only if B and C are independent of each other.
  • 68. Forth Normal Form (4NF) Course S_Name Text_Book Physics Ankit Mechanics Physics Ankit Optics Physics Rahat Mechanics Physics Rahat Optics Chemistry Ankit Org. Chemistry Chemistry Ankit Inorg. Chemistry English Raj Eng. Literature English Raj Eng. Grammer Course_Student_Book MVD exists : Course   S_Name Course   Text_Book
  • 69. Forth Normal Form (4NF) • Anomalies of database with MVDs: – If a new student joins the physics course then we have to make two insertions for that student in the database, which is equal to no. of physics text books. – If the name of the physics textbook is required to change we have the update the no. of records equal to no. of students in physics course. – If a physics textbook is required to be deleted then we have to delete no. of records.
  • 70. Forth Normal Form (4NF) • To put Course_Student_Book relation into 4NF, two separate tables are formed as shown below: Course S_Name Physics Ankit Physics Rahat Chemistry Ankit English Raj Course_Student Course Text_Book Physics Mechanics Physics Optics Chemistry Org. Chemistry Chemistry Inorg. Chemistry English Eng. Literature English Eng. Grammer Course_Student
  • 71. Fifth Normal Form (5NF) • A relation R is in 5NF if and only if the following conditions are satisfied simultaneously: – R is already in 4NF. – It cannot be further non-loss decomposed. • 5NF is of little practical use to the database designer, but it is of interest from a theoretical point of view.
  • 72. Fifth Normal Form (5NF) • In all of the normal forms discussed so far, no loss decomposition was achieved by the decomposing of a single table into two separate tables. No loss decomposition is possible because of the availability of the join operator as part of the relational model. In considering 5NF, consideration must be given to table where non-loss decomposition can only be achieved by decomposition into three or more separate tables.
  • 73. Fifth Normal Form (5NF) • Consider the table: Agent_Company_Product below, table lists agents, the companies they work for and the products they sell for those companies. The agents do not necessarily sell all the products supplied by the companies they do business with. Agent Company P_Name Suneet ABC Nut Suneet ABC Screw Suneet CDF Bolt Raj ABC Bolt Agent_Company_Product
  • 74. Fifth Normal Form (5NF) • Suppose the table is decomposed into three projections say P1, P2, P3: Agent Company Suneet ABC Suneet CDF Raj ABC P1 Agent P_Name Suneet Nut Suneet Screw Suneet Bolt Raj Bolt P2 Company P_Name ABC Nut ABC Screw ABC Bolt CDE Bolt P3
  • 75. Fifth Normal Form (5NF) • Apply Natural Join to Projection P1 and P2 over the Agent column: Agent Company P_Name Suneet ABC Nut Suneet ABC Screw Suneet ABC Bolt* Suneet CDE Nut* Suneet CDE Screw* Suneet CDE Bolt Raj ABC Bolt Natural Join of P1 & P2  The resulting table is spurious, since the asterisked row of the table contains incorrect information.
  • 76. Fifth Normal Form (5NF) • Apply Natural Join to Projection P1, P2 and P3 over the Company and P_Name columns: Agent Company P_Name Suneet ABC Nut Suneet ABC Screw Suneet ABC Bolt* Suneet CDE Bolt Raj ABC Bolt Natural Join of (P1 & P2) & P3  It is still containing spurious row. It is not simply possible to decompose Agent_Company_Product table without losing information.
  • 77. Fifth Normal Form (5NF) • Now Consider the different case where, if an agent for a company and the company makes a product, then he always sells that product for the company. Under these circumstances, the Agent_Company_Product table is shown below: Agent Company P_Name Suneet ABC Nut Raj ABC Bolt Raj ABC Nut Suneet CDF Bolt Suneet ABC Bolt Agent_Company_Product
  • 78. Fifth Normal Form (5NF) • The assumption being that ABC makes both Nuts and Bolts and that CDF makes Bolts only. This table can be decomposed into its three projections without loss of information as shown below: Agent Company Suneet ABC Suneet CDF Raj ABC P1 Agent P_Name Suneet Nut Suneet Bolt Raj Bolt Raj Nut P2 Company P_Name ABC Nut ABC Bolt CDE Bolt P3
  • 79. Fifth Normal Form (5NF) • All redundancy is removed, if the natural join of P1 and P2 is taken, the result is: Agent Company P_Name Suneet ABC Nut Suneet ABC Bolt Suneet CDE Nut* Suneet CDE Bolt Raj ABC Bolt Raj ABC Nut Natural Join of P1 & P2  The resulting table is spurious, since the asterisked row of the table contains incorrect information.
  • 80. Fifth Normal Form (5NF) • Now, if this result is joined with P3 over the column ‘Company’ and ‘P_Name’ the following table is obtained. Agent Company P_Name Suneet ABC Nut Suneet ABC Bolt Suneet CDE Bolt Raj ABC Bolt Raj ABC Nut Natural Join of (P1 & P2) & P3  This is a correct recomposition of the original table and no loss decomposition into the three projections is achieved.
  • 81. Fifth Normal Form (5NF) • If the original table and the table formed, after decomposing the original table into no. of tables and then joining those table together, are identical then the original table violates the 5NF. • Detecting that a table violates 5NF is very difficult in the practice and for this reason this normal form has little in any practical application.
  • 82. Steps of Normalization Step 1. Create Unnormalized Relation Step 2. Separate Repeating & Non- repeating Attributes Step 3. Remove Partial Dependencies Step 4. Remove Transitive Dependencies Step 5. Remove Multi-Valued Dependencies Step 6. Decompose Table Such That Further Decomposition is not Possible 1NF 2NF 3NF 4NF 5NF