SlideShare a Scribd company logo
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
1 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
UNIT III
Relational Database Design: Features of Good Relational Designs – Atomic Domains
and First Normal Form – Second Normal Form – Decomposition Using Functional
Dependencies – Functional Dependency Theory – Algorithms for decomposition –
Decomposition Using Multi-valued Dependencies – More Normal Forms – Database
Design Process – Modeling Temporal Data
3.1 FEATURES OF GOOD RELATIONAL DESIGNS
3.1.1 Design Alternative: Larger Schemas
 Combined Schemas
 Combined Schema without repetition
3.1.2 Design Alternative: Smaller Schemas
3.1.1 Design Alternative: Larger Schemas
It is possible to generate a set of relation schemas directly from the E-R design.
The goodness (or badness) of the resulting set of schemas depends on how good the E-R
design was in the first place.
 Combined Schemas
 Suppose we combine borrower and loan to get
bor_loan = (customer_id, loan_type, amount )
 Result is possible repetition of information (L100 in example below)
loan_type amount
..........
..........
L-1000
............
............
...........
...........
1000
..........
..........
Figure 3.1: cust_loan table
customer_id loan_type
..............
..............
C0001
C0002
..............
..............
...............
...............
L-1000
L-1000
..............
..............
customer_id loan_type amount
C0001
C0002
L-1000
L-1000
1000
1000
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
2 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
 Combined Schemas without repetition
 Consider combining loan_branch and loan
loan_amt_br = (loan_number, amount, branch_name)
 No repetition (as suggested by example below)
loan_number Amount
..........
..........
25235
............
............
...........
...........
1000
..........
..........
Figure 3.2: loan_branch table
3.1.2 Design Alternative: Smaller Schemas
We need to write a rule that says “if there were a schema (dept_name, budget),
then dept_name is able to serve as the primary key.” This rule is specified as a
functional dependency.
dept_ name→ budget
employee (ID, name, street, city, salary)
Not all decompositions are good. Suppose we decompose employee into
employee1 (ID, name)
employee2 (name, street, city, salary)
Figure 3.3: Loss of information via a bad decomposition.
loan_number branch_name
..............
..............
25235
..............
..............
...............
...............
Anna Nagar
..............
..............
loan_number Amount branch_name
25235 1000 Anna Nagar
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
3 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
3.2 ATOMIC DOMAINS AND FIRST NORMAL FORM
3.2.1 Atomic Domains
3.2.2 First Normal Form
 Employee (unnormalized)
 Employee (normalized – 1 NF)
 Alterations
3.2.1 Atomic Domains
A domain is atomic if elements of the domain are considered to be indivisible
units. We say that a relation schema R is in first normal form (1NF) if the domains of
all attributes of R are atomic.
A set of names is an example of a non atomic value. Non atomic values
complicate storage and encourage redundant (repeated) storage of data.
3.2.2 First Normal Form
 A relation is said to be in first normal form if all of its attributes have domains
that are indivisible or atomic. Also called as Flat File.
 Each attribute must be atomic. No repeating columns within a row. No multi-
valued columns.
 Each row of data must have a unique identifier (or Primary Key)
Employee (unnormalized)
Employee (normalized – 1 NF)
Alterations
Update Anamoly
 Update address of a student who occurs twice or more than in a table, address
column should be updated in all rows.
Insertion Anamoly
 Student admission, sid, sname, address known but course unknown – leads NULL
Value insertion.
Deletion Anamoly
 Student 101 discontinued course – leads to delete the other details also.
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C, Perl, Java
2 Barbara Jones 224 IT Linux, Mac
3 Jake Rivera 201 R&D DB2, Oracle, Java
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
4 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
3.3 SECOND NORMAL FORM
 A relation is said to be in second normal form if it meets both the followings
 The relation is in first normal form.
 All non-key attributes are functionally dependent on the entire primary
key.
 Each attribute must be functionally dependent on the primary key.
 2NF improves data integrity.
 Prevents update, insert, and delete anomalies.
Employee (normalized – 1 NF)
 Name, dept_no, and dept_name are functionally dependent on emp_no.
(emp_no -> name, dept_no, dept_name)
 Skills is not functionally dependent on emp_no since it is not unique to each
emp_no.
Employee (2NF) Skills (2NF)
3.4 DECOMPOSITION USING FUNCTIONAL DEPENDENCIES
3.4.1 Keys and Functional Dependencies
3.4.2 Boyce–Codd Normal Form
3.4.3 BCNF and Dependency Preservation
3.4.4 Third Normal Form
3.4.5 Higher Normal Forms
3.4.1 Keys and Functional Dependencies
Keys
 A subset K of R is a super key of r (R) if, in any legal instance of r (R), for all pairs
t1 and t2 of tuples in the instance of r if t1 = t2, then t1[K] = t2[K].
 That is, no two tuples in any legal instance of relation r (R) may have the same
value on attribute set K.
emp_no skills
1 C
1 Perl
1 Java
2 Linux
2 Mac
3 DB2
3 Oracle
3 Java
emp_no name dept_no dept_name skills
1 Kevin Jacobs 201 R&D C
1 Kevin Jacobs 201 R&D Perl
1 Kevin Jacobs 201 R&D Java
2 Barbara Jones 224 IT Linux
2 Barbara Jones 224 IT Mac
3 Jake Rivera 201 R&D DB2
3 Jake Rivera 201 R&D Oracle
3 Jake Rivera 201 R&D Java
emp_no name dept_nodept_name
1 Kevin Jacobs 201 R&D
2 Barbara Jones 224 IT
3 Jake Rivera 201 R&D
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
5 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
Functional Dependencies
 Y is functionally dependent on X
 if the value of Y is determined by X.
 if Y = X +1
 value of X will determine the resultant value of Y
 Y is dependent on X as a function of the value of X
3.4.2 Boyce–Codd Normal Form
A relation schema R is in BCNF with respect to a set F of functional dependencies
if for all functional dependencies in F+ of the form.

where R and R, at least one of the following holds:
is trivial (i.e., )
is a superkey for R
Example schema not in BCNF:
bor_loan = ( customer_id, loan_number, amount )
because loan_number amount holds on bor_loan but loan_number is not a superkey.
Decomposing a Schema into BCNF
Suppose we have a schema R and a nontrivial dependency causes a
violation of BCNF.
We decompose R into:
 (U )
 ( R- () )
In our example,
= loan_number
= amount
and bor_loan is replaced by
(U ) = ( loan_number, amount )
( R- () ) = ( customer_id, loan_number )
3.4.3 BCNF and Dependency Preservation
 Constraints, including functional dependencies, are costly to check in practice
unless they pertain to only one relation
 If it is sufficient to test only those dependencies on each individual relation of a
decomposition in order to ensure that all functional dependencies hold, then that
decomposition is dependency preserving.
 Because it is not always possible to achieve both BCNF and dependency
preservation, we consider a weaker normal form, known as third normal form.
3.4.4 Third Normal Form
A relation is said to be in third normal form if it meets both the followings
 The relation is in second normal form.
 There is no transitive dependence that is, all the non-key attributes
depend only on the primary key.
Remove transitive dependencies.
 Any transitive dependencies are moved into a smaller (subset) table.
3NF further improves data integrity.
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
6 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
 Prevents update, insert, and delete anomalies.
Employee (2NF) Skills (2NF)
Employee (3NF) Department (3NF)
Skills (3NF)
3.4.5 Higher Normal Forms
Refer 3.6 and 3.7
3.5 FUNCTIONAL DEPENDENCY THEORY
3.5.1 Closure of a Set of Functional Dependencies
3.5.2 Closure of Attribute Sets
3.5.3 Canonical Cover
3.5.4 Lossless-join Decomposition
3.5.5 Dependency Preservation
3.5.1 Closure of a Set of Functional Dependencies
Given a set F set of functional dependencies, there are certain other functional
dependencies that are logically implied by F.
For example: If A B and B C, then we can infer that A C
The set of all functional dependencies logically implied by F is the closure of F.
We denote the closure of F by F+.
We can find all of F+ by applying Armstrong’s Axioms:
 if , then (reflexivity)
 if , then (augmentation)
 if , and , then (transitivity)
These rules are
 sound (generate only functional dependencies that actually hold) and
 complete (generate all functional dependencies that hold).
emp_no skills
1 C
1 Perl
1 Java
2 Linux
2 Mac
3 DB2
3 Oracle
3 Java
emp_no name dept_no
1 Kevin Jacobs 201
2 Barbara Jones 224
3 Jake Rivera 201
dept_nodept_name
201 R&D
224 IT
emp_no skills
1 C
1 Perl
1 Java
2 Linux
2 Mac
3 DB2
3 Oracle
3 Java
emp_no name dept_nodept_name
1 Kevin Jacobs 201 R&D
2 Barbara Jones 224 IT
3 Jake Rivera 201 R&D
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
7 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
Procedure for Computing F+
F + = F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F +
until F + does not change any further
We can further simplify manual computation of F+ by using the following additional
rules.
 If holds and holds, then holds (union)
 If holds, then holds and holds (decomposition)
 If holds and holds, then holds (pseudo transitivity)
The above rules can be inferred from Armstrong’s axioms.
3.5.2 Closure of Attribute Sets
Given a set of attributes define the closure of under F (denoted by +) as the
set of attributes that are functionally determined by under F.
Algorithm to compute +, the closure of under F
result := ;
while (changes to result) do
for each in F do
begin
if result then result := result 
end
Uses of attribute closure
There are several uses of the attribute closure algorithm:
 Testing for superkey
 To test if is a superkey, we compute +, and check if + contains all
attributes of R.
 Testing functional dependencies
 To check if a functional dependency holds (or, in other words, is in
F+), just check if +.
 That is,we compute + by using attribute closure, and then check if it
contains .
 Is a simple and cheap test, and very useful.
 Computing closure of F
For each R, we find the closure +, and for each S +, we output a
functional dependency S.
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
8 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
3.5.3 Canonical Cover
A canonical cover for F is a set of dependencies Fc such that
 F logically implies all dependencies in Fc, and
 Fc logically implies all dependencies in F, and
 No functional dependency in Fc contains an extraneous attribute, and
 Each left side of functional dependency in Fc is unique.
To compute a canonical cover for F
repeat
Use the union rule to replace any dependencies in F
  1 1 and 1 2 with 1 1 2
Find a functional dependency with an extraneous attribute either in or
in 
If an extraneous attribute is found, delete it from 
until F does not change
Computing a Canonical Cover
R = (A, B, C)
F = {A BC, B C, A B, AB C}
 Combine A BC and A B into A BC
 Set is now {A BC, BC, AB C}
 A is extraneous in AB C
 Check if the result of deleting A from AB C is implied by the other
dependencies
 Yes: in fact, B C is already present!
 Set is now {A BC, B C}
 C is extraneous in A BC
 Check if A C is logically implied by A B and the other dependencies
 Yes: using transitivity on A B and B C.
 The canonical cover is:
A B
B C
3.5.4 Lossless-join Decomposition
The decomposition is lossless if, for all legal database, relation r contains the
same set of tuples as the result of the following SQL query:
select * from (select R1 from r) natural join (select R2 from r)
This is stated more concisely in the relational algebra as:
R1 and R2 form a lossless decomposition of R if at least one of the following
functional dependencies is in F+:
R1 ∩ R2 → R1
R1 ∩ R2 → R2
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
9 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
Example
R = (A, B, C)
F = {A B, B C}
Can be decomposed in two different ways
R1 = (A, B), R2 = (B, C)
 Lossless-join decomposition:
R1 R2 = {B} and B BC
 Dependency preserving
R1 = (A, B), R2 = (A, C)
 Lossless-join decomposition:
R1 R2 = {A} and A AB
 Not dependency preserving
3.5.5 Dependency Preservation
Let Fi be the set of dependencies F + that include only attributes in Ri
 A decomposition is dependency preserving, if (F1 F2 … Fn )+ = F +
 If it is not, then checking updates for violation of functional dependencies may
require computing joins, which is expensive.
Testing for Dependency Preservation
 To check if a dependency is preserved in a decomposition of R into R1, R2,
…, Rn we apply the following test (with attribute closure done with respect to F)
result = 
while (changes to result) do
for each Ri in the decomposition
t = (result Ri)+ Ri
result = result t
 If result contains all attributes in , then the functional dependency is
preserved.
 We apply the test on all dependencies in F to check if a decomposition is
dependency preserving.
 This procedure takes polynomial time, instead of the exponential time required
to compute F+ and (F1 F2 … Fn)+
3.6 ALGORITHMS FOR DECOMPOSITION
3.6.1 BCNF Decomposition
3.6.2 3NF Decomposition
3.6.3 Correctness of the 3NF Algorithm
3.6.4 Comparison of BCNF and 3NF
3.6.1 BCNF Decomposition
The definition of BCNF can be used directly to test if a relation is in BCNF.
However, computation of F+ can be a tedious task.
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
10 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
Testing for BCNF
To check if a nontrivial dependency causes a violation of BCNF
1. compute + (the attribute closure of ), and
2. verify that it includes all attributes of R, that is, it is a superkey of R.
 Simplified test: To check if a relation schema R is in BCNF, it suffices to check
only the dependencies in the given set F for violation of BCNF, rather than
checking all dependencies in F+.
 If none of the dependencies in F causes a violation of BCNF, then none of the
dependencies in F+ will cause a violation of BCNF either.
However, using only F is incorrect when testing a relation in a decomposition of R.
Consider R = (A, B, C, D, E), with F = { A B, BC D}
 Decompose R into R1 = (A,B) and R2 = (A,C,D, E)
 Neither of the dependencies in F contain only attributes from (A,C,D,E) so we
might be mislead into thinking R2 satisfies BCNF.
 In fact, dependency AC D in F+ shows R2 is not in BCNF.
BCNF Decomposition Algorithm
If R is not in BCNF, we can decompose R into a collection of BCNF schemas R1,
R2, . . . , Rn by the algorithm. The algorithm uses dependencies that demonstrate
violation of BCNF to perform the decomposition.
result := {R };
done := false;
compute F +;
while (not done) do
if (there is a schema Ri in result that is not in BCNF)
then begin
let be a nontrivial functional dependency that holds on Ri
such that Ri is not in F +,
and = ;
result := (result – Ri ) (Ri – ) (, );
end
else done := true;
Note: each Ri is in BCNF, and decomposition is lossless-join.
3.6.2 3NF Decomposition
There are some situations where BCNF is not dependency preserving, and
efficient checking for FD violation on updates is important.
Solution: Define a weaker normal form, called Third Normal Form (3NF)
 Allows some redundancy
 But functional dependencies can be checked on individual relations without
computing a join.
 There is always a lossless-join, dependency preserving decomposition into 3NF.
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
11 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
Let Fc be a canonical cover for F;
i := 0;
for each functional dependency in Fc do
if none of the schemas Rj, 1 j i contains 
then begin
i := i + 1;
Ri := 
end
if none of the schemas Rj, 1 j i contains a candidate key for R
then begin
i := i + 1;
Ri := any candidate key for R;
end
return (R1, R2, ..., Ri)
3.6.3 Correctness of the 3NF Algorithm
If a relation Ri is in the decomposition generated by the algorithm, then Ri
satisfies 3NF.
 Let Ri be generated from the dependency 
 Let B be any nontrivial functional dependency on Ri.
 Now, B can be in either or but not in both. Consider each case separately.
Case 1: If B in :
 If is a superkey, the 2nd condition of 3NF is satisfied.
 Otherwise must contain some attribute not in 
 Since B is in F+ it must be derivable from Fc, by using attribute closure on .
 Attribute closure not have used . If it had been used, must be contained
in the attribute closure of , which is not possible, since we assumed is not a
superkey.
 Now, using ({ B}) and B, we can derive B (since , and B
since B is nontrivial)
 Then, B is extraneous in the right hand side of ; which is not possible since
is in Fc.
 Thus, if B is in then must be a superkey, and the second condition of 3NF
must be satisfied.
Case 2: B is in .
 Since a is a candidate key, the third alternative in the definition of 3NF is trivially
satisfied.
 In fact, we cannot show that g is a superkey.
 This shows exactly why the third alternative is present in the definition of 3NF.
3.6.4 Comparison of BCNF and 3NF
1. We have seen BCNF and 3NF.
 It is always possible to obtain a 3NF design without sacrificing lossless-
join or dependency-preservation.
 If we do not eliminate all transitive dependencies, we may need to use
null values to represent some of the meaningful relationships.
 Repetition of information occurs.
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
12 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
2. These problems can be illustrated with Banker-schema.
 As banker-name bname, we may want to express relationships between
a banker and his or her branch.
.
 This table shows how we must either have a corresponding value for
customer name, or include a null.
 Repetition of information also occurs.
 Every occurrence of the banker's name must be accompanied by the
branch name.
3. If we must choose between BCNF and dependency preservation, it is generally
better to opt for 3NF.
 If we cannot check for dependency preservation efficiently, we either pay
a high price in system performance or risk the integrity of the data.
 The limited amount of redundancy in 3NF is then a lesser evil.
4. To summarize, our goal for a relational database design is
 BCNF.
 Lossless-join.
 Dependency-preservation.
5. If we cannot achieve this, we accept
 3NF
 Lossless-join.
 Dependency-preservation.
6. A final point: there is a price to pay for decomposition. When we decompose a
relation, we have to use natural joins or Cartesian products to put the pieces
back together. This takes computational time.
3.7 DECOMPOSITION USING MULTI-VALUED DEPENDENCIES
3.7.1 Multi-valued Dependencies (MVDs)
3.7.2 Fourth Normal Form
3.7.3 4NF Decomposition
3.7.1 Multi-valued Dependencies (MVDs)
Let R be a relation schema and let R and R. The multi-valued dependency
   
holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that
t1[] = t2 [], there exist tuples t3 and t4 in r such that:
t1[] = t2 [] = t3 [] = t4 []
t3[] = t1 []
t3[R – ] = t2[R – ]
t4 [] = t2[]
t4[R – ] = t1[R – ]
ENAME BANKER-
NAME
BNAME
Bill
Tom
Mary
Null
Jhon
Jhon
Jhon
Tim
SFU
SFU
SFU
Austin
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
13 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
Theory of MVDs
From the definition of multivalued dependency, we can derive the following rule:
If , then 
That is, every functional dependency is also a multivalued dependency.
The closure D+ of D is the set of all functional and multivalued dependencies logically
implied by D.
 We can compute D+ from D, using the formal definitions of functional
dependencies and multivalued dependencies.
 We can manage with such reasoning for very simple multivalued dependencies,
which seem to be most common in practice.
 For complex dependencies, it is better to reason about sets of dependencies
using a system of inference rules.
3.7.2 Fourth Normal Form
A relation schema R is in 4NF with respect to a set D of functional and multivalued
dependencies if for all multivalued dependencies in D+ of the form , where
R and R, at least one of the following hold:
  is trivial (i.e., or = R)
  is a superkey for schema R
If a relation is in 4NF it is in BCNF.
Restriction of Multivalued Dependencies
The restriction of D to Ri is the set Di consisting of
 All functional dependencies in D+ that include only attributes of Ri
 All multivalued dependencies of the form
  (Ri)
where Ri and is in D+
3.7.3 4NF Decomposition
result: = {R};
done := false;
compute D+;
Let Di denote the restriction of D+ to Ri
while (not done)
if (there is a schema Ri in result that is not in 4NF) then
begin
let be a nontrivial multivalued dependency that holds
on Ri such that Ri is not in Di, and ;
result := (result Ri) (Ri ) (, );
end
else done:= true;
Note: each Ri is in 4NF, and decomposition is lossless join
3.8 MORE NORMAL FORMS
 Join dependencies generalize multivalued dependencies
 lead to project-join normal form (PJNF) (also called fifth normal form)
 A class of even more general constraints, leads to a normal form called domain
key normal form.
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
14 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
 Problem with these generalized constraints: are hard to reason with, and no set
of sound and complete set of inference rules exists. Hence rarely used.
3.9 DATABASE DESIGN PROCESS
3.9.1 E-R Model and Normalization
3.9.2 Naming of Attributes and Relationships
3.9.3 Denormalization for Performance
3.9.4 Other Design Issues
We have assumed schema R is given
 R could have been generated when converting ER diagram to a set of tables.
 R could have been a single relation containing all attributes that are of interest
(called universal relation).
 Normalization breaks R into smaller relations.
 R could have been the result of some ad hoc design of relations, which we then
test/convert to normal form.
3.9.1 E-R Model and Normalization
 When an ER diagram is carefully designed, identifying all entities correctly, the
tables generated from the ER diagram should not need further normalization.
 However, in a real (imperfect) design, there can be functional dependencies from
nonkey attributes of an entity to other attributes of the entity.
 Example: an employee entity with attributes department_number and
department_address, and a functional dependency.
department_number department_address
 Good design would have made department an entity.
 Functional dependencies from nonkey attributes of a relationship set possible,
but rare most relationships are binary.
3.9.2 Naming of Attributes and Relationships
A desirable feature of a database design is the unique-role assumption, which
means that each attribute name has a unique meaning in the database.
In large database schemas, relationship sets are often named via a concatenation
of the names of related entity sets, perhaps with an intervening hyphen or underscore.
We have used a few such names, for example inst sec and student sec.
3.9.3 Denormalization for Performance
 May want to use non- normalized schema for performance.
 For example, displaying customer_name along with account_number and balance
requires join of account with depositor.
Alternative 1: Use denormalized relation containing attributes of account as well as
depositor with all above attributes
 faster lookup
 extra space and extra execution time for updates
 extra coding work for programmer and possibility of error in extra code
Alternative 2: use a materialized view defined as account depositor
account depositor
Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3
15 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET
 Benefits and drawbacks same as above, except no extra coding work for
programmer and avoids possible errors.
3.9.4 Other Design Issues
 Some aspects of database design are not caught by normalization
 Examples of bad database design, to be avoided:
Instead of earnings (company_id, year, amount ), use
 earnings_2004, earnings_2005, earnings_2006, etc., all on the schema
(company_id, earnings).
 Above are in BCNF, but make querying across years difficult
and needs new table each year
 company_year(company_id, earnings_2004, earnings_2005, earnings_2006)
 Also in BCNF, but also makes querying across years difficult and
requires new attribute each year.
 Is an example of a crosstab, where values for one attribute become
column names.
 Used in spreadsheets, and in data analysis tools.
3.10 MODELING TEMPORAL DATA
 Temporal data have an association time interval during which the data are valid.
 A snapshot is the value of the data at a particular point in time.
 Several proposals to extend ER model by adding valid time to
 attributes, e.g. address of a customer at different points in time
 entities, e.g. time duration when an account exists
 relationships, e.g. time during which a customer owned an account
 But no accepted standard
 Adding a temporal component results in functional dependencies like
customer_id customer_street, customer_city
not to hold, because the address varies over time
 A temporal functional dependency X Y holds on schema R if the functional
dependency X Y holds on all snapshots for all legal instances r (R )

More Related Content

PPTX
Dbms normalization
PPTX
database language ppt.pptx
PPTX
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
PPTX
Normalization
PPTX
Database Design
PPT
File organization 1
PPT
Database Triggers
PDF
Normalization in DBMS
Dbms normalization
database language ppt.pptx
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
Normalization
Database Design
File organization 1
Database Triggers
Normalization in DBMS

What's hot (20)

PPTX
Types Of Keys in DBMS
PPT
Databases: Normalisation
PPTX
Purpose of DBMS and users of DBMS
PPTX
data abstraction in DBMS
PPTX
Structure of dbms
PDF
Nested Queries Lecture
PDF
Enhanced Entity-Relationship (EER) Modeling
PPT
11. Storage and File Structure in DBMS
PPTX
Applications of DBMS(Database Management System)
PPT
Normalization PRESENTATION
PPTX
View of data DBMS
PPTX
PPT
Sql dml & tcl 2
PPTX
Integrity Constraints
PPTX
Relational Data Model Introduction
PPSX
Functional dependency
PPTX
Normal forms
PPTX
trigger dbms
PPT
Aggregate functions
PPT
DBMS Unit 2 ppt.ppt
Types Of Keys in DBMS
Databases: Normalisation
Purpose of DBMS and users of DBMS
data abstraction in DBMS
Structure of dbms
Nested Queries Lecture
Enhanced Entity-Relationship (EER) Modeling
11. Storage and File Structure in DBMS
Applications of DBMS(Database Management System)
Normalization PRESENTATION
View of data DBMS
Sql dml & tcl 2
Integrity Constraints
Relational Data Model Introduction
Functional dependency
Normal forms
trigger dbms
Aggregate functions
DBMS Unit 2 ppt.ppt
Ad

Similar to Relational Database Design (20)

PPTX
DBMS: Week 10 - Database Design and Normalization
PPTX
normalisation jdsuhduswwhdusw cdscsacasc.pptx
PPTX
Normalization in Relational database management systems
PPTX
Normalization in rdbms types and examples
PPTX
Relational Database Design Functional Dependency – definition, trivial and no...
PDF
L8 design1
PPTX
Relational database
PDF
Normalization in DBMS
PPTX
DBMS_Module 3_Functional Dependencies and Normalization.pptx
PDF
Normalization
PPTX
UNIT 2 -PPT.pptx
PPTX
normalization ppt.pptx
PPT
Normalization_BCA_
PPTX
Normalization and three normal forms.pptx
PPTX
L1-Normalization 1NF 2NF 3NF 4NF BCNF.pptx
PDF
Database management system session 5
PPTX
Normalization
PPT
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
PPT
Normalization.ppt
PDF
Normalization | (1NF) |(2NF) (3NF)|BCNF| 4NF |5NF
DBMS: Week 10 - Database Design and Normalization
normalisation jdsuhduswwhdusw cdscsacasc.pptx
Normalization in Relational database management systems
Normalization in rdbms types and examples
Relational Database Design Functional Dependency – definition, trivial and no...
L8 design1
Relational database
Normalization in DBMS
DBMS_Module 3_Functional Dependencies and Normalization.pptx
Normalization
UNIT 2 -PPT.pptx
normalization ppt.pptx
Normalization_BCA_
Normalization and three normal forms.pptx
L1-Normalization 1NF 2NF 3NF 4NF BCNF.pptx
Database management system session 5
Normalization
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Normalization.ppt
Normalization | (1NF) |(2NF) (3NF)|BCNF| 4NF |5NF
Ad

More from Prabu U (20)

PDF
Big Data Analytics, Data Analytics Lifecycle
PPTX
Computation Using Scipy, Scikit Image, Scikit Learn
PPTX
Concurrency and Parallelism, Asynchronous Programming, Network Programming
PPTX
File Input/output, Database Access, Data Analysis with Pandas
PPTX
Arrays with Numpy, Computer Graphics
PPTX
Lambdas, Collections Framework, Stream API
PPTX
Exception handling, Stream Classes, Multithread Programming
PPTX
String Handling, Inheritance, Packages and Interfaces
PPTX
Classes and Objects
PDF
Building XML Based Applications
PDF
Introduction to XML
PDF
WEB SERVICES
PDF
XML
PDF
SERVER SIDE PROGRAMMING
PDF
Internet Principles and Components, Client-Side Programming
PDF
Operation Management
PDF
Nature and Importance of Management
PDF
Replacement and Maintenance Analysis
PDF
Elementary Economic Analysis
PDF
Introduction to Engineering Economics
Big Data Analytics, Data Analytics Lifecycle
Computation Using Scipy, Scikit Image, Scikit Learn
Concurrency and Parallelism, Asynchronous Programming, Network Programming
File Input/output, Database Access, Data Analysis with Pandas
Arrays with Numpy, Computer Graphics
Lambdas, Collections Framework, Stream API
Exception handling, Stream Classes, Multithread Programming
String Handling, Inheritance, Packages and Interfaces
Classes and Objects
Building XML Based Applications
Introduction to XML
WEB SERVICES
XML
SERVER SIDE PROGRAMMING
Internet Principles and Components, Client-Side Programming
Operation Management
Nature and Importance of Management
Replacement and Maintenance Analysis
Elementary Economic Analysis
Introduction to Engineering Economics

Recently uploaded (20)

PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
web development for engineering and engineering
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
Construction Project Organization Group 2.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
DOCX
573137875-Attendance-Management-System-original
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
CYBER-CRIMES AND SECURITY A guide to understanding
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Foundation to blockchain - A guide to Blockchain Tech
Operating System & Kernel Study Guide-1 - converted.pdf
OOP with Java - Java Introduction (Basics)
UNIT-1 - COAL BASED THERMAL POWER PLANTS
web development for engineering and engineering
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
Internet of Things (IOT) - A guide to understanding
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
UNIT 4 Total Quality Management .pptx
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Construction Project Organization Group 2.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
573137875-Attendance-Management-System-original

Relational Database Design

  • 1. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 1 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET UNIT III Relational Database Design: Features of Good Relational Designs – Atomic Domains and First Normal Form – Second Normal Form – Decomposition Using Functional Dependencies – Functional Dependency Theory – Algorithms for decomposition – Decomposition Using Multi-valued Dependencies – More Normal Forms – Database Design Process – Modeling Temporal Data 3.1 FEATURES OF GOOD RELATIONAL DESIGNS 3.1.1 Design Alternative: Larger Schemas  Combined Schemas  Combined Schema without repetition 3.1.2 Design Alternative: Smaller Schemas 3.1.1 Design Alternative: Larger Schemas It is possible to generate a set of relation schemas directly from the E-R design. The goodness (or badness) of the resulting set of schemas depends on how good the E-R design was in the first place.  Combined Schemas  Suppose we combine borrower and loan to get bor_loan = (customer_id, loan_type, amount )  Result is possible repetition of information (L100 in example below) loan_type amount .......... .......... L-1000 ............ ............ ........... ........... 1000 .......... .......... Figure 3.1: cust_loan table customer_id loan_type .............. .............. C0001 C0002 .............. .............. ............... ............... L-1000 L-1000 .............. .............. customer_id loan_type amount C0001 C0002 L-1000 L-1000 1000 1000
  • 2. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 2 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET  Combined Schemas without repetition  Consider combining loan_branch and loan loan_amt_br = (loan_number, amount, branch_name)  No repetition (as suggested by example below) loan_number Amount .......... .......... 25235 ............ ............ ........... ........... 1000 .......... .......... Figure 3.2: loan_branch table 3.1.2 Design Alternative: Smaller Schemas We need to write a rule that says “if there were a schema (dept_name, budget), then dept_name is able to serve as the primary key.” This rule is specified as a functional dependency. dept_ name→ budget employee (ID, name, street, city, salary) Not all decompositions are good. Suppose we decompose employee into employee1 (ID, name) employee2 (name, street, city, salary) Figure 3.3: Loss of information via a bad decomposition. loan_number branch_name .............. .............. 25235 .............. .............. ............... ............... Anna Nagar .............. .............. loan_number Amount branch_name 25235 1000 Anna Nagar
  • 3. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 3 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET 3.2 ATOMIC DOMAINS AND FIRST NORMAL FORM 3.2.1 Atomic Domains 3.2.2 First Normal Form  Employee (unnormalized)  Employee (normalized – 1 NF)  Alterations 3.2.1 Atomic Domains A domain is atomic if elements of the domain are considered to be indivisible units. We say that a relation schema R is in first normal form (1NF) if the domains of all attributes of R are atomic. A set of names is an example of a non atomic value. Non atomic values complicate storage and encourage redundant (repeated) storage of data. 3.2.2 First Normal Form  A relation is said to be in first normal form if all of its attributes have domains that are indivisible or atomic. Also called as Flat File.  Each attribute must be atomic. No repeating columns within a row. No multi- valued columns.  Each row of data must have a unique identifier (or Primary Key) Employee (unnormalized) Employee (normalized – 1 NF) Alterations Update Anamoly  Update address of a student who occurs twice or more than in a table, address column should be updated in all rows. Insertion Anamoly  Student admission, sid, sname, address known but course unknown – leads NULL Value insertion. Deletion Anamoly  Student 101 discontinued course – leads to delete the other details also. emp_no name dept_no dept_name skills 1 Kevin Jacobs 201 R&D C, Perl, Java 2 Barbara Jones 224 IT Linux, Mac 3 Jake Rivera 201 R&D DB2, Oracle, Java emp_no name dept_no dept_name skills 1 Kevin Jacobs 201 R&D C 1 Kevin Jacobs 201 R&D Perl 1 Kevin Jacobs 201 R&D Java 2 Barbara Jones 224 IT Linux 2 Barbara Jones 224 IT Mac 3 Jake Rivera 201 R&D DB2 3 Jake Rivera 201 R&D Oracle 3 Jake Rivera 201 R&D Java
  • 4. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 4 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET 3.3 SECOND NORMAL FORM  A relation is said to be in second normal form if it meets both the followings  The relation is in first normal form.  All non-key attributes are functionally dependent on the entire primary key.  Each attribute must be functionally dependent on the primary key.  2NF improves data integrity.  Prevents update, insert, and delete anomalies. Employee (normalized – 1 NF)  Name, dept_no, and dept_name are functionally dependent on emp_no. (emp_no -> name, dept_no, dept_name)  Skills is not functionally dependent on emp_no since it is not unique to each emp_no. Employee (2NF) Skills (2NF) 3.4 DECOMPOSITION USING FUNCTIONAL DEPENDENCIES 3.4.1 Keys and Functional Dependencies 3.4.2 Boyce–Codd Normal Form 3.4.3 BCNF and Dependency Preservation 3.4.4 Third Normal Form 3.4.5 Higher Normal Forms 3.4.1 Keys and Functional Dependencies Keys  A subset K of R is a super key of r (R) if, in any legal instance of r (R), for all pairs t1 and t2 of tuples in the instance of r if t1 = t2, then t1[K] = t2[K].  That is, no two tuples in any legal instance of relation r (R) may have the same value on attribute set K. emp_no skills 1 C 1 Perl 1 Java 2 Linux 2 Mac 3 DB2 3 Oracle 3 Java emp_no name dept_no dept_name skills 1 Kevin Jacobs 201 R&D C 1 Kevin Jacobs 201 R&D Perl 1 Kevin Jacobs 201 R&D Java 2 Barbara Jones 224 IT Linux 2 Barbara Jones 224 IT Mac 3 Jake Rivera 201 R&D DB2 3 Jake Rivera 201 R&D Oracle 3 Jake Rivera 201 R&D Java emp_no name dept_nodept_name 1 Kevin Jacobs 201 R&D 2 Barbara Jones 224 IT 3 Jake Rivera 201 R&D
  • 5. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 5 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET Functional Dependencies  Y is functionally dependent on X  if the value of Y is determined by X.  if Y = X +1  value of X will determine the resultant value of Y  Y is dependent on X as a function of the value of X 3.4.2 Boyce–Codd Normal Form A relation schema R is in BCNF with respect to a set F of functional dependencies if for all functional dependencies in F+ of the form.  where R and R, at least one of the following holds: is trivial (i.e., ) is a superkey for R Example schema not in BCNF: bor_loan = ( customer_id, loan_number, amount ) because loan_number amount holds on bor_loan but loan_number is not a superkey. Decomposing a Schema into BCNF Suppose we have a schema R and a nontrivial dependency causes a violation of BCNF. We decompose R into:  (U )  ( R- () ) In our example, = loan_number = amount and bor_loan is replaced by (U ) = ( loan_number, amount ) ( R- () ) = ( customer_id, loan_number ) 3.4.3 BCNF and Dependency Preservation  Constraints, including functional dependencies, are costly to check in practice unless they pertain to only one relation  If it is sufficient to test only those dependencies on each individual relation of a decomposition in order to ensure that all functional dependencies hold, then that decomposition is dependency preserving.  Because it is not always possible to achieve both BCNF and dependency preservation, we consider a weaker normal form, known as third normal form. 3.4.4 Third Normal Form A relation is said to be in third normal form if it meets both the followings  The relation is in second normal form.  There is no transitive dependence that is, all the non-key attributes depend only on the primary key. Remove transitive dependencies.  Any transitive dependencies are moved into a smaller (subset) table. 3NF further improves data integrity.
  • 6. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 6 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET  Prevents update, insert, and delete anomalies. Employee (2NF) Skills (2NF) Employee (3NF) Department (3NF) Skills (3NF) 3.4.5 Higher Normal Forms Refer 3.6 and 3.7 3.5 FUNCTIONAL DEPENDENCY THEORY 3.5.1 Closure of a Set of Functional Dependencies 3.5.2 Closure of Attribute Sets 3.5.3 Canonical Cover 3.5.4 Lossless-join Decomposition 3.5.5 Dependency Preservation 3.5.1 Closure of a Set of Functional Dependencies Given a set F set of functional dependencies, there are certain other functional dependencies that are logically implied by F. For example: If A B and B C, then we can infer that A C The set of all functional dependencies logically implied by F is the closure of F. We denote the closure of F by F+. We can find all of F+ by applying Armstrong’s Axioms:  if , then (reflexivity)  if , then (augmentation)  if , and , then (transitivity) These rules are  sound (generate only functional dependencies that actually hold) and  complete (generate all functional dependencies that hold). emp_no skills 1 C 1 Perl 1 Java 2 Linux 2 Mac 3 DB2 3 Oracle 3 Java emp_no name dept_no 1 Kevin Jacobs 201 2 Barbara Jones 224 3 Jake Rivera 201 dept_nodept_name 201 R&D 224 IT emp_no skills 1 C 1 Perl 1 Java 2 Linux 2 Mac 3 DB2 3 Oracle 3 Java emp_no name dept_nodept_name 1 Kevin Jacobs 201 R&D 2 Barbara Jones 224 IT 3 Jake Rivera 201 R&D
  • 7. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 7 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET Procedure for Computing F+ F + = F repeat for each functional dependency f in F+ apply reflexivity and augmentation rules on f add the resulting functional dependencies to F + for each pair of functional dependencies f1and f2 in F + if f1 and f2 can be combined using transitivity then add the resulting functional dependency to F + until F + does not change any further We can further simplify manual computation of F+ by using the following additional rules.  If holds and holds, then holds (union)  If holds, then holds and holds (decomposition)  If holds and holds, then holds (pseudo transitivity) The above rules can be inferred from Armstrong’s axioms. 3.5.2 Closure of Attribute Sets Given a set of attributes define the closure of under F (denoted by +) as the set of attributes that are functionally determined by under F. Algorithm to compute +, the closure of under F result := ; while (changes to result) do for each in F do begin if result then result := result  end Uses of attribute closure There are several uses of the attribute closure algorithm:  Testing for superkey  To test if is a superkey, we compute +, and check if + contains all attributes of R.  Testing functional dependencies  To check if a functional dependency holds (or, in other words, is in F+), just check if +.  That is,we compute + by using attribute closure, and then check if it contains .  Is a simple and cheap test, and very useful.  Computing closure of F For each R, we find the closure +, and for each S +, we output a functional dependency S.
  • 8. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 8 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET 3.5.3 Canonical Cover A canonical cover for F is a set of dependencies Fc such that  F logically implies all dependencies in Fc, and  Fc logically implies all dependencies in F, and  No functional dependency in Fc contains an extraneous attribute, and  Each left side of functional dependency in Fc is unique. To compute a canonical cover for F repeat Use the union rule to replace any dependencies in F   1 1 and 1 2 with 1 1 2 Find a functional dependency with an extraneous attribute either in or in  If an extraneous attribute is found, delete it from  until F does not change Computing a Canonical Cover R = (A, B, C) F = {A BC, B C, A B, AB C}  Combine A BC and A B into A BC  Set is now {A BC, BC, AB C}  A is extraneous in AB C  Check if the result of deleting A from AB C is implied by the other dependencies  Yes: in fact, B C is already present!  Set is now {A BC, B C}  C is extraneous in A BC  Check if A C is logically implied by A B and the other dependencies  Yes: using transitivity on A B and B C.  The canonical cover is: A B B C 3.5.4 Lossless-join Decomposition The decomposition is lossless if, for all legal database, relation r contains the same set of tuples as the result of the following SQL query: select * from (select R1 from r) natural join (select R2 from r) This is stated more concisely in the relational algebra as: R1 and R2 form a lossless decomposition of R if at least one of the following functional dependencies is in F+: R1 ∩ R2 → R1 R1 ∩ R2 → R2
  • 9. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 9 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET Example R = (A, B, C) F = {A B, B C} Can be decomposed in two different ways R1 = (A, B), R2 = (B, C)  Lossless-join decomposition: R1 R2 = {B} and B BC  Dependency preserving R1 = (A, B), R2 = (A, C)  Lossless-join decomposition: R1 R2 = {A} and A AB  Not dependency preserving 3.5.5 Dependency Preservation Let Fi be the set of dependencies F + that include only attributes in Ri  A decomposition is dependency preserving, if (F1 F2 … Fn )+ = F +  If it is not, then checking updates for violation of functional dependencies may require computing joins, which is expensive. Testing for Dependency Preservation  To check if a dependency is preserved in a decomposition of R into R1, R2, …, Rn we apply the following test (with attribute closure done with respect to F) result =  while (changes to result) do for each Ri in the decomposition t = (result Ri)+ Ri result = result t  If result contains all attributes in , then the functional dependency is preserved.  We apply the test on all dependencies in F to check if a decomposition is dependency preserving.  This procedure takes polynomial time, instead of the exponential time required to compute F+ and (F1 F2 … Fn)+ 3.6 ALGORITHMS FOR DECOMPOSITION 3.6.1 BCNF Decomposition 3.6.2 3NF Decomposition 3.6.3 Correctness of the 3NF Algorithm 3.6.4 Comparison of BCNF and 3NF 3.6.1 BCNF Decomposition The definition of BCNF can be used directly to test if a relation is in BCNF. However, computation of F+ can be a tedious task.
  • 10. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 10 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET Testing for BCNF To check if a nontrivial dependency causes a violation of BCNF 1. compute + (the attribute closure of ), and 2. verify that it includes all attributes of R, that is, it is a superkey of R.  Simplified test: To check if a relation schema R is in BCNF, it suffices to check only the dependencies in the given set F for violation of BCNF, rather than checking all dependencies in F+.  If none of the dependencies in F causes a violation of BCNF, then none of the dependencies in F+ will cause a violation of BCNF either. However, using only F is incorrect when testing a relation in a decomposition of R. Consider R = (A, B, C, D, E), with F = { A B, BC D}  Decompose R into R1 = (A,B) and R2 = (A,C,D, E)  Neither of the dependencies in F contain only attributes from (A,C,D,E) so we might be mislead into thinking R2 satisfies BCNF.  In fact, dependency AC D in F+ shows R2 is not in BCNF. BCNF Decomposition Algorithm If R is not in BCNF, we can decompose R into a collection of BCNF schemas R1, R2, . . . , Rn by the algorithm. The algorithm uses dependencies that demonstrate violation of BCNF to perform the decomposition. result := {R }; done := false; compute F +; while (not done) do if (there is a schema Ri in result that is not in BCNF) then begin let be a nontrivial functional dependency that holds on Ri such that Ri is not in F +, and = ; result := (result – Ri ) (Ri – ) (, ); end else done := true; Note: each Ri is in BCNF, and decomposition is lossless-join. 3.6.2 3NF Decomposition There are some situations where BCNF is not dependency preserving, and efficient checking for FD violation on updates is important. Solution: Define a weaker normal form, called Third Normal Form (3NF)  Allows some redundancy  But functional dependencies can be checked on individual relations without computing a join.  There is always a lossless-join, dependency preserving decomposition into 3NF.
  • 11. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 11 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET Let Fc be a canonical cover for F; i := 0; for each functional dependency in Fc do if none of the schemas Rj, 1 j i contains  then begin i := i + 1; Ri :=  end if none of the schemas Rj, 1 j i contains a candidate key for R then begin i := i + 1; Ri := any candidate key for R; end return (R1, R2, ..., Ri) 3.6.3 Correctness of the 3NF Algorithm If a relation Ri is in the decomposition generated by the algorithm, then Ri satisfies 3NF.  Let Ri be generated from the dependency   Let B be any nontrivial functional dependency on Ri.  Now, B can be in either or but not in both. Consider each case separately. Case 1: If B in :  If is a superkey, the 2nd condition of 3NF is satisfied.  Otherwise must contain some attribute not in   Since B is in F+ it must be derivable from Fc, by using attribute closure on .  Attribute closure not have used . If it had been used, must be contained in the attribute closure of , which is not possible, since we assumed is not a superkey.  Now, using ({ B}) and B, we can derive B (since , and B since B is nontrivial)  Then, B is extraneous in the right hand side of ; which is not possible since is in Fc.  Thus, if B is in then must be a superkey, and the second condition of 3NF must be satisfied. Case 2: B is in .  Since a is a candidate key, the third alternative in the definition of 3NF is trivially satisfied.  In fact, we cannot show that g is a superkey.  This shows exactly why the third alternative is present in the definition of 3NF. 3.6.4 Comparison of BCNF and 3NF 1. We have seen BCNF and 3NF.  It is always possible to obtain a 3NF design without sacrificing lossless- join or dependency-preservation.  If we do not eliminate all transitive dependencies, we may need to use null values to represent some of the meaningful relationships.  Repetition of information occurs.
  • 12. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 12 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET 2. These problems can be illustrated with Banker-schema.  As banker-name bname, we may want to express relationships between a banker and his or her branch. .  This table shows how we must either have a corresponding value for customer name, or include a null.  Repetition of information also occurs.  Every occurrence of the banker's name must be accompanied by the branch name. 3. If we must choose between BCNF and dependency preservation, it is generally better to opt for 3NF.  If we cannot check for dependency preservation efficiently, we either pay a high price in system performance or risk the integrity of the data.  The limited amount of redundancy in 3NF is then a lesser evil. 4. To summarize, our goal for a relational database design is  BCNF.  Lossless-join.  Dependency-preservation. 5. If we cannot achieve this, we accept  3NF  Lossless-join.  Dependency-preservation. 6. A final point: there is a price to pay for decomposition. When we decompose a relation, we have to use natural joins or Cartesian products to put the pieces back together. This takes computational time. 3.7 DECOMPOSITION USING MULTI-VALUED DEPENDENCIES 3.7.1 Multi-valued Dependencies (MVDs) 3.7.2 Fourth Normal Form 3.7.3 4NF Decomposition 3.7.1 Multi-valued Dependencies (MVDs) Let R be a relation schema and let R and R. The multi-valued dependency     holds on R if in any legal relation r(R), for all pairs for tuples t1 and t2 in r such that t1[] = t2 [], there exist tuples t3 and t4 in r such that: t1[] = t2 [] = t3 [] = t4 [] t3[] = t1 [] t3[R – ] = t2[R – ] t4 [] = t2[] t4[R – ] = t1[R – ] ENAME BANKER- NAME BNAME Bill Tom Mary Null Jhon Jhon Jhon Tim SFU SFU SFU Austin
  • 13. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 13 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET Theory of MVDs From the definition of multivalued dependency, we can derive the following rule: If , then  That is, every functional dependency is also a multivalued dependency. The closure D+ of D is the set of all functional and multivalued dependencies logically implied by D.  We can compute D+ from D, using the formal definitions of functional dependencies and multivalued dependencies.  We can manage with such reasoning for very simple multivalued dependencies, which seem to be most common in practice.  For complex dependencies, it is better to reason about sets of dependencies using a system of inference rules. 3.7.2 Fourth Normal Form A relation schema R is in 4NF with respect to a set D of functional and multivalued dependencies if for all multivalued dependencies in D+ of the form , where R and R, at least one of the following hold:   is trivial (i.e., or = R)   is a superkey for schema R If a relation is in 4NF it is in BCNF. Restriction of Multivalued Dependencies The restriction of D to Ri is the set Di consisting of  All functional dependencies in D+ that include only attributes of Ri  All multivalued dependencies of the form   (Ri) where Ri and is in D+ 3.7.3 4NF Decomposition result: = {R}; done := false; compute D+; Let Di denote the restriction of D+ to Ri while (not done) if (there is a schema Ri in result that is not in 4NF) then begin let be a nontrivial multivalued dependency that holds on Ri such that Ri is not in Di, and ; result := (result Ri) (Ri ) (, ); end else done:= true; Note: each Ri is in 4NF, and decomposition is lossless join 3.8 MORE NORMAL FORMS  Join dependencies generalize multivalued dependencies  lead to project-join normal form (PJNF) (also called fifth normal form)  A class of even more general constraints, leads to a normal form called domain key normal form.
  • 14. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 14 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET  Problem with these generalized constraints: are hard to reason with, and no set of sound and complete set of inference rules exists. Hence rarely used. 3.9 DATABASE DESIGN PROCESS 3.9.1 E-R Model and Normalization 3.9.2 Naming of Attributes and Relationships 3.9.3 Denormalization for Performance 3.9.4 Other Design Issues We have assumed schema R is given  R could have been generated when converting ER diagram to a set of tables.  R could have been a single relation containing all attributes that are of interest (called universal relation).  Normalization breaks R into smaller relations.  R could have been the result of some ad hoc design of relations, which we then test/convert to normal form. 3.9.1 E-R Model and Normalization  When an ER diagram is carefully designed, identifying all entities correctly, the tables generated from the ER diagram should not need further normalization.  However, in a real (imperfect) design, there can be functional dependencies from nonkey attributes of an entity to other attributes of the entity.  Example: an employee entity with attributes department_number and department_address, and a functional dependency. department_number department_address  Good design would have made department an entity.  Functional dependencies from nonkey attributes of a relationship set possible, but rare most relationships are binary. 3.9.2 Naming of Attributes and Relationships A desirable feature of a database design is the unique-role assumption, which means that each attribute name has a unique meaning in the database. In large database schemas, relationship sets are often named via a concatenation of the names of related entity sets, perhaps with an intervening hyphen or underscore. We have used a few such names, for example inst sec and student sec. 3.9.3 Denormalization for Performance  May want to use non- normalized schema for performance.  For example, displaying customer_name along with account_number and balance requires join of account with depositor. Alternative 1: Use denormalized relation containing attributes of account as well as depositor with all above attributes  faster lookup  extra space and extra execution time for updates  extra coding work for programmer and possibility of error in extra code Alternative 2: use a materialized view defined as account depositor account depositor
  • 15. Dept of CSE | III YEAR | V SEMESTER CS T53 | DATABASE MANAGEMENT SYSTEMS | UNIT 3 15 |Prepared By : Mr. PRABU.U/AP |Dept. of Computer Science and Engineering | SKCET  Benefits and drawbacks same as above, except no extra coding work for programmer and avoids possible errors. 3.9.4 Other Design Issues  Some aspects of database design are not caught by normalization  Examples of bad database design, to be avoided: Instead of earnings (company_id, year, amount ), use  earnings_2004, earnings_2005, earnings_2006, etc., all on the schema (company_id, earnings).  Above are in BCNF, but make querying across years difficult and needs new table each year  company_year(company_id, earnings_2004, earnings_2005, earnings_2006)  Also in BCNF, but also makes querying across years difficult and requires new attribute each year.  Is an example of a crosstab, where values for one attribute become column names.  Used in spreadsheets, and in data analysis tools. 3.10 MODELING TEMPORAL DATA  Temporal data have an association time interval during which the data are valid.  A snapshot is the value of the data at a particular point in time.  Several proposals to extend ER model by adding valid time to  attributes, e.g. address of a customer at different points in time  entities, e.g. time duration when an account exists  relationships, e.g. time during which a customer owned an account  But no accepted standard  Adding a temporal component results in functional dependencies like customer_id customer_street, customer_city not to hold, because the address varies over time  A temporal functional dependency X Y holds on schema R if the functional dependency X Y holds on all snapshots for all legal instances r (R )