SlideShare a Scribd company logo
Database Normalization
From definition to implementation
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 1
Contents / Agenda
• Definition
• Objectives of Normalization
• Normalization Process
• Database Anomalies
• Database Dependencies
• Conversion to 1st NF (implementation)
• Conversion to 2nd NF (implementation)
• Conversion to 3rd NF (implementation)
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 2
Definition
• Normalization is a process for evaluating and correcting table structures to minimize data
redundancies, thereby reducing the likelihood of data anomalies. The normalization process involves
assigning attributes to tables based on the concept of The Relational Database Model.
• Normalization works through a series of stages called normal forms. The first three stages are
described as first normal form (1NF), second normal form (2NF), and third normal form (3NF). From
a structural point of view, 2NF is better than 1NF, and 3NF is better than 2NF. For most purposes in
business database design, 3NF is as high as you need to go in the normalization process.
• There are other normal forms Boyce-Codd (3.5NF or BCNF), 4NF, 5NF and Domain-Key (DKNF). But
they are less common and less used.
• Although normalization is a very important database design ingredient, you should not assume that
the highest level of normalization is always the most desirable. Generally, the higher the normal
form, the more relational join operations required to produce a specified output and the more
resources required by the database system to respond to end-user queries. A successful design must
also consider end-user demand for fast performance. Therefore, you will occasionally be expected to
denormalize some portions of a database design in order to meet performance requirements.
Denormalization produces a lower normal form; that is, a 3NF will be converted to a 2NF through
denormalization. However, the price you pay for increased performance through denormalization is
greater data redundancy.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 3
Objective of Normalization
• The objective of normalization is to ensure that each table conforms to the
concept of well-formed relations, that is, tables that have the following
characteristics:
• Each table represents a single subject. For example, a course table will contain only
data that directly pertains to courses. Similarly, a student table will contain only
student data.
• No data item will be unnecessarily stored in more than one table (in short, tables
have minimum controlled redundancy). The reason for this requirement is to ensure
that the data are updated in only one place.
• All nonprime attributes in a table are dependent on the primary key—the entire
primary key and nothing but the primary key. The reason for this requirement is to
ensure that the data are uniquely identifiable by a primary key value.
• Each table is void of insertion, update, or deletion anomalies. This is to ensure the
integrity and consistency of the data.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 4
Normalization Process
• To accomplish the objective, the normalization process takes you
through the steps that lead to successively higher normal forms. The
most common normal forms and their basic characteristic are listed in
table below and in coming slides you will learn the details of these
normal forms in the indicated sections.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 5
Database Anomalies
• Redundancies are not wanted because they can lead to anomalies. Those
data redundancies yield the following anomalies:
a. Update anomalies: An Update Anomaly exists when one or more instances of
duplicated data is updated, but not all. An update anomaly occurs when the same
data item has to be updated more than once. This can lead to errors and
inconsistency of data.
b. Insertion anomalies: Same as case of update anomaly but it is related to insertion
of data. An Insert Anomaly occurs when certain attributes cannot be inserted into
the database without the presence of other attributes.
c. Deletion anomalies: A deletion anomaly occurs when data is lost because of the
deletion of other data or incomplete deletion.
• Deletion anomaly can be solved by cascade deletion and other anomalies
can be solved using database triggers and Normalization is solution of all
types of anomalies.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 6
Dependencies
• Functional Dependency
Functional dependency is a relationship that exists when one attribute uniquely determines another
attribute.
Example: In a table listing employee characteristics including Social Security Number (SSN) and name, it
can be said that name is functionally dependent upon SSN (or SSN -> name) because an employee's
name can be uniquely determined from their SSN. However, the reverse statement (name -> SSN) is not
true because more than one employee can have the same name but different SSNs.
• Fully Functional Dependency
If attribute B is functionally dependent on a composite key A but not on any subset of that composite
key, the attribute B is fully functionally dependent on A.
• Partial Dependency
A dependency based on only a part of a composite primary key is called a partial dependency. Means if
a attributes depends only on a part of composite key then we can say that it is partially dependent on
composite key.
• Transitive Dependency
A transitive dependency is a dependency of one nonprime attribute on another nonprime attribute.
The problem with transitive dependencies is that they still yield data anomalies.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 7
Conversion to 1st NF
• The term first normal form (1NF) describes the tabular format in which:
• All of the key attributes are defined.
• There are no repeating groups in the table. In other words, each row/column
intersection contains one and only one value, not a set of values.
• All attributes are dependent on the primary key.
• Following are three main steps to achieve 1st NF
1. Eliminate Repeating Groups
2. Identify Primary Key
3. Identify all dependencies
• Repeating Group:
A repeating group derives its name from the fact that a group of multiple entries of the
same type can exist for any single key attribute occurrence.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 8
Step 1
Eliminate Repeating Groups
Step 1 is to eliminate repeating groups from our database.
To eliminate the repeating groups, eliminate the nulls by
making sure that each repeating group attribute contains an
appropriate data value. That change converts the table in
Figure 1.1 (above figure) to 1NF in Figure 1.2 (below figure).
Figure 1.1
Figure 1.2
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 9
Step 2
Identify Primary Key
• The layout in Figure 1.2 represents more than a mere cosmetic
change. Even a casual observer will note that PROJ_NUM is not an
adequate primary key because the project number does not uniquely
identify all of the remaining entity (row) attributes. For example, the
PROJ_NUM value 15 can identify any one of five employees. To
maintain a proper primary key that will uniquely identify any attribute
value, the new key must be composed of a combination of PROJ_NUM
and EMP_NUM. For example, using the data shown in Figure 2, if you
know that PROJ_NUM = 15 and EMP_NUM = 103 the entries for the
attributes PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, and
HOURS must be Evergreen, June E. AR bough, Elect. Engineer, $84.50,
and 23.8, respectively.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 10
Step 3
Identify all dependencies
1. The primary key attributes are
bold, underlined, and shaded in a
different color.
2. The arrows above the
attributes indicate all desirable
dependencies, that is,
dependencies that are based on
the primary key.
3. The arrows below the
dependency diagram indicate
less desirable dependencies. Two
types of such dependencies exist:
a. Partial dependencies.
b. Transitive dependencies.
Figure 1.3
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 11
Problems with 1st NF
• All relational tables satisfy the 1NF requirements. The problem with the 1NF table
structure shown in Figure 1.3 is that it contains partial dependencies—that is,
dependencies based on only a part of the primary key.
• While partial dependencies are sometimes used for performance reasons, they should
be used with caution. Such caution is warranted because a table that contains partial
dependencies is still subject to data redundancies, and therefore, to various anomalies.
• The data redundancies occur because every row entry requires duplication of data. For
example, if Alice K. Johnson submits her work log, then the user would have to make
multiple entries during the course of a day. For each entry, the EMP_NAME, JOB_CLASS,
and CHG_HOUR must be entered each time even though the attribute values are
identical for each row entered. Such duplication of effort is very inefficient. What’s more,
the duplication of effort helps create data anomalies; nothing prevents the user from
typing slightly different versions of the employee name, the position, or the hourly pay.
For instance, the employee name for EMP_NUM = 102 might be entered as Dave Senior
or D. Senior. The project name also might be entered correctly as Evergreen or
misspelled as Evergreen. Such data anomalies violate the relational database’s integrity
and consistency rules.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 12
Conversion to 2nd NF
• A table is in second normal form (2NF) when:
It is in 1NF and It includes no partial dependencies; that is, no attribute is
dependent on only a portion of the primary key.
• Converting to 2NF is done only when the 1NF has a composite
primary key. If the 1NF has a single attribute primary key, then the
table is automatically in 2NF.
• Following are main steps to convert your database in 2nd NF:
1. Your table should already be in 1st Normal Form.
2. Write Each Key Component on a Separate Line.
3. Assign Corresponding Dependent Attributes.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 13
Step 1
Write Each Key Component on a Separate Line
• Write each key component on a separate line; then write the original
(composite) key on the last line. For example:
PROJ_NUM
EMP_NUM
PROJ_NUM EMP_NUM
• Each component will become the key in a new table. In other words,
the original table is now divided into three tables (PROJECT,
EMPLOYEE, and ASSIGNMENT).
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 14
Step 2
Assign Corresponding Dependent Attributes
• Use Figure 1.3 to determine those attributes that are dependent on other
attributes. The dependencies for the original key components are found by
examining the arrows below the dependency diagram shown in Figure 1.3.
In other words, the three new tables (PROJECT, EMPLOYEE, and
ASSIGNMENT) are described by the following relational schemas:
PROJECT (PROJ_NUM, PROJ_NAME)
EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR)
ASSIGNMENT (PROJ_NUM, EMP_NUM, ASSIGN_HOURS)
• Because the number of hours spent on each project by each employee is
dependent on both PROJ_NUM and EMP_NUM in the ASSIGNMENT table,
you place those hours in the ASSIGNMENT table as ASSIGN_HOURS.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 15
Result (Step 1-2)
The results of Steps 1 and 2
are displayed in Figure 2.
At this point, most of the
anomalies discussed earlier
have been eliminated.
For example, if you now want
to add, change, or delete a
PROJECT record, you need to
go only to the PROJECT table
and make the change to only
one row.
Figure 2
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 16
Conversion to 3rd NF
• A table is in third normal form (3NF) when:
It is in 2NF and It contains no transitive dependencies.
• Following are main steps to convert you database in 3rd normal form:
1. Identify Each New Determinant
2. Identify the Dependent Attributes
3. Remove the Dependent Attributes from Transitive Dependencies
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 17
Step 1
Identify Each New Determinant
• For every transitive dependency, write its determinant as a PK for a
new table. A determinant is any attribute whose value determines
other values within a row. If you have three different transitive
dependencies, you will have three different determinants.
• Figure 2 shows only one table that contains a transitive dependency
(JOB_CLASS → CHG_HOUR). Therefore, write the determinant for this
transitive dependency as:
JOB_CLASS
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 18
Step 2
Identify the Dependent Attributes
• Identify the attributes that are dependent on each determinant
identified in Step 1 and identify the dependency. In this case, you
write:
JOB_CLASS → CHG_HOUR
• Name the table to reflect its contents and function. In this case, JOB
seems appropriate.
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 19
Step 3
Remove the Dependent Attributes from Transitive Dependencies
• Eliminate all dependent attributes in the transitive relationship(s) from each of
the tables that have such a transitive relationship. In this example, eliminate
CHG_HOUR from the EMPLOYEE table shown in Figure 2 to leave the EMPLOYEE
table dependency definition as:
EMP_NUM → EMP_NAME, JOB_CLASS
• Note that the JOB_CLASS remains in the EMPLOYEE table to serve as the FK.
• Draw a new dependency diagram to show all of the tables you have defined in
Steps 1−3. Check the new tables as well as the tables you modified in Step 3 to
make sure that each table has a determinant and that no table contains
inappropriate dependencies.
• PROJECT (PROJ_NUM, PROJ_NAME)
• EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS)
• JOB (JOB_CLASS, CHG_HOUR)
• ASSIGNMENT (PROJ_NUM, EMP_NUM, ASSIGN_HOURS)
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 20
Result (Step 1-3)
29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 21

More Related Content

PPT
Normalization of database tables
PPT
Entity Relationship Diagram
PPT
Databases: Normalisation
PPTX
Chapter-7 Relational Calculus
PPTX
Learn Normalization in simple language
PDF
Normalization in DBMS
PPTX
Functional dependencies in Database Management System
PPTX
database Normalization
Normalization of database tables
Entity Relationship Diagram
Databases: Normalisation
Chapter-7 Relational Calculus
Learn Normalization in simple language
Normalization in DBMS
Functional dependencies in Database Management System
database Normalization

What's hot (20)

PDF
Database Systems - Normalization of Relations(Chapter 4/3)
PPT
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
PPTX
Normal forms
PPTX
Functional dependency
PPTX
Normalization in DBMS
PPT
Entity relationship modelling
PPTX
Sql fundamentals
PPTX
Integrity Constraints
PPTX
Acid properties
PPTX
Recovery techniques
PPTX
Structure of dbms
PPTX
ER Modeling and Introduction to RDBMS
PDF
Database Normalization
PPTX
Transaction management DBMS
PPTX
Context free grammar
PPTX
Normalization 1 nf,2nf,3nf,bcnf
PPT
Transactions in dbms
PPTX
Lock based protocols
Database Systems - Normalization of Relations(Chapter 4/3)
Database Normalization 1NF, 2NF, 3NF, BCNF, 4NF, 5NF
Normal forms
Functional dependency
Normalization in DBMS
Entity relationship modelling
Sql fundamentals
Integrity Constraints
Acid properties
Recovery techniques
Structure of dbms
ER Modeling and Introduction to RDBMS
Database Normalization
Transaction management DBMS
Context free grammar
Normalization 1 nf,2nf,3nf,bcnf
Transactions in dbms
Lock based protocols
Ad

Viewers also liked (20)

PPTX
Database Normalization
PPTX
Normalization in Database
PPTX
Database Normalization
PDF
Database design & Normalization (1NF, 2NF, 3NF)
PPT
Lecture 04 normalization
PPTX
Database normalization
PPT
Normlaization
PPTX
Normalization
PPT
DBMS - Normalization
PPT
Normalization of database_tables_chapter_4
PDF
Employee table
PPTX
Database Join
PPT
Sub join a query optimization algorithm for flash-based database
PPTX
Normalization in a Database
PPTX
Database Introduction - Join Query
PPT
Scrum Model
PDF
Db normalization
ODP
My sql Syntax
PPTX
Database - SQL Joins
Database Normalization
Normalization in Database
Database Normalization
Database design & Normalization (1NF, 2NF, 3NF)
Lecture 04 normalization
Database normalization
Normlaization
Normalization
DBMS - Normalization
Normalization of database_tables_chapter_4
Employee table
Database Join
Sub join a query optimization algorithm for flash-based database
Normalization in a Database
Database Introduction - Join Query
Scrum Model
Db normalization
My sql Syntax
Database - SQL Joins
Ad

Similar to Database - Normalization (20)

DOCX
Database Normalization.docx
PPTX
DATABASE MANAGEMENT SYSTEM
PPTX
Year 11 DATA PROCESSING 1st Term
PPTX
Exception & Database
PPTX
Structured system analysis and design
PPTX
Normalization in data base presentation .pptx
PDF
Ch06-Normalization SDHVFDDGNMFBVMBNCVMNMV
PPT
When & Why\'s of Denormalization
PPTX
Presentation on Normalization.pptx
PPTX
Module 4_PART1.pptx
PPTX
PPT
DBMS e evevevevevevevbebrbbrbrbrbrbrbrb 4.ppt
PPTX
Normalization ppt for RDBMS PPT FOR BCA and for computer science student..pptx
PPTX
DBMS_Module 3_Functional Dependencies and Normalization.pptx
PDF
transection mangemanrtin data bse mangement system .pdf
PPT
Database Design Process
PPTX
normalization of database management ppt
PPTX
Relational database design
Database Normalization.docx
DATABASE MANAGEMENT SYSTEM
Year 11 DATA PROCESSING 1st Term
Exception & Database
Structured system analysis and design
Normalization in data base presentation .pptx
Ch06-Normalization SDHVFDDGNMFBVMBNCVMNMV
When & Why\'s of Denormalization
Presentation on Normalization.pptx
Module 4_PART1.pptx
DBMS e evevevevevevevbebrbbrbrbrbrbrbrb 4.ppt
Normalization ppt for RDBMS PPT FOR BCA and for computer science student..pptx
DBMS_Module 3_Functional Dependencies and Normalization.pptx
transection mangemanrtin data bse mangement system .pdf
Database Design Process
normalization of database management ppt
Relational database design

More from Mudasir Qazi (13)

PPTX
Design Patterns - Abstract Factory Pattern
PPTX
Database - Entity Relationship Diagram (ERD)
PPTX
OOP - Understanding association, aggregation, composition and dependency
PPTX
OOP - Polymorphism
PPTX
OOP - Java is pass-by-value
PPTX
OOP - Benefits and advantages of OOP
PPTX
Design Pattern - Singleton Pattern
PPTX
Design Pattern - Observer Pattern
PPTX
Design Pattern - MVC, MVP and MVVM
PPTX
Design Pattern - Introduction
PPTX
Design Pattern - Factory Method Pattern
PPTX
Design pattern - Facade Pattern
PPTX
Design Pattern - Chain of Responsibility
Design Patterns - Abstract Factory Pattern
Database - Entity Relationship Diagram (ERD)
OOP - Understanding association, aggregation, composition and dependency
OOP - Polymorphism
OOP - Java is pass-by-value
OOP - Benefits and advantages of OOP
Design Pattern - Singleton Pattern
Design Pattern - Observer Pattern
Design Pattern - MVC, MVP and MVVM
Design Pattern - Introduction
Design Pattern - Factory Method Pattern
Design pattern - Facade Pattern
Design Pattern - Chain of Responsibility

Recently uploaded (20)

PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
CH1 Production IntroductoryConcepts.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
PPT on Performance Review to get promotions
PPTX
Construction Project Organization Group 2.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Geodesy 1.pptx...............................................
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Welding lecture in detail for understanding
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Digital Logic Computer Design lecture notes
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
R24 SURVEYING LAB MANUAL for civil enggi
DOCX
573137875-Attendance-Management-System-original
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
CH1 Production IntroductoryConcepts.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPT on Performance Review to get promotions
Construction Project Organization Group 2.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Geodesy 1.pptx...............................................
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Welding lecture in detail for understanding
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
Foundation to blockchain - A guide to Blockchain Tech
UNIT 4 Total Quality Management .pptx
bas. eng. economics group 4 presentation 1.pptx
Digital Logic Computer Design lecture notes
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
R24 SURVEYING LAB MANUAL for civil enggi
573137875-Attendance-Management-System-original

Database - Normalization

  • 1. Database Normalization From definition to implementation 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 1
  • 2. Contents / Agenda • Definition • Objectives of Normalization • Normalization Process • Database Anomalies • Database Dependencies • Conversion to 1st NF (implementation) • Conversion to 2nd NF (implementation) • Conversion to 3rd NF (implementation) 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 2
  • 3. Definition • Normalization is a process for evaluating and correcting table structures to minimize data redundancies, thereby reducing the likelihood of data anomalies. The normalization process involves assigning attributes to tables based on the concept of The Relational Database Model. • Normalization works through a series of stages called normal forms. The first three stages are described as first normal form (1NF), second normal form (2NF), and third normal form (3NF). From a structural point of view, 2NF is better than 1NF, and 3NF is better than 2NF. For most purposes in business database design, 3NF is as high as you need to go in the normalization process. • There are other normal forms Boyce-Codd (3.5NF or BCNF), 4NF, 5NF and Domain-Key (DKNF). But they are less common and less used. • Although normalization is a very important database design ingredient, you should not assume that the highest level of normalization is always the most desirable. Generally, the higher the normal form, the more relational join operations required to produce a specified output and the more resources required by the database system to respond to end-user queries. A successful design must also consider end-user demand for fast performance. Therefore, you will occasionally be expected to denormalize some portions of a database design in order to meet performance requirements. Denormalization produces a lower normal form; that is, a 3NF will be converted to a 2NF through denormalization. However, the price you pay for increased performance through denormalization is greater data redundancy. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 3
  • 4. Objective of Normalization • The objective of normalization is to ensure that each table conforms to the concept of well-formed relations, that is, tables that have the following characteristics: • Each table represents a single subject. For example, a course table will contain only data that directly pertains to courses. Similarly, a student table will contain only student data. • No data item will be unnecessarily stored in more than one table (in short, tables have minimum controlled redundancy). The reason for this requirement is to ensure that the data are updated in only one place. • All nonprime attributes in a table are dependent on the primary key—the entire primary key and nothing but the primary key. The reason for this requirement is to ensure that the data are uniquely identifiable by a primary key value. • Each table is void of insertion, update, or deletion anomalies. This is to ensure the integrity and consistency of the data. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 4
  • 5. Normalization Process • To accomplish the objective, the normalization process takes you through the steps that lead to successively higher normal forms. The most common normal forms and their basic characteristic are listed in table below and in coming slides you will learn the details of these normal forms in the indicated sections. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 5
  • 6. Database Anomalies • Redundancies are not wanted because they can lead to anomalies. Those data redundancies yield the following anomalies: a. Update anomalies: An Update Anomaly exists when one or more instances of duplicated data is updated, but not all. An update anomaly occurs when the same data item has to be updated more than once. This can lead to errors and inconsistency of data. b. Insertion anomalies: Same as case of update anomaly but it is related to insertion of data. An Insert Anomaly occurs when certain attributes cannot be inserted into the database without the presence of other attributes. c. Deletion anomalies: A deletion anomaly occurs when data is lost because of the deletion of other data or incomplete deletion. • Deletion anomaly can be solved by cascade deletion and other anomalies can be solved using database triggers and Normalization is solution of all types of anomalies. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 6
  • 7. Dependencies • Functional Dependency Functional dependency is a relationship that exists when one attribute uniquely determines another attribute. Example: In a table listing employee characteristics including Social Security Number (SSN) and name, it can be said that name is functionally dependent upon SSN (or SSN -> name) because an employee's name can be uniquely determined from their SSN. However, the reverse statement (name -> SSN) is not true because more than one employee can have the same name but different SSNs. • Fully Functional Dependency If attribute B is functionally dependent on a composite key A but not on any subset of that composite key, the attribute B is fully functionally dependent on A. • Partial Dependency A dependency based on only a part of a composite primary key is called a partial dependency. Means if a attributes depends only on a part of composite key then we can say that it is partially dependent on composite key. • Transitive Dependency A transitive dependency is a dependency of one nonprime attribute on another nonprime attribute. The problem with transitive dependencies is that they still yield data anomalies. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 7
  • 8. Conversion to 1st NF • The term first normal form (1NF) describes the tabular format in which: • All of the key attributes are defined. • There are no repeating groups in the table. In other words, each row/column intersection contains one and only one value, not a set of values. • All attributes are dependent on the primary key. • Following are three main steps to achieve 1st NF 1. Eliminate Repeating Groups 2. Identify Primary Key 3. Identify all dependencies • Repeating Group: A repeating group derives its name from the fact that a group of multiple entries of the same type can exist for any single key attribute occurrence. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 8
  • 9. Step 1 Eliminate Repeating Groups Step 1 is to eliminate repeating groups from our database. To eliminate the repeating groups, eliminate the nulls by making sure that each repeating group attribute contains an appropriate data value. That change converts the table in Figure 1.1 (above figure) to 1NF in Figure 1.2 (below figure). Figure 1.1 Figure 1.2 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 9
  • 10. Step 2 Identify Primary Key • The layout in Figure 1.2 represents more than a mere cosmetic change. Even a casual observer will note that PROJ_NUM is not an adequate primary key because the project number does not uniquely identify all of the remaining entity (row) attributes. For example, the PROJ_NUM value 15 can identify any one of five employees. To maintain a proper primary key that will uniquely identify any attribute value, the new key must be composed of a combination of PROJ_NUM and EMP_NUM. For example, using the data shown in Figure 2, if you know that PROJ_NUM = 15 and EMP_NUM = 103 the entries for the attributes PROJ_NAME, EMP_NAME, JOB_CLASS, CHG_HOUR, and HOURS must be Evergreen, June E. AR bough, Elect. Engineer, $84.50, and 23.8, respectively. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 10
  • 11. Step 3 Identify all dependencies 1. The primary key attributes are bold, underlined, and shaded in a different color. 2. The arrows above the attributes indicate all desirable dependencies, that is, dependencies that are based on the primary key. 3. The arrows below the dependency diagram indicate less desirable dependencies. Two types of such dependencies exist: a. Partial dependencies. b. Transitive dependencies. Figure 1.3 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 11
  • 12. Problems with 1st NF • All relational tables satisfy the 1NF requirements. The problem with the 1NF table structure shown in Figure 1.3 is that it contains partial dependencies—that is, dependencies based on only a part of the primary key. • While partial dependencies are sometimes used for performance reasons, they should be used with caution. Such caution is warranted because a table that contains partial dependencies is still subject to data redundancies, and therefore, to various anomalies. • The data redundancies occur because every row entry requires duplication of data. For example, if Alice K. Johnson submits her work log, then the user would have to make multiple entries during the course of a day. For each entry, the EMP_NAME, JOB_CLASS, and CHG_HOUR must be entered each time even though the attribute values are identical for each row entered. Such duplication of effort is very inefficient. What’s more, the duplication of effort helps create data anomalies; nothing prevents the user from typing slightly different versions of the employee name, the position, or the hourly pay. For instance, the employee name for EMP_NUM = 102 might be entered as Dave Senior or D. Senior. The project name also might be entered correctly as Evergreen or misspelled as Evergreen. Such data anomalies violate the relational database’s integrity and consistency rules. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 12
  • 13. Conversion to 2nd NF • A table is in second normal form (2NF) when: It is in 1NF and It includes no partial dependencies; that is, no attribute is dependent on only a portion of the primary key. • Converting to 2NF is done only when the 1NF has a composite primary key. If the 1NF has a single attribute primary key, then the table is automatically in 2NF. • Following are main steps to convert your database in 2nd NF: 1. Your table should already be in 1st Normal Form. 2. Write Each Key Component on a Separate Line. 3. Assign Corresponding Dependent Attributes. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 13
  • 14. Step 1 Write Each Key Component on a Separate Line • Write each key component on a separate line; then write the original (composite) key on the last line. For example: PROJ_NUM EMP_NUM PROJ_NUM EMP_NUM • Each component will become the key in a new table. In other words, the original table is now divided into three tables (PROJECT, EMPLOYEE, and ASSIGNMENT). 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 14
  • 15. Step 2 Assign Corresponding Dependent Attributes • Use Figure 1.3 to determine those attributes that are dependent on other attributes. The dependencies for the original key components are found by examining the arrows below the dependency diagram shown in Figure 1.3. In other words, the three new tables (PROJECT, EMPLOYEE, and ASSIGNMENT) are described by the following relational schemas: PROJECT (PROJ_NUM, PROJ_NAME) EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS, CHG_HOUR) ASSIGNMENT (PROJ_NUM, EMP_NUM, ASSIGN_HOURS) • Because the number of hours spent on each project by each employee is dependent on both PROJ_NUM and EMP_NUM in the ASSIGNMENT table, you place those hours in the ASSIGNMENT table as ASSIGN_HOURS. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 15
  • 16. Result (Step 1-2) The results of Steps 1 and 2 are displayed in Figure 2. At this point, most of the anomalies discussed earlier have been eliminated. For example, if you now want to add, change, or delete a PROJECT record, you need to go only to the PROJECT table and make the change to only one row. Figure 2 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 16
  • 17. Conversion to 3rd NF • A table is in third normal form (3NF) when: It is in 2NF and It contains no transitive dependencies. • Following are main steps to convert you database in 3rd normal form: 1. Identify Each New Determinant 2. Identify the Dependent Attributes 3. Remove the Dependent Attributes from Transitive Dependencies 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 17
  • 18. Step 1 Identify Each New Determinant • For every transitive dependency, write its determinant as a PK for a new table. A determinant is any attribute whose value determines other values within a row. If you have three different transitive dependencies, you will have three different determinants. • Figure 2 shows only one table that contains a transitive dependency (JOB_CLASS → CHG_HOUR). Therefore, write the determinant for this transitive dependency as: JOB_CLASS 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 18
  • 19. Step 2 Identify the Dependent Attributes • Identify the attributes that are dependent on each determinant identified in Step 1 and identify the dependency. In this case, you write: JOB_CLASS → CHG_HOUR • Name the table to reflect its contents and function. In this case, JOB seems appropriate. 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 19
  • 20. Step 3 Remove the Dependent Attributes from Transitive Dependencies • Eliminate all dependent attributes in the transitive relationship(s) from each of the tables that have such a transitive relationship. In this example, eliminate CHG_HOUR from the EMPLOYEE table shown in Figure 2 to leave the EMPLOYEE table dependency definition as: EMP_NUM → EMP_NAME, JOB_CLASS • Note that the JOB_CLASS remains in the EMPLOYEE table to serve as the FK. • Draw a new dependency diagram to show all of the tables you have defined in Steps 1−3. Check the new tables as well as the tables you modified in Step 3 to make sure that each table has a determinant and that no table contains inappropriate dependencies. • PROJECT (PROJ_NUM, PROJ_NAME) • EMPLOYEE (EMP_NUM, EMP_NAME, JOB_CLASS) • JOB (JOB_CLASS, CHG_HOUR) • ASSIGNMENT (PROJ_NUM, EMP_NUM, ASSIGN_HOURS) 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 20
  • 21. Result (Step 1-3) 29-Dec-14 Mudasir Qazi - mudasirqazi00@gmail.com 21