SlideShare a Scribd company logo
Please do not copy without permission. © ExploreAI 2023.
Normalisation
Views and normalisation
|Normalisation is a database design technique used to organise data in a relational database. It
reduces data redundancy and ensures data integrity.
Normalisation
Views and normalisation
The most common normal forms are:
2
Normalisation involves breaking down a large, complex
table into smaller, related tables and establishing
relationships between them.
The process is guided by a set of rules and principles,
often described using normal forms.
1. _First Normal Form (1NF)_
2. _Second Normal Form (2NF)_
3. _Third Normal Form (3NF)_
and so on, up to higher levels of normalisation.
1NF 2NF 3NF . . . 5NF
Normalisation
Views and normalisation
3
It helps maintain data integrity by reducing data redundancy
and preventing anomalies from operations like insertion.
Normalisation reduces data redundancy, which leads to more
efficient storage utilisation.
It's easier to maintain and update because we don’t need to
update multiple places to change a single piece of data.
Normalised databases offer scalability with efficient indexing
and data integrity, supporting larger datasets.
Normalised databases excel in flexible querying, supporting
complex queries through table relationships.
Advantages
01.
02.
03.
04.
05.
Normalised databases may be complex to design and query
due to multiple table joins, potentially affecting performance.
Query performance may be slow, especially with complex
joins and large datasets, resulting in long execution times.
While normalised databases are great for reads, they may
struggle with writes due to relationship overhead.
Normalisation reduces redundancy but adds storage
overhead due to primary and foreign key maintenance.
Higher normal forms (e.g. 4NF, 5NF) can be challenging and
unnecessary, potentially resulting in overly complex designs.
01.
02.
03.
04.
05.
Disadvantages
|Denormalisation, on the other hand, is a database design technique that intentionally introduces redundancy
into a database by combining tables or adding redundant data to one or more tables.
Denormalisation
Views and normalisation
4
Redundant data offers faster query performance for complex
queries and reporting due to reduced join complexity.
It simplifies schema structures, aiding query development
and maintenance, ideal for reporting and analytics.
Denormalisation suits read-heavy scenarios, prioritising
faster queries over write complexities.
Fewer joins in denormalised databases means simpler SQL
queries and reduced risk of performance issues.
Advantages
01.
02.
03.
04.
Write operations, such as inserts, updates, and deletes, can
be slower and more complex due to redundancy.
Data integrity is challenging because ensuring consistent
updates to redundant data is complex and error-prone.
Redundant data consumes more storage space, which can
be a significant concern in systems with large datasets.
As the database complexity increases, managing schema
changes becomes increasingly challenging.
01.
02.
03.
04.
Disadvantages
|
Data anomalies are unexpected occurrences in a dataset that can erode data accuracy,
reliability, and usability, eventually undermining data quality for decision-making, analysis, and
reporting.
Data anomalies
Views and normalisation
5
Can arise from a variety of causes, including:
1. Data entry errors: Human-made mistakes like
typos or missing data.
2. Data storage inconsistencies: Differences in
how data are stored.
3. Integration issues: Merging data from diverse
sources with format conflicts.
4. System glitches: Technical problems in data
processing.
Strategies include:
1. Prevention:
a. Normalise data.
b. Apply validation rules.
c. Implement data entry controls.
2. Mitigation:
a. Perform data cleansing.
b. Conduct data audits.
c. Establish error-handling protocols.
Causes Prevention and mitigation
6
Managing data anomalies
Views and normalisation
Data integration: Integrating data from multiple sources can increase data anomalies.
Data integrity: Anomalies can compromise data integrity by violating database rules and constraints.
Anomalies vs outliers: Outliers fall outside the expected range but, unlike anomalies, might not imply errors.
Data cleaning: Detecting and rectifying anomalies involves data cleaning to correct errors and inconsistencies.
Data governance: Vital for preventing, detecting, and addressing data anomalies to uphold data quality.
Continuous monitoring: Continuous monitoring and auditing are essential for maintaining anomaly-free data.
|
An update anomaly occurs in a database when updating a piece of information requires
modifying multiple rows or records in a table, and failing to do so can lead to inconsistencies or
inaccuracies in the data.
Data anomalies – Update anomaly
Views and normalisation
7
Let's say Carmel is promoted to
Head Chef and the company
wants to update this information.
Employee_id Name Job_code Job_title State_code Home_state
E001 Carmel J01 Chef 26 Cape Town
E001 Carmel J02 Waiter 26 Cape Town
E002 Stefanie J02 Waiter 56 Joburg
E002 Stefanie J03 Bartender 56 Joburg
E003 Lisa J01 Chef 5 Nairobi
In a denormalised table like this,
you would need to update multiple
rows with the same Employee_id.
If we forget to update one of these
rows, it would lead to an
inconsistency in the data.
|
An insertion anomaly occurs when a new record can’t be added without introducing
incomplete data because certain required information is not yet available for the entity being
represented.
Data anomalies – Insertion anomaly
Views and normalisation
8
A new employee, John, doesn't
yet have a Job_code or
State_code. Employee_id Name Job_code Job_title State_code Home_state
E001 Carmel J01 Chef 26 Cape Town
E002 Stefanie J02 Waiter 56 Joburg
E003 Lisa J01 Chef 5 Nairobi
In a denormalised table like this,
you might be forced to insert
incomplete or inaccurate data.
This is common where adding a
new record may entail duplicating
information.
E004 John NULL NULL NULL NULL
|A deletion anomaly is a type of data anomaly in a database where deleting a single piece of
data results in the unintentional loss of related data that are still valid and necessary.
Data anomalies – Deletion anomaly
Views and normalisation
9
Let's say Lisa decides to leave the
company, and her record is
removed from the table.
Employee_id Name Job_code Job_title State_code Home_state
E001 Carmel J01 Chef 26 Cape Town
E002 Stefanie J02 Waiter 56 Joburg
All information related to Nairobi
(State_code 5) was removed
from the table.
Now, there is no record of any
employee in Nairobi, which is not
what we intended.
E003 Lisa J01 Chef 5 Nairobi
● To normalise a database, one must know what
the requirements are for each of the three
normal forms that we’ll go over.
● One of the key requirements to remember is
that normal forms are progressive. That is, in
order to have 3NF we must have 2NF, and in
order to have 2NF we must have 1NF.
Normalisation
Views and normalisation
Atomic values: Each column in a table should contain
only atomic (indivisible) values. This eliminates
repeating groups and ensures data are stored in a
tabular format (1NF).
10
Key normalisation principles
No partial dependencies: In a table with a composite
primary key, non-key attributes should be fully
functionally dependent on the entire primary key
(2NF).
No transitive dependencies: Non-key attributes
should not depend on other non-key attributes within
the same table (3NF).
1NF is the initial stage of database normalisation. It
sets the foundation for organising data in a relational
database in a structured and consistent manner.
A table is said to be in 1NF if it meets the following
criteria:
1. Each cell in the table must not hold more than
one value, which is referred to as atomicity.
2. The table must have a primary key for
identification.
3. The table should have no duplicated rows or
columns.
1. Atomicity: "Atomic" means that a value cannot
be divided into smaller parts that have meaning
on their own. Atomic values ensure that data are
stored at its smallest meaningful level. This
prevents the inclusion of arrays, lists, or multiple
values within a single cell.
2. Tabular structure: Data must be organised in a
tabular structure with rows and columns, where
each column represents a distinct attribute, and
each cell holds a single atomic value.
First Normal Form (1NF)
Views and normalisation
11
How 1NF works
1NF addresses data anomalies by enforcing the rule of atomic values and organising data in a structured,
non-redundant manner. This helps prevent update anomalies and improve data consistency, ultimately
enhancing data quality and reducing the risk of anomalies in the database.
_Example:_
Suppose we have the following
unnormalised table:
First Normal Form (1NF)
Views and normalisation
12
Employee_id Name Job_codes Job_titles State_codes Home_states
E001 Carmel J01, J02 Chef, Waiter 26, 26 Cape Town
E002 Stefanie J02, J03 Waiter, Bartender 56, 56 Joburg
E003 Lisa J01 Chef 5 Nairobi
In our 1NF table, each column
contains atomic values and each
row is uniquely identified by the
Employee_id.
Several columns contain
non-atomic, comma-separated
values. We need to break down
these columns into separate rows,
for a more normalised structure.
Employee_id Name Job_code Job_title State_code Home_state
E001 Carmel J01 Chef 26 Cape Town
E001 Carmel J02 Waiter 26 Cape Town
E002 Stefanie J02 Waiter 56 Joburg
E002 Stefanie J03 Bartender 56 Joburg
E003 Lisa J01 Chef 5 Nairobi
2NF addresses data anomalies by eliminating partial
dependencies which enhances data integrity,
reduces update anomalies, improves insertion and
deletion operations, and promotes structured data
organisation. It is crucial for maintaining a reliable
and efficient relational database.
2NF is the next level of database normalisation.
A table is considered to be in 2NF if it meets the
following criteria:
1. It is already in 1NF (First Normal Form).
2. It does not contain partial dependencies, which
means that non-key attributes are fully
functionally dependent on the entire primary
key.
1. Identify primary key: Find the unique identifier
for each row, which can be a single attribute or
a combination.
2. Ensure full dependencies: Non-key attributes
must depend entirely on the primary key, with
no partial dependencies.
3. Table splitting (if needed): If partial
dependencies exist, consider dividing the table
into related tables, each with its own primary
key.
4. Create relationships: When tables are split,
establish links using foreign keys in child tables
referencing parent table primary keys for data
consistency.
Second Normal Form (2NF)
Views and normalisation
13
How 2NF works
_Example:_
Our 1NF table has a partial
dependency issue because the
combination of {Employee_id,
Job_code} forms the primary key,
and non-key attributes (e.g.
Job_title) depend only on part
of the primary key (Job_code).
Second Normal Form (2NF)
Views and normalisation
14
To bring this table into 2NF, we
can split it into two separate
tables that are linked by the
Employee_id column:
Employee_id Name Job_code Job_title State_code Home_state
E001 Carmel J01 Chef 26 Cape Town
E001 Carmel J02 Waiter 26 Cape Town
E002 Stefanie J02 Waiter 56 Joburg
E002 Stefanie J03 Bartender 56 Joburg
E003 Lisa J01 Chef 5 Nairobi
1. Employee information.
2. Job-related information.
Employee_id Name State_code Home_state
E001 Carmel 26 Cape Town
E002 Stefanie 56 Joburg
E003 Lisa 5 Nairobi
Employee_id Job_code Job_title
E001 J01 Chef
E001 J02 Waiter
E002 J02 Waiter
E002 J03 Bartender
E003 J01 Chef
A table is considered to be in 3NF if it satisfies the
following criteria:
1. It is already in 2NF.
2. It does not contain transitive dependencies: In
3NF, a table should ensure that non-key
attributes (attributes not part of the primary
key) are not transitively dependent on the
primary key through other non-key attributes.
1. Eliminates transitive dependencies: Transitive
dependencies occur when a non-key attribute is
dependent on another non-key attribute. 3NF
requires that all non-key attributes be directly
dependent on the primary key.
2. Enhances data integrity: Maintains data
accuracy by preventing non-key attributes from
relying on other non-key attributes.
3. Reduces update anomalies: Minimises the risk
of unintended data inconsistencies during
updates.
4. Organised data: Encourages structured data by
linking non-key attributes directly to the primary
key.
Third Normal Form (3NF)
Views and normalisation
15
How 3NF works
3NF addresses data anomalies, particularly those
associated with transitive dependencies, by
enforcing direct dependencies on the primary key,
enhancing data integrity, reducing update
anomalies, and promoting structured data
organisation.
_Example:_
In the employee information table,
the primary key is {Employee_id},
and there are no partial
dependencies. However, there is
a transitive dependency between
State_code and Home_state
because State_code indirectly
determines Home_state through
Employee_id.
Third Normal Form (3NF)
Views and normalisation
16
To achieve 3NF, we would split
this table into two separate tables
linked by State_code:
1. Employee information.
2. Location information.
Employee_id Name State_code Home_state
E001 Carmel 26 Cape Town
E002 Stefanie 56 Joburg
E003 Lisa 5 Nairobi
Employee_id Name State_code
E001 Carmel 26
E002 Stefanie 56
E003 Lisa 5
State_code Home_state
26 Cape Town
56 Joburg
5 Nairobi

More Related Content

PDF
Normalization in Database
PPTX
Database - Normalization
PPTX
normalization-1.pptx
PPTX
Sql server ___________session3-normailzation
DOCX
Database Normalization.docx
PPTX
PPTX
2 normalization
PPTX
Database Normalisation
Normalization in Database
Database - Normalization
normalization-1.pptx
Sql server ___________session3-normailzation
Database Normalization.docx
2 normalization
Database Normalisation

Similar to Normalisation [Slides].pdf introduction language (20)

PPT
Normalisation - 2nd normal form
PPTX
Normalization_database_EERD_education,presentation.pptx
PPTX
Normalization by Ashwin and Tanmay
PPTX
1-161103092724.pzxsdfdsdrgdrgdfgdfgdfgdfgptx
PPT
Chapter six - Normalization.ppt fundamental of db
PPTX
04 CHAPTER FOUR - INTEGRITY CONSTRAINTS AND NORMALIZATION.pptx
PPTX
Chapter Four Logical Database Design (Normalization).pptx
PPTX
Database.ppt
PPT
Normalization.ppt What is Normalizations
PDF
Ch06-Normalization SDHVFDDGNMFBVMBNCVMNMV
PPT
Normalization
PPTX
Week 6 Normalization
PPT
When & Why\'s of Denormalization
PDF
Lecture no 7 Database. For Bs IT Semester Fall Spring
PPT
Roja128
PPTX
Normalization ppt for RDBMS PPT FOR BCA and for computer science student..pptx
PPTX
Normalization in Database Management System
PPTX
Normalization.pptx
PPTX
Lecture 6.pptx
PPS
Rdbms xp 03
Normalisation - 2nd normal form
Normalization_database_EERD_education,presentation.pptx
Normalization by Ashwin and Tanmay
1-161103092724.pzxsdfdsdrgdrgdfgdfgdfgdfgptx
Chapter six - Normalization.ppt fundamental of db
04 CHAPTER FOUR - INTEGRITY CONSTRAINTS AND NORMALIZATION.pptx
Chapter Four Logical Database Design (Normalization).pptx
Database.ppt
Normalization.ppt What is Normalizations
Ch06-Normalization SDHVFDDGNMFBVMBNCVMNMV
Normalization
Week 6 Normalization
When & Why\'s of Denormalization
Lecture no 7 Database. For Bs IT Semester Fall Spring
Roja128
Normalization ppt for RDBMS PPT FOR BCA and for computer science student..pptx
Normalization in Database Management System
Normalization.pptx
Lecture 6.pptx
Rdbms xp 03
Ad

More from AndrewSilungwe2 (20)

PDF
Logic operators [Slides].pdf introductory
PPT
Microbiology-intro1.ppt introduction in science
PPT
14624698.ppt chemical kinetics introduction
PPTX
1781207887.pptx just the introduction to DM
PPTX
Crude_Drugs_and_Their_Classification_ppt.pptx
PPT
chapter2-Atoms-Molecules-Ions.pp chemistry for pharmacyt
PPTX
21-150618102215-lva1-app6892.pp for introductory purposestx
PPT
lecture_9_drugs_affecting_the_gi_system.ppt
PPT
PM-Anexo4-8-Germany (1).ppt pharmacognosy
PPTX
Unit-1-PharmacognosyC-4-Adulteration-of-crude-drug.pptx
PPTX
CRUDE_DRUG_EVALUATION_1.pptx for the love of crude drugs
PPT
PPT.ppt important for students to start learning drugs
PPTX
evaluationofcrudedrug-190912112928.pptx intro
PPTX
evaluationofcrudedrug-190912112928.pptx pharmacognosy
PPT
PM-Anexo4-8-Germany.ppt pharmacognosy of crude drugs
PPTX
CAM.pptx introductory course in pharmacognosy
PPT
acne.ppt pharmacotherapy for treating acne
PPT
Pharmaceutical Dosage Forms.ppt for pharmacy students
PPT
1- GENERAL PHARMACOLOGY (absorption).ppt
PPTX
INTODUCTION TO CHEMISTRY PHARMACY FIRST YEAR.pptx
Logic operators [Slides].pdf introductory
Microbiology-intro1.ppt introduction in science
14624698.ppt chemical kinetics introduction
1781207887.pptx just the introduction to DM
Crude_Drugs_and_Their_Classification_ppt.pptx
chapter2-Atoms-Molecules-Ions.pp chemistry for pharmacyt
21-150618102215-lva1-app6892.pp for introductory purposestx
lecture_9_drugs_affecting_the_gi_system.ppt
PM-Anexo4-8-Germany (1).ppt pharmacognosy
Unit-1-PharmacognosyC-4-Adulteration-of-crude-drug.pptx
CRUDE_DRUG_EVALUATION_1.pptx for the love of crude drugs
PPT.ppt important for students to start learning drugs
evaluationofcrudedrug-190912112928.pptx intro
evaluationofcrudedrug-190912112928.pptx pharmacognosy
PM-Anexo4-8-Germany.ppt pharmacognosy of crude drugs
CAM.pptx introductory course in pharmacognosy
acne.ppt pharmacotherapy for treating acne
Pharmaceutical Dosage Forms.ppt for pharmacy students
1- GENERAL PHARMACOLOGY (absorption).ppt
INTODUCTION TO CHEMISTRY PHARMACY FIRST YEAR.pptx
Ad

Recently uploaded (20)

PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Computer network topology notes for revision
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
Introduction to Knowledge Engineering Part 1
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Mega Projects Data Mega Projects Data
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPT
Quality review (1)_presentation of this 21
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPT
Miokarditis (Inflamasi pada Otot Jantung)
Qualitative Qantitative and Mixed Methods.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Computer network topology notes for revision
ISS -ESG Data flows What is ESG and HowHow
Introduction to Knowledge Engineering Part 1
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Clinical guidelines as a resource for EBP(1).pdf
Mega Projects Data Mega Projects Data
Acceptance and paychological effects of mandatory extra coach I classes.pptx
[EN] Industrial Machine Downtime Prediction
Quality review (1)_presentation of this 21
Supervised vs unsupervised machine learning algorithms
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
SAP 2 completion done . PRESENTATION.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Miokarditis (Inflamasi pada Otot Jantung)

Normalisation [Slides].pdf introduction language

  • 1. Please do not copy without permission. © ExploreAI 2023. Normalisation Views and normalisation
  • 2. |Normalisation is a database design technique used to organise data in a relational database. It reduces data redundancy and ensures data integrity. Normalisation Views and normalisation The most common normal forms are: 2 Normalisation involves breaking down a large, complex table into smaller, related tables and establishing relationships between them. The process is guided by a set of rules and principles, often described using normal forms. 1. _First Normal Form (1NF)_ 2. _Second Normal Form (2NF)_ 3. _Third Normal Form (3NF)_ and so on, up to higher levels of normalisation. 1NF 2NF 3NF . . . 5NF
  • 3. Normalisation Views and normalisation 3 It helps maintain data integrity by reducing data redundancy and preventing anomalies from operations like insertion. Normalisation reduces data redundancy, which leads to more efficient storage utilisation. It's easier to maintain and update because we don’t need to update multiple places to change a single piece of data. Normalised databases offer scalability with efficient indexing and data integrity, supporting larger datasets. Normalised databases excel in flexible querying, supporting complex queries through table relationships. Advantages 01. 02. 03. 04. 05. Normalised databases may be complex to design and query due to multiple table joins, potentially affecting performance. Query performance may be slow, especially with complex joins and large datasets, resulting in long execution times. While normalised databases are great for reads, they may struggle with writes due to relationship overhead. Normalisation reduces redundancy but adds storage overhead due to primary and foreign key maintenance. Higher normal forms (e.g. 4NF, 5NF) can be challenging and unnecessary, potentially resulting in overly complex designs. 01. 02. 03. 04. 05. Disadvantages
  • 4. |Denormalisation, on the other hand, is a database design technique that intentionally introduces redundancy into a database by combining tables or adding redundant data to one or more tables. Denormalisation Views and normalisation 4 Redundant data offers faster query performance for complex queries and reporting due to reduced join complexity. It simplifies schema structures, aiding query development and maintenance, ideal for reporting and analytics. Denormalisation suits read-heavy scenarios, prioritising faster queries over write complexities. Fewer joins in denormalised databases means simpler SQL queries and reduced risk of performance issues. Advantages 01. 02. 03. 04. Write operations, such as inserts, updates, and deletes, can be slower and more complex due to redundancy. Data integrity is challenging because ensuring consistent updates to redundant data is complex and error-prone. Redundant data consumes more storage space, which can be a significant concern in systems with large datasets. As the database complexity increases, managing schema changes becomes increasingly challenging. 01. 02. 03. 04. Disadvantages
  • 5. | Data anomalies are unexpected occurrences in a dataset that can erode data accuracy, reliability, and usability, eventually undermining data quality for decision-making, analysis, and reporting. Data anomalies Views and normalisation 5 Can arise from a variety of causes, including: 1. Data entry errors: Human-made mistakes like typos or missing data. 2. Data storage inconsistencies: Differences in how data are stored. 3. Integration issues: Merging data from diverse sources with format conflicts. 4. System glitches: Technical problems in data processing. Strategies include: 1. Prevention: a. Normalise data. b. Apply validation rules. c. Implement data entry controls. 2. Mitigation: a. Perform data cleansing. b. Conduct data audits. c. Establish error-handling protocols. Causes Prevention and mitigation
  • 6. 6 Managing data anomalies Views and normalisation Data integration: Integrating data from multiple sources can increase data anomalies. Data integrity: Anomalies can compromise data integrity by violating database rules and constraints. Anomalies vs outliers: Outliers fall outside the expected range but, unlike anomalies, might not imply errors. Data cleaning: Detecting and rectifying anomalies involves data cleaning to correct errors and inconsistencies. Data governance: Vital for preventing, detecting, and addressing data anomalies to uphold data quality. Continuous monitoring: Continuous monitoring and auditing are essential for maintaining anomaly-free data.
  • 7. | An update anomaly occurs in a database when updating a piece of information requires modifying multiple rows or records in a table, and failing to do so can lead to inconsistencies or inaccuracies in the data. Data anomalies – Update anomaly Views and normalisation 7 Let's say Carmel is promoted to Head Chef and the company wants to update this information. Employee_id Name Job_code Job_title State_code Home_state E001 Carmel J01 Chef 26 Cape Town E001 Carmel J02 Waiter 26 Cape Town E002 Stefanie J02 Waiter 56 Joburg E002 Stefanie J03 Bartender 56 Joburg E003 Lisa J01 Chef 5 Nairobi In a denormalised table like this, you would need to update multiple rows with the same Employee_id. If we forget to update one of these rows, it would lead to an inconsistency in the data.
  • 8. | An insertion anomaly occurs when a new record can’t be added without introducing incomplete data because certain required information is not yet available for the entity being represented. Data anomalies – Insertion anomaly Views and normalisation 8 A new employee, John, doesn't yet have a Job_code or State_code. Employee_id Name Job_code Job_title State_code Home_state E001 Carmel J01 Chef 26 Cape Town E002 Stefanie J02 Waiter 56 Joburg E003 Lisa J01 Chef 5 Nairobi In a denormalised table like this, you might be forced to insert incomplete or inaccurate data. This is common where adding a new record may entail duplicating information. E004 John NULL NULL NULL NULL
  • 9. |A deletion anomaly is a type of data anomaly in a database where deleting a single piece of data results in the unintentional loss of related data that are still valid and necessary. Data anomalies – Deletion anomaly Views and normalisation 9 Let's say Lisa decides to leave the company, and her record is removed from the table. Employee_id Name Job_code Job_title State_code Home_state E001 Carmel J01 Chef 26 Cape Town E002 Stefanie J02 Waiter 56 Joburg All information related to Nairobi (State_code 5) was removed from the table. Now, there is no record of any employee in Nairobi, which is not what we intended. E003 Lisa J01 Chef 5 Nairobi
  • 10. ● To normalise a database, one must know what the requirements are for each of the three normal forms that we’ll go over. ● One of the key requirements to remember is that normal forms are progressive. That is, in order to have 3NF we must have 2NF, and in order to have 2NF we must have 1NF. Normalisation Views and normalisation Atomic values: Each column in a table should contain only atomic (indivisible) values. This eliminates repeating groups and ensures data are stored in a tabular format (1NF). 10 Key normalisation principles No partial dependencies: In a table with a composite primary key, non-key attributes should be fully functionally dependent on the entire primary key (2NF). No transitive dependencies: Non-key attributes should not depend on other non-key attributes within the same table (3NF).
  • 11. 1NF is the initial stage of database normalisation. It sets the foundation for organising data in a relational database in a structured and consistent manner. A table is said to be in 1NF if it meets the following criteria: 1. Each cell in the table must not hold more than one value, which is referred to as atomicity. 2. The table must have a primary key for identification. 3. The table should have no duplicated rows or columns. 1. Atomicity: "Atomic" means that a value cannot be divided into smaller parts that have meaning on their own. Atomic values ensure that data are stored at its smallest meaningful level. This prevents the inclusion of arrays, lists, or multiple values within a single cell. 2. Tabular structure: Data must be organised in a tabular structure with rows and columns, where each column represents a distinct attribute, and each cell holds a single atomic value. First Normal Form (1NF) Views and normalisation 11 How 1NF works 1NF addresses data anomalies by enforcing the rule of atomic values and organising data in a structured, non-redundant manner. This helps prevent update anomalies and improve data consistency, ultimately enhancing data quality and reducing the risk of anomalies in the database.
  • 12. _Example:_ Suppose we have the following unnormalised table: First Normal Form (1NF) Views and normalisation 12 Employee_id Name Job_codes Job_titles State_codes Home_states E001 Carmel J01, J02 Chef, Waiter 26, 26 Cape Town E002 Stefanie J02, J03 Waiter, Bartender 56, 56 Joburg E003 Lisa J01 Chef 5 Nairobi In our 1NF table, each column contains atomic values and each row is uniquely identified by the Employee_id. Several columns contain non-atomic, comma-separated values. We need to break down these columns into separate rows, for a more normalised structure. Employee_id Name Job_code Job_title State_code Home_state E001 Carmel J01 Chef 26 Cape Town E001 Carmel J02 Waiter 26 Cape Town E002 Stefanie J02 Waiter 56 Joburg E002 Stefanie J03 Bartender 56 Joburg E003 Lisa J01 Chef 5 Nairobi
  • 13. 2NF addresses data anomalies by eliminating partial dependencies which enhances data integrity, reduces update anomalies, improves insertion and deletion operations, and promotes structured data organisation. It is crucial for maintaining a reliable and efficient relational database. 2NF is the next level of database normalisation. A table is considered to be in 2NF if it meets the following criteria: 1. It is already in 1NF (First Normal Form). 2. It does not contain partial dependencies, which means that non-key attributes are fully functionally dependent on the entire primary key. 1. Identify primary key: Find the unique identifier for each row, which can be a single attribute or a combination. 2. Ensure full dependencies: Non-key attributes must depend entirely on the primary key, with no partial dependencies. 3. Table splitting (if needed): If partial dependencies exist, consider dividing the table into related tables, each with its own primary key. 4. Create relationships: When tables are split, establish links using foreign keys in child tables referencing parent table primary keys for data consistency. Second Normal Form (2NF) Views and normalisation 13 How 2NF works
  • 14. _Example:_ Our 1NF table has a partial dependency issue because the combination of {Employee_id, Job_code} forms the primary key, and non-key attributes (e.g. Job_title) depend only on part of the primary key (Job_code). Second Normal Form (2NF) Views and normalisation 14 To bring this table into 2NF, we can split it into two separate tables that are linked by the Employee_id column: Employee_id Name Job_code Job_title State_code Home_state E001 Carmel J01 Chef 26 Cape Town E001 Carmel J02 Waiter 26 Cape Town E002 Stefanie J02 Waiter 56 Joburg E002 Stefanie J03 Bartender 56 Joburg E003 Lisa J01 Chef 5 Nairobi 1. Employee information. 2. Job-related information. Employee_id Name State_code Home_state E001 Carmel 26 Cape Town E002 Stefanie 56 Joburg E003 Lisa 5 Nairobi Employee_id Job_code Job_title E001 J01 Chef E001 J02 Waiter E002 J02 Waiter E002 J03 Bartender E003 J01 Chef
  • 15. A table is considered to be in 3NF if it satisfies the following criteria: 1. It is already in 2NF. 2. It does not contain transitive dependencies: In 3NF, a table should ensure that non-key attributes (attributes not part of the primary key) are not transitively dependent on the primary key through other non-key attributes. 1. Eliminates transitive dependencies: Transitive dependencies occur when a non-key attribute is dependent on another non-key attribute. 3NF requires that all non-key attributes be directly dependent on the primary key. 2. Enhances data integrity: Maintains data accuracy by preventing non-key attributes from relying on other non-key attributes. 3. Reduces update anomalies: Minimises the risk of unintended data inconsistencies during updates. 4. Organised data: Encourages structured data by linking non-key attributes directly to the primary key. Third Normal Form (3NF) Views and normalisation 15 How 3NF works 3NF addresses data anomalies, particularly those associated with transitive dependencies, by enforcing direct dependencies on the primary key, enhancing data integrity, reducing update anomalies, and promoting structured data organisation.
  • 16. _Example:_ In the employee information table, the primary key is {Employee_id}, and there are no partial dependencies. However, there is a transitive dependency between State_code and Home_state because State_code indirectly determines Home_state through Employee_id. Third Normal Form (3NF) Views and normalisation 16 To achieve 3NF, we would split this table into two separate tables linked by State_code: 1. Employee information. 2. Location information. Employee_id Name State_code Home_state E001 Carmel 26 Cape Town E002 Stefanie 56 Joburg E003 Lisa 5 Nairobi Employee_id Name State_code E001 Carmel 26 E002 Stefanie 56 E003 Lisa 5 State_code Home_state 26 Cape Town 56 Joburg 5 Nairobi