2. What is Normalization?
• Normalization is the process of organizing the data in the database.
• Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate undesirable characteristics like Insertion,
Update, and Deletion Anomalies.
• Normalization divides the larger table into smaller and links them using
relationships.
• The normal form is used to reduce redundancy from the database table.
3. Problems Without Normalization
The main reason for normalizing the relations is removing these anomalies.
Failure to eliminate anomalies leads to data redundancy and can cause data
integrity and other problems as the database grows.
Normalization consists of a series of guidelines that helps to guide you in
creating a good database structure.
If a table is not properly normalized and have data redundancy then it will not
only eat up extra memory space but will also make it difficult to handle and
update the database, without facing data loss.
Insertion, Updation and Deletion Anomalies are very frequent if database is not
normalized.
4. To understand these anomalies let us take an example of a Student table.
rollno name branch hod office_tel
401 Akon CSE Mr. X 53337
402 Bkon CSE Mr. X 53337
403 Ckon CSE Mr. X 53337
404 Dkon CSE Mr. X 53337
5. Example
In the table above, we have data of 4 Computer Sci. students.
As we can see,
data for the fields branch, hod(Head of Department)
office_tel is repeated for the students who are in the same branch in the
college,
this is Data Redundancy.
6. Insertion Anomaly
Suppose for a new admission, until and unless a student opts for a
branch, data of the student cannot be inserted, or else we will have to
set the branch information as NULL.
Also, if we have to insert data of 100 students of same branch, then the
branch information will be repeated for all those 100 students.
These scenarios are nothing but Insertion anomalies.
7. Updation Anomaly
What if Mr. X leaves the college?
or is no longer the HOD of computer science department?
In that case all the student records will have to be updated, and if by
mistake we miss any record, it will lead to data inconsistency.
This is Updation anomaly.
8. Deletion Anomaly
In our Student table, two different information's are kept together,
Student information and Branch information.
Hence, at the end of the academic year, if student records are deleted,
we will also lose the branch information.
This is Deletion anomaly.
9. Normalization Rule
Normalization rules are divided into the following normal forms:
1.First Normal Form
2.Second Normal Form
3.Third Normal Form
10. First Normal Form (1NF)
For a table to be in the First Normal Form, it should follow the
following 4 rules:
1.It should only have single(atomic) valued attributes/columns.
2.Values stored in a column should be of the same domain
3.All the columns in a table should have unique names.
4.And the order in which data is stored, does not matter.
17. Second Normal Form (2NF)
For a table to be in the Second Normal Form
1.It should be in the First Normal form.
2.And, it should not have Partial Dependency.