SlideShare a Scribd company logo
Career Point University,
Kota (Raj.)
Minor 2 Assignment
Topic : Normalization
Presented To: Presented By:
Rohit Singh Wakil Kumar
Assistant Professor UID- K12467
CSE Dept. B.Tech (CSE)
4th Sem / 2nd Year
Normalization
In relational database design,normalizationisthe processof organizingdatatominimize redundancy.
Normalizationusuallyinvolvesdividinga database intotwoor more tables and definingrelationships
betweenthe tables.The objective istoisolate datasothat additions,deletions,andmodificationsof a
fieldcanbe made in justone table andthenpropagatedthroughthe restof the database viathe defined
relationships.
There are three main normal forms, each with increasing levels of normalization:
 First Normal Form(1NF): Each field in a table contains different information. For
example, in an employee list, each table would contain only one birthdate field.
 SecondNormal Form(2NF): No field values can be derived from another field. For
example, if a table already included a birthdate field, it could not also include a birth year field,
since this information would be redundant.
 Third Normal Form (3NF): No duplicate information is permitted. So, for example, if two
tables both require a birthdate field, the birthdate information would be separated into a
separate table, and the two other tables would then access the birthdate information via an
index field in the birthdate table. Any change to a birthdate would automatically be reflect in all
tables that link to the birthdate table.
The underlying ideas in normalization are simple enough. Through normalization we want to
design for our relational database a set of files that (1) contain all the data necessary for the
purposes that the database is to serve, (2) have as little redundancy as possible, (3)
accommodate multiple values for types of data that require them, (4) permit efficient updates
of the data in the database, and (5) avoid the danger of losing data unknowingly.
Normalization can be viewed as a series of steps (i.e., levels) designed, one after another, to
deal with ways in which tables can be "too complicated for their own good". The purpose of
normalization is to reduce the chances for anomalies to occur in a database. The definitions of
the various levels of normalization illustrate complications to be eliminated in order to reduce
the chances of anomalies. At all levels and in every case of a table with a complication, the
resolution of the problem turns out to be the establishment of two or more simpler tables
which, as a group, contain the same information as the original table but which, because of
their simpler individual structures, lack the complication.
To explain this concept we will use a typical business set of data – that commonly found
in the shipment of goods.
The shipment of goods between a consignee (who gets the goods) and a consignor (who sends
them) is also known as consignment, delivery, movement or transport of goods. It typically
involves a third party organization who takes on the role of the freight forwarder. They manage
the scheduling of the required transport (ship, plane, train, truck, etc.) and may supply the
equipment necessary for efficient carriage of the goods. For example, customized containers
for the holds of aircraft or refrigerated sea containers for carrying perishable goods. The
contract of a shipment between the carrier and either the consignee or the consignor is often
established by a document known as a shipping or forwarding instruction or a bill, waybill or bill
of lading.
The following table (Table 1) gives three sample instances of data that may typically be
used for shipping. The first two rows refer to a consignment of goods being shipped by
sea and then road in a shipping container using two carriers. The third row relates to a
separate air freight shipment.
ContractID1 Transport
Mode2
CarrierID3 EquipmentID4 SizeTypeCode5 SealNumber6 SealIssuer7
PONL40078678 Sea P&ONL PONL12345
PONL34567
4040
4020
ABX123456
ABX123457
ABC
ABC
TNT9439287-5 Road TNT PONL34567 4020 ABX123457
GGDFG99
ABC
Customs
1 identifies a shipping contract that allows the supplier to ship goods under specific
freight conditions or the carrier to bill against a specific contract.
2 specifies the method or type of transportation of the shipment. Typically this may be
sea, air or road.
3 the identifier assigned by the agency to the carrier. This identifies the carrier being
used for this stage of the shipment.
4 identifies the information about one set of transport equipment related to the shipment.
The most common example of transport equipment is a shipping container.
5 the size and type of the transport equipment.
6 identifies the seal number of the equipment.
7 which party issues and is responsible for the seal.
180-1234567 Air KE-
Korean
Air Cargo
KAL12345 747-K10 XXX664
Table 1 Shipment Table
We shall use this set of data to demonstrate the principles of normalization.
Of course, we can present this data in any number of formats. Here is an XML instantiation of the same
data.
<Shipment>
<ContractId>PONL40078678</ContractId>
<TransportModeId>Sea</TransportModeId>
<CarrierId>P&ONL</CarrierId>
<EquipmentId>PONL12345</EquipmentId>
<EquipmentId>PONL34567</EquipmentId>
<SizeTypeCode>4040</SizeTypeCode>
<SizeTypeCode>4020</SizeTypeCode>
<SealNumber>ABX123456</ SealNumber>
<SealIssuer>ABC</SealIssuer>
<SealNumber>ABX123457</ SealNumber>
<SealIssuer>ABC</SealIssuer>
</Shipment>
<Shipment>
<ContractId>TNT9439287-5</ContractId>
<TransportModeId>Road</TransportModeId>
<CarrierId>TNT</CarrierId>
<EquipmentId>PONL34567</EquipmentId>
<SizeTypeCode>4020</SizeTypeCode>
<SealNumber>ABX123457</ SealNumber>
<SealIssuer>ABC</SealIssuer>
<SealNumber>GGDFG99</SealNumber>
<SealIssuer>Customs</SealIssuer>
</Shipment>
<Shipment>
<ContractId>180-1234567</ContractId>
<TransportModeId>Air</TransportModeId>
<CarrierId>KE-KoreanAir Cargo</CarrierId>
<EquipmentId>KAL12345</EquipmentId>
<SizeTypeCode>747-K10</SizeTypeCode>
<SealNumber>XXX664</ SealNumber>
</Shipment>
The first thing to note is that the data present is ‘flat’ – we have one table/container called ‘shipment’ and
all attributes or nested elements sit within this one structure. When data is a single flat structure like this, it
is known as being in zero normal form (or de-normalized). The purpose of normalization is to put
structure or ‘depth’ into the data.
The secondthingto be aware of is the primarykey(orkeys) of our data. Withinanysetof data,one or
more valuesmaybe usedto uniquelyidentify aspecificinstance of anentry. For example,aContractID
may be usedto identifypreciselyone row inthe shipmenttable. Soif we have a ContractIDof
“PONL40078678” thenwe should findone,andonlyone,entrywiththisvalue.
However,sometimesasingle value maynotbe sufficientlyindividual todothis.Forexample,itis
possible fordifferentcarrierstouse the same identificationnumbersfortheircontracts. Technically,we
couldhave two“PONL40078678” ContractIDs,one for P&ONLand another for OOCL shipping. There is
no businessconventiontoguardagainstthis. So we may needboththe CarrierIDand the ContractIDto
be sure of uniqueness. Atthisstage,thisparticularissue wouldaddtothe complexityof ourexample,
so we will assume thatContractIDisgood enoughonitsownas a unique key.However,asalways,real
businesspractice shouldbe the guide forthese decisions. Itshouldsuffice tosaythatwhenwe talkof
keyswe meanthe ‘entire’keyorsetof valuesthatcan uniquelyidentifyasingle entryinourdata.
Withthisin mind,the firststepisto progressourdata intoFirst Normal Form.
First NormalForm
The aim of firstnormal form data isto ensure thatall of the attributesare discrete i.e.canonlytake a
single value.Thisisachievedbythe removal of repeatinggroupsintotheirownentities. Forexample,a
large Shipmentmayrequire several ‘equipments’orcontainers. Thismeanswe can have repeating
EquipmentID,SealNumberandSizeTypeCodevaluesineachcell of ourtable. FirstNormal Form says
that these shouldbe separatedintoaseparate table asshownintable 2.
ContractID Transport Mode CarrierID
PONL40078678 Sea P&ONL
TNT9439287-5 Road TNT
180-1234567 Air KE-Korean Air Cargo
Table 2 Shipment Table - 1NF
ContractID EquipmentID SizeTypeCode SealNumber SealIssuer
PONL40078678 PONL12345 4040 ABX123456 ABC
PONL40078678 PONL34567
4020 ABX123457
ABC
TNT9439287-5 PONL34567
4020 ABX123457
GGDFG99
ABC
Customs
180-1234567 KAL12345 747-K10 XXX664
Table 3 ShipmentEquipment Table - 0NF
A quickglance at the secondtable will revealthatwe have includedthe ContractIDinthe secondtable
as well asthe first. Thisis because wheneverwe move elementsintoanew table of theirownwe
include the keyvalue of the original,parenttable. We mustdothisto ensure we retainthe association
betweenthe twopiecesof data. Inrelational modelingthisiscalledthe ‘foreign’key –it’sforeign
because itshome isinthe parenttable.
Anotherlongerglance atthe secondtable will show we still have repeatingvaluesinelements
SealNumberandSealIssuer. Thisisbecause acontainermayhave several sealsattached,eachwithits
ownnumber. Therefore we needtoseparate SealNumberandSealIssuerfromthisnew table,intoa
table of theirown. But before we cando thiswe needto establishthe keyfieldsforthe new
ShipmentEquipmenttable.Onthe face of it,EquipmentIDwouldappearsufficientlyprecise tobe
unique. Infact,international shippingconventionsensure thatcontainernumbersare unique globally.
However,whilstatany particularmomentintime anEquipmentIDwouldbe unique,containersare re-
usedinothershipments. Thisisthe case here,where container“PONL34567” istakenoff a shipand
carriedby road. So ourkeyfor ShipmentEquipmentisboththe ContractIDandthe EquipmentID. We
thenendup withthe following….
ContractID EquipmentID SizeTypeCode
PONL40078678 PONL12345 4040
PONL40078678 PONL34567 4020
TNT9439287-5 PONL34567
4020
180-1234567 KAL12345 747-K10
Table 4 ShipmentEquipment Table - 1NF
ContractID EquipmentID SealNumber SealIssuer
PONL40078678 PONL12345 ABX123456 ABC
PONL40078678 PONL34567 ABX123457 ABC
TNT9439287-5 PONL34567 ABX123457 ABC
TNT9439287-5 PONL34567 GGDFG99 Customs
180-1234567 KAL12345 XXX664
Table 5 ShipmentSeal Table - 1NF
The newtable for ShipmentSeal hasinheritedthe foreignkeyof bothContractIDandEquipmentID.That
isto say, thispiece of equipmentwhenusedinthisshipmenthasthisseal.
Second NormalForm
The aim of secondnormal form data isto splitoff intoseparate tablesanyattributesthatdonotwholly
dependonthe entire key.
For example,whenwe lookcloselyatthe ShipmentEquipmenttable we cansee thatSizeTypeCode does
not dependentirelyonContractIDandEquipmentID(ourtwokeys).
We can say that the size andtype of a containerdependsonthe EquipmentID. Everycontainerhasone
EquipmentIDandone size andtype. “PONL34567” is a 40 footcontainerof standardfeatures. If the
EquipmentIDvalue changed(ie adifferentcontainerwasused),thenwe couldnotbe sure the
SizeTypeCode wouldremainthe same. SizeTypeCode isdependantonthe EquipmentID. The same
cannot be saidfor ContractID. The value of ContractIDcan change withoutaffectingthe SizeTypeCode.
For example,whenthe containeristransferredtothe truckfor road haulage – itssize andtype do not
change.
SecondNormal Formtellsusto separate these attributesthatdon’tdependonthe entire key. Inthis
case itis the SizeTypeCode anditsdependantforeignkey,EquipmentID,thatformanew Equipment
table.
ContractID EquipmentID
PONL40078678 PONL12345
PONL40078678 PONL34567
TNT9439287-5 PONL34567
180-1234567 KAL12345
Table 6 - ShipmentEquipment table - 2 NF
EquipmentID SizeTypeCode
PONL12345 4040
PONL34567 4020
KAL12345 747-K10
Table 7 Equipment table - 2 NF
Third NormalForm
To achieve adata model inThirdNormal Form we mustensure thatall Non-Keyattributesare
independentof one another. ThisissimilartoSecondNormal Form, butnow we focuson the non-key
dependencies.
For example,if we lookatthe ShipmentSealtable,we seethatSealNumberandSealIssuerare not
independentof eachother. Neitherare keysvalues,butthere isadependantrelationshipbetween
them,forexample if the SealIssuerwhere tochange thenthe SealNumberwouldpresumablychange as
well. Sowe mustmove SealIssueranditsdependantforeignkey,SealNumberintoanew table. Inthis
case we shall call itthe Seal table.
ContractID EquipmentID SealNumber
PONL40078678 PONL12345 ABX123456
PONL40078678 PONL34567 ABX123457
TNT9439287-5 PONL34567 ABX123457
TNT9439287-5 PONL34567 GGDFG99
180-1234567 KAL12345 XXX664
Table 8 ShipmentSeal table - 3NF
SealNumber SealIssuer
ABX123456 ABC
ABX123457 ABC
GGDFG99 Customs
XXX664
Table 9 Seal table - 3 NF
Summary
Normalizationisaformal andwell establishedmethodof analyzingdatastructures. If applied
consistently,thenthistechnique willidentifythe logical containers necessaryforbuildingre-usable XML
schemas. It can be usedinconjunctionwithotheranalysistechniquestocompare andrefine data
models.
Whilstitsoriginal purpose wastoorganize datato minimize redundancyandavoiddataduplication,itis
a powerful technique forimprovingthe understandingof the datamodelsnecessaryforre-usable
librariesof componentssuchasUBL.
Reference :
1). Wikipedia
2). Google
3).Books,(H.F. Korth)

More Related Content

PPTX
Normalization in DBMS
PPTX
Sharing names and address cleaning patterns for Patstat
PDF
Understanding about relational database m-square systems inc
DOC
Deadlock- System model, resource types, deadlock problem, deadlock characteri...
PPTX
How to create .exe file into .c file in C Language anyone Program (Marksheet ...
PPTX
Electronic bus ticketing system ppt
KEY
TinyMCE: WYSIWYG editor 2010-12-08
PDF
171dreamweaver
Normalization in DBMS
Sharing names and address cleaning patterns for Patstat
Understanding about relational database m-square systems inc
Deadlock- System model, resource types, deadlock problem, deadlock characteri...
How to create .exe file into .c file in C Language anyone Program (Marksheet ...
Electronic bus ticketing system ppt
TinyMCE: WYSIWYG editor 2010-12-08
171dreamweaver

Viewers also liked (15)

PPTX
Dreamweawer
PDF
Introduction of The Dream Weavers
PPT
Normalization
PDF
WYSIWYG Is a Lie
PPT
Normalization
PDF
Introduction to Dreamweaver
PPTX
Penerbitan video korporat
PPTX
DHTML - Events & Buttons
PPTX
Dreamweaver - Introduction AND WALKTHROUGH
PPTX
Dhtml sohaib ch
PPTX
Web designp pt
PPTX
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
PPT
DBMS - Normalization
PPT
Lecture 04 normalization
PDF
Database design & Normalization (1NF, 2NF, 3NF)
Dreamweawer
Introduction of The Dream Weavers
Normalization
WYSIWYG Is a Lie
Normalization
Introduction to Dreamweaver
Penerbitan video korporat
DHTML - Events & Buttons
Dreamweaver - Introduction AND WALKTHROUGH
Dhtml sohaib ch
Web designp pt
FUNCTION DEPENDENCY AND TYPES & EXAMPLE
DBMS - Normalization
Lecture 04 normalization
Database design & Normalization (1NF, 2NF, 3NF)
Ad

Similar to Report Normalization documents (20)

PDF
Chapter+3+-+Normalization.pdf
PDF
Normalization
PPT
Normalization
PDF
Normalization2
PPTX
normaliztion
PDF
Ch06-Normalization SDHVFDDGNMFBVMBNCVMNMV
PPTX
Database Normalization
PPT
Normalisation - 2nd normal form
PPT
PHP mysql Database normalizatin
PPSX
3N Normalized Databases.ppsx 3N Normalized Databases3N Normalized Databases
PPT
Normalization
PPT
Database management system
PPT
Normalization.ppt What is Normalizations
PDF
Chapter – 4 Normalization and Relational Algebra.pdf
PPTX
Normalization presentation in Database Management System
PPTX
Normalization.pptx
PPT
Normalization.ppt
DOCX
Normalization in relational database management systems
PPT
Normalization.ppt
PPTX
Structured system analysis and design
Chapter+3+-+Normalization.pdf
Normalization
Normalization
Normalization2
normaliztion
Ch06-Normalization SDHVFDDGNMFBVMBNCVMNMV
Database Normalization
Normalisation - 2nd normal form
PHP mysql Database normalizatin
3N Normalized Databases.ppsx 3N Normalized Databases3N Normalized Databases
Normalization
Database management system
Normalization.ppt What is Normalizations
Chapter – 4 Normalization and Relational Algebra.pdf
Normalization presentation in Database Management System
Normalization.pptx
Normalization.ppt
Normalization in relational database management systems
Normalization.ppt
Structured system analysis and design
Ad

More from Wakil Kumar (7)

PPTX
Mcq android app
PPT
Miniature Circuit Breakers (MCB) By Chhotray Tiyu
PPTX
Stepdown Transformer By Nagmani
PPTX
function (mal120) By Wakil Kumar
PPT
Line balancing ppt By Wakil Kumar
PPTX
RLC series circuit simulation at Proteus Wakil Kumar
PPTX
RLC series circuit simulation at Proteus
Mcq android app
Miniature Circuit Breakers (MCB) By Chhotray Tiyu
Stepdown Transformer By Nagmani
function (mal120) By Wakil Kumar
Line balancing ppt By Wakil Kumar
RLC series circuit simulation at Proteus Wakil Kumar
RLC series circuit simulation at Proteus

Recently uploaded (20)

PPTX
Construction Project Organization Group 2.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Sustainable Sites - Green Building Construction
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Lecture Notes Electrical Wiring System Components
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPT
Mechanical Engineering MATERIALS Selection
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
CH1 Production IntroductoryConcepts.pptx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
Welding lecture in detail for understanding
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Construction Project Organization Group 2.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Model Code of Practice - Construction Work - 21102022 .pdf
Sustainable Sites - Green Building Construction
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
bas. eng. economics group 4 presentation 1.pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Lecture Notes Electrical Wiring System Components
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Mechanical Engineering MATERIALS Selection
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
CH1 Production IntroductoryConcepts.pptx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
Welding lecture in detail for understanding
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
R24 SURVEYING LAB MANUAL for civil enggi
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026

Report Normalization documents

  • 1. Career Point University, Kota (Raj.) Minor 2 Assignment Topic : Normalization Presented To: Presented By: Rohit Singh Wakil Kumar Assistant Professor UID- K12467 CSE Dept. B.Tech (CSE) 4th Sem / 2nd Year
  • 2. Normalization In relational database design,normalizationisthe processof organizingdatatominimize redundancy. Normalizationusuallyinvolvesdividinga database intotwoor more tables and definingrelationships betweenthe tables.The objective istoisolate datasothat additions,deletions,andmodificationsof a fieldcanbe made in justone table andthenpropagatedthroughthe restof the database viathe defined relationships. There are three main normal forms, each with increasing levels of normalization:  First Normal Form(1NF): Each field in a table contains different information. For example, in an employee list, each table would contain only one birthdate field.  SecondNormal Form(2NF): No field values can be derived from another field. For example, if a table already included a birthdate field, it could not also include a birth year field, since this information would be redundant.  Third Normal Form (3NF): No duplicate information is permitted. So, for example, if two tables both require a birthdate field, the birthdate information would be separated into a separate table, and the two other tables would then access the birthdate information via an index field in the birthdate table. Any change to a birthdate would automatically be reflect in all tables that link to the birthdate table. The underlying ideas in normalization are simple enough. Through normalization we want to design for our relational database a set of files that (1) contain all the data necessary for the purposes that the database is to serve, (2) have as little redundancy as possible, (3) accommodate multiple values for types of data that require them, (4) permit efficient updates of the data in the database, and (5) avoid the danger of losing data unknowingly. Normalization can be viewed as a series of steps (i.e., levels) designed, one after another, to deal with ways in which tables can be "too complicated for their own good". The purpose of normalization is to reduce the chances for anomalies to occur in a database. The definitions of the various levels of normalization illustrate complications to be eliminated in order to reduce the chances of anomalies. At all levels and in every case of a table with a complication, the resolution of the problem turns out to be the establishment of two or more simpler tables which, as a group, contain the same information as the original table but which, because of their simpler individual structures, lack the complication. To explain this concept we will use a typical business set of data – that commonly found in the shipment of goods.
  • 3. The shipment of goods between a consignee (who gets the goods) and a consignor (who sends them) is also known as consignment, delivery, movement or transport of goods. It typically involves a third party organization who takes on the role of the freight forwarder. They manage the scheduling of the required transport (ship, plane, train, truck, etc.) and may supply the equipment necessary for efficient carriage of the goods. For example, customized containers for the holds of aircraft or refrigerated sea containers for carrying perishable goods. The contract of a shipment between the carrier and either the consignee or the consignor is often established by a document known as a shipping or forwarding instruction or a bill, waybill or bill of lading. The following table (Table 1) gives three sample instances of data that may typically be used for shipping. The first two rows refer to a consignment of goods being shipped by sea and then road in a shipping container using two carriers. The third row relates to a separate air freight shipment. ContractID1 Transport Mode2 CarrierID3 EquipmentID4 SizeTypeCode5 SealNumber6 SealIssuer7 PONL40078678 Sea P&ONL PONL12345 PONL34567 4040 4020 ABX123456 ABX123457 ABC ABC TNT9439287-5 Road TNT PONL34567 4020 ABX123457 GGDFG99 ABC Customs 1 identifies a shipping contract that allows the supplier to ship goods under specific freight conditions or the carrier to bill against a specific contract. 2 specifies the method or type of transportation of the shipment. Typically this may be sea, air or road. 3 the identifier assigned by the agency to the carrier. This identifies the carrier being used for this stage of the shipment. 4 identifies the information about one set of transport equipment related to the shipment. The most common example of transport equipment is a shipping container. 5 the size and type of the transport equipment. 6 identifies the seal number of the equipment. 7 which party issues and is responsible for the seal.
  • 4. 180-1234567 Air KE- Korean Air Cargo KAL12345 747-K10 XXX664 Table 1 Shipment Table We shall use this set of data to demonstrate the principles of normalization. Of course, we can present this data in any number of formats. Here is an XML instantiation of the same data. <Shipment> <ContractId>PONL40078678</ContractId> <TransportModeId>Sea</TransportModeId> <CarrierId>P&ONL</CarrierId> <EquipmentId>PONL12345</EquipmentId> <EquipmentId>PONL34567</EquipmentId> <SizeTypeCode>4040</SizeTypeCode> <SizeTypeCode>4020</SizeTypeCode> <SealNumber>ABX123456</ SealNumber> <SealIssuer>ABC</SealIssuer> <SealNumber>ABX123457</ SealNumber> <SealIssuer>ABC</SealIssuer> </Shipment> <Shipment> <ContractId>TNT9439287-5</ContractId> <TransportModeId>Road</TransportModeId> <CarrierId>TNT</CarrierId> <EquipmentId>PONL34567</EquipmentId> <SizeTypeCode>4020</SizeTypeCode>
  • 5. <SealNumber>ABX123457</ SealNumber> <SealIssuer>ABC</SealIssuer> <SealNumber>GGDFG99</SealNumber> <SealIssuer>Customs</SealIssuer> </Shipment> <Shipment> <ContractId>180-1234567</ContractId> <TransportModeId>Air</TransportModeId> <CarrierId>KE-KoreanAir Cargo</CarrierId> <EquipmentId>KAL12345</EquipmentId> <SizeTypeCode>747-K10</SizeTypeCode> <SealNumber>XXX664</ SealNumber> </Shipment> The first thing to note is that the data present is ‘flat’ – we have one table/container called ‘shipment’ and all attributes or nested elements sit within this one structure. When data is a single flat structure like this, it is known as being in zero normal form (or de-normalized). The purpose of normalization is to put structure or ‘depth’ into the data. The secondthingto be aware of is the primarykey(orkeys) of our data. Withinanysetof data,one or more valuesmaybe usedto uniquelyidentify aspecificinstance of anentry. For example,aContractID may be usedto identifypreciselyone row inthe shipmenttable. Soif we have a ContractIDof “PONL40078678” thenwe should findone,andonlyone,entrywiththisvalue. However,sometimesasingle value maynotbe sufficientlyindividual todothis.Forexample,itis possible fordifferentcarrierstouse the same identificationnumbersfortheircontracts. Technically,we couldhave two“PONL40078678” ContractIDs,one for P&ONLand another for OOCL shipping. There is no businessconventiontoguardagainstthis. So we may needboththe CarrierIDand the ContractIDto be sure of uniqueness. Atthisstage,thisparticularissue wouldaddtothe complexityof ourexample, so we will assume thatContractIDisgood enoughonitsownas a unique key.However,asalways,real businesspractice shouldbe the guide forthese decisions. Itshouldsuffice tosaythatwhenwe talkof keyswe meanthe ‘entire’keyorsetof valuesthatcan uniquelyidentifyasingle entryinourdata. Withthisin mind,the firststepisto progressourdata intoFirst Normal Form.
  • 6. First NormalForm The aim of firstnormal form data isto ensure thatall of the attributesare discrete i.e.canonlytake a single value.Thisisachievedbythe removal of repeatinggroupsintotheirownentities. Forexample,a large Shipmentmayrequire several ‘equipments’orcontainers. Thismeanswe can have repeating EquipmentID,SealNumberandSizeTypeCodevaluesineachcell of ourtable. FirstNormal Form says that these shouldbe separatedintoaseparate table asshownintable 2. ContractID Transport Mode CarrierID PONL40078678 Sea P&ONL TNT9439287-5 Road TNT 180-1234567 Air KE-Korean Air Cargo Table 2 Shipment Table - 1NF ContractID EquipmentID SizeTypeCode SealNumber SealIssuer PONL40078678 PONL12345 4040 ABX123456 ABC PONL40078678 PONL34567 4020 ABX123457 ABC TNT9439287-5 PONL34567 4020 ABX123457 GGDFG99 ABC Customs 180-1234567 KAL12345 747-K10 XXX664 Table 3 ShipmentEquipment Table - 0NF A quickglance at the secondtable will revealthatwe have includedthe ContractIDinthe secondtable as well asthe first. Thisis because wheneverwe move elementsintoanew table of theirownwe include the keyvalue of the original,parenttable. We mustdothisto ensure we retainthe association betweenthe twopiecesof data. Inrelational modelingthisiscalledthe ‘foreign’key –it’sforeign because itshome isinthe parenttable. Anotherlongerglance atthe secondtable will show we still have repeatingvaluesinelements SealNumberandSealIssuer. Thisisbecause acontainermayhave several sealsattached,eachwithits ownnumber. Therefore we needtoseparate SealNumberandSealIssuerfromthisnew table,intoa table of theirown. But before we cando thiswe needto establishthe keyfieldsforthe new ShipmentEquipmenttable.Onthe face of it,EquipmentIDwouldappearsufficientlyprecise tobe unique. Infact,international shippingconventionsensure thatcontainernumbersare unique globally. However,whilstatany particularmomentintime anEquipmentIDwouldbe unique,containersare re- usedinothershipments. Thisisthe case here,where container“PONL34567” istakenoff a shipand
  • 7. carriedby road. So ourkeyfor ShipmentEquipmentisboththe ContractIDandthe EquipmentID. We thenendup withthe following…. ContractID EquipmentID SizeTypeCode PONL40078678 PONL12345 4040 PONL40078678 PONL34567 4020 TNT9439287-5 PONL34567 4020 180-1234567 KAL12345 747-K10 Table 4 ShipmentEquipment Table - 1NF ContractID EquipmentID SealNumber SealIssuer PONL40078678 PONL12345 ABX123456 ABC PONL40078678 PONL34567 ABX123457 ABC TNT9439287-5 PONL34567 ABX123457 ABC TNT9439287-5 PONL34567 GGDFG99 Customs 180-1234567 KAL12345 XXX664 Table 5 ShipmentSeal Table - 1NF The newtable for ShipmentSeal hasinheritedthe foreignkeyof bothContractIDandEquipmentID.That isto say, thispiece of equipmentwhenusedinthisshipmenthasthisseal. Second NormalForm The aim of secondnormal form data isto splitoff intoseparate tablesanyattributesthatdonotwholly dependonthe entire key. For example,whenwe lookcloselyatthe ShipmentEquipmenttable we cansee thatSizeTypeCode does not dependentirelyonContractIDandEquipmentID(ourtwokeys).
  • 8. We can say that the size andtype of a containerdependsonthe EquipmentID. Everycontainerhasone EquipmentIDandone size andtype. “PONL34567” is a 40 footcontainerof standardfeatures. If the EquipmentIDvalue changed(ie adifferentcontainerwasused),thenwe couldnotbe sure the SizeTypeCode wouldremainthe same. SizeTypeCode isdependantonthe EquipmentID. The same cannot be saidfor ContractID. The value of ContractIDcan change withoutaffectingthe SizeTypeCode. For example,whenthe containeristransferredtothe truckfor road haulage – itssize andtype do not change. SecondNormal Formtellsusto separate these attributesthatdon’tdependonthe entire key. Inthis case itis the SizeTypeCode anditsdependantforeignkey,EquipmentID,thatformanew Equipment table. ContractID EquipmentID PONL40078678 PONL12345 PONL40078678 PONL34567 TNT9439287-5 PONL34567 180-1234567 KAL12345 Table 6 - ShipmentEquipment table - 2 NF EquipmentID SizeTypeCode PONL12345 4040 PONL34567 4020 KAL12345 747-K10 Table 7 Equipment table - 2 NF Third NormalForm To achieve adata model inThirdNormal Form we mustensure thatall Non-Keyattributesare independentof one another. ThisissimilartoSecondNormal Form, butnow we focuson the non-key dependencies.
  • 9. For example,if we lookatthe ShipmentSealtable,we seethatSealNumberandSealIssuerare not independentof eachother. Neitherare keysvalues,butthere isadependantrelationshipbetween them,forexample if the SealIssuerwhere tochange thenthe SealNumberwouldpresumablychange as well. Sowe mustmove SealIssueranditsdependantforeignkey,SealNumberintoanew table. Inthis case we shall call itthe Seal table. ContractID EquipmentID SealNumber PONL40078678 PONL12345 ABX123456 PONL40078678 PONL34567 ABX123457 TNT9439287-5 PONL34567 ABX123457 TNT9439287-5 PONL34567 GGDFG99 180-1234567 KAL12345 XXX664 Table 8 ShipmentSeal table - 3NF SealNumber SealIssuer ABX123456 ABC ABX123457 ABC GGDFG99 Customs XXX664 Table 9 Seal table - 3 NF Summary Normalizationisaformal andwell establishedmethodof analyzingdatastructures. If applied consistently,thenthistechnique willidentifythe logical containers necessaryforbuildingre-usable XML schemas. It can be usedinconjunctionwithotheranalysistechniquestocompare andrefine data models. Whilstitsoriginal purpose wastoorganize datato minimize redundancyandavoiddataduplication,itis a powerful technique forimprovingthe understandingof the datamodelsnecessaryforre-usable librariesof componentssuchasUBL.
  • 10. Reference : 1). Wikipedia 2). Google 3).Books,(H.F. Korth)