SlideShare a Scribd company logo
UNIT 2 DATABASE DESIGN 9
Database design & E-R Model: Entity–Relationship model (E-R model)–E-R Diagrams-
Constraints-Extended E-R features. Introduction to Relational Model: Database schema–Keys-
Schema Diagrams – Relational Query languages – Relational Operations
ER (Entity Relationship) Diagram in DBMS
o ER model stands for an Entity-Relationship model. It is a high-level data model. This
model is used to define the data elements and relationship for a specified system.
o It develops a conceptual design for the database. It also develops a very simple and easy
to design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-
relationship diagram.
For example, Suppose we design a school database. In this database, the student will be an
entity with attributes like address, name, id, age, etc. The address can be another entity with
attributes like city, street name, pin code, etc and there will be a relationship between them.
Component of ER Diagram
1. Entity:
An entity may be any object, class, person or place. In the ER diagram, an entity can be
represented as rectangles.
Consider an organization as an example- manager, product, employee, department etc. can be
taken as an entity.
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak entity doesn't contain
any key attribute of its own. The weak entity is represented by a double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used to represent an
attribute.
For example, id, age, contact number, name, etc. can be attributes of a student.
a. Key Attribute
The key attribute is used to represent the main characteristics of an entity. It represents a primary
key. The key attribute is represented by an ellipse with the text underlined.
b. Composite Attribute
An attribute that composed of many other attributes is known as a composite attribute. The
composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse.
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as a multivalued attribute.
The double oval is used to represent multivalued attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
An attribute that can be derived from other attribute is known as a derived attribute. It can be
represented by a dashed ellipse.
For example, A person's age changes over time and can be derived from another attribute like
Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond or rhombus is used to
represent the relationship.
Types of relationship are as follows:
a. One-to-One Relationship
When only one instance of an entity is associated with the relationship, then it is known as one to
one relationship.
For example, A female can marry to one male, and a male can marry to one female.
b. One-to-many relationship
When only one instance of the entity on the left, and more than one instance of an entity on the
right associates with the relationship then this is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is done by the only
specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one instance of an entity on the
right associates with the relationship then it is known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one instance of an entity on
the right associates with the relationship then it is known as a many-to-many relationship.
For example, Employee can assign by many projects and project can have many employees.
Notation of ER diagram
Database can be represented using the notations. In ER diagram, many notations are used
to express the cardinality. These notations are as follows:
Fig: Notations of ER diagram
Constraints
Mapping Constraints
o A mapping constraint is a data constraint that expresses the number of entities to which
another entity can be related via a relationship set.
o It is most useful in describing the relationship sets that involve more than two entity sets.
o For binary relationship set R on an entity set A and B, there are four possible mapping
cardinalities. These are as follows:
1. One to one (1:1)
2. One to many (1:M)
3. Many to one (M:1)
4. Many to many (M:M)
One-to-one
In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and an
entity in E2 is associated with at most one entity in E1.
One-to-many
In one-to-many mapping, an entity in E1 is associated with any number of entities in E2,
and an entity in E2 is associated with at most one entity in E1.
Many-to-one
In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and
an entity in E2 is associated with any number of entities in E1.
Many-to-many
In many-to-many mapping, an entity in E1 is associated with any number of entities in
E2, and an entity in E2 is associated with any number of entities in E1.
Constraints
In a database table, we can add rules to a column known as constraints. These rules control the
data that can be stored in a column.
For example, if a column has NOT NULL constraint, it means the column cannot
store NULL values.
The constraints used in SQL are:
Constraint Description
NOT NULL values cannot be null
UNIQUE values cannot match any older value
PRIMARY KEY used to uniquely identify a row
FOREIGN KEY references a row in another table
CHECK validates condition for new value
DEFAULT set default value if not passed
CREATE INDEX used to speedup the read process
Note: These constraints are also called integrity constraints.
NOT NULL Constraint
The NOT NULL constraint in a column means that the column cannot store NULL values. For
example,
CREATE TABLE Colleges (
college_id INT NOT NULL,
college_code VARCHAR(20) NOT NULL,
college_name VARCHAR(50)
);
Run Code
Here, the college_id and the college_code columns of the Colleges table won't
allow NULL values.
UNIQUE Constraint
The UNIQUE constraint in a column means that the column must have unique value. For
example,
CREATE TABLE Colleges (
college_id INT NOT NULL UNIQUE,
college_code VARCHAR(20) UNIQUE,
college_name VARCHAR(50)
);
Run Code
Here, the value of the college_code column must be unique. Similarly, the value
of college_id must be unique as well as it cannot store NULL values.
PRIMARY KEY Constraint
The PRIMARY KEY constraint is simply a combination of NOT
NULL and UNIQUE constraints. It means that the column value is used to uniquely identify the
row. For example,
CREATE TABLE Colleges (
college_id INT PRIMARY KEY,
college_code VARCHAR(20) NOT NULL,
college_name VARCHAR(50)
);
Run Code
Here, the value of the college_id column is a unique identifier for a row. Similarly, it cannot
store NULL value and must be UNIQUE.
FOREIGN KEY Constraint
The FOREIGN KEY (REFERENCES in some databases) constraint in a column is used to
reference a record that exists in another table. For example,
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
customer_id int REFERENCES Customers(id)
);
Run Code
Here, the value of the college_code column references the row in another table
named Customers.
It means that the value of customer_id in the Orders table must be a value from the id column of
the Customers table.
CHECK Constraint
The CHECK constraint checks the condition before allowing values in a table. For example,
CREATE TABLE Orders (
order_id INT PRIMARY KEY,
amount int CHECK (amount >= 100)
);
Run Code
Here, the value of the amount column must be greater than or equal to 100. If not, the SQL
statement results in an error.
DEFAULT Constraint
The DEFAULT constraint is used to set the default value if we try to store NULL in a column.
For example,
CREATE TABLE College (
college_id INT PRIMARY KEY,
college_code VARCHAR(20),
college_country VARCHAR(20) DEFAULT 'US'
);
Run Code
Here, the default value of the college_country column is US.
If we try to store the NULL value in the college_country column, its value will be US.
CREATE INDEX Constraint
If a column has CREATE INDEX constraint, it's faster to retrieve data if we use that column for
data retrieval. For example,
-- create table
CREATE TABLE Colleges (
college_id INT PRIMARY KEY,
college_code VARCHAR(20) NOT NULL,
college_name VARCHAR(50)
);
-- create index
CREATE INDEX college_index
ON Colleges(college_code);
Run Code
Here, the SQL command creates an index named customers_index on the Customers table
using customer_id column.
Extended or Enhanced ER model in DBMS
Extended ER is a high-level data model that incorporates the extensions to the original ER
model. Enhanced ER models are high level models that represent the requirements and
complexities of complex databases.
The extended Entity Relationship (ER) models are three types as given below −
 Aggregation
 Specialization
 Generalization
Specialization
The process of designing sub groupings within an entity set is called specialization. It is a top-
down process. If an entity set is given with all the attributes in which the instances of the entity
set are differentiated according to the given attribute value, then that sub-classes or the sub-
entity sets can be formed from the given attribute.
Example
Specialization of a person allows us to distinguish a person according to whether they are
employees or customers. Specialization of account creates two entity sets: savings account and
current account.
In the E-R diagram specialization is represented by triangle components labeled ISA. The ISA
relationship is referred as superclass- subclass relationship as shown below −
Generalization
It is the reverse process of specialization. It is a bottom-up approach.
It converts subclasses to superclasses. This process combines a number of entity sets that share
the same features into higher-level entity sets.
If the sub-class information is given for the given entity set then, ISA relationship type will be
used to represent the connectivity between the subclass and superclass as shown below −
Example
Aggregation
It is an abstraction in which relationship sets are treated as higher level entity sets and can
participate in relationships. Aggregation allows us to indicate that a relationship set participates
in another relationship set.
Aggregation is used to simplify the details of a given database where ternary relationships will
be changed into binary relationships. Ternary relation is only one type of relationship which is
working between three entities.
Aggregation is shown in the image below −
Introduction to Relational Model
The relational Model was proposed by E.F. Codd to model data in the form of relations or
tables. After designing the conceptual model of the Database using ER diagram, we need to
convert the conceptual model into a relational model which can be implemented using any
RDBMS language like Oracle SQL, MySQL, etc. So we will see what the Relational Model is.
What is the Relational Model?
The relational model represents how data is stored in Relational Databases. A relational
database stores data in the form of relations (tables). Consider a relation STUDENT with
attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE shown in Table 1.
STUDENT
ROLL_NO NAME ADDRESS PHONE AGE
1 RAM DELHI 9455123451 18
2 RAMESH GURGAON 9652431543 18
3 SUJIT ROHTAK 9156253131 20
4 SURESH DELHI 18
IMPORTANT TERMINOLOGIES
 Attribute: Attributes are the properties that define a relation. e.g.; ROLL_NO, NAME
 Relation Schema: A relation schema represents the name of the relation with its attributes.
e.g.; STUDENT (ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation
schema for STUDENT. If a schema has more than 1 relation, it is called Relational
Schema.
 Tuple: Each row in the relation is known as a tuple. The above relation contains 4 tuples,
one of which is shown as:
1 RAM DELHI 9455123451 18
 Relation Instance: The set of tuples of a relation at a particular instance of time is called a
relation instance. Table 1 shows the relation instance of STUDENT at a particular time. It
can change whenever there is an insertion, deletion, or update in the database.
 Degree: The number of attributes in the relation is known as the degree of the relation.
The STUDENT relation defined above has degree 5.
 Cardinality: The number of tuples in a relation is known as cardinality.
The STUDENT relation defined above has cardinality 4.
 Column: The column represents the set of values for a particular attribute. The
column ROLL_NO is extracted from the relation STUDENT.
ROLL_NO
1
2
3
4
 NULL Values: The value which is not known or unavailable is called a NULL value. It is
represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL.
Constraints in Relational Model
While designing the Relational Model, we define some conditions which must hold for data
present in the database are called Constraints. These constraints are checked before performing
any operation (insertion, deletion, and updation ) in the database. If there is a violation of any
of the constraints, the operation will fail.
Domain Constraints: These are attribute-level constraints. An attribute can only take values
that lie inside the domain range. e.g; If a constraint AGE>0 is applied to STUDENT relation,
inserting a negative value of AGE will result in failure.
Key Integrity: Every relation in the database should have at least one set of attributes that
defines a tuple uniquely. Those set of attributes is called keys. e.g.; ROLL_NO in STUDENT
is a key. No two students can have the same roll number. So a key has two properties:
 It should be unique for all tuples.
 It can’t have NULL values.
Referential Integrity: When one attribute of a relation can only take values from another
attribute of the same relation or any other relation, it is called referential integrity. Let us
suppose we have 2 relations
STUDENT
ROLL_NO NAME ADDRESS PHONE AGE BRANCH_CODE
1 RAM DELHI 9455123451 18 CS
2 RAMESH GURGAON 9652431543 18 CS
3 SUJIT ROHTAK 9156253131 20 ECE
4 SURESH DELHI 18 IT
BRANCH
BRANCH_CODE BRANCH_NAME
CS COMPUTER SCIENCE
IT INFORMATION TECHNOLOGY
ECE ELECTRONICS AND COMMUNICATION ENGINEERING
CV CIVIL ENGINEERING
BRANCH_CODE of STUDENT can only take the values which are present in
BRANCH_CODE of BRANCH which is called referential integrity constraint. The relation
which is referencing another relation is called REFERENCING RELATION (STUDENT in
this case) and the relation to which other relations refer is called REFERENCED RELATION
(BRANCH in this case).
ANOMALIES
An anomaly is an irregularity or something which deviates from the expected or normal state.
When designing databases, we identify three types of anomalies: Insert, Update and Delete.
Insertion Anomaly in Referencing Relation:
We can’t insert a row in REFERENCING RELATION if referencing attribute’s value is not
present in the referenced attribute value. e.g.; Insertion of a student with BRANCH_CODE
‘ME’ in STUDENT relation will result in an error because ‘ME’ is not present in
BRANCH_CODE of BRANCH.
Deletion/ Updation Anomaly in Referenced Relation:
We can’t delete or update a row from REFERENCED RELATION if the value of
REFERENCED ATTRIBUTE is used in the value of REFERENCING ATTRIBUTE. e.g; if
we try to delete a tuple from BRANCH having BRANCH_CODE ‘CS’, it will result in an error
because ‘CS’ is referenced by BRANCH_CODE of STUDENT, but if we try to delete the row
from BRANCH with BRANCH_CODE CV, it will be deleted as the value is not been used by
referencing relation. It can be handled by the following method:
ON DELETE CASCADE: It will delete the tuples from REFERENCING RELATION if the
value used by REFERENCING ATTRIBUTE is deleted from REFERENCED RELATION.
e.g; For, if we delete a row from BRANCH with BRANCH_CODE ‘CS’, the rows in
STUDENT relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be
deleted.
ON UPDATE CASCADE: It will update the REFERENCING ATTRIBUTE in
REFERENCING RELATION if the attribute value used by REFERENCING ATTRIBUTE is
updated in REFERENCED RELATION. e.g;, if we update a row from BRANCH with
BRANCH_CODE ‘CS’ to ‘CSE’, the rows in STUDENT relation with BRANCH_CODE CS
(ROLL_NO 1 and 2 in this case) will be updated with BRANCH_CODE ‘CSE’.
SUPER KEYS:
Any set of attributes that allows us to identify unique rows (tuples) in a given relationship is
known as super keys. Out of these super keys, we can always choose a proper subset among
these which can be used as a primary key. Such keys are known as Candidate keys. If there is a
combination of two or more attributes that are being used as the primary key then we call it a
Composite key.
Advantages:
 Simple model
 It is Flexible
 It is Secure
 Data accuracy
 Data integrity
 Operations can be applied easily
Disadvantage:
 Not good for large database
 Relation between tables become difficult some time
Basic Operators in Relational Algebra
Article Contributed by Sonal Tuteja. Please write comments if you find anything incorrect, or
if you want to share more information about the topic discussed above
Database Schema
A database schema is a structure that represents the logical storage of the data in a
database. It represents the organization of data and provides information about the relationships
between the tables in a given database. In this topic, we will understand more about database
schema and its types. Before understanding database schema, lets first understand what a
Database is.
What is Database?
A database is a place to store information. It can store the simplest data, such as a list of people
as well as the most complex data. The database stores the information in a well-structured
format.
What is Database Schema?
A database schema is the logical representation of a database, which shows how the data is
stored logically in the entire database. It contains list of attributes and instruction that informs the
database engine that how the data is organized and how the elements are related to each other.
A database schema contains schema objects that may include tables, fields, packages, views,
relationships, primary key, foreign key,
In actual, the data is physically stored in files that may be in unstructured form, but to retrieve it
and use it, we need to put it in a structured form. To do this, a database schema is used. It
provides knowledge about how the data is organized in a database and how it is associated with
other data.
The schema does not physically contain the data itself; instead, it gives information about the
shape of data and how it can be related to other tables or models.
A database schema object includes the following:
Consistent formatting for all data entries.
Database objects and unique keys for all data entries.
Tables with multiple columns, and each column contains its name and datatype.
The complexity & the size of the schema vary as per the size of the project. It helps developers to
easily manage and structure the database before coding it.
The given diagram is an example of a database schema. It contains three tables, their data types.
This also represents the relationships between the tables and primary keys as well as foreign
keys.
Types of Database Schema
The database schema is divided into three types, which are:
1. Logical Schema
2. Physical Schema
3. View Schema
1. Physical Database Schema
A physical database schema specifies how the data is stored physically on a storage system or
disk storage in the form of Files and Indices. Designing a database at the physical level is called
a physical schema.
2. Logical Database Schema
The Logical database schema specifies all the logical constraints that need to be applied to the
stored data. It defines the views, integrity constraints, and table. Here the term integrity
constraints define the set of rules that are used by DBMS (Database Management System) to
maintain the quality for insertion & update the data. The logical schema represents how the data
is stored in the form of tables and how the attributes of a table are linked together.
At this level, programmers and administrators work, and the implementation of the data structure
is hidden at this level.
Various tools are used to create a logical database schema, and these tools demonstrate the
relationships between the component of your data; this process is called ER modelling.
The ER modelling stands for entity-relationship modelling, which specifies the relationships
between different entities.
We can understand it with an example of a basic commerce application. Below is the schema
diagram, the simple ER model representing the logical flow of transaction in a commerce
application.
In the given example, the Ids are given in each circle, and these Ids are primary key & foreign
keys.
The primary key is used to uniquely identify the entry in a document or record. The Ids of the
upper three circles are the primary keys.
The Foreign key is used as the primary key for other tables. The FK represent the foreign key in
the diagram. It relates one table to another table.
3. View Schema
The view level design of a database is known as view schema. This schema generally describes
the end-user interaction with the database systems.
Difference between the Physical and Logical Database Schema
Physical database schema Logical Database schema
It does not include the attributes. It includes the attributes.
It contains both primary & secondary
Keys.
It also contains both primary &
secondary keys.
It contains the table name. It contains the names of the tables.
It contains the column names and
their data types.
It does not contain any column name
or datatype.
Database Instance or Database Schema is the same?
The terms database schema and database instances are related to each other & sometimes
confusing to be used as the same thing. But both are different from each other.
Database Schema is a representation of a planned database and does not actually contain the
data.
On the other hand, a database instance is a type of snapshot of an actual database as it existed at
an instance of time. Hence it varies or can be changed as per the time. In contrast, the database
schema is static and very complex to change the structure of a database.
Both instances and schemas are related to and impact each other through the DBMS. DBMS
ensures that every database instance complies with the constraints imposed by the database
designers in the database schema.
Creating Schema
To create a schema, "CREATE SCHEMA" Statements is used in each type of database. But each
DBMS has a different meaning for this. Below we are explaining creating schema in different
database systems:
1. MySQL
In MySQL, the "CREATE SCHEMA" statement creates the database. It is because, in MySQL,
the CREATE SCHEMA statement is similar to CREATE DATABASE statement, and schema is
a synonym for the database.
2. Oracle Database
In Oracle Database, each schema is already present with each database user. Hence CREATE
SCHEMA does not actually create a schema; rather, it helps to show the schema with tables and
views and allows to access those objects without requiring multiple SQL statements for multiple
transactions. The "CREATE USER" statement is used to create a schema in Oracle.
3. SQL Server
In the SQL server, the "CREATE SCHEMA" statement creates a new schema with the name
provided by the user.
Database Schema Designs
A schema design is the first step in building a foundation in data management. Ineffective
schema designs are difficult to manage and consume more memory and other resources. It
logically depends on the business requirements. It is required to choose the correct database
schema design to make ease in the project lifecycle. The list of some popular database schema
designs is given below:
o Flat Model
o Hierarchical Model
o Network Model
o Relational Model
o Star Schema
o Snowflake Schema
Flat Model
A flat model schema is a type of 2-D array in which each column contains the same type of data,
and elements within a row are related to each other. It can be understood as a single spreadsheet
or a database table with no relations. This schema design is most suitable for small applications
that don't contain complex data.
Hierarchical Model
The Hierarchical model design contains a tree-like structure. The tree structure contains the root
node of data and its child nodes. Between each child node and parent node, there is a one-to-
many relationship. Such type of database schemas is presented by XML or JSON files, as these
files can contain the entities with their sub-entities.
The hierarchical schema models are best suitable for storing the nested data, such as
representing Hominoid classification.
Network Model
The network model design is similar to hierarchical design as it represents a series of nodes and
vertices. The main difference between the network model and the hierarchical model is that the
network model allows a many-to-many relationship. In contrast, the hierarchical model only
allows a one-to-many relationship.
The network model design is best suitable for applications that require spatial calculations. It is
also great for representing workflows and mainly for cases with multiple paths to the same result.
Relational Model
The relational models are used for the relational database, which stores data as relations of the
table. There are relational operators used to operate on data to manipulate and calculate different
values from it.
Star Schema
The star schema is a different way of schema design to organize the data. It is best suitable for
storing and analysing a huge amount of data, and it works on "Facts" and "Dimensions".
Here the fact is the numerical data point that runs business processes, and Dimension is a
description of fact. With Star Schema, we can structure the data of RDBMS.
Snowflake Schema
The snowflake schema is an adaption of a star schema. There is a main "Fact" table in the star
schema that contains the main data points and reference to its dimension tables. But in
snowflake, dimension tables can have their own dimension tables.
Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the table. It is also used to
establish and identify relationships between tables.
For example, ID is used as a key in the Student table because it is unique for each student. In the
PERSON table, passport_number, license_number, SSN are keys since they are unique for each
person.
Types of keys:
1. Primary key
o It is the first key used to identify one and only one instance of an entity uniquely. An
entity can contain multiple keys, as we saw in the PERSON table. The key which is most
suitable from those lists becomes a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique for each employee.
In the EMPLOYEE table, we can even select License_Number and Passport_Number as
primary keys since they are also unique.
o For each entity, the primary key selection is based on requirements and developers.
2. Candidate key
o A candidate key is an attribute or set of attributes that can uniquely identify a tuple.
o Except for the primary key, the remaining attributes are considered a candidate key. The
candidate keys are as strong as the primary key.
For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the
attributes, like SSN, Passport_Number, License_Number, etc., are considered a candidate key.
3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a
candidate key.
For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the
name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this
combination can also be a key.
The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc.
4. Foreign key
o Foreign keys are the column of the table used to point to the primary key of another table.
o Every employee works in a specific department in a company, and employee and
department are two different entities. So we can't store the department's information in
the employee table. That's why we link these two tables through the primary key of one
table.
o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute
in the EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are
related.
5. Alternate key
There may be one or more attributes or a combination of attributes that uniquely identify each
tuple in a relation. These attributes or combinations of the attributes are called the candidate
keys. One key is chosen as the primary key from these candidate keys, and the remaining
candidate key, if it exists, is termed the alternate key. In other words, the total number of the
alternate keys is the total number of candidate keys minus the primary key. The alternate key
may or may not exist. If there is only one candidate key in a relation, it does not have an alternate
key.
For example, employee relation has two attributes, Employee_Id and PAN_No, that act as
candidate keys. In this relation, Employee_Id is chosen as the primary key, so the other candidate
key, PAN_No, acts as the Alternate key.
6. Composite key
Whenever a primary key consists of more than one attribute, it is known as a composite key.
This key is also known as Concatenated Key.
For example, in employee relations, we assume that an employee may be assigned multiple
roles, and an employee may work on multiple projects simultaneously. So the primary key will
be composed of all three attributes, namely Emp_ID, Emp_role, and Proj_ID in combination. So
these attributes act as a composite key since the primary key comprises more than one attribute.
7. Artificial key
The key created using arbitrarily assigned data are known as artificial keys. These keys are
created when a primary key is large and complex and has no relationship with many other
relations. The data values of the artificial keys are usually numbered in a serial order.
For example, the primary key, which is composed of Emp_ID, Emp_role, and Proj_ID, is large
in employee relations. So it would be better to add a new virtual attribute to identify each tuple in
the relation uniquely.
Database Schema
A database schema is the skeleton structure that represents the logical view of the entire database.
It defines how the data is organized and how the relations among them are associated. It
formulates all the constraints that are to be applied on the data.
A database schema defines its entities and the relationship among them. It contains a descriptive
detail of the database, which can be depicted by means of schema diagrams. It’s the database
designers who design the schema to help programmers understand the database and make it
useful.
A database schema can be divided broadly into two categories −
 Physical Database Schema − This schema pertains to the actual storage of data and its
form of storage like files, indices, etc. It defines how the data will be stored in a secondary
storage.
 Logical Database Schema − This schema defines all the logical constraints that need to be
applied on the data stored. It defines tables, views, and integrity constraints.
Database Instance
It is important that we distinguish these two terms individually. Database schema is the skeleton
of database. It is designed when the database doesn't exist at all. Once the database is
operational, it is very difficult to make any changes to it. A database schema does not contain
any data or information.
A database instance is a state of operational database with data at any given time. It contains a
snapshot of the database. Database instances tend to change with time. A DBMS ensures that its
every instance (state) is in a valid state, by diligently following all the validations, constraints,
and conditions that the database designers have imposed.
Relational Query Languages
Relational algebra is used to break the user requests and instruct the DBMS to execute them.
Relational Query language is used by the user to communicate with the database. They are
generally on a higher level than any other programming language.
This is further divided into two types
Procedural Query Language
Non-Procedural Language
Relational Algebra
Difference between Procedural and Non-Procedural language:
Procedural Language Non-Procedural Language
It is command-driven language. It is a function-driven language
It works through the state of machine.
It works through the mathematical
functions.
Its semantics are quite tough. Its semantics are very simple.
It returns only restricted data types and
allowed values. It can return any datatype or value
Overall efficiency is very high.
Overall efficiency is low as
compared to Procedural Language.
Size of the program written in Procedural
language is large.
Size of the Non-Procedural
language programs are small.
It is not suitable for time critical
applications.
It is suitable for time critical
applications.
Iterative loops and Recursive calls both
are used in the Procedural languages.
Recursive calls are used in Non-
Procedural languages.
Relational algebra is a procedural query language. It gives a step by step process to obtain the
result of the query. It uses operators to perform queries.
Relational Operations
Types of Relational operation
1. Select Operation:
*The select operation selects tuples that satisfy a given predicate.
*It is denoted by sigma (σ).
1. Notation: σ p(r)
Where:
σ is used for selection predictionr is used for relation p is used as a propositional logic formula
which may use connectors like: AND OR and NOT. These relational can use as relational
operators like =, ≠, ≥, <, >, ≤.
For example: LOAN Relation
BRANCH_NAME LOAN_NO AMOUNT
Downtown L-17 1000
Redwood L-23 2000
Perryride L-15 1500
Downtown L-14 1500
Mianus L-13 500
Roundhill L-11 900
Perryride L-16 1300
Input:
1. σ BRANCH_NAME="perryride" (LOAN)
Output:
BRANCH_NAME LOAN_NO AMOUNT
Perryride L-15 1500
Perryride L-16 1300
2. Project Operation:
o This operation shows the list of those attributes that we wish to appear
in the result. Rest of the attributes are eliminated from the table.
o It is denoted by ∏.
1. Notation: ∏ A1, A2, An (r)
Where
A1, A2, A3 is used as an attribute name of relation r.
Example: CUSTOMER RELATION
NAME STREET CITY
Jones Main Harrison
Smith North Rye
Hays Main Harrison
Curry North Rye
Johnson Alma Brooklyn
Brooks Senator Brooklyn
Input:
1. ∏ NAME, CITY (CUSTOMER)
Output:
NAME CITY
Jones Harrison
Smith Rye
Hays Harrison
Curry Rye
Johnson Brooklyn
Brooks Brooklyn
3. Union Operation:
o Suppose there are two tuples R and S. The union operation contains all
the tuples that are either in R or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.
1. Notation: R ∪ S
A union operation must hold the following condition:
o R and S must have the attribute of the same number.
o Duplicate tuples are eliminated automatically.
Example:
DEPOSITOR RELATION
CUSTOMER_NAME ACCOUNT_NO
Johnson A-101
Smith A-121
Mayes A-321
Turner A-176
Johnson A-273
Jones A-472
Lindsay A-284
BORROW RELATION
CUSTOMER_NAME LOAN_NO
Jones L-17
Smith L-23
Hayes L-15
Jackson L-14
Curry L-93
Smith L-11
Williams L-17
Input:
1. ∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR)
Output:
CUSTOMER_NAME
Johnson
Smith
Hayes
Turner
Jones
Lindsay
Jackson
Curry
Williams
Mayes
4. Set Intersection:
o Suppose there are two tuples R and S. The set intersection operation
contains all tuples that are in both R & S.
o It is denoted by intersection ∩.
1. Notation: R ∩ S
Example: Using the above DEPOSITOR table and BORROW table
Input:
1. ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)
Output:
CUSTOMER_NAME
Smith
Jones
5. Set Difference:
o Suppose there are two tuples R and S. The set intersection operation
contains all tuples that are in R but not in S.
o It is denoted by intersection minus (-).
1. Notation: R - S
Example: Using the above DEPOSITOR table and BORROW table
Input:
1. ∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR)
Output:
CUSTOMER_NAME
Jackson
Hayes
Willians
Curry
6. Cartesian product
o The Cartesian product is used to combine each row in one table with
each row in the other table. It is also known as a cross product.
o It is denoted by X.
1. Notation: E X D
Example:
EMPLOYEE
EMP_ID EMP_NAME EMP_DEPT
1 Smith A
2 Harry C
3 John B
DEPARTMENT
DEPT_NO DEPT_NAME
A Marketing
B Sales
C Legal
Input:
1. EMPLOYEE X DEPARTMENT
Output:
EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME
1 Smith A A Marketing
1 Smith A B Sales
1 Smith A C Legal
2 Harry C A Marketing
2 Harry C B Sales
2 Harry C C Legal
3 John B A Marketing
3 John B B Sales
3 John B C Legal
7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted
by rho (ρ).
Example: We can use the rename operator to rename STUDENT relation to
STUDENT1.
1. ρ(STUDENT1, STUDENT)

More Related Content

PPT
graph ASS (1).ppt
PPTX
unit5 graphs (DS).pptx
PPTX
Constraint propagation
PPTX
Log based and Recovery with concurrent transaction
PPTX
3.asynchronous and synchronous communication
PPT
358 33 powerpoint-slides_9-stacks-queues_chapter-9
PPTX
Stack & Queue using Linked List in Data Structure
PDF
DBMS 2 | Entity Relationship Model
graph ASS (1).ppt
unit5 graphs (DS).pptx
Constraint propagation
Log based and Recovery with concurrent transaction
3.asynchronous and synchronous communication
358 33 powerpoint-slides_9-stacks-queues_chapter-9
Stack & Queue using Linked List in Data Structure
DBMS 2 | Entity Relationship Model

What's hot (20)

PPT
15. Transactions in DBMS
PDF
Indexing and-hashing
PPTX
Adjacency And Incidence Matrix
PPT
Finite automata
PPTX
Bit manipulation
PPTX
Data structures
PPTX
Datagram Switching and Virtual Control Switching
PPTX
Delivery and Forwarding of IP Packets
PDF
Beyond Relational Databases
PPTX
Software engineering 7 prototype model
PPT
Ch1- Introduction to dbms
PPT
Even odd parity
PPTX
8. Graph - Data Structures using C++ by Varsha Patil
PPT
Salmos e hinos 627
PDF
SQL Queries - DDL Commands
PPTX
DBMS Unit-2_Final.pptx
PPTX
Tower Of Hanoi
PPTX
1.3.2 non deterministic finite automaton
PPTX
Breadth first search (Bfs)
15. Transactions in DBMS
Indexing and-hashing
Adjacency And Incidence Matrix
Finite automata
Bit manipulation
Data structures
Datagram Switching and Virtual Control Switching
Delivery and Forwarding of IP Packets
Beyond Relational Databases
Software engineering 7 prototype model
Ch1- Introduction to dbms
Even odd parity
8. Graph - Data Structures using C++ by Varsha Patil
Salmos e hinos 627
SQL Queries - DDL Commands
DBMS Unit-2_Final.pptx
Tower Of Hanoi
1.3.2 non deterministic finite automaton
Breadth first search (Bfs)
Ad

Similar to UNIT 2 DATABASE DESIGN 9.pdf ER DIAGRAMS (20)

PPTX
Module 2 dbms.pptx
PPTX
ER diagram
PPTX
ER Modeling and Introduction to RDBMS
PPTX
rdbms3, dbms,dbms,rdbmssssssssssssssssssssssssssssssssss
PPTX
er-models.pptx
PPTX
Entity Relationship Model
PPT
18306_lec-2 (1).ppt
PPTX
42_16SCCCS4_20200520053835884587894.pptx
PPTX
Data Models.pptx
PPTX
Entityrelationshipmodel
PPT
ermodelN in database management system.ppt
PPT
ER-Model-ER Diagram
PDF
Unit 2 DBMS
PPTX
Day 1 SQL.pptx
PPTX
SQL.pptx
PPTX
Entity-Relationship Model in Database Technology
PPTX
entityrelationshipmodel.pptx
PPTX
Entity Relationship Diagram – ER Diagram in DBMS.pptx
PPTX
DBMS: ER Model Basics with a good description
PPTX
UNIT II DBMS.pptx
Module 2 dbms.pptx
ER diagram
ER Modeling and Introduction to RDBMS
rdbms3, dbms,dbms,rdbmssssssssssssssssssssssssssssssssss
er-models.pptx
Entity Relationship Model
18306_lec-2 (1).ppt
42_16SCCCS4_20200520053835884587894.pptx
Data Models.pptx
Entityrelationshipmodel
ermodelN in database management system.ppt
ER-Model-ER Diagram
Unit 2 DBMS
Day 1 SQL.pptx
SQL.pptx
Entity-Relationship Model in Database Technology
entityrelationshipmodel.pptx
Entity Relationship Diagram – ER Diagram in DBMS.pptx
DBMS: ER Model Basics with a good description
UNIT II DBMS.pptx
Ad

More from saranyaksr92 (7)

PDF
UNIT 5 TRANSACTION MANAGEMENT 9 f.pdf AND SQL
PDF
UNIT 3 SQL 10.pdf ORACEL DATABASE QUERY OPTIMIZATION
PDF
UNIT 4 NORMALIZATION AND QUERY OPTIMIZATION 9.pdf
DOCX
RELATIONAL DATABASE PURPOSE OF DATABASE SYSTEM
PDF
DATABASE DESIGNS ER DIAGRAMS REATIONA; ALGEBRA
PDF
TRANSACATION CONCEPTS ACID PNeed for Concurrencyroperties Serializability
PDF
Relational data base and Er diagema Normalization
UNIT 5 TRANSACTION MANAGEMENT 9 f.pdf AND SQL
UNIT 3 SQL 10.pdf ORACEL DATABASE QUERY OPTIMIZATION
UNIT 4 NORMALIZATION AND QUERY OPTIMIZATION 9.pdf
RELATIONAL DATABASE PURPOSE OF DATABASE SYSTEM
DATABASE DESIGNS ER DIAGRAMS REATIONA; ALGEBRA
TRANSACATION CONCEPTS ACID PNeed for Concurrencyroperties Serializability
Relational data base and Er diagema Normalization

Recently uploaded (20)

DOCX
573137875-Attendance-Management-System-original
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
PPT on Performance Review to get promotions
PPT
Mechanical Engineering MATERIALS Selection
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
web development for engineering and engineering
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Construction Project Organization Group 2.pptx
PDF
Structs to JSON How Go Powers REST APIs.pdf
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
composite construction of structures.pdf
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Geodesy 1.pptx...............................................
573137875-Attendance-Management-System-original
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPT on Performance Review to get promotions
Mechanical Engineering MATERIALS Selection
Arduino robotics embedded978-1-4302-3184-4.pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Internet of Things (IOT) - A guide to understanding
UNIT 4 Total Quality Management .pptx
web development for engineering and engineering
Operating System & Kernel Study Guide-1 - converted.pdf
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Construction Project Organization Group 2.pptx
Structs to JSON How Go Powers REST APIs.pdf
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
composite construction of structures.pdf
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Geodesy 1.pptx...............................................

UNIT 2 DATABASE DESIGN 9.pdf ER DIAGRAMS

  • 1. UNIT 2 DATABASE DESIGN 9 Database design & E-R Model: Entity–Relationship model (E-R model)–E-R Diagrams- Constraints-Extended E-R features. Introduction to Relational Model: Database schema–Keys- Schema Diagrams – Relational Query languages – Relational Operations ER (Entity Relationship) Diagram in DBMS o ER model stands for an Entity-Relationship model. It is a high-level data model. This model is used to define the data elements and relationship for a specified system. o It develops a conceptual design for the database. It also develops a very simple and easy to design view of data. o In ER modeling, the database structure is portrayed as a diagram called an entity- relationship diagram. For example, Suppose we design a school database. In this database, the student will be an entity with attributes like address, name, id, age, etc. The address can be another entity with attributes like city, street name, pin code, etc and there will be a relationship between them. Component of ER Diagram
  • 2. 1. Entity: An entity may be any object, class, person or place. In the ER diagram, an entity can be represented as rectangles. Consider an organization as an example- manager, product, employee, department etc. can be taken as an entity.
  • 3. a. Weak Entity An entity that depends on another entity called a weak entity. The weak entity doesn't contain any key attribute of its own. The weak entity is represented by a double rectangle. 2. Attribute The attribute is used to describe the property of an entity. Eclipse is used to represent an attribute. For example, id, age, contact number, name, etc. can be attributes of a student. a. Key Attribute The key attribute is used to represent the main characteristics of an entity. It represents a primary key. The key attribute is represented by an ellipse with the text underlined.
  • 4. b. Composite Attribute An attribute that composed of many other attributes is known as a composite attribute. The composite attribute is represented by an ellipse, and those ellipses are connected with an ellipse. c. Multivalued Attribute An attribute can have more than one value. These attributes are known as a multivalued attribute. The double oval is used to represent multivalued attribute. For example, a student can have more than one phone number.
  • 5. d. Derived Attribute An attribute that can be derived from other attribute is known as a derived attribute. It can be represented by a dashed ellipse. For example, A person's age changes over time and can be derived from another attribute like Date of birth. 3. Relationship A relationship is used to describe the relation between entities. Diamond or rhombus is used to represent the relationship.
  • 6. Types of relationship are as follows: a. One-to-One Relationship When only one instance of an entity is associated with the relationship, then it is known as one to one relationship. For example, A female can marry to one male, and a male can marry to one female. b. One-to-many relationship When only one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship then this is known as a one-to-many relationship. For example, Scientist can invent many inventions, but the invention is done by the only specific scientist. c. Many-to-one relationship When more than one instance of the entity on the left, and only one instance of an entity on the right associates with the relationship then it is known as a many-to-one relationship.
  • 7. For example, Student enrolls for only one course, but a course can have many students. d. Many-to-many relationship When more than one instance of the entity on the left, and more than one instance of an entity on the right associates with the relationship then it is known as a many-to-many relationship. For example, Employee can assign by many projects and project can have many employees. Notation of ER diagram Database can be represented using the notations. In ER diagram, many notations are used to express the cardinality. These notations are as follows:
  • 8. Fig: Notations of ER diagram Constraints Mapping Constraints o A mapping constraint is a data constraint that expresses the number of entities to which another entity can be related via a relationship set. o It is most useful in describing the relationship sets that involve more than two entity sets. o For binary relationship set R on an entity set A and B, there are four possible mapping cardinalities. These are as follows: 1. One to one (1:1) 2. One to many (1:M) 3. Many to one (M:1) 4. Many to many (M:M)
  • 9. One-to-one In one-to-one mapping, an entity in E1 is associated with at most one entity in E2, and an entity in E2 is associated with at most one entity in E1. One-to-many In one-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an entity in E2 is associated with at most one entity in E1. Many-to-one In one-to-many mapping, an entity in E1 is associated with at most one entity in E2, and an entity in E2 is associated with any number of entities in E1.
  • 10. Many-to-many In many-to-many mapping, an entity in E1 is associated with any number of entities in E2, and an entity in E2 is associated with any number of entities in E1. Constraints In a database table, we can add rules to a column known as constraints. These rules control the data that can be stored in a column. For example, if a column has NOT NULL constraint, it means the column cannot store NULL values. The constraints used in SQL are: Constraint Description
  • 11. NOT NULL values cannot be null UNIQUE values cannot match any older value PRIMARY KEY used to uniquely identify a row FOREIGN KEY references a row in another table CHECK validates condition for new value DEFAULT set default value if not passed CREATE INDEX used to speedup the read process Note: These constraints are also called integrity constraints. NOT NULL Constraint The NOT NULL constraint in a column means that the column cannot store NULL values. For example, CREATE TABLE Colleges ( college_id INT NOT NULL, college_code VARCHAR(20) NOT NULL, college_name VARCHAR(50) );
  • 12. Run Code Here, the college_id and the college_code columns of the Colleges table won't allow NULL values. UNIQUE Constraint The UNIQUE constraint in a column means that the column must have unique value. For example, CREATE TABLE Colleges ( college_id INT NOT NULL UNIQUE, college_code VARCHAR(20) UNIQUE, college_name VARCHAR(50) ); Run Code Here, the value of the college_code column must be unique. Similarly, the value of college_id must be unique as well as it cannot store NULL values. PRIMARY KEY Constraint The PRIMARY KEY constraint is simply a combination of NOT NULL and UNIQUE constraints. It means that the column value is used to uniquely identify the row. For example, CREATE TABLE Colleges ( college_id INT PRIMARY KEY, college_code VARCHAR(20) NOT NULL, college_name VARCHAR(50) ); Run Code Here, the value of the college_id column is a unique identifier for a row. Similarly, it cannot store NULL value and must be UNIQUE.
  • 13. FOREIGN KEY Constraint The FOREIGN KEY (REFERENCES in some databases) constraint in a column is used to reference a record that exists in another table. For example, CREATE TABLE Orders ( order_id INT PRIMARY KEY, customer_id int REFERENCES Customers(id) ); Run Code Here, the value of the college_code column references the row in another table named Customers. It means that the value of customer_id in the Orders table must be a value from the id column of the Customers table. CHECK Constraint The CHECK constraint checks the condition before allowing values in a table. For example, CREATE TABLE Orders ( order_id INT PRIMARY KEY, amount int CHECK (amount >= 100) ); Run Code Here, the value of the amount column must be greater than or equal to 100. If not, the SQL statement results in an error. DEFAULT Constraint The DEFAULT constraint is used to set the default value if we try to store NULL in a column. For example, CREATE TABLE College ( college_id INT PRIMARY KEY, college_code VARCHAR(20),
  • 14. college_country VARCHAR(20) DEFAULT 'US' ); Run Code Here, the default value of the college_country column is US. If we try to store the NULL value in the college_country column, its value will be US. CREATE INDEX Constraint If a column has CREATE INDEX constraint, it's faster to retrieve data if we use that column for data retrieval. For example, -- create table CREATE TABLE Colleges ( college_id INT PRIMARY KEY, college_code VARCHAR(20) NOT NULL, college_name VARCHAR(50) ); -- create index CREATE INDEX college_index ON Colleges(college_code); Run Code Here, the SQL command creates an index named customers_index on the Customers table using customer_id column. Extended or Enhanced ER model in DBMS Extended ER is a high-level data model that incorporates the extensions to the original ER model. Enhanced ER models are high level models that represent the requirements and complexities of complex databases. The extended Entity Relationship (ER) models are three types as given below −  Aggregation  Specialization  Generalization Specialization
  • 15. The process of designing sub groupings within an entity set is called specialization. It is a top- down process. If an entity set is given with all the attributes in which the instances of the entity set are differentiated according to the given attribute value, then that sub-classes or the sub- entity sets can be formed from the given attribute. Example Specialization of a person allows us to distinguish a person according to whether they are employees or customers. Specialization of account creates two entity sets: savings account and current account. In the E-R diagram specialization is represented by triangle components labeled ISA. The ISA relationship is referred as superclass- subclass relationship as shown below − Generalization It is the reverse process of specialization. It is a bottom-up approach. It converts subclasses to superclasses. This process combines a number of entity sets that share the same features into higher-level entity sets.
  • 16. If the sub-class information is given for the given entity set then, ISA relationship type will be used to represent the connectivity between the subclass and superclass as shown below − Example Aggregation It is an abstraction in which relationship sets are treated as higher level entity sets and can participate in relationships. Aggregation allows us to indicate that a relationship set participates in another relationship set. Aggregation is used to simplify the details of a given database where ternary relationships will be changed into binary relationships. Ternary relation is only one type of relationship which is working between three entities. Aggregation is shown in the image below −
  • 17. Introduction to Relational Model The relational Model was proposed by E.F. Codd to model data in the form of relations or tables. After designing the conceptual model of the Database using ER diagram, we need to convert the conceptual model into a relational model which can be implemented using any RDBMS language like Oracle SQL, MySQL, etc. So we will see what the Relational Model is. What is the Relational Model? The relational model represents how data is stored in Relational Databases. A relational database stores data in the form of relations (tables). Consider a relation STUDENT with attributes ROLL_NO, NAME, ADDRESS, PHONE, and AGE shown in Table 1. STUDENT ROLL_NO NAME ADDRESS PHONE AGE 1 RAM DELHI 9455123451 18 2 RAMESH GURGAON 9652431543 18 3 SUJIT ROHTAK 9156253131 20 4 SURESH DELHI 18 IMPORTANT TERMINOLOGIES  Attribute: Attributes are the properties that define a relation. e.g.; ROLL_NO, NAME
  • 18.  Relation Schema: A relation schema represents the name of the relation with its attributes. e.g.; STUDENT (ROLL_NO, NAME, ADDRESS, PHONE, and AGE) is the relation schema for STUDENT. If a schema has more than 1 relation, it is called Relational Schema.  Tuple: Each row in the relation is known as a tuple. The above relation contains 4 tuples, one of which is shown as: 1 RAM DELHI 9455123451 18  Relation Instance: The set of tuples of a relation at a particular instance of time is called a relation instance. Table 1 shows the relation instance of STUDENT at a particular time. It can change whenever there is an insertion, deletion, or update in the database.  Degree: The number of attributes in the relation is known as the degree of the relation. The STUDENT relation defined above has degree 5.  Cardinality: The number of tuples in a relation is known as cardinality. The STUDENT relation defined above has cardinality 4.  Column: The column represents the set of values for a particular attribute. The column ROLL_NO is extracted from the relation STUDENT. ROLL_NO 1 2 3 4  NULL Values: The value which is not known or unavailable is called a NULL value. It is represented by blank space. e.g.; PHONE of STUDENT having ROLL_NO 4 is NULL. Constraints in Relational Model While designing the Relational Model, we define some conditions which must hold for data present in the database are called Constraints. These constraints are checked before performing any operation (insertion, deletion, and updation ) in the database. If there is a violation of any of the constraints, the operation will fail. Domain Constraints: These are attribute-level constraints. An attribute can only take values that lie inside the domain range. e.g; If a constraint AGE>0 is applied to STUDENT relation, inserting a negative value of AGE will result in failure. Key Integrity: Every relation in the database should have at least one set of attributes that defines a tuple uniquely. Those set of attributes is called keys. e.g.; ROLL_NO in STUDENT is a key. No two students can have the same roll number. So a key has two properties:  It should be unique for all tuples.
  • 19.  It can’t have NULL values. Referential Integrity: When one attribute of a relation can only take values from another attribute of the same relation or any other relation, it is called referential integrity. Let us suppose we have 2 relations STUDENT ROLL_NO NAME ADDRESS PHONE AGE BRANCH_CODE 1 RAM DELHI 9455123451 18 CS 2 RAMESH GURGAON 9652431543 18 CS 3 SUJIT ROHTAK 9156253131 20 ECE 4 SURESH DELHI 18 IT BRANCH BRANCH_CODE BRANCH_NAME CS COMPUTER SCIENCE IT INFORMATION TECHNOLOGY ECE ELECTRONICS AND COMMUNICATION ENGINEERING CV CIVIL ENGINEERING BRANCH_CODE of STUDENT can only take the values which are present in BRANCH_CODE of BRANCH which is called referential integrity constraint. The relation which is referencing another relation is called REFERENCING RELATION (STUDENT in this case) and the relation to which other relations refer is called REFERENCED RELATION (BRANCH in this case). ANOMALIES An anomaly is an irregularity or something which deviates from the expected or normal state. When designing databases, we identify three types of anomalies: Insert, Update and Delete. Insertion Anomaly in Referencing Relation: We can’t insert a row in REFERENCING RELATION if referencing attribute’s value is not present in the referenced attribute value. e.g.; Insertion of a student with BRANCH_CODE ‘ME’ in STUDENT relation will result in an error because ‘ME’ is not present in BRANCH_CODE of BRANCH.
  • 20. Deletion/ Updation Anomaly in Referenced Relation: We can’t delete or update a row from REFERENCED RELATION if the value of REFERENCED ATTRIBUTE is used in the value of REFERENCING ATTRIBUTE. e.g; if we try to delete a tuple from BRANCH having BRANCH_CODE ‘CS’, it will result in an error because ‘CS’ is referenced by BRANCH_CODE of STUDENT, but if we try to delete the row from BRANCH with BRANCH_CODE CV, it will be deleted as the value is not been used by referencing relation. It can be handled by the following method: ON DELETE CASCADE: It will delete the tuples from REFERENCING RELATION if the value used by REFERENCING ATTRIBUTE is deleted from REFERENCED RELATION. e.g; For, if we delete a row from BRANCH with BRANCH_CODE ‘CS’, the rows in STUDENT relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be deleted. ON UPDATE CASCADE: It will update the REFERENCING ATTRIBUTE in REFERENCING RELATION if the attribute value used by REFERENCING ATTRIBUTE is updated in REFERENCED RELATION. e.g;, if we update a row from BRANCH with BRANCH_CODE ‘CS’ to ‘CSE’, the rows in STUDENT relation with BRANCH_CODE CS (ROLL_NO 1 and 2 in this case) will be updated with BRANCH_CODE ‘CSE’. SUPER KEYS: Any set of attributes that allows us to identify unique rows (tuples) in a given relationship is known as super keys. Out of these super keys, we can always choose a proper subset among these which can be used as a primary key. Such keys are known as Candidate keys. If there is a combination of two or more attributes that are being used as the primary key then we call it a Composite key. Advantages:  Simple model  It is Flexible  It is Secure  Data accuracy  Data integrity  Operations can be applied easily Disadvantage:  Not good for large database  Relation between tables become difficult some time Basic Operators in Relational Algebra Article Contributed by Sonal Tuteja. Please write comments if you find anything incorrect, or if you want to share more information about the topic discussed above Database Schema A database schema is a structure that represents the logical storage of the data in a database. It represents the organization of data and provides information about the relationships between the tables in a given database. In this topic, we will understand more about database schema and its types. Before understanding database schema, lets first understand what a Database is.
  • 21. What is Database? A database is a place to store information. It can store the simplest data, such as a list of people as well as the most complex data. The database stores the information in a well-structured format. What is Database Schema? A database schema is the logical representation of a database, which shows how the data is stored logically in the entire database. It contains list of attributes and instruction that informs the database engine that how the data is organized and how the elements are related to each other. A database schema contains schema objects that may include tables, fields, packages, views, relationships, primary key, foreign key, In actual, the data is physically stored in files that may be in unstructured form, but to retrieve it and use it, we need to put it in a structured form. To do this, a database schema is used. It provides knowledge about how the data is organized in a database and how it is associated with other data. The schema does not physically contain the data itself; instead, it gives information about the shape of data and how it can be related to other tables or models. A database schema object includes the following: Consistent formatting for all data entries. Database objects and unique keys for all data entries. Tables with multiple columns, and each column contains its name and datatype. The complexity & the size of the schema vary as per the size of the project. It helps developers to easily manage and structure the database before coding it. The given diagram is an example of a database schema. It contains three tables, their data types. This also represents the relationships between the tables and primary keys as well as foreign keys.
  • 22. Types of Database Schema The database schema is divided into three types, which are: 1. Logical Schema 2. Physical Schema 3. View Schema
  • 23. 1. Physical Database Schema A physical database schema specifies how the data is stored physically on a storage system or disk storage in the form of Files and Indices. Designing a database at the physical level is called a physical schema. 2. Logical Database Schema The Logical database schema specifies all the logical constraints that need to be applied to the stored data. It defines the views, integrity constraints, and table. Here the term integrity constraints define the set of rules that are used by DBMS (Database Management System) to maintain the quality for insertion & update the data. The logical schema represents how the data is stored in the form of tables and how the attributes of a table are linked together. At this level, programmers and administrators work, and the implementation of the data structure is hidden at this level. Various tools are used to create a logical database schema, and these tools demonstrate the relationships between the component of your data; this process is called ER modelling. The ER modelling stands for entity-relationship modelling, which specifies the relationships between different entities. We can understand it with an example of a basic commerce application. Below is the schema diagram, the simple ER model representing the logical flow of transaction in a commerce application.
  • 24. In the given example, the Ids are given in each circle, and these Ids are primary key & foreign keys. The primary key is used to uniquely identify the entry in a document or record. The Ids of the upper three circles are the primary keys. The Foreign key is used as the primary key for other tables. The FK represent the foreign key in the diagram. It relates one table to another table. 3. View Schema The view level design of a database is known as view schema. This schema generally describes the end-user interaction with the database systems. Difference between the Physical and Logical Database Schema Physical database schema Logical Database schema It does not include the attributes. It includes the attributes. It contains both primary & secondary Keys. It also contains both primary & secondary keys. It contains the table name. It contains the names of the tables.
  • 25. It contains the column names and their data types. It does not contain any column name or datatype. Database Instance or Database Schema is the same? The terms database schema and database instances are related to each other & sometimes confusing to be used as the same thing. But both are different from each other. Database Schema is a representation of a planned database and does not actually contain the data. On the other hand, a database instance is a type of snapshot of an actual database as it existed at an instance of time. Hence it varies or can be changed as per the time. In contrast, the database schema is static and very complex to change the structure of a database. Both instances and schemas are related to and impact each other through the DBMS. DBMS ensures that every database instance complies with the constraints imposed by the database designers in the database schema. Creating Schema To create a schema, "CREATE SCHEMA" Statements is used in each type of database. But each DBMS has a different meaning for this. Below we are explaining creating schema in different database systems: 1. MySQL In MySQL, the "CREATE SCHEMA" statement creates the database. It is because, in MySQL, the CREATE SCHEMA statement is similar to CREATE DATABASE statement, and schema is a synonym for the database. 2. Oracle Database In Oracle Database, each schema is already present with each database user. Hence CREATE SCHEMA does not actually create a schema; rather, it helps to show the schema with tables and views and allows to access those objects without requiring multiple SQL statements for multiple transactions. The "CREATE USER" statement is used to create a schema in Oracle. 3. SQL Server In the SQL server, the "CREATE SCHEMA" statement creates a new schema with the name provided by the user. Database Schema Designs
  • 26. A schema design is the first step in building a foundation in data management. Ineffective schema designs are difficult to manage and consume more memory and other resources. It logically depends on the business requirements. It is required to choose the correct database schema design to make ease in the project lifecycle. The list of some popular database schema designs is given below: o Flat Model o Hierarchical Model o Network Model o Relational Model o Star Schema o Snowflake Schema Flat Model A flat model schema is a type of 2-D array in which each column contains the same type of data, and elements within a row are related to each other. It can be understood as a single spreadsheet or a database table with no relations. This schema design is most suitable for small applications that don't contain complex data. Hierarchical Model The Hierarchical model design contains a tree-like structure. The tree structure contains the root node of data and its child nodes. Between each child node and parent node, there is a one-to- many relationship. Such type of database schemas is presented by XML or JSON files, as these files can contain the entities with their sub-entities. The hierarchical schema models are best suitable for storing the nested data, such as representing Hominoid classification. Network Model The network model design is similar to hierarchical design as it represents a series of nodes and vertices. The main difference between the network model and the hierarchical model is that the network model allows a many-to-many relationship. In contrast, the hierarchical model only allows a one-to-many relationship. The network model design is best suitable for applications that require spatial calculations. It is also great for representing workflows and mainly for cases with multiple paths to the same result. Relational Model The relational models are used for the relational database, which stores data as relations of the table. There are relational operators used to operate on data to manipulate and calculate different values from it.
  • 27. Star Schema The star schema is a different way of schema design to organize the data. It is best suitable for storing and analysing a huge amount of data, and it works on "Facts" and "Dimensions". Here the fact is the numerical data point that runs business processes, and Dimension is a description of fact. With Star Schema, we can structure the data of RDBMS. Snowflake Schema The snowflake schema is an adaption of a star schema. There is a main "Fact" table in the star schema that contains the main data points and reference to its dimension tables. But in snowflake, dimension tables can have their own dimension tables. Keys o Keys play an important role in the relational database. o It is used to uniquely identify any record or row of data from the table. It is also used to establish and identify relationships between tables. For example, ID is used as a key in the Student table because it is unique for each student. In the PERSON table, passport_number, license_number, SSN are keys since they are unique for each person. Types of keys:
  • 28. 1. Primary key o It is the first key used to identify one and only one instance of an entity uniquely. An entity can contain multiple keys, as we saw in the PERSON table. The key which is most suitable from those lists becomes a primary key. o In the EMPLOYEE table, ID can be the primary key since it is unique for each employee. In the EMPLOYEE table, we can even select License_Number and Passport_Number as primary keys since they are also unique. o For each entity, the primary key selection is based on requirements and developers. 2. Candidate key o A candidate key is an attribute or set of attributes that can uniquely identify a tuple. o Except for the primary key, the remaining attributes are considered a candidate key. The candidate keys are as strong as the primary key. For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the attributes, like SSN, Passport_Number, License_Number, etc., are considered a candidate key.
  • 29. 3. Super Key Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a candidate key. For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination can also be a key. The super key would be EMPLOYEE-ID (EMPLOYEE_ID, EMPLOYEE-NAME), etc. 4. Foreign key o Foreign keys are the column of the table used to point to the primary key of another table.
  • 30. o Every employee works in a specific department in a company, and employee and department are two different entities. So we can't store the department's information in the employee table. That's why we link these two tables through the primary key of one table. o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in the EMPLOYEE table. o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related. 5. Alternate key There may be one or more attributes or a combination of attributes that uniquely identify each tuple in a relation. These attributes or combinations of the attributes are called the candidate keys. One key is chosen as the primary key from these candidate keys, and the remaining candidate key, if it exists, is termed the alternate key. In other words, the total number of the alternate keys is the total number of candidate keys minus the primary key. The alternate key may or may not exist. If there is only one candidate key in a relation, it does not have an alternate key. For example, employee relation has two attributes, Employee_Id and PAN_No, that act as candidate keys. In this relation, Employee_Id is chosen as the primary key, so the other candidate key, PAN_No, acts as the Alternate key.
  • 31. 6. Composite key Whenever a primary key consists of more than one attribute, it is known as a composite key. This key is also known as Concatenated Key. For example, in employee relations, we assume that an employee may be assigned multiple roles, and an employee may work on multiple projects simultaneously. So the primary key will be composed of all three attributes, namely Emp_ID, Emp_role, and Proj_ID in combination. So these attributes act as a composite key since the primary key comprises more than one attribute.
  • 32. 7. Artificial key The key created using arbitrarily assigned data are known as artificial keys. These keys are created when a primary key is large and complex and has no relationship with many other relations. The data values of the artificial keys are usually numbered in a serial order. For example, the primary key, which is composed of Emp_ID, Emp_role, and Proj_ID, is large in employee relations. So it would be better to add a new virtual attribute to identify each tuple in the relation uniquely. Database Schema A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated. It formulates all the constraints that are to be applied on the data. A database schema defines its entities and the relationship among them. It contains a descriptive detail of the database, which can be depicted by means of schema diagrams. It’s the database designers who design the schema to help programmers understand the database and make it useful.
  • 33. A database schema can be divided broadly into two categories −  Physical Database Schema − This schema pertains to the actual storage of data and its form of storage like files, indices, etc. It defines how the data will be stored in a secondary storage.  Logical Database Schema − This schema defines all the logical constraints that need to be applied on the data stored. It defines tables, views, and integrity constraints. Database Instance It is important that we distinguish these two terms individually. Database schema is the skeleton of database. It is designed when the database doesn't exist at all. Once the database is operational, it is very difficult to make any changes to it. A database schema does not contain any data or information. A database instance is a state of operational database with data at any given time. It contains a snapshot of the database. Database instances tend to change with time. A DBMS ensures that its every instance (state) is in a valid state, by diligently following all the validations, constraints, and conditions that the database designers have imposed. Relational Query Languages Relational algebra is used to break the user requests and instruct the DBMS to execute them. Relational Query language is used by the user to communicate with the database. They are generally on a higher level than any other programming language.
  • 34. This is further divided into two types Procedural Query Language Non-Procedural Language Relational Algebra Difference between Procedural and Non-Procedural language: Procedural Language Non-Procedural Language It is command-driven language. It is a function-driven language It works through the state of machine. It works through the mathematical functions. Its semantics are quite tough. Its semantics are very simple. It returns only restricted data types and allowed values. It can return any datatype or value Overall efficiency is very high. Overall efficiency is low as compared to Procedural Language. Size of the program written in Procedural language is large. Size of the Non-Procedural language programs are small. It is not suitable for time critical applications. It is suitable for time critical applications. Iterative loops and Recursive calls both are used in the Procedural languages. Recursive calls are used in Non- Procedural languages. Relational algebra is a procedural query language. It gives a step by step process to obtain the result of the query. It uses operators to perform queries.
  • 35. Relational Operations Types of Relational operation 1. Select Operation: *The select operation selects tuples that satisfy a given predicate. *It is denoted by sigma (σ). 1. Notation: σ p(r) Where: σ is used for selection predictionr is used for relation p is used as a propositional logic formula which may use connectors like: AND OR and NOT. These relational can use as relational operators like =, ≠, ≥, <, >, ≤. For example: LOAN Relation
  • 36. BRANCH_NAME LOAN_NO AMOUNT Downtown L-17 1000 Redwood L-23 2000 Perryride L-15 1500 Downtown L-14 1500 Mianus L-13 500 Roundhill L-11 900 Perryride L-16 1300 Input: 1. σ BRANCH_NAME="perryride" (LOAN) Output: BRANCH_NAME LOAN_NO AMOUNT Perryride L-15 1500
  • 37. Perryride L-16 1300 2. Project Operation: o This operation shows the list of those attributes that we wish to appear in the result. Rest of the attributes are eliminated from the table. o It is denoted by ∏. 1. Notation: ∏ A1, A2, An (r) Where A1, A2, A3 is used as an attribute name of relation r. Example: CUSTOMER RELATION NAME STREET CITY Jones Main Harrison Smith North Rye Hays Main Harrison Curry North Rye Johnson Alma Brooklyn Brooks Senator Brooklyn Input:
  • 38. 1. ∏ NAME, CITY (CUSTOMER) Output: NAME CITY Jones Harrison Smith Rye Hays Harrison Curry Rye Johnson Brooklyn Brooks Brooklyn 3. Union Operation: o Suppose there are two tuples R and S. The union operation contains all the tuples that are either in R or S or both in R & S. o It eliminates the duplicate tuples. It is denoted by ∪. 1. Notation: R ∪ S A union operation must hold the following condition: o R and S must have the attribute of the same number. o Duplicate tuples are eliminated automatically.
  • 39. Example: DEPOSITOR RELATION CUSTOMER_NAME ACCOUNT_NO Johnson A-101 Smith A-121 Mayes A-321 Turner A-176 Johnson A-273 Jones A-472 Lindsay A-284
  • 40. BORROW RELATION CUSTOMER_NAME LOAN_NO Jones L-17 Smith L-23 Hayes L-15 Jackson L-14 Curry L-93 Smith L-11 Williams L-17 Input: 1. ∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR) Output: CUSTOMER_NAME Johnson
  • 41. Smith Hayes Turner Jones Lindsay Jackson Curry Williams Mayes 4. Set Intersection: o Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in both R & S. o It is denoted by intersection ∩. 1. Notation: R ∩ S Example: Using the above DEPOSITOR table and BORROW table Input: 1. ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR)
  • 42. Output: CUSTOMER_NAME Smith Jones 5. Set Difference: o Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in R but not in S. o It is denoted by intersection minus (-). 1. Notation: R - S Example: Using the above DEPOSITOR table and BORROW table Input: 1. ∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR) Output: CUSTOMER_NAME Jackson Hayes Willians
  • 43. Curry 6. Cartesian product o The Cartesian product is used to combine each row in one table with each row in the other table. It is also known as a cross product. o It is denoted by X. 1. Notation: E X D Example: EMPLOYEE EMP_ID EMP_NAME EMP_DEPT 1 Smith A 2 Harry C 3 John B DEPARTMENT DEPT_NO DEPT_NAME A Marketing B Sales
  • 44. C Legal Input: 1. EMPLOYEE X DEPARTMENT Output: EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME 1 Smith A A Marketing 1 Smith A B Sales 1 Smith A C Legal 2 Harry C A Marketing 2 Harry C B Sales 2 Harry C C Legal 3 John B A Marketing 3 John B B Sales 3 John B C Legal
  • 45. 7. Rename Operation: The rename operation is used to rename the output relation. It is denoted by rho (ρ). Example: We can use the rename operator to rename STUDENT relation to STUDENT1. 1. ρ(STUDENT1, STUDENT)