2. Data Independence
It refers to the ability to change the schema of a database
at one level (e.g., the internal or external level) without
having to change the schema at the other levels.
There are two main types of data independence: logical
data independence and physical data independence.
3. Importance of Data Independence:
Scalability: Making it easier to scale the database system by
abstracting changes in physical storage and logical structure.
Maintenance: Reducing the need for reworking applications when
database changes are made.
Adaptability: Easing the migration to newer database technologies
or hardware without affecting user access.
Data Redundancy: Storing the same data in multiple places. While
data independence aims to reduce redundancy in the schema,
redundancy is often a challenge when making changes to the
database schema. How proper data independence can reduce
redundancy and improve data consistency.
4. Data Independence in Distributed Databases
Data independence becomes even more
important in distributed databases, where data is
stored across multiple locations.
Ensuring that changes in one part of the database
don't affect the global view or application logic.
5. Types of Data Independence
Physical Data Independence:
The ability to change the physical storage of the data
without impacting the logical schema.
Examples: Changing file formats, storage devices, or
indexing mechanisms.
Key challenges: Storage and performance
optimization.
6. Logical Data Independence:
The ability to change the logical schema without
impacting the external schema or application
programs.
Examples: Adding new fields, changing data types, or
re-organizing relationships between entities.
Key challenges: Ensuring the change does not affect
user views or applications relying on the data.
8. Relational Data Model
The Relational Data Model is one of the most
widely used and well-known data models for
structuring and managing data in database
management systems (DBMS). It is based on the
concept of representing data in terms of tables
(relations), which are related to each other
through keys and constraints.
9. Key Concepts of the Relational Data Model
Relation (Table):
A relation is essentially a table with rows and columns.
A relation consists of a set of tuples (rows) and a set of attributes
(columns).
Each table represents an entity or a relationship between entities.
Attribute:
An attribute is a column in the table, which represents a
property or characteristic of the entity that the table
models.
Example: In a "Student" table, attributes might include
Student_ID, First_Name, Last_Name, DOB (Date of Birth), etc.
10. Tuple:
A tuple is a row in the table, which contains a specific value for each
attribute.
Example: A tuple in the "Student" table could be a record like: (123,
"John", "Doe", "2000-05-15").
Domain:
The domain of an attribute defines the set of permissible values for
that attribute.
Example: The domain of the attribute DOB might be a set of valid
date values.
Primary Key:
A primary key is a set of one or more attributes that uniquely
identify a tuple in a relation (table).
Example: In a "Student" table, Student_ID could be the primary key,
as it uniquely identifies each student.
11. Foreign Key:
A foreign key is an attribute or set of attributes in one table that
refers to the primary key in another table.
It establishes a relationship between two tables.
Example: In a "Course_Enrollment" table, Student_ID might be a
foreign key that references the Student_ID in the "Student"
table.
Referential Integrity:
Referential integrity ensures that a foreign key value in a table
must either be null or match a primary key value in the
referenced table.
This prevents orphaned records and ensures the integrity of
relationships between tables.
12. Relations and Keys:
Superkey: A set of one or more attributes that can
uniquely identify a tuple in a relation. Every
relation must have at least one superkey.
Candidate Key: A minimal superkey, i.e., a
superkey where no attribute can be removed
without losing the ability to uniquely identify a
tuple.
13. Primary Key: A candidate key chosen to uniquely
identify tuples in the table.
Alternate Keys: Other candidate keys that are
not selected as the primary key.
Composite Key: A primary key that consists of
two or more attributes.
15. Advantages of the Relational Model
Simplicity: The relational model is easy to understand and
use, with tables (relations) being a familiar concept.
Flexibility: It is flexible in handling data, as it allows easy
modifications such as adding new attributes or tables.
Data Integrity: The relational model enforces constraints
(e.g., primary key, foreign key) that help maintain data
integrity.
Support for Query Languages: SQL, a powerful and widely
adopted query language, is built around the relational model.
Normalization: The relational model supports
normalization techniques to reduce redundancy and improve
data integrity.
16. Anomalies in Relational Model
Insertion Anomalies: It is the inability to insert
data in the database due to the absence of other
data.
For example: Suppose we are dividing the whole
class into groups for a project and
the GroupNumber attribute is defined so that null
values are not allowed. If a new student is
admitted to the class but not immediately
assigned to a group then this student can't be
inserted into the database.
17. Modification/Update Anomalies –
It is the data inconsistency that arises from data
redundancy and partial updation of data in the
database.
For example: Suppose, while updating the data
into the database duplicate entries were
entered. Now, if the user does not realize that
the data is stored redundantly after updation,
there will be data inconsistency in the database.
18. Deletion Anomalies - It is the accidental loss of
data in the database upon deletion of any other
data element.
For example: Suppose, we have an employee
relation that contains the details of the employee
along with the department they are working in.
Now, if a department has one employee working
in it and we remove the information of this
employee from the table, there will be a loss of
data related to the department also. This can
lead to data inconsistency.