1. @ SMU
Fundamentals of Database System
Chapter One
1 | P a g e
CHAPTER ONE
INTRODUCTION
Database System
Database systems are designed to manage large data set in an organization. The data
management involves both definition and the manipulation of the data which ranges
from simple representation of the data to considerations of structures for the
storage of information. The data management also considers the provision of
mechanisms for the manipulation of information.
Today, Databases are essential to every business. They are used to maintain internal
records, to present data to customers and clients on the World-WideWeb (www) and to
support many other commercial processes of many modern organizations.
Database management system (DBMS) is a powerful software tool for creating and
managing large amounts of data efficiently and allowing it to persist over long periods of
time, safely. In essence, a database is nothing more than a collection of shared
information that exists over a long period of time, often many years. Technically, the
term database refers to a collection of data that is managed by a DBMS.
Data management passes through the different levels of development along with the
development in technology and services. These levels could best be described by
categorizing the levels into three levels of development. Even though there is an
advantage and a problem overcome at each new level, all methods of data handling are in
use to some extent. The major three levels are;
1. Manual Approach
2. Traditional File Based Approach
3. Database Approach
1. Manual Approach
In the manual approach, data storage and retrieval follows the primitive and traditional
way of information handling where cards and paper are used for the purpose. The data
storage and retrieval will be performed using human labor.
Files for as many event and objects as the organization has are used to store
information.
Each of the files containing various kinds of information is labeled and stored in
one or more cabinets.
The cabinets could be kept in safe places for security purpose based on the
sensitivity of the information contained in it.
Insertion and retrieval is done by searching first for the right cabinet then for the
right the file then the information.
One could have an indexing system to facilitate access to the data Limitations of
the Manual approach
Prone to error
Not suitable to update, retrieve, integrate
You have the data but it is difficult to compile the information
Limited to small size information
2. @ SMU
Fundamentals of Database System
Chapter One
2 | P a g e
Cross referencing is difficult
An alternative approach of data handling is a computerized way of dealing with the
information. The computerized approach could also be either decentralized or centralized
base on where the data resides in the system.
2. Traditional File Based Approach
After the introduction of Computer for data processing to the business community, the
need to use the device for data storage and processing increase. There were, and still are,
several computer applications with file based processing used for the purpose of data
handling. Even though the approach evolved over time, the basic structure is still similar
if not identical.
File based systems were an early attempt to computerize the manual filing
system.
This approach is the decentralized computerized data handling method.
A collection of application programs perform services for the end-users. In such
systems, every application program that provides service to end users define and
manage its own data
Such systems have number of programs for each of the different applications in
the organization.
Since every application defines and manages its own data, the system is
subjected to serious data duplication problem.
File, in traditional file based approach, is a collection of records which contains
logically related data.
Limitations of the Traditional File Based approach
3. @ SMU
Fundamentals of Database System
Chapter One
3 | P a g e
As business application become more complex demanding more flexible and reliable data
handling methods, the shortcomings of the file based system became evident. These
shortcomings include, but not limited to:
Separation or Isolation of Data: Available information in one application may not be
known.
Limited data sharing
Lengthy development and maintenance time
Duplication or redundancy of data
Data dependency on the application
Incompatible file formats between different applications and programs creating
inconsistency.
Fixed query processing which is defined during application development.
The limitations for the traditional file based data handling approach arise from two basic
reasons.
1. Definition of the data is embedded in the application program which makes it difficult to
modify the database definition easily.
2. No control over the access and manipulation of the data beyond that imposed by the
application programs.
The most significant problem experienced by the traditional file based approach of data handling
is the “update anomalies”. We have three types of update anomalies;
1. Modification Anomalies: a problem experienced when one or more data value is modified on
one application program but not on others containing the same data set.
2. Deletion Anomalies: a problem encountered where one record set is deleted from one
application but remain untouched in other application programs.
3. Insertion Anomalies: a problem experienced whenever there is new data item to be recorded,
and the recording is not made in all the applications. And when same data item is inserted at
different applications, there could be errors in encoding which makes the new data item to be
considered as a totally different object.
3. Database Approach
Following a famous paper written by Ted Codd in 1970, database systems changed significantly.
Codd proposed that database systems should present the user with a view of data organized as
tables called relations. Behind the scenes, there might be a complex data structure that allowed
rapid response to a variety of queries. But, unlike the user of earlier database systems, the user of
a relational system would not be concerned with the storage structure. Queries could be
expressed in a very high-level language, which greatly increased the efficiency of database
programmers. The database approach emphasizes the integration and sharing of data throughout
the organization
Thus in Database Approach:
4. @ SMU
Fundamentals of Database System
Chapter One
4 | P a g e
Database is just a computerized record keeping system or a kind of electronic filing
cabinet.
Database is a repository for collection of computerized data files.
Database is a shared collection of logically related data designed to meet the information
needs of an organization. Since it is a shared corporate resource, the database is
integrated with minimum amount of or no duplication.
Database is a collection of logically related data where these logically related data
comprises entities, attributes, relationships, and business rules of an organization's
information.
In addition to containing data required by an organization, database also contains a
description of the data which called as “Metadata” or “Data Dictionary” or “Systems
Catalogue” or “Data about Data”.
Since a database contains information about the data (metadata), it is called a self-
descriptive collection on integrated records.
The purpose of a database is to store information and to allow users to retrieve and
update that information on demand.
Database is deigned once and used simultaneously by many users.
Unlike the traditional file based approach in database approach there is program data
independence. That is the separation of the data definition from the application. Thus the
application is not affected by changes made in the data structure and file organization.
Each database application will perform the combination of: Creating database, Reading,
Updating and Deleting data.
Benefits of the database approach
5. @ SMU
Fundamentals of Database System
Chapter One
5 | P a g e
Data can be shared: two or more users can access and use same data instead of storing
data in redundant manner for each user.
Improved accessibility of data: by using structured query languages, the users can easily
access data without programming experience.
Redundancy can be reduced: isolated data is integrated in database to decrease the
redundant data stored at different applications.
Quality data can be maintained: the different integrity constraints in the database
approach will maintain the quality leading to better decision making
Inconsistency can be avoided: controlled data redundancy will avoid inconsistency of
the data in the database to some extent.
Transaction support can be provided: basic demands of any transaction support systems
are implanted in a full scale DBMS.
Integrity can be maintained: data at different applications will be integrated together
with additional constraints to facilitate shared data resource.
Security majors can be enforced: the shared data can be secured by having different
levels of clearance and other data security mechanisms.
Improved decision support: the database will provide information useful for decision
making.
Standards can be enforced: the different ways of using and dealing with data by
different unite of an organization can be balanced and standardized by using database
approach.
Compactness: since it is an electronic data handling method, the data is stored
compactly (no voluminous papers).
Speed: data storage and retrieval is fast as it will be using the modern fast computer
systems.
Less labor: unlike the other data handling methods, data maintenance will not demand
much resource.
Centralized information control: since relevant data in the organization will be stored at
one repository, it can be controlled and managed at the central level.
Limitations and risk of Database Approach
Introduction of new professional and specialized personnel.
Complexity in designing and managing data
The cost and risk during conversion from the old to the new system
High cost to be incurred to develop and maintain the system
Complex backup and recovery services from the users perspective
Reduced performance due to centralization and data independency
High impact on the system when failure occurs to the central system.
6. @ SMU
Fundamentals of Database System
Chapter One
6 | P a g e
Database Management System (DBMS)
Database Management System (DBMS) is a Software package used for providing
efficient, convenient and safe multi-user (many people/programs accessing same
database, or even same data, simultaneously) storage of and access to massive amounts of
persistent (data outlives programs that operate on it) data.
A DBMS also provides a systematic method for creating, updating, storing, retrieving
data in a database. DBMS also provides the service of controlling data access, enforcing
data integrity, managing concurrency control, and recovery. Having this in mind, a full
scale DBMS should at least have the following services to provide to the user.
1. Data storage, retrieval and update in the database
2. A user accessible catalogue
3. Transaction support service: ALL or NONE transaction, which minimize data
inconsistency.
4. Concurrency Control Services: access and update on the database by different users
simultaneously should be implemented correctly.
5. Recovery Services: a mechanism for recovering the database after a failure must be
available.
6. Authorization Services (Security): must support the implementation of access and
authorization service to database administrator and users.
7. Support for Data Communication: should provide the facility to integrate with data
transfer software or data communication managers.
8. Integrity Services: rules about data and the change that took place on the data,
correctness and consistency of stored data, and quality of data based on business
constraints.
9. Services to promote data independency between the data and the application
10. Utility services: sets of utility service facilities like
Importing/exporting of data
Statistical analysis support
Index reorganization
DBMS and Components of DBMS Environment
The DBMS is software package that helps to design, manage, and use data using the database
approach. Taking a DBMS as a system, one can describe it with respect to it environment or
other systems interacting with the DBMS. The DBMS environment has five components. To
design and use a database, there will be the interaction or integration of Hardware, Software,
Data, Procedure and People.
1. Hardware: are components that one can touch and feel. These components are comprised of
various types of personal computers, mainframe or any server computers to be used in multi-user
system, network infrastructure, and other peripherals required in the system.
7. @ SMU
Fundamentals of Database System
Chapter One
7 | P a g e
2. Software: are collection of commands and programs used to manipulate the hardware to
perform a function. These include components like the DBMS software, application programs,
operating systems, network software, language software and other relevant software.
3. Data: since the goal of any database system is to have better control of the data and making
data useful, Data is the most important component to the user of the database. There are two
categories of data in any database system: that is Operational and Metadata. Operational data
is the data actually stored in the system to be used by the user. Metadata is the data that is used
to store information about the database itself. The structure of the data in the database is called
the schema, which is composed of the Entities, Properties of entities, and relationship between
entities which will be discussed in the upcoming chapters.
4. Procedure: this is the rules and regulations on how to design and use a database. It includes
procedures like how to log on to the DBMS, how to use facilities, how to start and stop
transaction, how to make backup, how to treat hardware and software failure, how to change the
structure of the database.
5. People: this component is composed of the people in the organization that are responsible or
play a role in designing, implementing, managing, administering and using the resources in the
database. This component includes group of people with high level of knowledge about the
database and the design technology to other with no knowledge of the system except using the
data in the database.
Roles in Database Design and Use
As people are one of the components in DBMS environment, there are group of roles played by
different stakeholders of the designing and operation of a database system.
1. Database Administrator (DBA)
Responsible to oversee, control and manage the database resources (the database
itself, the DBMS and other related software)
Authorizing users’ access to the database
Coordinating and monitoring the use of the database
Responsible for determining and acquiring hardware and software resources
Accountable for problems like poor security, poor performance of the system
Involves in all steps of database development
We can have further classifications of this role in big organizations having huge amount of data
and user requirement.
Data Administrator (DA): is responsible for the management of data resources
that involves in database planning, development, maintenance of standards
policies and procedures at the conceptual and logical design phases.
Database Administrator (DBA): is more technically oriented role. Responsible
for the physical realization of the database that involves in physical design,
implementation, security and integrity control of the database.
2. Database Designer (DBD)
Identifies the data to be stored and choose the appropriate structures to represent and
store the data.
Should understand the user requirement and should choose how the user views the
database.
8. @ SMU
Fundamentals of Database System
Chapter One
8 | P a g e
Involve on the design phase before the implementation of the database system. We have
two distinctions of database designers, one involving in the logical and conceptual
design and another involving in physical design.
1. Logical and Conceptual DBD
Identifies data (entity, attributes and relationship) relevant to the organization
Identifies constraints on each data
Understand data and business rules in the organization
Sees the database independent of any data model at conceptual level and consider
one specific data model at logical design phase.
2. Physical DBD
Take logical design specification as input and decide how it should be physically
realized.
o Map the logical data model on the specified DBMS with respect to tables
and integrity constraints. (DBMS dependent designing)
o Select specific storage structure and access path to the database
o Design security measures required on the database
3. Application Programmer and System analyst
o Determines the user requirement and how the user wants to view the
database.
o The application programmer implements these specifications as programs;
code, test, debug, document and maintain the application program.
o Determines the interface on how to retrieve, insert, update and delete data
in the database.
o The application could use any high level programming language according
to the availability, the facility and the required service.
4. End Users
Workers, whose job requires accessing the database frequently for various purpose.
There are different group of users in this category. 1.
1. Naive Users:
Sizable proportion of users
Unaware of the DBMS
Only access the database based on their access level and demand
Use standard and pre-specified types of queries.
2. Sophisticated Users
Are users familiar with the structure of the Database and facilities of
the DBMS.
Have complex requirements
Have higher level queries
Are most of the time engineers, scientists, business analysts, etc
3. Casual Users
Users who access the database occasionally.
Need different information from the database each time.
Use sophisticated database queries to satisfy their needs.
Are most of the time middle to high level managers.
These users can be again classified as “Actors on the Scene” and “Workers Behind the Scene”.
9. @ SMU
Fundamentals of Database System
Chapter One
9 | P a g e
Actors On the Scene:
Data Administrator
Database Administrator
Database Designer End Users Workers
Behind the Scene
DBMS designers and implementers: who design and implement different DBMS
software.
Tool Developers: experts who develop software packages that facilitates database system
designing and use. Prototype, simulation, code generator developers could be an
example. Independent software vendors could also be categorized in this group.
Operators and Maintenance Personnel: system administrators who are responsible for
actually running and maintaining the hardware and software of the database system and
the information technology facilities.