this chapter deals with: Basic Concepts of Databases, Sources of data, Evolution of Database
Database Benefits, Types of Database, Database Design, Characteristics of Database, and Jobs with Database.
1. COURSE CODE : ECEg 4181
BY : EYOB S.
EMAIL : eyobce@gmail.com
1
2. § Basic Concepts of Databases
§ Sources of data
§ Evolution of Database
§ Database Benefits
§ Types of Database
§ Database Design
§ Characteristics of Database
§ Jobs with Database
3/5/2024
ECEg 4181 - By: Eyob S. 2
3. § Data: raw facts
§ It must be formatted for processing and storage
§ Bigdata: refers to huge amount of data
§ Can be structured, unstructured or semistructured
§ Database: a collection of data organized to enable the
creation, reading, updating, and deletion (CRUD) of data
§ Information: generated from processed data
§ It requires context to determine meaning
§ It can become knowledge used for decision making
§ Metadata: data about data, or description of the data
3/5/2024
ECEg 4181 - By: Eyob S. 3
4. § Data can be collected from different sources
§ You can collect from
§ Internet searching: e.g. Google Dataset, Kaggle, & Earthdata
§ Web scraping: is the process of using bots to extract content
and data from a website.
§ Transactions: from finances, banks, etc.
§ Asking servers with APIs: e.g. about music in Spotify API
§ Querying a database,… etc.
§ Nowadays, any company uses data
3/5/2024
ECEg 4181 - By: Eyob S. 4
6. § Traditional
§ Manual System: Files, folders, file cabinets
§ Computerized: Apple numbers, Google sheet, MS excel
– Poor structure: changing a structure breaks the system
– Poor data dependency, redundancy, and inconsistency
– Data insecurity, integrity issues, decentralized data, etc.
§ Hierarchical and Network Models were introduced in mid 1960s
§ In 1970s, Edgar F. Codd introduced the Relational Model, which
provided a more flexible and intuitive way to organize and access
data using tables and structural
3/5/2024
ECEg 4181 - By: Eyob S. 6
7. § Scale
§ Spreadsheets can hold thousands of records, whereas
databases can hold millions or even billions of records
§ Frequency
§ Databases are designed to manage and process frequent
data operations efficiently (realtime updates and queries)
§ Speed
§ Databases can perform queries, updates, and other
transactions much faster
3/5/2024
ECEg 4181 - By: Eyob S. 7
8. § Centralized data
§ the data is centrally located and it is a collection of persistent
data that can be shared and interrelated
§ Persistent – the data resides on stable storage since the
data is repetitively used
§ Shared – the database can have multiple users
§ Interrelated – data stored as a separate unit can be
connected to provide a whole picture
§ Database contains flood of data about many aspects which
are useful for decision making
3/5/2024
ECEg 4181 - By: Eyob S. 8
9. 3/5/2024
ECEg 4181 - By: Eyob S. 9
§ Single-user (PCs) or Multi-user (Workspaces, Enterprises)
§ Centralized or Distributed (multi-location)
§ Cloud database (MS Azure, Amazon AWS, IBM, Oracle, etc.)
§ General purpose / Discipline-specific
§ Operational (OLTP, Transactional, Production) or Analytical
§ Structured or Unstructured (or Semi-structured)
§ eXtensible Markup Language (XML) databases
§ Not only SQL (NoSQL)
10. § Logical data format
§ How you visualize your data
§ Draw the entity relationship diagram (ERD)
§ Physical data format
§ Actual database
§ Database management system (DBMS)
§ Database system environments
§ Hardware: electronic devices (server side, client side,…)
§ Software: OS (operating systems), DBMS, applications, etc.
§ Information: information lives in a database.
§ Procedure: how data get into a database or come out from it.
§ People: database expert, programmers, end user, etc.
3/5/2024
ECEg 4181 - By: Eyob S. 10
11. § DBMS (Database Management System)
§ A software though which you can interact with a database
§ DBMS is a software system that enables users to define, create,
maintain, and control access to the database.
§ Examples:
§ MS Access
§ Oracle
§ MySQL
3/5/2024
ECEg 4181 - By: Eyob S. 11
§ Ingres
§ MariaDB
§ PostgreSQL
§ Snowflake
§ SQLite,…etc.
§ MongoDB
§ DynamoDB
§ ScyllaDB
§ Redis
§ Neo4J
§ ArangoDB
§ Hbase
§ Cassandra,…etc.
12. § The most common query language is the Structured Query
Language (SQL, pronounced “S-Q-L”, or sometimes “See-
Quel”), which is now both the formal and de facto standard
language for relational DBMSs.
3/5/2024
ECEg 4181 - By: Eyob S. 12
13. § JSON (JavaScript Object Notation) plays a crucial role in NoSQL
databases, especially document-oriented databases like
MongoDB, Couchbase, and Firebase.
3/5/2024
ECEg 4181 - By: Eyob S. 13
14. § Self-describing nature of the database system
§ Database contains not only the database itself but also metadata
§ Metadata is a complete definition(description) of the database
structure and its constraints
§ Insulation between data and program
§ which is also called program data independence
§ Metadata is stored in the DBMS catalog separately from the
access program
§ The characteristics that allow program data independence is
called data abstraction
3/5/2024
ECEg 4181 - By: Eyob S. 14
15. § Support multiple user view of the data
§ A view may be a subset of the database, or it contain virtual data
that is derived from the database files, but not explicitly stored
§ A database has different users and each of them may require a
different perspective (view) of the database
§ Sharing of data and multiple user transaction processing
§ Concurrency control software to ensure that multiple users
trying to update the same data, do so in a controlled manner, so
that the result of the update must be correct
§ Isolation property ensures that each transaction appears to
execute in isolation from other transactions, even though
hundreds of transactions may be executing concurrently
3/5/2024
ECEg 4181 - By: Eyob S. 15
16. § DBMS has built-in facilities to support concurrent or
parallel execution of database programs
§ Sequence of read/write operations considered to be an
atomic unit in the sense that either all operations are
executed or none at all
§ Read/write operations can be executed at the same time
by the DBMS
§ DBMS should avoid any inconsistencies
3/5/2024
ECEg 4181 - By: Eyob S. 16
17. BACKUP AND RECOVERY FACILITIES
§ Backup and recovery facilities can be used to deal with
the effect of loss of data due to hardware or network
errors, or bugs in system or application software
§ Backup facilities can either perform a full or incremental
backup
§ Recovery facilities allow restoration of the data to a
previous state after loss or damage occurs
17
18. DATA SECURITY
§ Data security can be enforced by the DBMS
§ Some users have read access, while others have write
access to the data (role-based functionality)
§ Sophisticated granularity is possible
§ Data access can be managed via logins and passwords
assigned to users or user accounts
§ Each account has its own authorization rules that can be
stored in the catalog
18
19. PERFORMANCE UTILITIES
§ There are three key performance indicators (KPIs) of a DBMS
§ Response time, denoting the time elapsed between issuing a
database request and the successful termination thereof
§ Throughput rate, representing the transactions a DBMS can
process per unit of time
§ Space utilization, referring to the space utilized by the DBMS to
store both raw data and metadata
§ DBMSs come with various types of utilities aimed at
improving these KPIs
§ e.g., utilities to distribute and optimize data storage, to tune
indexes for faster query execution, to tune queries to
improve application performance, or to optimize buffer
management
19
20. § System administrator
§ Database designer
§ Database developer
§ Database manager
§ Data architect (e.g., in cloud computing)
§ Data security officer
§ Database consultant
§ Data modeler
§ Data analyst
§ Computer system analyst
§ Data security analyst
§ Data scientist
§ Computer scientist, etc.
3/5/2024
ECEg 4181 - By: Eyob S. 20