2. Overview
The management of data is ultimately about representing, capturing and
processing information about some aspect of reality.
An organization might be interested in managing data about any number
of domains, including, but not limited to:
• Employee Information • customer information
• Documents • Part inventories
• Product orders • service orders
• Geographic Information • Environmental conditions
• information systems logs
3. Logs
A log is a record, composed of log entries; each entry
contains information related to a specific event that has occurred within
a system or network. Many logs within an organization contain records
related to computer security.
TASK:
What is system log?
What is log management?
Define the importance of log management.
Explain logging and monitoring.
What is Geographic information system?
4. The key elements of effective data
management for any organization are:
1. Data Models, an accurate and flexible representation of the
concepts to be managed within the organization.
2. Information systems, a technical implementation and
arrangement of data, software and hardware that provides
for efficient processing of the data specified in the data
model.
3. Social processes, an appropriate organization of humans
which allows the information system to be used in a safe and
effective manner.
5. The purpose is to provide a set of tools for understanding data
management from conceptual, technical, and social perspectives.
Various management roles in an organization, includes:
• General clients: those who depend on a data management system to
carry out daily tasks, but who are not involved directly in the technical
aspects of the system's design or operation
• Managers: those who manage people and operations through the use of a
data management system and may or may not be directly involved in the
system's technical aspects of its design or operation
• Technical clients: those who are involved directly in the operation and
maintenance of a data management system
• Systems analysts: those who are directly involved in the planning,
design, implementation, and maintenance of data management systems.
6. The basic responsibilities that data
management technologies include :
1. Data models define how the logical structure of a database is modeled &
define how data is connected to each other and how they are processed and
stored inside the system. OR
data models: mechanisms that allow clients to specify what data are to be
managed including the logical relationships amongst them and constraints
which must hold.
TASK:
Define logical relationship and its types.
2. Storage Management: providing mechanisms for storing data in a
logically coherent and space efficient manner.
7. 3. Access methods: providing mechanisms for locating desired data
amongst a very large collection of data and retrieving them efficiently
4. Query processing and data manipulation: providing mechanisms that
allow clients—people or software—to create, examine, change, and delete
data in a convenient manner
5. Security: providing mechanisms for making data secure from unwanted
access.
6. Application program interface: providing a mechanism by which other
software systems can make use of the data management system.
Correctly designed and operated data management technologies allow
systems such as those described above to run reliably and efficiently.
8. Representing reality through data
management
Data management is ultimately concerned with representing some
aspect of reality that must be recorded, analyzed, or communicated.
The aspect of reality that is chosen to be represented depends upon two
main determinants.
The first determinant is the nature of the domain for which the data are
to be managed.
Common generic data management domains are:
i. Objects: These can be physical or conceptual entities such as
automobiles, train or airplane reservations, or health records.
Data management in this domain involves recording and tracking objects.
9. ii. Events: These can be any type of occurrence for which a record is
desired, such as a business transaction or a bank deposit.
Data management in this domain involves recording in a highly reliable
manner.
iii. Organizations: These can be individual businesses, communities,
governmental agencies, or departments within larger entities.
Organizational data management in this domain involves recording and
tracking: objects within the organization, including people; events and
processes within the organization; and relationships between processes
and entities.
10. iv. Physical phenomena: These can be any observable occurrences such as:
geological conditions, weather conditions, or astronomical processes. Data
management in this domain involves recording any necessary
measurements.
Note: One requirement that is common to most data management domains is
the need to support queries. A query in this context is a question that is posed
to and answered by a data management system using the data it is
managing.
11. The second determinant of what aspects of reality are to be represented is
the set of tasks to be performed with the data that are to be managed.
Suppose we are designing a data management scheme for a grocery
store. For example, that one of the tasks the store needs to perform is
tracking its inventory so that it can know when certain items must be
replenished( fill up again).
Not only product numbers and their quantities for the inventory, but
information about which of those products have been purchased by
customers.
12. Common Terms:
What are data?
Datum is the singular of data. Facts that are represented by data may be
natural objects or phenomena, human-derived concepts, or some
combination of the two.
Information and meaning?
The concept of data is usually distinguished from the concepts of
information and knowledge. If a datum is a fact about the real world, the
meaning that we derive from it is information.
Knowledge
Knowledge constitutes an additional level of meaning that we derive from
information through some process. Sometimes this process is
observational.
Or simply: facts, information, and skills acquired through experience or
education.
13. What must be represented?
The fundamental problems of data modeling include:
deciding what entities must be represented within a chosen aspect of
reality,
what characteristics of those entities must be represented, and
how best to balance the solutions to both of the above problems.
TASK:
Differentiate data & reality.
14. Granularity
Granularity in the context of data modeling refers to the level of specificity
or detail at which something is represented. Almost all data modeling
problems offer the opportunity to represent some aspect of reality at ever
greater levels of detail. Representations that contain a lot of detail are said
to be fine grained with respect to granularity.
OR
The level of detail considered in a model or decision making process. The
greater the granularity, the deeper the level of detail.
OR
highly detailed; having many small and distinct part: data analysis on a
granular level
15. Example of Granularity
Granular data, as the name suggests, is data that is in pieces, as small as
possible, in order to be more defined and detailed. The advantage of granular data
is that it can be molded in any way that the data scientist or analyst requires.
If data is not granulated, such as a name or address field being saved as a whole,
then it is very difficult for analysts to mine and analyze data because they are in
large chunks.
A good example of data granularity is how a name field is subdivided, if it is
contained in a single field or subdivided into its constituents such as first name,
middle name and last name. As the data becomes more subdivided and specific, it
is also considered more granular.
https://www.youtube.com/watch?v=uwi4EvRXtc0
16. Identity: In creating a data model, it is almost always the case that multiple
entities must be represented. If entities within a collection of data lack
identity it will be difficult, if not impossible, for users to find data they need.
It is also important to be able to distinguish between entities of different
types.
Uniqueness: The assignment of unique identifiers to entities is often the
ideal way of distinguishing one entity from another.
Ex: It is common in transportation networks to assign a unique number to
each route. “If you want to go to Ottawa, take the 35, 37, or 39,” a
Montréal resident might say. Thus, VIA Rail assigns a unique number to
each combination. The Montréal to Ottawa route has unique numbers for
each time of day.
Montréal is the largest city in Canada's Québec province.
Ottawa is Canada’s capital, in the east of southern Ontario, near the city of
Montréal and the U.S. border.
18. Assigning values:
Another level of reality is represented by the values: we choose to assign to
the attributes of the entities in a schema. Typically, data management
software, particularly DBMS, can support the storage of various basic data
types.
These include types that represent the well-known numerical and logical
domains integer, real, and Boolean.
Such systems allow the storage of character or string data. Strings are
sequences of characters.
19. Relationships:
Another aspect of reality that must often be captured are the relationships:
that exist between entities in the real world.
Relationships can be viewed in a variety of ways. From one perspective,
relationships can be used to define the logical structure of a set of entities.
Another important perspective is that which defines what various entities
mean to each other.
one-to-one
We depict a 1:1 relationship between type A and type B.
20. one-to-many
We depict a 1:m relationship between
type A and type B.
many-to-many
We depict a m:m relationship between
type A and type B.
21. Has-A relationship is also known as composition. It is also used for code
reusability in Java. In Java, a Has-A relationship simply means that an
instance of one class has a reference to an instance of another class or
an other instance of the same class.
For example, a car has an engine, a dog has a tail and so on.
An Is-A relationship depends on inheritance. Further inheritance is of two
types, class inheritance and interface inheritance. It is used for code
reusability in Java.
For example, a Potato is a vegetable, a Bus is a vehicle, a Bulb is an
electronic device and so on. One of the properties of inheritance is that
inheritance is unidirectional in nature. Like we can say that a house is a
building. But not all buildings are houses.
#6:Coherent: having the quality of holding together or cohering especially
#8:Determinant: a factor which decisively affects the nature or outcome of something
Object: a material thing that can be seen and touched
#21:made up of several parts or elements: Composite
The concept of a composite entity, an entity that contains other entities. A composite entity is one that defines structure.
composition: ingredients or constituents; the way in which a whole or mixture is made up.