SlideShare a Scribd company logo
Foundations of Business
Intelligence: Databases and
Information Management
Terminologies Used….
Analytic Entity
Attribute Entity-relationship diagram
Big data Field
Bit File
Byte Foreign key
Data administration Hadoop
Data cleansing In-memory computing
Data definition Information policy
Data dictionary Key field
Data governance Non-relational database management systems
Data inconsistency Normalization
Data manipulation language Online analytical processing (OLAP)
Data mart Primary key
Data mining Program-data dependence
Data quality audit Record
Data redundancy Referential integrity
Data warehouse Relational DBMS
Database Sentiment analysis
Database administration Structured Query Language (SQL)
Database management system (DBMS) Text mining
Database server Tuple
Web mining
Learning Objectives
• 1 What are the problems of managing data resources in a traditional
file environment?
• 2 What are the major capabilities of database management systems
(DBMS), and why is a relational DBMS so powerful?
• 3 What are the principal tools and technologies for accessing
information from databases to improve business performance and
decision making?
• 4 Why are information policy, data administration, and data quality
assurance essential for managing the firm’s data resources?
Video Cases
• Case 1: Dubuque Uses Cloud Computing and Sensors to Build a
Smarter City
• Case 2: Brooks Brothers Closes In on Omnichannel Retail
• Case 3: Maruti Suzuki Business Intelligence and Enterprise Databases
File Organization Terms and Concepts
• Database: Group of related files
• File: Group of records of same type
• Record: Group of related fields
• Field: Group of characters as word(s) or number(s)
• Entity: Person, place, thing on which we store information
• Attribute: Each characteristic, or quality, describing entity
The Data
Hierarchy
Problems with the Traditional File
Environment
• Files maintained separately by different departments
• Data redundancy
• Data inconsistency
• Program-data dependence
• Lack of flexibility
• Poor security
• Lack of data sharing and availability
Traditional
File
Processing
Database Management Systems
• Database
– Serves many applications by centralizing data and controlling redundant data
• Database management system (DBMS)
– Interfaces between applications and physical data files
– Separates logical and physical views of data
– Solves problems of traditional file environment
• Controls redundancy
• Eliminates inconsistency
• Uncouples programs and data
• Enables organization to centrally manage data and data security
Human Resources Database with Multiple Views
Relational DBMS
• Represent data as two-dimensional tables
• Each table contains data on entity and attributes
• Table: grid of columns and rows
• Rows (tuples): Records for different entities
• Fields (columns): Represents attribute for entity
• Key field: Field used to uniquely identify each record
• Primary key: Field in table used for key fields
• Foreign key: Primary key used in second table as look-up field to identify
records from original table
Relational Database Tables
Operations of a Relational DBMS
• Three basic operations used to develop useful sets of data
• SELECT
• Creates subset of data of all records that meet stated criteria
• JOIN
• Combines relational tables to provide user with more information than available in
individual tables
• PROJECT
• Creates subset of columns in table, creating tables with only the information specified
The Three Basic Operations of a Relational DBMS
Capabilities of Database Management
Systems
• Data definition capability
• Data dictionary
• Querying and reporting
• Data manipulation language
• Structured Query Language (SQL)
• Many DBMS have report generation capabilities for creating polished
reports (Microsoft Access)
Access Data Dictionary Features
Example of an SQL Query
An Access Query
Designing Databases
• Conceptual design vs. physical design
• Normalization
• Streamlining complex groupings of data to minimize redundant data elements
and awkward many-to-many relationships
• Referential integrity
• Rules used by RDBMS to ensure relationships between tables remain
consistent
• Entity-relationship diagram
• A correct data model is essential for a system serving the business
well
An Unnormalized Relation for Order
Normalized Tables Created from Order
An Entity-Relationship Diagram
Non-relational Databases and Databases in
the Cloud
• Non-relational databases: “NoSQL”
• More flexible data model
• Data sets stored across distributed machines
• Easier to scale
• Handle large volumes of unstructured and structured data
• Databases in the cloud
• Appeal to start-ups, smaller businesses
• Amazon Relational Database Service, Microsoft SQL Azure
• Private clouds
The Challenge of Big Data
• Big data
• Massive sets of unstructured/semi-structured data from web traffic, social
media, sensors, and so on
• Volumes too great for typical DBMS
• Petabytes, exabytes of data
• Can reveal more patterns, relationships and anomalies
• Requires new tools and technologies to manage and analyze
Business Intelligence Infrastructure (1 of 3)
• Array of tools for obtaining information from separate systems and
from big data
• Data warehouse
– Stores current and historical data from many core operational transaction
systems
– Consolidates and standardizes information for use across enterprise, but data
cannot be altered
– Provides analysis and reporting tools
Business Intelligence Infrastructure (2 of 3)
• Data marts
– Subset of data warehouse
– Typically focus on single subject or line of business
• Hadoop
• Enables distributed parallel processing of big data across inexpensive
computers
• Key services
• Hadoop Distributed File System (HDFS): data storage
• MapReduce: breaks data into clusters for work
• Hbase: NoSQL database
• Used Yahoo, NextBio
Business Intelligence Infrastructure (3 of 3)
• In-memory computing
• Used in big data analysis
• Uses computers main memory (RAM) for data storage to avoid delays in
retrieving data from disk storage
• Can reduce hours/days of processing to seconds
• Requires optimized hardware
• Analytic platforms
• High-speed platforms using both relational and non-relational tools optimized
for large datasets
Contemporary Business Intelligence Infrastructure
Analytical Tools: Relationships, Patterns,
Trends
• Tools for consolidating, analyzing, and providing access to vast
amounts of data to help users make better business decisions
• Multidimensional data analysis (OLAP - Online Analytical Processing)
• Data mining
• Text mining
• Web mining
Online Analytical Processing (OLAP)
• Supports multidimensional data analysis
• Viewing data using multiple dimensions
• Each aspect of information (product, pricing, cost, region, time period) is
different dimension
• Example: How many washers sold in the East in June compared with other
regions?
• OLAP enables rapid, online answers to ad hoc queries
Multidimensional Data Model
Data Mining
• Finds hidden patterns, relationships in datasets
• Example: customer buying patterns
• Infers rules to predict future behavior
• Types of information obtainable from data mining:
• Associations
• Sequences
• Classification
• Clustering
• Forecasting
Text Mining and Web Mining
• Text mining
• Extracts key elements from large unstructured data sets
• Sentiment analysis software
• Web mining
• Discovery and analysis of useful patterns and information from web
• Web content mining
• Web structure mining
• Web usage mining
Databases and the Web
–Many companies use the web to make some internal databases
available to customers or partners
–Typical configuration includes:
• Web server
• Application server/middleware/CGI scripts
• Database server (hosting DBMS)
–Advantages of using the web for database access:
• Ease of use of browser software
• Web interface requires few or no changes to database
• Inexpensive to add web interface to system
Linking Internal Databases to the Web
Establishing an Information Policy
• Firm’s rules, procedures, roles for sharing, managing, standardizing
data
• Data administration
• Establishes policies and procedures to manage data
• Data governance
• Deals with policies and processes for managing availability, usability, integrity,
and security of data, especially regarding government regulations
• Database administration
• Creating and maintaining database
Ensuring Data Quality
• More than 25 percent of critical data in Fortune 1000 company
databases are inaccurate or incomplete
• Before new database is in place, a firm must:
• Identify and correct faulty data
• Establish better routines for editing data once database in operation
• Data quality audit
• Data cleansing
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information management

More Related Content

PPT
MIS-CH6: Foundation of BUsiness Intelligence: Databases & IS
PPT
MIS Chapter 2
PPT
MIS Chapter 3
PDF
Data warehouse architecture
PPTX
Mis chapter 7 database systems
PPT
MIS-CH04: Ethical and Social Issues in INformation Systems
PPT
Management Information System [Kenneth Laudon]
PDF
003. Business Information System
MIS-CH6: Foundation of BUsiness Intelligence: Databases & IS
MIS Chapter 2
MIS Chapter 3
Data warehouse architecture
Mis chapter 7 database systems
MIS-CH04: Ethical and Social Issues in INformation Systems
Management Information System [Kenneth Laudon]
003. Business Information System

What's hot (20)

PDF
Chapter 6
PDF
Business Intelligence Presentation 1 (15th March'16)
PPTX
Management Information Systems - Chapter 2
PPT
data modeling and models
PPT
Business intelligence databases and information management
PPT
Data Warehouse Basic Guide
PDF
Enterprise Data Management
PPT
Chapter 5 data resource management
PPT
Normalization
PPTX
The Data Warehouse Lifecycle
PPT
Data Warehousing and Data Mining
PPTX
An introduction to Business intelligence
PDF
Data Warehousing 2016
PPTX
OLAP & DATA WAREHOUSE
PPT
Laudon Ch13
PPT
Data models
PPTX
Storytelling with Data with Power BI
PPT
MIS-CH01: Information Systems, Organization, and Strategy
PPT
Data warehouse
PPT
MIS-CH01: IS in Business Today
Chapter 6
Business Intelligence Presentation 1 (15th March'16)
Management Information Systems - Chapter 2
data modeling and models
Business intelligence databases and information management
Data Warehouse Basic Guide
Enterprise Data Management
Chapter 5 data resource management
Normalization
The Data Warehouse Lifecycle
Data Warehousing and Data Mining
An introduction to Business intelligence
Data Warehousing 2016
OLAP & DATA WAREHOUSE
Laudon Ch13
Data models
Storytelling with Data with Power BI
MIS-CH01: Information Systems, Organization, and Strategy
Data warehouse
MIS-CH01: IS in Business Today
Ad

Similar to Foundations of business intelligence databases and information management (20)

PPTX
RowanDay4.pptx
PPTX
4- DB Ch6 18-3-2020.pptx
PPT
Lecture-1.ppt
PPT
Management information system database management
PPTX
Chapter5
PPT
Fundamentals of information systems chapter 3.ppt
PPTX
Fundamentals of information systems chapter 3.pptx
PPTX
The Database Management System DBMS.pptx
PPT
Notes on Understanding RDBMS2 for StudentsS.ppt
PPTX
History of database processing module 1 (2)
PDF
Lect. 7 - MIS and business analytics.pdf
PPTX
What Is a Database Powerpoint Presentation.pptx
PDF
CST204 DBMS Module-1
PPTX
dbms introduction.pptx
PPTX
Introduction to Database System Concepts and ArchitectureDBMS_I_UNIT.pptx
PPTX
CS3270 - DATABASE SYSTEM - Lecture (1)
PDF
01-Database Administration and Management.pdf
PDF
Introduction to RDBMS
PPTX
Database management system
PPTX
Database
RowanDay4.pptx
4- DB Ch6 18-3-2020.pptx
Lecture-1.ppt
Management information system database management
Chapter5
Fundamentals of information systems chapter 3.ppt
Fundamentals of information systems chapter 3.pptx
The Database Management System DBMS.pptx
Notes on Understanding RDBMS2 for StudentsS.ppt
History of database processing module 1 (2)
Lect. 7 - MIS and business analytics.pdf
What Is a Database Powerpoint Presentation.pptx
CST204 DBMS Module-1
dbms introduction.pptx
Introduction to Database System Concepts and ArchitectureDBMS_I_UNIT.pptx
CS3270 - DATABASE SYSTEM - Lecture (1)
01-Database Administration and Management.pdf
Introduction to RDBMS
Database management system
Database
Ad

More from Amity University | FMS - DU | IMT | Stratford University | KKMI International Institute | AIMA | DTU (20)

PPTX
Concept of Governance - Management of Operational Risk for IT Officers/Execut...
PPTX
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
PPTX
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET
Concept of Governance - Management of Operational Risk for IT Officers/Execut...
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET

Recently uploaded (20)

PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Computing-Curriculum for Schools in Ghana
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Pre independence Education in Inndia.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Institutional Correction lecture only . . .
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
01-Introduction-to-Information-Management.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Basic Mud Logging Guide for educational purpose
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
GDM (1) (1).pptx small presentation for students
102 student loan defaulters named and shamed – Is someone you know on the list?
Computing-Curriculum for Schools in Ghana
Anesthesia in Laparoscopic Surgery in India
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pre independence Education in Inndia.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Institutional Correction lecture only . . .
human mycosis Human fungal infections are called human mycosis..pptx
01-Introduction-to-Information-Management.pdf
VCE English Exam - Section C Student Revision Booklet
Basic Mud Logging Guide for educational purpose
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPH.pptx obstetrics and gynecology in nursing
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Microbial disease of the cardiovascular and lymphatic systems
GDM (1) (1).pptx small presentation for students

Foundations of business intelligence databases and information management

  • 1. Foundations of Business Intelligence: Databases and Information Management
  • 2. Terminologies Used…. Analytic Entity Attribute Entity-relationship diagram Big data Field Bit File Byte Foreign key Data administration Hadoop Data cleansing In-memory computing Data definition Information policy Data dictionary Key field Data governance Non-relational database management systems Data inconsistency Normalization Data manipulation language Online analytical processing (OLAP) Data mart Primary key Data mining Program-data dependence Data quality audit Record Data redundancy Referential integrity Data warehouse Relational DBMS Database Sentiment analysis Database administration Structured Query Language (SQL) Database management system (DBMS) Text mining Database server Tuple Web mining
  • 3. Learning Objectives • 1 What are the problems of managing data resources in a traditional file environment? • 2 What are the major capabilities of database management systems (DBMS), and why is a relational DBMS so powerful? • 3 What are the principal tools and technologies for accessing information from databases to improve business performance and decision making? • 4 Why are information policy, data administration, and data quality assurance essential for managing the firm’s data resources?
  • 4. Video Cases • Case 1: Dubuque Uses Cloud Computing and Sensors to Build a Smarter City • Case 2: Brooks Brothers Closes In on Omnichannel Retail • Case 3: Maruti Suzuki Business Intelligence and Enterprise Databases
  • 5. File Organization Terms and Concepts • Database: Group of related files • File: Group of records of same type • Record: Group of related fields • Field: Group of characters as word(s) or number(s) • Entity: Person, place, thing on which we store information • Attribute: Each characteristic, or quality, describing entity
  • 7. Problems with the Traditional File Environment • Files maintained separately by different departments • Data redundancy • Data inconsistency • Program-data dependence • Lack of flexibility • Poor security • Lack of data sharing and availability
  • 9. Database Management Systems • Database – Serves many applications by centralizing data and controlling redundant data • Database management system (DBMS) – Interfaces between applications and physical data files – Separates logical and physical views of data – Solves problems of traditional file environment • Controls redundancy • Eliminates inconsistency • Uncouples programs and data • Enables organization to centrally manage data and data security
  • 10. Human Resources Database with Multiple Views
  • 11. Relational DBMS • Represent data as two-dimensional tables • Each table contains data on entity and attributes • Table: grid of columns and rows • Rows (tuples): Records for different entities • Fields (columns): Represents attribute for entity • Key field: Field used to uniquely identify each record • Primary key: Field in table used for key fields • Foreign key: Primary key used in second table as look-up field to identify records from original table
  • 13. Operations of a Relational DBMS • Three basic operations used to develop useful sets of data • SELECT • Creates subset of data of all records that meet stated criteria • JOIN • Combines relational tables to provide user with more information than available in individual tables • PROJECT • Creates subset of columns in table, creating tables with only the information specified
  • 14. The Three Basic Operations of a Relational DBMS
  • 15. Capabilities of Database Management Systems • Data definition capability • Data dictionary • Querying and reporting • Data manipulation language • Structured Query Language (SQL) • Many DBMS have report generation capabilities for creating polished reports (Microsoft Access)
  • 17. Example of an SQL Query
  • 19. Designing Databases • Conceptual design vs. physical design • Normalization • Streamlining complex groupings of data to minimize redundant data elements and awkward many-to-many relationships • Referential integrity • Rules used by RDBMS to ensure relationships between tables remain consistent • Entity-relationship diagram • A correct data model is essential for a system serving the business well
  • 23. Non-relational Databases and Databases in the Cloud • Non-relational databases: “NoSQL” • More flexible data model • Data sets stored across distributed machines • Easier to scale • Handle large volumes of unstructured and structured data • Databases in the cloud • Appeal to start-ups, smaller businesses • Amazon Relational Database Service, Microsoft SQL Azure • Private clouds
  • 24. The Challenge of Big Data • Big data • Massive sets of unstructured/semi-structured data from web traffic, social media, sensors, and so on • Volumes too great for typical DBMS • Petabytes, exabytes of data • Can reveal more patterns, relationships and anomalies • Requires new tools and technologies to manage and analyze
  • 25. Business Intelligence Infrastructure (1 of 3) • Array of tools for obtaining information from separate systems and from big data • Data warehouse – Stores current and historical data from many core operational transaction systems – Consolidates and standardizes information for use across enterprise, but data cannot be altered – Provides analysis and reporting tools
  • 26. Business Intelligence Infrastructure (2 of 3) • Data marts – Subset of data warehouse – Typically focus on single subject or line of business • Hadoop • Enables distributed parallel processing of big data across inexpensive computers • Key services • Hadoop Distributed File System (HDFS): data storage • MapReduce: breaks data into clusters for work • Hbase: NoSQL database • Used Yahoo, NextBio
  • 27. Business Intelligence Infrastructure (3 of 3) • In-memory computing • Used in big data analysis • Uses computers main memory (RAM) for data storage to avoid delays in retrieving data from disk storage • Can reduce hours/days of processing to seconds • Requires optimized hardware • Analytic platforms • High-speed platforms using both relational and non-relational tools optimized for large datasets
  • 29. Analytical Tools: Relationships, Patterns, Trends • Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions • Multidimensional data analysis (OLAP - Online Analytical Processing) • Data mining • Text mining • Web mining
  • 30. Online Analytical Processing (OLAP) • Supports multidimensional data analysis • Viewing data using multiple dimensions • Each aspect of information (product, pricing, cost, region, time period) is different dimension • Example: How many washers sold in the East in June compared with other regions? • OLAP enables rapid, online answers to ad hoc queries
  • 32. Data Mining • Finds hidden patterns, relationships in datasets • Example: customer buying patterns • Infers rules to predict future behavior • Types of information obtainable from data mining: • Associations • Sequences • Classification • Clustering • Forecasting
  • 33. Text Mining and Web Mining • Text mining • Extracts key elements from large unstructured data sets • Sentiment analysis software • Web mining • Discovery and analysis of useful patterns and information from web • Web content mining • Web structure mining • Web usage mining
  • 34. Databases and the Web –Many companies use the web to make some internal databases available to customers or partners –Typical configuration includes: • Web server • Application server/middleware/CGI scripts • Database server (hosting DBMS) –Advantages of using the web for database access: • Ease of use of browser software • Web interface requires few or no changes to database • Inexpensive to add web interface to system
  • 36. Establishing an Information Policy • Firm’s rules, procedures, roles for sharing, managing, standardizing data • Data administration • Establishes policies and procedures to manage data • Data governance • Deals with policies and processes for managing availability, usability, integrity, and security of data, especially regarding government regulations • Database administration • Creating and maintaining database
  • 37. Ensuring Data Quality • More than 25 percent of critical data in Fortune 1000 company databases are inaccurate or incomplete • Before new database is in place, a firm must: • Identify and correct faulty data • Establish better routines for editing data once database in operation • Data quality audit • Data cleansing