SlideShare a Scribd company logo
Business Intelligence and Big Data in Cloud
Data-Driven, a Survival Must?
2BI: Goal, Mindset
Industry Winner Loser
Online Retail
DVD Rental
Social Network
The same story will happen in other industries soon, if not already happened.
BI is a major tool to turn a company into a data-driven company.
What Can BI System Help?
3BI: Goal, Mindset
Globalization
Customer Demand
Market Conditions
Competition
Technology Advance
Regulations
…
Business
Environment
Organization
Responses
Strategic Planning
New Business Models
Restructure Business Processes
Supply Chain Optimization
Improve Partnership Relationships
Improve Information Systems
Encourage Innovation
Improve Customer Service
Improve Communication
Improve Data Access
Automate tasks
Real-time Response
…
Pressures
Opportunities
Decision and
Support
Analysis
Predictions
Decisions
Business
Intelligence
Support
Turban, E., Sharda, R., Delen, D., and King, D. (2010). Business Intelligence: a managerial approach (2nd ed.). Upper Saddle River, NJ: Prentice Hall.
Without Good BI System – Information Everywhere, but Hard to Access
4BI: Goal, Mindset
Executives
Managers Why cannot I
have right
data at right
time?
Analysts Why should I
waste most
time getting
the data?
Data Engineers Ad hoc query
all day? The
data job is so
boring.
Operators
Can I get all
relevant data
in one place?
Without Good BI System – Inefficiency due to Information Silo
5BI: Goal, Mindset
O9 Solutions, Inc. Funny Business: Sales and Operations, accessed January 28, 2016, https://www.linkedin.com/hp/update/6098306831355047936
How Can BI System Help?
6BI: Goal, Mindset
Bench
marking
Historical Current Predictive
Views of Business Operations and Performance
Better, Quicker Business Decision-Making
Performance
Management
Reporting Analytics
Data
Mining
Predictive
Analytics
FinanceSales
Returns
Supply
Production
Web
Email
User
Usage
Industry
Analysis
Competitive
Analysis
Social
Analysis
Product
Ranking
Technology
Analysis
Internal Data External Data
How BI Deliver Core Values to Customers
7
Business
Intelligence
Tableau
QlikView
Amazon QuickSight
Web/Mobile
Analysis
Google Analytics
Adobe Analytics
SEO
Social
Analysis
Facebook Twitter
Pinterest WeChat
Cloud Big Data
Warehouse
Amazon Redshift
Machine
Learning
R Python
Amazon ML
BI: Goal, Mindset
Web/Mobile/APP Analysis – Audience(demo, interest, user type)
8
Davis, J. (2015). Google Analytics Demystified: A Hand-On Approach (2nd edition). CreateSpace.
BI: Web/Mobile Analysis
Web/Mobile/APP Analysis – Audience (Location)
9BI: Web/Mobile Analysis
Web/Mobile/APP Analysis – Audience (Operating System)
10BI: Web/Mobile Analysis
Adobe Discover: Navigation Flow Among Top Pages/Content
11Adobe training video. Retrieved
from https://outv.omniture.com/. 11BI: Web/Mobile Analysis
Adobe Discover: Navigation Flow from a Page
12
Adobe training video. Retrieved from https://outv.omniture.com/.
BI: Web/Mobile Analysis
Web/Mobile/APP Analysis – Acquisition
13
Campaign
Keyword
BI: Web/Mobile Analysis
Web/Mobile/APP Analysis – SEO
14BI: Web/Mobile Analysis
Visitor Segmentation Study via Adobe SiteCatalyst
15BI: Web/Mobile Analysis
• What/how are
they viewing?
• Why did they
leave?
• How to engage
them more?
• How to
connect them?
New
Visitors
Casual
Visitors
Loyal
Visitors
Elapsed
Visitors
• Growing the loyal visitors is essential to keep the site thriving.
• So it is important to understand their navigation pattern, what do
they like and unlike.
Visits from Social Channels
16
Facebook
Pinterest
Twitter
BI: Social Analysis
Top Facebook Posts – Facebook Insights
17BI: Social Analysis
Talking about this:
Engaged Users:
Reach:
Engagement Rate:
Users’ Response to Content – Twitter Analytics
18BI: Social Analysis
Trace Pin in Pinterest – Curalate.com
19BI: Social Analysis
• ~1000 visits form this pin.
• The pinboard YUM by another
account has 655 Pins and 168,699
Followers.
• Keywords used to find the pin, should
the pin be tagged this way?
Tao of Social Media
20BI: Data Visualization
Schaefer, M. (2012). The Tao of Twitter: Changing Your Life and Business 140 Characters at a Time. NY: McGraw-Hill Education.
• Tao 1: Making Targeted Connections
• Tao 2: Providing Meaningful Content
• Tao 3: Offering Authentic Helpfulness
Tao in Chinese
21BI: Data Visualization
圣人无常心,以百姓心为心。
-- 老子 道德经 四十九章
The sage has no mind of his own
He is aware of the needs of others.
-- Lao Tsu, Tao Te Ching
Lao Tsu, (1997). Tao Te Ching - 25th Anniversary Edition. Translated by Gia-fu Feng and Jane English, Chapter 49. NYC, Vintage
Books / Random House.
What is Big Data
22
• Forrester: Big Data is the frontier of a firm's ability to store,
process, and access (SPA) all the data it needs to operate
effectively, make decisions, reduce risks, and serve customers
• IBM: Big data is the data characterized by 3 attributes:
volume, variety, and velocity
Walker, R. (2015). From Big Data to Big Profits: Success with Data and Analytics, chapter 1. NYC: Oxford.
BI: Data Flow Architecture
Big Data Market is Growing Fast
23BI: Data Flow Architecture
Kelly, J. (2015). Executive Summary: Big Data Vendor Revenue and Market Forecast, 2011-2016, accessed January 21, 2016,
http://wikibon.com/executive-summary-big-data-vendor-revenue-and-market-forecast-2011-2026/
Big Data in Cloud
24BI: Data Flow Architecture
Wikipedia. Cloud Computing, accessed January 28, 2016, https://en.wikipedia.org/wiki/Cloud_computing
• Big data are moving
to cloud fast.
• Applications in cloud
are generating more
big data in cloud.
BI Data Flow Architecture With ETL
25BI: Data Flow Architecture
Relational
Database
NoSQL
Store
Excel File
Text File
Web
Extract
Standardize
Primary Keys
Clean-
ing
Transform
Transform
Format
Translate Embedded
Logic
Referential Integrity Check Indexing
Load
BI Data Warehouse
Summarization
Derivation
Merge Sort
Integration
Aggregation
BI System
Social
Moss, L. T. & Atre, S. (2003). Business intelligence roadmap: the complete project lifecycle for decision-support applications.
Boston, MA: Addison –Wesley.
Facebook’s Data Space Management with Open Source Tools
26BI: Data Flow Architecture
Transactional
Databases
Application Logs
Web
Crawls
(Post)
Structured Data Unstructured Data
Hadoop Distributed File System (HDFS)
Query language Query UI (HiPal)
Hive
15 terabytes new
data per day in 2009
Data Warehousing
Framework
Argus
Portal for Sharing
Charts and Graphs
Databee
Workflow
Management
System
PyHive
Python Script
Framework for
MapReduce
Cassandra
Storage System for
Serving Data to End
Users
Tools
Parallelized Data
Processing at Massive
Scale
Hammerbacher, J. (2009). Information platforms and the rise of the data scientist. In Segaran, T. & Hammerbacher, J. (Eds.).
Beautiful Data, chapter 5. Sebastopol, CA: O’Reilly Media.
Teradata Unified Data Architecture
27BI: Data Flow Architecture
“Teradata Unified Data Architecture in Action,” Teradata’s Corporation, accessed April 19, 2014, http://www.teradata.com/white-
papers/Teradata-Unified-Data-Architecture-in-Action/
Amazon Big Data Portfolio
28BI: Data Flow Architecture
“Introduction to Amazon Redshift,” Pavan Pothukuchi, accessed January 15, 2016,
http://www.slideshare.net/AmazonWebServices/dat201-introduction-to-amazon-redshift
Amazon Redshift Benefits
29BI: Data Flow Architecture
“Introduction to Amazon Redshift,” Pavan Pothukuchi, accessed January 15, 2016,
http://www.slideshare.net/AmazonWebServices/dat201-introduction-to-amazon-redshift
Amazon Redshift Architecture
30BI: Data Flow Architecture
“Introduction to Amazon Redshift,” Pavan Pothukuchi, accessed January 15, 2016,
http://www.slideshare.net/AmazonWebServices/dat201-introduction-to-amazon-redshift
Machine Learning
31BI: Machine Learning
“Machine Learning,” Andrew Ng, accessed January 20, 2016, https://www.coursera.org/learn/machine-learning
• Definition
– Field of study that gives computers the ability to learn without being explicitly
programmed. -- Arthur Samuel (1959).
• Examples:
– Database mining
• Large datasets from growth of automation/web.
• E.g., Web click data, medical records, biology, engineering
– Applications can’t be programed by hand.
• E.g., Autonomous helicopter, handwriting recognition, most of Natural Language
Processing (NLP), Computer Vision.
– Self-customizing programs
• E.g., Amazon, Netflix product recommendations
– Understanding human learning (brain, real AI).
Neuron & Neural Networks
32BI: Machine Learning
“Machine Learning,” Andrew Ng, accessed January 20, 2016, https://www.coursera.org/learn/machine-learning
Pedestrian Car Motorcycle Truck
Want , , , etc.
when pedestrian when car when motorcycle
Input Output Input (Image Pixel) Output (Judgement)
Use Amazon ML for Filtering Actionable Tweets
33BI: Machine Learning
Alex Ingerman (2015) “Real-World Smart Applications with Amazon Machine Learning,” , accessed January 29, 2016,
https://www.youtube.com/watch?v=sHJx1KJf8p0
Customer Service
Actionable
Customer Service
Not Actionable
Human LabelTweets mentioning AWS Training ML Model
Training tweet analysis
model developed by
Amazon to automatically
find the tweets which
are actionable for
customer service
Beautiful Data Visualization
34BI: Data Visualization
Lliinsky, N. (2010). On beauty. In Steele, J. & Lliinsky, N. (Eds.). Beautiful visualization, Chapter 1. Sebastopol, CA: O’Reilly Media.
• Informative
– Reveal intended message clearly with enough data
– With different perspectives to facilitate discovery
• Efficient
– Visually emphasize what matters and reveal relationship
– Use axes, color and size to convey meaning
• Novel
– Break the limit of default format, choose best format to suit data
– A fresh look at the data
– A new level of understanding
• Aesthetic
– Appropriate usage of graphical construction to offer visual appeal.
Napoleon’s March to Russia in 1812 - 1813
35BI: Data Visualization
Tufle, E. (2001). The Visual Display of Quantitative Information (2nd ed.). (Original by Charles Joseph Minard.) Connecticut , US:
Graphics Press.
•Army size
•Geo Location
•Move direction
•Temperature
•Date
•Event
When Relevant Information Are Put Together…
36BI: Data Visualization
Tufle, E. (2001). The Visual Display of Quantitative Information (2nd ed.). Connecticut , US: Graphics Press.
A cholera epidemic took
the lives of 600 Londoners
in September 1854.
Nobody knew the cause.
Dr. John Snow started the
mapping of incident
locations, and linked them
to a particular pump site.
It was verified later the
Broad Street pump was
the cause of the epidemic.
Sanitization started and
then the epidemic was
stopped.
When We Do Literarily What User Asked …
37BI: Data Visualization
Could I have top
10 stores in BI?
No problem.
Here you are!
I see. Thanks.
I thought you would
be very interested.
I was… but only
for 10 seconds…
If We Add a Cyclic Group of Category and Brand …
38BI: Data Visualization
Change
Dimension
Drill-down to Phone Drill-down to Apple
Can Business Intelligence Match Human Intelligence?
39
How the six tech
companies were
organized?
(Manu Cornet, 2011)
http://www.bonkersworld.net/
organizational-charts/
Can BI system
bring insights so
straightforward
and drive users
to think deep?
BI: Data Visualization
Information in Well-designed Dashboard
40BI: Dashboard Design
• Exceptionally well organized
– All important data in one page
• Condensed, primarily in the form of summaries and exceptions
– Single numbers from sums or averages.
– Something falls outside the realm of normality, which needs attention.
• Specific to and customized for the dashboard’s audience and objectives
– Information should be narrowed to address the objective(s).
– Use audience’s vocabulary.
• Displayed using concise and often small media that communicate the data
and its message in the clearest and most direct way possible.
– Reduce the non-data pixels.
– Enhance the data pixels.
Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
Define Key Performance Indicators (KPIs)
41BI: Dashboard Design
Category Measures
Sales Bookings
Billings
Sales pipeline
Number of orders
Order amounts
Selling prices
Marketing Market share
Campaign success
Customer
demographics
Finance Revenues
Expenses
Profits
Web
Services
Number of visitors
Number of page hits
Visit durations
Comparative Measure Example
The same measure at
the same point in time
in the past
The same day last year
The same measure at
some other point in
time in the past
The end of last year
The current target for
the measure
A budgeted amount for the
current period
A prior prediction of the
measure
Forecast of where we
expected to be today
An extrapolation of the
current measure
Projection out into the
future, e.g. year end.
Some measure of the
norm for this measure
Average, normal range or a
bench mark.
Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
Effective Dashboard Display Media
42BI: Dashboard Design
Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
Easier to spot trend with line chart
Clean
display
of
related
data
Simple
symbol
or
number
Utilize Short-Term Memory
43BI: Dashboard Design
Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
• Memory comes in three fundamental types:
– Iconic memory (a.k.a. the visual sensory register)
– Short-term memory (a.k.a. working memory)
– Long-term memory
• Only 3-9 chunks of information can be stored in short-term memory.
• Graphs over text.
– Individual numbers are stored in discrete chunks.
– One or more lines in a line graph, can represent a great deal of information as a single chunk.
• Relevant information on the same screen.
– Once the information is no longer visible, unless it is one of the few chunks of information
stored in short-term memory, it is no longer available.
– If everything remains within eye span, users can exchange information in and out of short-
term memory at lighting speed.
Sample Sales Dashboard
44BI: Dashboard Design
Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
When Dashboard is not Enough -> Self Service BI
45BI: Dashboard Design
• As soon as a dashboard shows abnormalities, users will
want to know more details.
• The responsible individual will be called. He will query the
database or ask IT staff to run the query… The process is
long and resource consuming.
• Layered reports in self service BI can provide top-down
views to user fingertips:
– Layer 1: One page overview
– Layer 2: Categorical reports such as regional/product reports
– Layer 3: Data tables down to most granular levels.
Data Modeling – Understand Data Connection
46BI: Data Modeling
• Given a system, first study how the data are linked, then
model the linkage in BI system.
R. Arlen Price
Faculty
An obesity-related locus in
chromosome region 12q23-24
Diabetes
Author
Subscribe
Read
American Diabetes
Association
Publication
National Institutes of
Health
Funding
Research Interest
Genetics of Complex Traits, Genetics
of Obesity, Behavioral Genetics,
Genetic Epidemiology
Faculty Profile
Research Techniques
Linkage mapping, linkage
disequilibrium association analyses,
and gene expression profiling
Profile
Research
Strength
Ding Li
Author
Student
Attend
Events
Proposal
Review
Data Linkage on STM Publishing
Data Modeling – Natural Linear & Star Structure
47BI: Data Modeling
• Data connection is the key to revel the insights hidden in data.
• In simple situation, a central table or a central key field can
link the tables together.
Data Modeling – Construct Star Structure
48BI: Data Modeling
• A link table can be constructed to link tables on multiple
common fields.
• In this example, Sale, Return and Target tables need to be
linked on (Item, Store, Date).
Data Modeling – Time Series
49BI: Data Modeling
• In time series model, each event keeps its own timestamp, so it is
easy to track the time gap in each step.
• Typical questions:
– For all the articles submitted on Jan. 2013, how long does it take to get
reviewed, receive final decision, and publish online if accepted? Compare
with articles submitted on Jan. 2012.
– For all the articles published on Apr. 2014, when were they submitted,
reviewed, and received final decision? Compare with the articles
published on Apr. 2013.
Submit
Date
Review
Date
Decision
Date
Online
Date
Download
Date
Data Modeling – Universal Time
50BI: Data Modeling
• In this model, users want to view all activities within same period.
• Typical questions:
– In Apr. 2014, how many articles submitted, reviewed, and published? If a
user change to another period, all the numbers will be changed according
to new period simultaneously.
Event
date
Submit
Editor
Review
Peer
Review
Production
Online
Usage
Challenges of BI Development Management
51BI: Development Management
• BI project involves cross talk between multiple departments.
Winning cooperative support is the key for its success.
• BI development often encounters unexpected issues in data
availability, data quality, data linkage, and business logic
transfer. Forcing a deadline may cause low-quality report;
over-relaxing due date may halt a project. An agile process is
pivotal to moving project forward.
• BI system is very efficient to expose data abnormalities. A
cleaner data system is only possible if source data problem is
addressed between BI developer and data owners/suppliers.
Heavyweight Development Process – Thorough but High Risk
52BI: Development Management
Moss, L. T. & Atre, S. (2003). Business intelligence roadmap: the complete project lifecycle for decision-support applications. Boston,
MA: Addison –Wesley.
Agile Development Process
53BI: Development Management
Plan
•Business
Goals
•KPIs
Analysis
•Data Sources
•Calculation
Logics
Data ETL
•Extraction
•Transform
•Loading
Design
•Report Layout
•Data
Visualization
Validation
•Data
•Logics
Feedback
•New
Requirements
 Phased Release.
◦ Important KPIs first.
◦ Well connected data first.
 Fast Development
Quick Feedback
◦ Design
◦ Data
◦ Logic
BI Platforms – 2015 Gartner Magic Quadrant
54BI: Platform, Tool
Rita L. Sallam, Joao
Tapadinhas, Josh Parenteau, Daniel
Yuen, Bill Hostmann (2014). Magic
Quadrant for Business Intelligence
and Analytics Platforms, last
accessed on Apr. 22,2014,
http://www.gartner.com/technolo
gy/reprints.do?id=1-
1QLGACN&ct=140210&st=sb
 Agile Platform
◦ Tableau.
◦ QlikView.
◦ Tibco Spotfire
 Large Platform
◦ Microsoft
◦ IBM (Cognos)
◦ SAS
◦ SAP (BusinessObjects)
◦ Oracle (OBIEE)
◦ MicroStrategy
◦ Information Builders
BI Platform Example – QlikView
55BI: Platform, Tool
• Pros
– Click driven, visually interactive interface is simple to learn and use.
– Based on in-memory associative technology, which is fast.
– Flexible data source (Oracle, SQL, excel, txt file).
– Quicker to build comparing with traditional BI systems.
• Cons
– Need straight-forward relationship among tables, which requires clean
data to link tables.
– Its underlining calculation logic, set analysis, is hard to use for
complicated logics.
– Its script language is not complete enough to accomplish
comprehensive tasks.
– Most data need to be in memory.
BI Platform Example – Tableau vs QlikView
56BI: Platform, Tool
• Pros
– More innovate visualization, including geo mapping.
– Using UI to select data set instead of expression in code.
– Free Tableau Public makes it very popular.
• Cons
– Weak ETL capability.
• Sample Projects
– Payment difference to medical providers for 100 common inpatient services
Tableau Public – Free Hosting of Data Visualization
57BI: Platform, Tool https://public.tableau.com/s/gallery/new-yorks-citi-bikes
Thank You
58BI: Thank You
Analyzing data is worth the cost…
The price of light is less than the cost of darkness.
--Arthur C. Nielsen, Founder of ACNielsen Company
Please send your comment or suggestion to ding.li@smartdatanet.com
Appendix: Services from Smart Data Net Inc.
59BI: Smart Data Net Inc.
Data
Web Clicks Social Posts
User DemographicsSale
Supply
Competitors
BI Solutions
1.Provide
2.Analyze
3.Develop
4.GetInsights
Business
Client
Smart Data
NetCommunicate all the time
Forecast
Demand
Return
Profit
Cost
Marketing
User Feedback
Email Open/Click
User ReferralR & D
Appendix: How BI Can Help Small Business
60BI: How BI Can Help Small Business
• Web Analysis
– What do users want to see?
– Can users find right contents?
– Is website search engine friendly?
– Tool: Google Analytics
• Social Analysis
– What contents are engaging users?
– How to make contents far-reaching?
– How to foster a supportive social group?
– Tool: Facebook Insights, Twitter Analytics,
HootSuite, Curalate
• Sale Analysis
– Near real-time revenue/cost analysis
– Find problem/opportunity quickly
– Service level analysis
– Tool: Tableau, QlikView
• Marketing/User Analysis
– Which marketing method can bring most
valuable users?
– How to target right users based on their
previous behavior?
– User segmentation analysis
– Tool: Tableau, QlikView, R, Python
Appendix: How to Become a BI Developer/Data Scientist
61BI: How to Become a BI Developer/Data Scientist
• Visualization Track
(programming experience not required)
– Proficient on a Visualization Tool
• Tableau, QlikView
– Study Visualization Best Practices
• Books from Edward Tufle, Stephen Few
– Understand Business Analysis Flow
• Discuss with business users
• Data Management Track
– Data Warehouse & BI Platform
• Amazon Redshift, Cognos, SAS, SAP, SQL,
Oracle
– Big Data Store
• Hadoop, Teradata, AWS, Azure
– No-SQL Store
• MongoDB
• Data Mining Track
(for programmer or statistician)
– Data Manipulation
• Python
– Statistics
• R
– Machine Learning
• Octave, Java, R, Python
• Resources
– Free Online Classes
• Coursera.og
– Seminars
• Meetup.com
– Tool Online Training
• www.tableausoftware.com/learn/training
Appendix: How to Become a Tableau Developer from Scratch
62BI: How to Become a Tableau Developer
• Tableau has Powerful Visualization, Great
Usability and Short Learning Curve
– Efficient for geo and trending analysis
– Takes a couple of weeks to learn and a few
months to master
– Can be the first step to enter the data science
world
• Step 1: Using Free Tableau Public
– Download Tableau Public
• http://www.tableausoftware.com/public/
– Take Online Training
• http://www.tableausoftware.com/public/training
– Apply to Open Public Data
• https://www.data.gov/open-gov/
• https://data.ny.gov/
• https://nycopendata.socrata.com/
– Save and Publish Your Work Online
• With the free version, users cannot save the result
on a local machine
• This is all you need if you can publish all your work
to public (the server is hosted by Tableau for free)
• Step 2: Using Tableau Desktop
– Download Tableau Desktop
• http://www.tableausoftware.com/products/desktop
– Use the 14-days free trial to do as much training
and development as possible
• http://www.tableausoftware.com/learn/training
– Purchase the product if it is the right tool for you
• Personal edition: $1000, no database connection
• Professional edition: $2000, can open database
• Next Steps: Enjoy Data Visualization and
Analysis; Learn More Theory, Best Practices and
Tools.

More Related Content

PPTX
Generative adversarial networks
PDF
Machine Learning in the Cloud with GraphLab
PDF
MIT Deep Learning Basics: Introduction and Overview by Lex Fridman
PDF
PDF
Introduction to Data Science
PDF
How Graph Databases used in Police Department?
PDF
Loops of humans and bots in Wikidata
PDF
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning
Generative adversarial networks
Machine Learning in the Cloud with GraphLab
MIT Deep Learning Basics: Introduction and Overview by Lex Fridman
Introduction to Data Science
How Graph Databases used in Police Department?
Loops of humans and bots in Wikidata
CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning

What's hot (20)

PPTX
Linked Data Entity Summarization (PhD defense)
PPTX
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
PDF
Application Modeling with Graph Databases - Relationships are cool
PDF
PPTX
Machine Learning using Big data
PDF
Building better knowledge graphs through social computing
PDF
Graph Neural Networks for Recommendations
PDF
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
PDF
TUW-ASE Summer 2015 - Quality of Result-aware data analytics
PDF
姜俊宇/從資料到知識:從零開始的資料探勘
PDF
Building and Using a Knowledge Graph to Combat Human Trafficking
PDF
NoSQL (Not Only SQL)
PDF
Social networks protection against fake profiles and social bots attacks
PDF
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
PDF
陸永祥/全球網路攝影機帶來的機會與挑戰
PDF
Introduction to Data Science
PDF
Introduction to Data Science and Analytics
PPTX
Machine Learning Introduction for Digital Business Leaders
PDF
Linear Regression With R
PDF
AI @ Wholi - Bucharest.AI Meetup #5
Linked Data Entity Summarization (PhD defense)
Extracting, Aligning, and Linking Data to Build Knowledge Graphs
Application Modeling with Graph Databases - Relationships are cool
Machine Learning using Big data
Building better knowledge graphs through social computing
Graph Neural Networks for Recommendations
Web Intelligence 2013 - Characterizing concepts of interest leveraging Linked...
TUW-ASE Summer 2015 - Quality of Result-aware data analytics
姜俊宇/從資料到知識:從零開始的資料探勘
Building and Using a Knowledge Graph to Combat Human Trafficking
NoSQL (Not Only SQL)
Social networks protection against fake profiles and social bots attacks
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
陸永祥/全球網路攝影機帶來的機會與挑戰
Introduction to Data Science
Introduction to Data Science and Analytics
Machine Learning Introduction for Digital Business Leaders
Linear Regression With R
AI @ Wholi - Bucharest.AI Meetup #5
Ad

Viewers also liked (18)

PDF
Eu ppt presentation pq projektledning allmän
PDF
Pavel Dabrytski - Agile Africa 2015 - Agile Economics - budgets, contacts, ca...
PPTX
Poster analysis
PDF
poster_final
DOC
Silabus pai-kelas-ix
DOCX
Grameen Final Paper 303
PPTX
Interactive Video for Teaching and Learning
PPTX
Great neck school budget 2016-2017 analysis
PPTX
Facebook And Pinterest Take Big Data (BI, Analytics and Machine Learning) To ...
PPT
Presentacion algoritmos
PPT
Curso de Algoritmos - Presentación 1
PPTX
Big Data in the Cloud
PDF
Big Data Analytics on the Cloud
PDF
Digitization in supply chain management
PPTX
Renascimento Cultural
PPTX
Introduction to Machine Learning
PPTX
Introduction to Big Data/Machine Learning
PPTX
cloud computing ppt
Eu ppt presentation pq projektledning allmän
Pavel Dabrytski - Agile Africa 2015 - Agile Economics - budgets, contacts, ca...
Poster analysis
poster_final
Silabus pai-kelas-ix
Grameen Final Paper 303
Interactive Video for Teaching and Learning
Great neck school budget 2016-2017 analysis
Facebook And Pinterest Take Big Data (BI, Analytics and Machine Learning) To ...
Presentacion algoritmos
Curso de Algoritmos - Presentación 1
Big Data in the Cloud
Big Data Analytics on the Cloud
Digitization in supply chain management
Renascimento Cultural
Introduction to Machine Learning
Introduction to Big Data/Machine Learning
cloud computing ppt
Ad

Similar to Business Intelligence and Big Data in Cloud (20)

PDF
Data Analytics PowerPoint Presentation Slides
PDF
Big Data Analytics PowerPoint Presentation Slides
PDF
Business with Big data
PPTX
Conference Presenation Predictive Analytics ITC-AP 2013 , Prof Lili Saghafi
PDF
Business Data Analytics Powerpoint Presentation Slides
PPTX
Big data? No. Big Decisions are What You Want
PDF
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
PDF
Big Data Analysis and Business Intelligence
PPTX
Finance and Accounting BPM
PDF
Big Data Analytics Powerpoint Presentation Slides
PDF
Big Data, Little Data, and Everything in Between
PPTX
Technology Trends and Big Data in 2013-2014
PPTX
Kaushal Amin & Big 5 IT trends in the world
PPTX
000 introduction to big data analytics 2021
PDF
Data, Interconnectedness & The Internet of Things
PDF
Lecture3 business intelligence
PDF
Analytics big data ibm
PDF
IBM-Infoworld Big Data deep dive
PDF
Agile data science
PDF
3 джозеп курто превращаем вашу организацию в big data компанию
Data Analytics PowerPoint Presentation Slides
Big Data Analytics PowerPoint Presentation Slides
Business with Big data
Conference Presenation Predictive Analytics ITC-AP 2013 , Prof Lili Saghafi
Business Data Analytics Powerpoint Presentation Slides
Big data? No. Big Decisions are What You Want
BIGDATA-DIGITAL TRANSFORMATION AND STRATEGY
Big Data Analysis and Business Intelligence
Finance and Accounting BPM
Big Data Analytics Powerpoint Presentation Slides
Big Data, Little Data, and Everything in Between
Technology Trends and Big Data in 2013-2014
Kaushal Amin & Big 5 IT trends in the world
000 introduction to big data analytics 2021
Data, Interconnectedness & The Internet of Things
Lecture3 business intelligence
Analytics big data ibm
IBM-Infoworld Big Data deep dive
Agile data science
3 джозеп курто превращаем вашу организацию в big data компанию

More from Ding Li (11)

PPTX
Software architecture for data applications
PPTX
Seismic data analysis with u net
PPTX
Titanic survivor prediction by machine learning
PPTX
Find nuclei in images with U-net
PPTX
Digit recognizer by convolutional neural network
PPTX
Reinforcement learning
PPTX
Recommendation system
PPTX
Practical data science
PPTX
AI to advance science research
PPTX
Machine learning with graph
PPTX
Natural language processing and transformer models
Software architecture for data applications
Seismic data analysis with u net
Titanic survivor prediction by machine learning
Find nuclei in images with U-net
Digit recognizer by convolutional neural network
Reinforcement learning
Recommendation system
Practical data science
AI to advance science research
Machine learning with graph
Natural language processing and transformer models

Recently uploaded (20)

PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
Database Infoormation System (DBIS).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
Introduction to Business Data Analytics.
PDF
Mega Projects Data Mega Projects Data
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
1_Introduction to advance data techniques.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to Knowledge Engineering Part 1
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Moving the Public Sector (Government) to a Digital Adoption
Clinical guidelines as a resource for EBP(1).pdf
Fluorescence-microscope_Botany_detailed content
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Database Infoormation System (DBIS).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
oil_refinery_comprehensive_20250804084928 (1).pptx
climate analysis of Dhaka ,Banglades.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to Business Data Analytics.
Mega Projects Data Mega Projects Data
.pdf is not working space design for the following data for the following dat...
Business Acumen Training GuidePresentation.pptx
1_Introduction to advance data techniques.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to Knowledge Engineering Part 1

Business Intelligence and Big Data in Cloud

  • 2. Data-Driven, a Survival Must? 2BI: Goal, Mindset Industry Winner Loser Online Retail DVD Rental Social Network The same story will happen in other industries soon, if not already happened. BI is a major tool to turn a company into a data-driven company.
  • 3. What Can BI System Help? 3BI: Goal, Mindset Globalization Customer Demand Market Conditions Competition Technology Advance Regulations … Business Environment Organization Responses Strategic Planning New Business Models Restructure Business Processes Supply Chain Optimization Improve Partnership Relationships Improve Information Systems Encourage Innovation Improve Customer Service Improve Communication Improve Data Access Automate tasks Real-time Response … Pressures Opportunities Decision and Support Analysis Predictions Decisions Business Intelligence Support Turban, E., Sharda, R., Delen, D., and King, D. (2010). Business Intelligence: a managerial approach (2nd ed.). Upper Saddle River, NJ: Prentice Hall.
  • 4. Without Good BI System – Information Everywhere, but Hard to Access 4BI: Goal, Mindset Executives Managers Why cannot I have right data at right time? Analysts Why should I waste most time getting the data? Data Engineers Ad hoc query all day? The data job is so boring. Operators Can I get all relevant data in one place?
  • 5. Without Good BI System – Inefficiency due to Information Silo 5BI: Goal, Mindset O9 Solutions, Inc. Funny Business: Sales and Operations, accessed January 28, 2016, https://www.linkedin.com/hp/update/6098306831355047936
  • 6. How Can BI System Help? 6BI: Goal, Mindset Bench marking Historical Current Predictive Views of Business Operations and Performance Better, Quicker Business Decision-Making Performance Management Reporting Analytics Data Mining Predictive Analytics FinanceSales Returns Supply Production Web Email User Usage Industry Analysis Competitive Analysis Social Analysis Product Ranking Technology Analysis Internal Data External Data
  • 7. How BI Deliver Core Values to Customers 7 Business Intelligence Tableau QlikView Amazon QuickSight Web/Mobile Analysis Google Analytics Adobe Analytics SEO Social Analysis Facebook Twitter Pinterest WeChat Cloud Big Data Warehouse Amazon Redshift Machine Learning R Python Amazon ML BI: Goal, Mindset
  • 8. Web/Mobile/APP Analysis – Audience(demo, interest, user type) 8 Davis, J. (2015). Google Analytics Demystified: A Hand-On Approach (2nd edition). CreateSpace. BI: Web/Mobile Analysis
  • 9. Web/Mobile/APP Analysis – Audience (Location) 9BI: Web/Mobile Analysis
  • 10. Web/Mobile/APP Analysis – Audience (Operating System) 10BI: Web/Mobile Analysis
  • 11. Adobe Discover: Navigation Flow Among Top Pages/Content 11Adobe training video. Retrieved from https://outv.omniture.com/. 11BI: Web/Mobile Analysis
  • 12. Adobe Discover: Navigation Flow from a Page 12 Adobe training video. Retrieved from https://outv.omniture.com/. BI: Web/Mobile Analysis
  • 13. Web/Mobile/APP Analysis – Acquisition 13 Campaign Keyword BI: Web/Mobile Analysis
  • 14. Web/Mobile/APP Analysis – SEO 14BI: Web/Mobile Analysis
  • 15. Visitor Segmentation Study via Adobe SiteCatalyst 15BI: Web/Mobile Analysis • What/how are they viewing? • Why did they leave? • How to engage them more? • How to connect them? New Visitors Casual Visitors Loyal Visitors Elapsed Visitors • Growing the loyal visitors is essential to keep the site thriving. • So it is important to understand their navigation pattern, what do they like and unlike.
  • 16. Visits from Social Channels 16 Facebook Pinterest Twitter BI: Social Analysis
  • 17. Top Facebook Posts – Facebook Insights 17BI: Social Analysis Talking about this: Engaged Users: Reach: Engagement Rate:
  • 18. Users’ Response to Content – Twitter Analytics 18BI: Social Analysis
  • 19. Trace Pin in Pinterest – Curalate.com 19BI: Social Analysis • ~1000 visits form this pin. • The pinboard YUM by another account has 655 Pins and 168,699 Followers. • Keywords used to find the pin, should the pin be tagged this way?
  • 20. Tao of Social Media 20BI: Data Visualization Schaefer, M. (2012). The Tao of Twitter: Changing Your Life and Business 140 Characters at a Time. NY: McGraw-Hill Education. • Tao 1: Making Targeted Connections • Tao 2: Providing Meaningful Content • Tao 3: Offering Authentic Helpfulness
  • 21. Tao in Chinese 21BI: Data Visualization 圣人无常心,以百姓心为心。 -- 老子 道德经 四十九章 The sage has no mind of his own He is aware of the needs of others. -- Lao Tsu, Tao Te Ching Lao Tsu, (1997). Tao Te Ching - 25th Anniversary Edition. Translated by Gia-fu Feng and Jane English, Chapter 49. NYC, Vintage Books / Random House.
  • 22. What is Big Data 22 • Forrester: Big Data is the frontier of a firm's ability to store, process, and access (SPA) all the data it needs to operate effectively, make decisions, reduce risks, and serve customers • IBM: Big data is the data characterized by 3 attributes: volume, variety, and velocity Walker, R. (2015). From Big Data to Big Profits: Success with Data and Analytics, chapter 1. NYC: Oxford. BI: Data Flow Architecture
  • 23. Big Data Market is Growing Fast 23BI: Data Flow Architecture Kelly, J. (2015). Executive Summary: Big Data Vendor Revenue and Market Forecast, 2011-2016, accessed January 21, 2016, http://wikibon.com/executive-summary-big-data-vendor-revenue-and-market-forecast-2011-2026/
  • 24. Big Data in Cloud 24BI: Data Flow Architecture Wikipedia. Cloud Computing, accessed January 28, 2016, https://en.wikipedia.org/wiki/Cloud_computing • Big data are moving to cloud fast. • Applications in cloud are generating more big data in cloud.
  • 25. BI Data Flow Architecture With ETL 25BI: Data Flow Architecture Relational Database NoSQL Store Excel File Text File Web Extract Standardize Primary Keys Clean- ing Transform Transform Format Translate Embedded Logic Referential Integrity Check Indexing Load BI Data Warehouse Summarization Derivation Merge Sort Integration Aggregation BI System Social Moss, L. T. & Atre, S. (2003). Business intelligence roadmap: the complete project lifecycle for decision-support applications. Boston, MA: Addison –Wesley.
  • 26. Facebook’s Data Space Management with Open Source Tools 26BI: Data Flow Architecture Transactional Databases Application Logs Web Crawls (Post) Structured Data Unstructured Data Hadoop Distributed File System (HDFS) Query language Query UI (HiPal) Hive 15 terabytes new data per day in 2009 Data Warehousing Framework Argus Portal for Sharing Charts and Graphs Databee Workflow Management System PyHive Python Script Framework for MapReduce Cassandra Storage System for Serving Data to End Users Tools Parallelized Data Processing at Massive Scale Hammerbacher, J. (2009). Information platforms and the rise of the data scientist. In Segaran, T. & Hammerbacher, J. (Eds.). Beautiful Data, chapter 5. Sebastopol, CA: O’Reilly Media.
  • 27. Teradata Unified Data Architecture 27BI: Data Flow Architecture “Teradata Unified Data Architecture in Action,” Teradata’s Corporation, accessed April 19, 2014, http://www.teradata.com/white- papers/Teradata-Unified-Data-Architecture-in-Action/
  • 28. Amazon Big Data Portfolio 28BI: Data Flow Architecture “Introduction to Amazon Redshift,” Pavan Pothukuchi, accessed January 15, 2016, http://www.slideshare.net/AmazonWebServices/dat201-introduction-to-amazon-redshift
  • 29. Amazon Redshift Benefits 29BI: Data Flow Architecture “Introduction to Amazon Redshift,” Pavan Pothukuchi, accessed January 15, 2016, http://www.slideshare.net/AmazonWebServices/dat201-introduction-to-amazon-redshift
  • 30. Amazon Redshift Architecture 30BI: Data Flow Architecture “Introduction to Amazon Redshift,” Pavan Pothukuchi, accessed January 15, 2016, http://www.slideshare.net/AmazonWebServices/dat201-introduction-to-amazon-redshift
  • 31. Machine Learning 31BI: Machine Learning “Machine Learning,” Andrew Ng, accessed January 20, 2016, https://www.coursera.org/learn/machine-learning • Definition – Field of study that gives computers the ability to learn without being explicitly programmed. -- Arthur Samuel (1959). • Examples: – Database mining • Large datasets from growth of automation/web. • E.g., Web click data, medical records, biology, engineering – Applications can’t be programed by hand. • E.g., Autonomous helicopter, handwriting recognition, most of Natural Language Processing (NLP), Computer Vision. – Self-customizing programs • E.g., Amazon, Netflix product recommendations – Understanding human learning (brain, real AI).
  • 32. Neuron & Neural Networks 32BI: Machine Learning “Machine Learning,” Andrew Ng, accessed January 20, 2016, https://www.coursera.org/learn/machine-learning Pedestrian Car Motorcycle Truck Want , , , etc. when pedestrian when car when motorcycle Input Output Input (Image Pixel) Output (Judgement)
  • 33. Use Amazon ML for Filtering Actionable Tweets 33BI: Machine Learning Alex Ingerman (2015) “Real-World Smart Applications with Amazon Machine Learning,” , accessed January 29, 2016, https://www.youtube.com/watch?v=sHJx1KJf8p0 Customer Service Actionable Customer Service Not Actionable Human LabelTweets mentioning AWS Training ML Model Training tweet analysis model developed by Amazon to automatically find the tweets which are actionable for customer service
  • 34. Beautiful Data Visualization 34BI: Data Visualization Lliinsky, N. (2010). On beauty. In Steele, J. & Lliinsky, N. (Eds.). Beautiful visualization, Chapter 1. Sebastopol, CA: O’Reilly Media. • Informative – Reveal intended message clearly with enough data – With different perspectives to facilitate discovery • Efficient – Visually emphasize what matters and reveal relationship – Use axes, color and size to convey meaning • Novel – Break the limit of default format, choose best format to suit data – A fresh look at the data – A new level of understanding • Aesthetic – Appropriate usage of graphical construction to offer visual appeal.
  • 35. Napoleon’s March to Russia in 1812 - 1813 35BI: Data Visualization Tufle, E. (2001). The Visual Display of Quantitative Information (2nd ed.). (Original by Charles Joseph Minard.) Connecticut , US: Graphics Press. •Army size •Geo Location •Move direction •Temperature •Date •Event
  • 36. When Relevant Information Are Put Together… 36BI: Data Visualization Tufle, E. (2001). The Visual Display of Quantitative Information (2nd ed.). Connecticut , US: Graphics Press. A cholera epidemic took the lives of 600 Londoners in September 1854. Nobody knew the cause. Dr. John Snow started the mapping of incident locations, and linked them to a particular pump site. It was verified later the Broad Street pump was the cause of the epidemic. Sanitization started and then the epidemic was stopped.
  • 37. When We Do Literarily What User Asked … 37BI: Data Visualization Could I have top 10 stores in BI? No problem. Here you are! I see. Thanks. I thought you would be very interested. I was… but only for 10 seconds…
  • 38. If We Add a Cyclic Group of Category and Brand … 38BI: Data Visualization Change Dimension Drill-down to Phone Drill-down to Apple
  • 39. Can Business Intelligence Match Human Intelligence? 39 How the six tech companies were organized? (Manu Cornet, 2011) http://www.bonkersworld.net/ organizational-charts/ Can BI system bring insights so straightforward and drive users to think deep? BI: Data Visualization
  • 40. Information in Well-designed Dashboard 40BI: Dashboard Design • Exceptionally well organized – All important data in one page • Condensed, primarily in the form of summaries and exceptions – Single numbers from sums or averages. – Something falls outside the realm of normality, which needs attention. • Specific to and customized for the dashboard’s audience and objectives – Information should be narrowed to address the objective(s). – Use audience’s vocabulary. • Displayed using concise and often small media that communicate the data and its message in the clearest and most direct way possible. – Reduce the non-data pixels. – Enhance the data pixels. Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
  • 41. Define Key Performance Indicators (KPIs) 41BI: Dashboard Design Category Measures Sales Bookings Billings Sales pipeline Number of orders Order amounts Selling prices Marketing Market share Campaign success Customer demographics Finance Revenues Expenses Profits Web Services Number of visitors Number of page hits Visit durations Comparative Measure Example The same measure at the same point in time in the past The same day last year The same measure at some other point in time in the past The end of last year The current target for the measure A budgeted amount for the current period A prior prediction of the measure Forecast of where we expected to be today An extrapolation of the current measure Projection out into the future, e.g. year end. Some measure of the norm for this measure Average, normal range or a bench mark. Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
  • 42. Effective Dashboard Display Media 42BI: Dashboard Design Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media. Easier to spot trend with line chart Clean display of related data Simple symbol or number
  • 43. Utilize Short-Term Memory 43BI: Dashboard Design Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media. • Memory comes in three fundamental types: – Iconic memory (a.k.a. the visual sensory register) – Short-term memory (a.k.a. working memory) – Long-term memory • Only 3-9 chunks of information can be stored in short-term memory. • Graphs over text. – Individual numbers are stored in discrete chunks. – One or more lines in a line graph, can represent a great deal of information as a single chunk. • Relevant information on the same screen. – Once the information is no longer visible, unless it is one of the few chunks of information stored in short-term memory, it is no longer available. – If everything remains within eye span, users can exchange information in and out of short- term memory at lighting speed.
  • 44. Sample Sales Dashboard 44BI: Dashboard Design Few, S. (2006). Information Dashboard Design. Sebastopol, CA: O’Reilly Media.
  • 45. When Dashboard is not Enough -> Self Service BI 45BI: Dashboard Design • As soon as a dashboard shows abnormalities, users will want to know more details. • The responsible individual will be called. He will query the database or ask IT staff to run the query… The process is long and resource consuming. • Layered reports in self service BI can provide top-down views to user fingertips: – Layer 1: One page overview – Layer 2: Categorical reports such as regional/product reports – Layer 3: Data tables down to most granular levels.
  • 46. Data Modeling – Understand Data Connection 46BI: Data Modeling • Given a system, first study how the data are linked, then model the linkage in BI system. R. Arlen Price Faculty An obesity-related locus in chromosome region 12q23-24 Diabetes Author Subscribe Read American Diabetes Association Publication National Institutes of Health Funding Research Interest Genetics of Complex Traits, Genetics of Obesity, Behavioral Genetics, Genetic Epidemiology Faculty Profile Research Techniques Linkage mapping, linkage disequilibrium association analyses, and gene expression profiling Profile Research Strength Ding Li Author Student Attend Events Proposal Review Data Linkage on STM Publishing
  • 47. Data Modeling – Natural Linear & Star Structure 47BI: Data Modeling • Data connection is the key to revel the insights hidden in data. • In simple situation, a central table or a central key field can link the tables together.
  • 48. Data Modeling – Construct Star Structure 48BI: Data Modeling • A link table can be constructed to link tables on multiple common fields. • In this example, Sale, Return and Target tables need to be linked on (Item, Store, Date).
  • 49. Data Modeling – Time Series 49BI: Data Modeling • In time series model, each event keeps its own timestamp, so it is easy to track the time gap in each step. • Typical questions: – For all the articles submitted on Jan. 2013, how long does it take to get reviewed, receive final decision, and publish online if accepted? Compare with articles submitted on Jan. 2012. – For all the articles published on Apr. 2014, when were they submitted, reviewed, and received final decision? Compare with the articles published on Apr. 2013. Submit Date Review Date Decision Date Online Date Download Date
  • 50. Data Modeling – Universal Time 50BI: Data Modeling • In this model, users want to view all activities within same period. • Typical questions: – In Apr. 2014, how many articles submitted, reviewed, and published? If a user change to another period, all the numbers will be changed according to new period simultaneously. Event date Submit Editor Review Peer Review Production Online Usage
  • 51. Challenges of BI Development Management 51BI: Development Management • BI project involves cross talk between multiple departments. Winning cooperative support is the key for its success. • BI development often encounters unexpected issues in data availability, data quality, data linkage, and business logic transfer. Forcing a deadline may cause low-quality report; over-relaxing due date may halt a project. An agile process is pivotal to moving project forward. • BI system is very efficient to expose data abnormalities. A cleaner data system is only possible if source data problem is addressed between BI developer and data owners/suppliers.
  • 52. Heavyweight Development Process – Thorough but High Risk 52BI: Development Management Moss, L. T. & Atre, S. (2003). Business intelligence roadmap: the complete project lifecycle for decision-support applications. Boston, MA: Addison –Wesley.
  • 53. Agile Development Process 53BI: Development Management Plan •Business Goals •KPIs Analysis •Data Sources •Calculation Logics Data ETL •Extraction •Transform •Loading Design •Report Layout •Data Visualization Validation •Data •Logics Feedback •New Requirements  Phased Release. ◦ Important KPIs first. ◦ Well connected data first.  Fast Development Quick Feedback ◦ Design ◦ Data ◦ Logic
  • 54. BI Platforms – 2015 Gartner Magic Quadrant 54BI: Platform, Tool Rita L. Sallam, Joao Tapadinhas, Josh Parenteau, Daniel Yuen, Bill Hostmann (2014). Magic Quadrant for Business Intelligence and Analytics Platforms, last accessed on Apr. 22,2014, http://www.gartner.com/technolo gy/reprints.do?id=1- 1QLGACN&ct=140210&st=sb  Agile Platform ◦ Tableau. ◦ QlikView. ◦ Tibco Spotfire  Large Platform ◦ Microsoft ◦ IBM (Cognos) ◦ SAS ◦ SAP (BusinessObjects) ◦ Oracle (OBIEE) ◦ MicroStrategy ◦ Information Builders
  • 55. BI Platform Example – QlikView 55BI: Platform, Tool • Pros – Click driven, visually interactive interface is simple to learn and use. – Based on in-memory associative technology, which is fast. – Flexible data source (Oracle, SQL, excel, txt file). – Quicker to build comparing with traditional BI systems. • Cons – Need straight-forward relationship among tables, which requires clean data to link tables. – Its underlining calculation logic, set analysis, is hard to use for complicated logics. – Its script language is not complete enough to accomplish comprehensive tasks. – Most data need to be in memory.
  • 56. BI Platform Example – Tableau vs QlikView 56BI: Platform, Tool • Pros – More innovate visualization, including geo mapping. – Using UI to select data set instead of expression in code. – Free Tableau Public makes it very popular. • Cons – Weak ETL capability. • Sample Projects – Payment difference to medical providers for 100 common inpatient services
  • 57. Tableau Public – Free Hosting of Data Visualization 57BI: Platform, Tool https://public.tableau.com/s/gallery/new-yorks-citi-bikes
  • 58. Thank You 58BI: Thank You Analyzing data is worth the cost… The price of light is less than the cost of darkness. --Arthur C. Nielsen, Founder of ACNielsen Company Please send your comment or suggestion to ding.li@smartdatanet.com
  • 59. Appendix: Services from Smart Data Net Inc. 59BI: Smart Data Net Inc. Data Web Clicks Social Posts User DemographicsSale Supply Competitors BI Solutions 1.Provide 2.Analyze 3.Develop 4.GetInsights Business Client Smart Data NetCommunicate all the time Forecast Demand Return Profit Cost Marketing User Feedback Email Open/Click User ReferralR & D
  • 60. Appendix: How BI Can Help Small Business 60BI: How BI Can Help Small Business • Web Analysis – What do users want to see? – Can users find right contents? – Is website search engine friendly? – Tool: Google Analytics • Social Analysis – What contents are engaging users? – How to make contents far-reaching? – How to foster a supportive social group? – Tool: Facebook Insights, Twitter Analytics, HootSuite, Curalate • Sale Analysis – Near real-time revenue/cost analysis – Find problem/opportunity quickly – Service level analysis – Tool: Tableau, QlikView • Marketing/User Analysis – Which marketing method can bring most valuable users? – How to target right users based on their previous behavior? – User segmentation analysis – Tool: Tableau, QlikView, R, Python
  • 61. Appendix: How to Become a BI Developer/Data Scientist 61BI: How to Become a BI Developer/Data Scientist • Visualization Track (programming experience not required) – Proficient on a Visualization Tool • Tableau, QlikView – Study Visualization Best Practices • Books from Edward Tufle, Stephen Few – Understand Business Analysis Flow • Discuss with business users • Data Management Track – Data Warehouse & BI Platform • Amazon Redshift, Cognos, SAS, SAP, SQL, Oracle – Big Data Store • Hadoop, Teradata, AWS, Azure – No-SQL Store • MongoDB • Data Mining Track (for programmer or statistician) – Data Manipulation • Python – Statistics • R – Machine Learning • Octave, Java, R, Python • Resources – Free Online Classes • Coursera.og – Seminars • Meetup.com – Tool Online Training • www.tableausoftware.com/learn/training
  • 62. Appendix: How to Become a Tableau Developer from Scratch 62BI: How to Become a Tableau Developer • Tableau has Powerful Visualization, Great Usability and Short Learning Curve – Efficient for geo and trending analysis – Takes a couple of weeks to learn and a few months to master – Can be the first step to enter the data science world • Step 1: Using Free Tableau Public – Download Tableau Public • http://www.tableausoftware.com/public/ – Take Online Training • http://www.tableausoftware.com/public/training – Apply to Open Public Data • https://www.data.gov/open-gov/ • https://data.ny.gov/ • https://nycopendata.socrata.com/ – Save and Publish Your Work Online • With the free version, users cannot save the result on a local machine • This is all you need if you can publish all your work to public (the server is hosted by Tableau for free) • Step 2: Using Tableau Desktop – Download Tableau Desktop • http://www.tableausoftware.com/products/desktop – Use the 14-days free trial to do as much training and development as possible • http://www.tableausoftware.com/learn/training – Purchase the product if it is the right tool for you • Personal edition: $1000, no database connection • Professional edition: $2000, can open database • Next Steps: Enjoy Data Visualization and Analysis; Learn More Theory, Best Practices and Tools.