SlideShare a Scribd company logo
ABC Bookstore Analysis
The Three Data Musketeers
NIDA Business Analytics and Data Sciences Contest
September, 2, 2016
Thearasak Phaladisailoed
Team Position: Senior Data Scientist
Faculty of Information Technology
KMITL
Experienced with Python
Machine Learning Enthusiast
James Rakratchatakul
Team Position: Analyst
Faculty of Engineering
Chulalongkorn University
- Big Data & Statistical Analysis
- Experienced with RapidMiner
- Tableua Analyst
Korkrid Akepanidtaworn
Team Position: Data Scientist
- Economics and Statistics
- London School of Economics
- Big Data & Statistical Analysis
- Experienced with R Programming, STATA,
SAS, SPSS, PSPP, Python, BI Tools, and
Big Data Services
September, 2, 2016 Business Analytics and Data Sciences Contest 2
Our Team
Morning Agenda 1. Understanding our Dataset
2. Problem Statement
3. Data Preprocessing
4. Descriptive Analytics
5. Predictive Analytics
6. Data Insight Recap
4
Analytics Life Cycle
“Our team use mass analytics tools to build a
predictive model, communicate results,
operationalize data, and lead to discovery”
September, 2, 2016 Business Analytics and Data Sciences Contest 5
Dataset Snapshot
395 observations of 64 variables. The dataset comes with data dictionary, describing each variable.
September, 2, 2016 Business Analytics and Data Sciences Contest 6
Problem Statement
ABC Bookstore is currently experiencing problems
with a constant decline in profits, customer loyalty,
and customer satisfaction.
Our team will need to use data analytics to
investigate the ways to recover customer satisfaction,
books re-purchasing rate, and bookstore subscription.
September, 2, 2016 Business Analytics and Data Sciences Contest 7
Data Preprocessing
The ABC Bookstore dataset needs cleansing prior to
data analysis and visualization. The key challenges are:
 Outlier Detection (delete or impute extreme values)
 Missing Value Treatment
 Binning or Discretization
 Central of Tendency Imputation
Missing Values Replacement Policies:
• Ignore the records with missing values.
• Replace them with a global constant (e.g., “?”).
• Fill in missing values manually based on your
domain knowledge.
• Replace them with the variable mean (if numerical)
or the most frequent value (if categorical).
• Use modeling techniques such as nearest neighbors,
Bayes’ rule, decision tree, or EM algorithm.
8
Descriptive Analytics
What Happened When?
Explaining the past
September, 2, 2016 Business Analytics and Data Sciences Contest 9
Demographic Analysis
September, 2, 2016 Business Analytics and Data Sciences Contest 10
Demographic Analysis
September, 2, 2016 Business Analytics and Data Sciences Contest 11
Demographic Analysis
September, 2, 2016 Business Analytics and Data Sciences Contest 12
What will happen?
Predicting the future
September, 2, 2016 Business Analytics and Data Sciences Contest 13
Customer Satisfaction Model
Algorithms Title: Stepwise Linear Regression
How does the algorithms work?: predicting the value of target (numerical variable) by building a model based on
one or more predictors (numerical and categorical variables)
Business Objective: regain customer satisfaction and maximize customer utility
Goal: predict the levels of customer satisfaction from purchasing books from our store
Approach:
• Data cleansing and select relevant features
• Interpret the linear regression model
• Examine coefficient of determination and p-values.
• Parameter Tuning
• Model selection and evaluation
September, 2, 2016 Business Analytics and Data Sciences Contest 14
Regression Model Result
September, 2, 2016 Business Analytics and Data Sciences Contest 15
Model Evaluation – R Squared
The coefficient of determination (R2) summarizes the explanatory power of the regression model and is computed
from the sums-of-squares terms.
R2 describes the proportion of variance of the dependent variable explained by the regression model. If the
regression model is “perfect”, SSE is zero, and R2 is 1. If the regression model is a total failure, SSE is equal to SST,
no variance is explained by regression, and R2 is zero.
September, 2, 2016 Business Analytics and Data Sciences Contest 16
Re-Purchasing Algorithms
Algorithms Title: Linear Regression
How does the algorithms work?: predicting the value of target (numerical variable) by building a model based on
one or more predictors (numerical and categorical variables)
Business Objective: identify the rate at which a customer wants to re-purchase our books
Goal: predict the levels of customer satisfaction from purchasing books from our store
Approach:
• Data cleansing and select relevant features
• Interpret the linear regression model
• Examine coefficient of determination and p-values.
• Parameter Tuning
• Model selection and evaluation
September, 2, 2016 Business Analytics and Data Sciences Contest 17
Re-Purchasing Regression
September, 2, 2016 Business Analytics and Data Sciences Contest 18
Model Evaluation - RMSE
RMSE is a popular formula to measure the error rate of a regression model. However, it can only be compared
between models whose errors are measured in the same units: RMSE = 7.29605. Accuracy = 92.70395
September, 2, 2016 Business Analytics and Data Sciences Contest 19
Subscription Model
Algorithms Title: Decision Tree
How does the algorithms work?: Decision tree builds classification or regression models in the form of a tree
structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated
decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. Decision
trees can handle both categorical and numerical data.
Business Objective: identify customer preference to subscribe ABC bookstore
Goal: apply vs. not apply by using classification algorithms
Approach:
• Data cleansing and select relevant features
• Interpret the decision tree model
• Parameter Tuning: entropy, information gain, and decision rules
• Model selection and evaluation
September, 2, 2016 Business Analytics and Data Sciences Contest 20
Model Result
Significant var: CS5, CS8, CS9 + age, CS14, CS30
September, 2, 2016 Business Analytics and Data Sciences Contest 21
Model Evaluation
A confusion matrix shows the number of correct and incorrect predictions made by the classification model compared
to the actual outcomes (target value) in the data
• Accuracy : the proportion of the total number of predictions that were correct.
• Positive Predictive Value or Precision : the proportion of positive cases that were correctly identified.
• Negative Predictive Value : the proportion of negative cases that were correctly identified.
• Sensitivity or Recall : the proportion of actual positive cases which are correctly identified.
• Specificity : the proportion of actual negative cases which are correctly identified.
September, 2, 2016 Business Analytics and Data Sciences Contest 22
What’s Next?
Data-driven business strategy in the era of Thailand 4.0
 Better Campaign Performance
 Better Market Share
 Better Product Development
 Lasting Revenue
 Better customer satisfaction
 Better books re-purchasing rate
 Better bookstore subscription.
Business Analytics: Data-Driven ABC Bookstore
The Three Data Musketeers
NIDA Business Analytics and Data Sciences Contest
September, 2, 2016
Business Plan:
Table of Contents
1. Business Challenges and Expectations
2. Industry 4.0: Data-Driven Business
3. Marketing Strategy
4. Management
5. Service Recommendation
6. Project Timeline
7. Financial Analysis
Key Challenges & Outcomes
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 25
• Constant decline in profits
• Customer utility drops
• Low re-purchase
• Low the bookstore subscription
• Sustainable business growth
• Maximized customer utility
• Higher re-purchase
• High Incentive to subscribe
Industry 4.0
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 26
Data-Driven Marketing
If ABC Bookstore wants to succeed in the era of Thailand 4.0, the company
needs to deliver innovation, speed, cost reduction, and creative management
27September, 2, 2016 NIDA Business Analytics and Data Sciences Contest
the right message to the right person
at the right time for the right price
Strategy to Maximize Utility
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 28
According to previous analysis, it is found that “the employee can answer questions”, “activities/discussions”, “a
variety of books”, and “book wrapping” are statistically significant to the increase in customer satisfaction.
Management Challenge
 Lack of employee training
 Lack of store activities
Service Challenge
 A variety of books offered
 Add-on for book wrapping?
Strategy to Re-purchase
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 29
According to previous analysis, it is found that the top 10 factors that are statistically associated with re-
purchasing are a variety of books, bag claiming, respect for customers, employee’s good service, quality in book
categorization, feeling of security or freedom in reading, place to read books, Easy to search books, and discount
or promotion.
Strategy to Increase Subscription
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 30
Freedom for customers
Well-mannered
There’s book that I want!
Book Wrapping Service
Well-Organized
Quality in book categorization
Creative Management 4.0
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 31
Data indicates three following attributes of most importance:
 Well-Mannered
 Service-minded
 No pressure on customers
 Freedom for customers
 Respect for customers
 Balance of Can Do Capability and Will Do Motivation
 appropriate training for employees
 adequate skills and competencies, especially data-driven decisions
 updated knowledge: always keep up-to-date on social media
 Motivate employees in aspects of recognition, love of work, career
structure, and social respect.
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 32
ABC Serenade
Novel Business
Academic Magazine
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 33
ABC Serenade
Novel Top10
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 34
ABC Serenade
Novel
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxx
xxxx
September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 35
ABC Counter
Timeline
Set up Impression Box
Train Employees 1st
2016/
9
2016/
11
2016/
13
2016/
10
2016/
12
App Development
for Serenade
Serenade Zone
Full Operation
Social Media Marketing
36
Result Check-up
Purchase Computers
and materials for
serenade zone
36September, 2, 2016 NIDA Business Analytics and Data Sciences Contest
Financial Analysis
37September, 2, 2016 NIDA Business Analytics and Data Sciences Contest
Service Costs
Impression Box 200 B
Training Venue 10,000 B
Computer 20,000 B
App Development 10,000 B
Serenade Bar 10,000 B
Total 50,200 B
Thank You
Time for Q & A

More Related Content

PDF
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...
PDF
Marketing Analytics using R/Python
PPTX
Data analytics
PDF
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
PPTX
Data analytics
PPTX
Analytics 2
PPTX
Data analytics vs. Data analysis
PDF
Introduction to analytics
แผนธุรกิจ ของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Dat...
Marketing Analytics using R/Python
Data analytics
PoT - probeer de mogelijkheden van datamining zelf uit 30-10-2014
Data analytics
Analytics 2
Data analytics vs. Data analysis
Introduction to analytics

What's hot (20)

PDF
Big Data Analytics
PPTX
Importance of data analytics for business
PPT
Introducing SPSS customer overview
PPTX
BAS 150 Lesson 1 Lecture
PDF
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
PPTX
Data analytics
PPT
Data analytics & its Trends
PDF
Apply (Big) Data Analytics & Predictive Analytics to Business Application
PPTX
Analytics what to look for sustaining your growing business-
PPTX
Data analytics
PDF
Predictive analytics in action real-world examples and advice
PPTX
Predictive Analytics - An Overview
PPTX
Predictive Analytics: Business Perspective & Use Cases
PDF
Introduction to data analytics
PPTX
Introduction to Data Analytics
PPTX
BAS 150 Lesson 2 Lecture
PDF
Data Analytics and Big Data on IoT
PPTX
BAS 250 Lecture 2
ODP
Introduction To Analytics
Big Data Analytics
Importance of data analytics for business
Introducing SPSS customer overview
BAS 150 Lesson 1 Lecture
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytic...
Data analytics
Data analytics & its Trends
Apply (Big) Data Analytics & Predictive Analytics to Business Application
Analytics what to look for sustaining your growing business-
Data analytics
Predictive analytics in action real-world examples and advice
Predictive Analytics - An Overview
Predictive Analytics: Business Perspective & Use Cases
Introduction to data analytics
Introduction to Data Analytics
BAS 150 Lesson 2 Lecture
Data Analytics and Big Data on IoT
BAS 250 Lecture 2
Introduction To Analytics
Ad

Viewers also liked (20)

PDF
Second prize business plan @ the First NIDA business analytics and data scien...
PDF
Second prize data analysis @ the First NIDA business analytics and data scie...
PDF
Tableau for statistical graphic and data visualization
PDF
Oracle Enterprise Performance Management Overview
PDF
Nida event oracle business analytics 1 sep2016
PDF
ระบบการเรียนการสอนระยะไกลโดยใช้เทคโนโลยีคลาวด์ โดย รศ. ดร. พิพัฒน์ หิรัญวณิชช...
PDF
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
PDF
Oracle Enterprise Performance Management
PPTX
Tableau Presentation
PDF
cybersecurity regulation for thai capital market ดร.กำพล ศรธนะรัตน์ ผู้อำนวย...
PDF
เสถียรภาพและความมั่นคงของกองทุนการออมแห่งชาติ: การประเมินทางคณิตศาสตร์ประกันภ...
PDF
Career Track: Business Analytics and Intelligence@NIDA โดย อาจารย์ ดร. อานนท์...
PDF
ปัจจัยที่มีอิทธิพลต่อการเปลี่ยนแปลงการใช้จ่ายของครัวเรือนไทย โดย รศ.ดร.เดือนเ...
PDF
Face recognition and modeling โดย ผศ.ดร.ธนาสัย สุคนธ์พันธุ์
PDF
Data Visualization with Tableau - by Knowledgebee Trainings
PDF
R server and spark
PDF
microsoft r server for distributed computing
PPTX
Tableau Software - Business Analytics and Data Visualization
PPT
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
PPTX
Tableau presentation
Second prize business plan @ the First NIDA business analytics and data scien...
Second prize data analysis @ the First NIDA business analytics and data scie...
Tableau for statistical graphic and data visualization
Oracle Enterprise Performance Management Overview
Nida event oracle business analytics 1 sep2016
ระบบการเรียนการสอนระยะไกลโดยใช้เทคโนโลยีคลาวด์ โดย รศ. ดร. พิพัฒน์ หิรัญวณิชช...
R Tool for Visual Studio และการทำงานร่วมกันเป็นทีม โดย เฉลิมวงศ์ วิจิตรปิยะกุ...
Oracle Enterprise Performance Management
Tableau Presentation
cybersecurity regulation for thai capital market ดร.กำพล ศรธนะรัตน์ ผู้อำนวย...
เสถียรภาพและความมั่นคงของกองทุนการออมแห่งชาติ: การประเมินทางคณิตศาสตร์ประกันภ...
Career Track: Business Analytics and Intelligence@NIDA โดย อาจารย์ ดร. อานนท์...
ปัจจัยที่มีอิทธิพลต่อการเปลี่ยนแปลงการใช้จ่ายของครัวเรือนไทย โดย รศ.ดร.เดือนเ...
Face recognition and modeling โดย ผศ.ดร.ธนาสัย สุคนธ์พันธุ์
Data Visualization with Tableau - by Knowledgebee Trainings
R server and spark
microsoft r server for distributed computing
Tableau Software - Business Analytics and Data Visualization
Learning Tableau - Data, Graphs, Filters, Dashboards and Advanced features
Tableau presentation
Ad

Similar to ผลการวิเคราะห์ข้อมูลของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Data Sciences Contest (20)

PDF
Predictive Analytics: From Insight to Action
PPTX
Business Analytics.pptx
PPTX
Introduction to Business Analytics---PPT
PDF
Day 1 - Introduction to Data Analytics.pdf
DOCX
Curriculum Vitae
DOCX
Curriculum Vitae
PDF
Business analytics course with NSE India certification
PDF
Business analytics course with NSE India Certification
PPT
Lobsters, Wine and Market Research
DOCX
Curriculum Vitae
PPTX
Introduction to BUsiness Analytics.pptx
PPTX
Data Analytics & Visualization (Introduction)
PDF
What's new with analytics in academia?
PPSX
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
PPTX
727325165-Unit-1-Data-Analytics-PPT-1.pptx
PPTX
Unit 1 pptx.pptx
PDF
Operationalizing Customer Analytics with Azure and Power BI
 
PDF
How Data Science Can Transform Your Business. | IABAC
PDF
Marketing Analytics in a Week
PPTX
Introduction to Business Anlytics and Strategic Landscape
Predictive Analytics: From Insight to Action
Business Analytics.pptx
Introduction to Business Analytics---PPT
Day 1 - Introduction to Data Analytics.pdf
Curriculum Vitae
Curriculum Vitae
Business analytics course with NSE India certification
Business analytics course with NSE India Certification
Lobsters, Wine and Market Research
Curriculum Vitae
Introduction to BUsiness Analytics.pptx
Data Analytics & Visualization (Introduction)
What's new with analytics in academia?
JU Analytics Day Presentation by Naveen Agarwal, Creative Analytics Solutions...
727325165-Unit-1-Data-Analytics-PPT-1.pptx
Unit 1 pptx.pptx
Operationalizing Customer Analytics with Azure and Power BI
 
How Data Science Can Transform Your Business. | IABAC
Marketing Analytics in a Week
Introduction to Business Anlytics and Strategic Landscape

More from BAINIDA (20)

PDF
ดนตรีของพระเจ้าแผ่นดิน อานนท์ ศักดิ์วรวิชญ์ สุรพงษ์ บ้านไกรทอง หอประชุมวปอ 7...
PDF
Mixed methods in social and behavioral sciences
PDF
Advanced quantitative research methods in political science and pa
PPTX
Latest thailand election2019report
PDF
Data science in medicine
PPTX
Nursing data science
PDF
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
PDF
Statistics and big data for justice and fairness
PDF
Data science and big data for business and industrial application
PDF
Update trend: Free digital marketing metrics for start-up
PDF
Advent of ds and stat adjustment
PPTX
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร
PPTX
Data visualization. map
PPTX
Dark data by Worapol Alex Pongpech
PDF
Deepcut Thai word Segmentation @ NIDA
PPTX
Professionals and wanna be in Business Analytics and Data Science
PDF
Deep learning and image analytics using Python by Dr Sanparit
PDF
Visualizing for impact final
PPTX
Python programming workshop
PDF
Current trends in information security โดย ผศ.ดร.ปราโมทย์ กั่วเจริญ
ดนตรีของพระเจ้าแผ่นดิน อานนท์ ศักดิ์วรวิชญ์ สุรพงษ์ บ้านไกรทอง หอประชุมวปอ 7...
Mixed methods in social and behavioral sciences
Advanced quantitative research methods in political science and pa
Latest thailand election2019report
Data science in medicine
Nursing data science
Financial time series analysis with R@the 3rd NIDA BADS conference by Asst. p...
Statistics and big data for justice and fairness
Data science and big data for business and industrial application
Update trend: Free digital marketing metrics for start-up
Advent of ds and stat adjustment
เมื่อ Data Science เข้ามา สถิติศาสตร์จะปรับตัวอย่างไร
Data visualization. map
Dark data by Worapol Alex Pongpech
Deepcut Thai word Segmentation @ NIDA
Professionals and wanna be in Business Analytics and Data Science
Deep learning and image analytics using Python by Dr Sanparit
Visualizing for impact final
Python programming workshop
Current trends in information security โดย ผศ.ดร.ปราโมทย์ กั่วเจริญ

Recently uploaded (20)

PPTX
Cell Structure & Organelles in detailed.
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Complications of Minimal Access Surgery at WLH
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Insiders guide to clinical Medicine.pdf
PDF
Basic Mud Logging Guide for educational purpose
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Pharma ospi slides which help in ospi learning
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
RMMM.pdf make it easy to upload and study
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
Microbial diseases, their pathogenesis and prophylaxis
Cell Structure & Organelles in detailed.
2.FourierTransform-ShortQuestionswithAnswers.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Computing-Curriculum for Schools in Ghana
O5-L3 Freight Transport Ops (International) V1.pdf
Renaissance Architecture: A Journey from Faith to Humanism
Complications of Minimal Access Surgery at WLH
Microbial disease of the cardiovascular and lymphatic systems
Insiders guide to clinical Medicine.pdf
Basic Mud Logging Guide for educational purpose
O7-L3 Supply Chain Operations - ICLT Program
Module 4: Burden of Disease Tutorial Slides S2 2025
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
human mycosis Human fungal infections are called human mycosis..pptx
Pharma ospi slides which help in ospi learning
PPH.pptx obstetrics and gynecology in nursing
RMMM.pdf make it easy to upload and study
Sports Quiz easy sports quiz sports quiz
Microbial diseases, their pathogenesis and prophylaxis

ผลการวิเคราะห์ข้อมูลของทีมที่ได้รางวัลชนะเลิศ The First NIDA Business Analytics and Data Sciences Contest

  • 1. ABC Bookstore Analysis The Three Data Musketeers NIDA Business Analytics and Data Sciences Contest September, 2, 2016
  • 2. Thearasak Phaladisailoed Team Position: Senior Data Scientist Faculty of Information Technology KMITL Experienced with Python Machine Learning Enthusiast James Rakratchatakul Team Position: Analyst Faculty of Engineering Chulalongkorn University - Big Data & Statistical Analysis - Experienced with RapidMiner - Tableua Analyst Korkrid Akepanidtaworn Team Position: Data Scientist - Economics and Statistics - London School of Economics - Big Data & Statistical Analysis - Experienced with R Programming, STATA, SAS, SPSS, PSPP, Python, BI Tools, and Big Data Services September, 2, 2016 Business Analytics and Data Sciences Contest 2 Our Team
  • 3. Morning Agenda 1. Understanding our Dataset 2. Problem Statement 3. Data Preprocessing 4. Descriptive Analytics 5. Predictive Analytics 6. Data Insight Recap
  • 4. 4 Analytics Life Cycle “Our team use mass analytics tools to build a predictive model, communicate results, operationalize data, and lead to discovery”
  • 5. September, 2, 2016 Business Analytics and Data Sciences Contest 5 Dataset Snapshot 395 observations of 64 variables. The dataset comes with data dictionary, describing each variable.
  • 6. September, 2, 2016 Business Analytics and Data Sciences Contest 6 Problem Statement ABC Bookstore is currently experiencing problems with a constant decline in profits, customer loyalty, and customer satisfaction. Our team will need to use data analytics to investigate the ways to recover customer satisfaction, books re-purchasing rate, and bookstore subscription.
  • 7. September, 2, 2016 Business Analytics and Data Sciences Contest 7 Data Preprocessing The ABC Bookstore dataset needs cleansing prior to data analysis and visualization. The key challenges are:  Outlier Detection (delete or impute extreme values)  Missing Value Treatment  Binning or Discretization  Central of Tendency Imputation Missing Values Replacement Policies: • Ignore the records with missing values. • Replace them with a global constant (e.g., “?”). • Fill in missing values manually based on your domain knowledge. • Replace them with the variable mean (if numerical) or the most frequent value (if categorical). • Use modeling techniques such as nearest neighbors, Bayes’ rule, decision tree, or EM algorithm.
  • 8. 8 Descriptive Analytics What Happened When? Explaining the past
  • 9. September, 2, 2016 Business Analytics and Data Sciences Contest 9 Demographic Analysis
  • 10. September, 2, 2016 Business Analytics and Data Sciences Contest 10 Demographic Analysis
  • 11. September, 2, 2016 Business Analytics and Data Sciences Contest 11 Demographic Analysis
  • 12. September, 2, 2016 Business Analytics and Data Sciences Contest 12 What will happen? Predicting the future
  • 13. September, 2, 2016 Business Analytics and Data Sciences Contest 13 Customer Satisfaction Model Algorithms Title: Stepwise Linear Regression How does the algorithms work?: predicting the value of target (numerical variable) by building a model based on one or more predictors (numerical and categorical variables) Business Objective: regain customer satisfaction and maximize customer utility Goal: predict the levels of customer satisfaction from purchasing books from our store Approach: • Data cleansing and select relevant features • Interpret the linear regression model • Examine coefficient of determination and p-values. • Parameter Tuning • Model selection and evaluation
  • 14. September, 2, 2016 Business Analytics and Data Sciences Contest 14 Regression Model Result
  • 15. September, 2, 2016 Business Analytics and Data Sciences Contest 15 Model Evaluation – R Squared The coefficient of determination (R2) summarizes the explanatory power of the regression model and is computed from the sums-of-squares terms. R2 describes the proportion of variance of the dependent variable explained by the regression model. If the regression model is “perfect”, SSE is zero, and R2 is 1. If the regression model is a total failure, SSE is equal to SST, no variance is explained by regression, and R2 is zero.
  • 16. September, 2, 2016 Business Analytics and Data Sciences Contest 16 Re-Purchasing Algorithms Algorithms Title: Linear Regression How does the algorithms work?: predicting the value of target (numerical variable) by building a model based on one or more predictors (numerical and categorical variables) Business Objective: identify the rate at which a customer wants to re-purchase our books Goal: predict the levels of customer satisfaction from purchasing books from our store Approach: • Data cleansing and select relevant features • Interpret the linear regression model • Examine coefficient of determination and p-values. • Parameter Tuning • Model selection and evaluation
  • 17. September, 2, 2016 Business Analytics and Data Sciences Contest 17 Re-Purchasing Regression
  • 18. September, 2, 2016 Business Analytics and Data Sciences Contest 18 Model Evaluation - RMSE RMSE is a popular formula to measure the error rate of a regression model. However, it can only be compared between models whose errors are measured in the same units: RMSE = 7.29605. Accuracy = 92.70395
  • 19. September, 2, 2016 Business Analytics and Data Sciences Contest 19 Subscription Model Algorithms Title: Decision Tree How does the algorithms work?: Decision tree builds classification or regression models in the form of a tree structure. It breaks down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes. Decision trees can handle both categorical and numerical data. Business Objective: identify customer preference to subscribe ABC bookstore Goal: apply vs. not apply by using classification algorithms Approach: • Data cleansing and select relevant features • Interpret the decision tree model • Parameter Tuning: entropy, information gain, and decision rules • Model selection and evaluation
  • 20. September, 2, 2016 Business Analytics and Data Sciences Contest 20 Model Result Significant var: CS5, CS8, CS9 + age, CS14, CS30
  • 21. September, 2, 2016 Business Analytics and Data Sciences Contest 21 Model Evaluation A confusion matrix shows the number of correct and incorrect predictions made by the classification model compared to the actual outcomes (target value) in the data • Accuracy : the proportion of the total number of predictions that were correct. • Positive Predictive Value or Precision : the proportion of positive cases that were correctly identified. • Negative Predictive Value : the proportion of negative cases that were correctly identified. • Sensitivity or Recall : the proportion of actual positive cases which are correctly identified. • Specificity : the proportion of actual negative cases which are correctly identified.
  • 22. September, 2, 2016 Business Analytics and Data Sciences Contest 22 What’s Next? Data-driven business strategy in the era of Thailand 4.0  Better Campaign Performance  Better Market Share  Better Product Development  Lasting Revenue  Better customer satisfaction  Better books re-purchasing rate  Better bookstore subscription.
  • 23. Business Analytics: Data-Driven ABC Bookstore The Three Data Musketeers NIDA Business Analytics and Data Sciences Contest September, 2, 2016
  • 24. Business Plan: Table of Contents 1. Business Challenges and Expectations 2. Industry 4.0: Data-Driven Business 3. Marketing Strategy 4. Management 5. Service Recommendation 6. Project Timeline 7. Financial Analysis
  • 25. Key Challenges & Outcomes September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 25 • Constant decline in profits • Customer utility drops • Low re-purchase • Low the bookstore subscription • Sustainable business growth • Maximized customer utility • Higher re-purchase • High Incentive to subscribe
  • 26. Industry 4.0 September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 26
  • 27. Data-Driven Marketing If ABC Bookstore wants to succeed in the era of Thailand 4.0, the company needs to deliver innovation, speed, cost reduction, and creative management 27September, 2, 2016 NIDA Business Analytics and Data Sciences Contest the right message to the right person at the right time for the right price
  • 28. Strategy to Maximize Utility September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 28 According to previous analysis, it is found that “the employee can answer questions”, “activities/discussions”, “a variety of books”, and “book wrapping” are statistically significant to the increase in customer satisfaction. Management Challenge  Lack of employee training  Lack of store activities Service Challenge  A variety of books offered  Add-on for book wrapping?
  • 29. Strategy to Re-purchase September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 29 According to previous analysis, it is found that the top 10 factors that are statistically associated with re- purchasing are a variety of books, bag claiming, respect for customers, employee’s good service, quality in book categorization, feeling of security or freedom in reading, place to read books, Easy to search books, and discount or promotion.
  • 30. Strategy to Increase Subscription September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 30 Freedom for customers Well-mannered There’s book that I want! Book Wrapping Service Well-Organized Quality in book categorization
  • 31. Creative Management 4.0 September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 31 Data indicates three following attributes of most importance:  Well-Mannered  Service-minded  No pressure on customers  Freedom for customers  Respect for customers  Balance of Can Do Capability and Will Do Motivation  appropriate training for employees  adequate skills and competencies, especially data-driven decisions  updated knowledge: always keep up-to-date on social media  Motivate employees in aspects of recognition, love of work, career structure, and social respect.
  • 32. September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 32 ABC Serenade Novel Business Academic Magazine
  • 33. September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 33 ABC Serenade Novel Top10
  • 34. September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 34 ABC Serenade Novel xxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxx xxxx
  • 35. September, 2, 2016 NIDA Business Analytics and Data Sciences Contest 35 ABC Counter
  • 36. Timeline Set up Impression Box Train Employees 1st 2016/ 9 2016/ 11 2016/ 13 2016/ 10 2016/ 12 App Development for Serenade Serenade Zone Full Operation Social Media Marketing 36 Result Check-up Purchase Computers and materials for serenade zone 36September, 2, 2016 NIDA Business Analytics and Data Sciences Contest
  • 37. Financial Analysis 37September, 2, 2016 NIDA Business Analytics and Data Sciences Contest Service Costs Impression Box 200 B Training Venue 10,000 B Computer 20,000 B App Development 10,000 B Serenade Bar 10,000 B Total 50,200 B