SlideShare a Scribd company logo
Discover the Potential of
Your Data with Machine
Learning
Housekeeping
• Webinar recordings and slides will be shared
with all attendees
• Type in your questions and comments using
the question pane on the right hand side
© Harbinger Systems | www.harbinger-systems.com
Presenters
© Harbinger Systems | www.harbinger-systems.com
Lalit Kumar
Business Analyst
Harbinger Systems
Gautam Mainkar
Data Analyst
Harbinger Systems
Agenda
• A Practical definition
• Why its important
• Using machine learning on enterprise data
– Types of business problems machine learning can solve
– How to categorize a problem- Regression, Clustering and
Classification
• Overview of key algorithms, tools and technologies
• Walk-through of real-world use cases
© Harbinger Systems | www.harbinger-systems.com
Machine Learning (ML) – A Practical Definition
A type of artificial intelligence
that provides computers with the ability to learn
without being explicitly programmed.
• Computer can infer rules inherent in data
• Computer adapts when exposed to new data
© Harbinger Systems | www.harbinger-systems.com
Why we Need it?
© Harbinger Systems | www.harbinger-systems.com
Comic by XKCD
Enterprise Data Hides Information
“There are things we know we know,
There are things we know we don't know.
But there are also things we don't know we don't
know”
- Donald Rumsfeld
© Harbinger Systems | www.harbinger-systems.com
What Constitutes a Machine Learning Problem?
© Harbinger Systems | www.harbinger-systems.com
Emphasis of machine learning is on
automatic methods
Devise learning algorithms that do the
learning automatically without human
intervention
Program by example: we don't care what
the machine does, as long as it does it
right
Result-oriented rather than process-
oriented
How can Machine Learning Add Value?
© Harbinger Systems | www.harbinger-systems.com
ML is a data driven approach
• Business knowledge isn’t necessary
ML is domain independent
• Same algorithms can be used across domains and in different use cases
ML creates flexible decision systems
• Creates robust systems that can adjust for changing systems without
human intervention
ML and Big Data
ML thrives with big data!
– Accuracy of algorithms increases with size of data
– Statistical approaches can treat big datasets much better than
traditional paradigms
– Decision making using ML can adapt to transactional data much better
© Harbinger Systems | www.harbinger-systems.com
Machine Learning Big Data
Fraud Detection: Did the user really do this login/make this purchase?
Product Recommendation: Will the user like this product?
Stock Trading: Will the stock go up or down?
Medical Diagnosis: Given some symptoms, what is the patient
suffering from?
© Harbinger Systems | www.harbinger-systems.com
Machine Learning Applications- Some Examples
© Harbinger Systems | www.harbinger-systems.com
How to Categorize the Problem?
Generally, machine learning problems looks to:
Identify a Value
Assign data points to a category
Discover similarities between two data points
© Harbinger Systems | www.harbinger-systems.com
Flowchart
Start
Sufficient
Data?
Sort into
category?
Predict a
value?
Refine Problem!
Labeled
Data
Clustering
Classification
Get more!
Regression
© Harbinger Systems | www.harbinger-systems.com
What to look for in algorithms:
Flexible across many use cases
Able to handle several input types
Accurate
Resistant to over-fitting/noise/error
Machine Learning Algorithms
© Harbinger Systems | www.harbinger-systems.com
Random Forest
Used for classification and regression
Works on small subsets of data and combines the result into the best estimate
XGBoost
Works on classification and regression
Starts off with a weak learner that improves over successive iterations
K-Means
Works on classification and clustering
Tries to find boundaries between data points for each individual variable
Machine Learning Algorithms
© Harbinger Systems | www.harbinger-systems.com
Tools and Technologies
Emphasis on tools which:
Can integrate with existing data architecture
Have a smooth learning curve
Simplify the process of analysis and prediction
Have an active community
© Harbinger Systems | www.harbinger-systems.com
Popular Machine Learning Tools
Python
Free, open-source, widely popular
Consolidates many important libraries in python, C
Has an active community
Disclaimer: Brand names, logos and trademarks used herein remain the property of their respective owners.
© Harbinger Systems | www.harbinger-systems.com
Popular Machine Learning Tools
R
Statistical computing language that simplifies complex
statistical operations
Large number of libraries available for extending
functionality (DB connectors, algorithm, visualization)
Disclaimer: Brand names, logos and trademarks used herein remain the property of their respective owners.
Scenario
Industrial MNC buys part assemblies from various suppliers
Supplier selection workflow is cumbersome and inadaptable
Create a system to predict supplier price quotes and simplify selection process
© Harbinger Systems | www.harbinger-systems.com
Price Prediction: Regression Problem
Data Available
Technical specifications and pricing parameters of past supplier quotes
© Harbinger Systems | www.harbinger-systems.com
Problem Type
Predict a numerical value (price quoted by supplier)
Numerical data (specs, prices, etc.)
Categorical data (part composition, payment options, etc.)
Price Prediction: Regression Problem
Algorithm Chosen: Random Forest
We are working with a mix of numerical and categorical variables
Large number of records but with relatively low dimensionality of features
(Overfitting is not a big risk)
We expect a complex relationship between features
© Harbinger Systems | www.harbinger-systems.com
Price Prediction: Regression Problem
Result
Predicted the quote price given by the supplier with a relatively low error rate
Simplified supplier selection workflow and opened avenues for complete
automation in future
© Harbinger Systems | www.harbinger-systems.com
Price Prediction: Regression Problem
Scenario
eLearning product is sold to universities, corporate and institution across
the globe
There is a need to improve conversion rate by targeted marketing
Create a system to sort prospects into a specific segment
© Harbinger Systems | www.harbinger-systems.com
Targeted Marketing: Classification Problem
Data Available
Historical data of purchases and customer data from CRM
© Harbinger Systems | www.harbinger-systems.com
Problem Type: Predict a Category (Customer Segment) Based on
Numerical data (payment records)
Categorical data (customer profile data, product purchased by data)
Targeted Marketing: Classification Problem
Algorithm Chosen: Gradient Boosting Machine (XGBoost)
A mix of numerical and categorical values
Extremely high dimensionality and size of data
Parallel processing capacities could be useful
Overfitting could be a problem
© Harbinger Systems | www.harbinger-systems.com
Targeted Marketing: Classification Problem
Result
Created customer segments; new prospects entering CRM are sorted into a
segment and marketing campaigns are targeted to a particular segment
Sales people are better equipped with insights
© Harbinger Systems | www.harbinger-systems.com
Targeted Marketing: Classification Problem
Scenario
News feed engine publishes varied news content for users
Some level of categorization is done by humans
There is a need to personalize and recommend articles
Create a system to discover similar articles based on content
© Harbinger Systems | www.harbinger-systems.com
Personalized News Feed: Clustering Problem
Data Available
Text content of the news articles
User’s reading history
© Harbinger Systems | www.harbinger-systems.com
Personalized News Feed: Clustering Problem
Algorithm Chosen: K-Means Clustering
We are interested in sorting data points in an arbitrary series of clusters
No intrinsic metric for verifying the 'correctness' of a cluster, must be checked by
human oversight
We expect sorting to be accurate with more data
© Harbinger Systems | www.harbinger-systems.com
Personalized News Feed: Clustering Problem
Result
Sorted articles into different clusters which are nominally identified by a label
© Harbinger Systems | www.harbinger-systems.com
Personalized News Feed: Clustering Problem
Conclusion
• Amount of data available to enterprises is exploding
• In order to remain competitive, enterprises will have
to have mastery over their data
• Machine learning provides a powerful framework for
extracting meaning and actions from data
© Harbinger Systems | www.harbinger-systems.com
Q&A
© Harbinger Systems | www.harbinger-systems.com
© Harbinger Systems | www.harbinger-systems.com
Thank You!
Visit us at: www.harbinger-systems.com
Write to us at: hsinfo@harbingergroup.com
Blog: blog.harbinger-systems.com
Twitter: twitter.com/HarbingerSys (@HarbingerSys)
Slideshare: slideshare.net/hsplmkting
Facebook: facebook.com/harbingersys
LinkedIn: linkedin.com/company/382306
Instagram: https://www.instagram.com/harbingersystems

More Related Content

PPTX
Engage for Success: Improve Workforce Engagement with Open Communication and ...
PPTX
Impact of SMAC Technology in HCM
PPTX
Application of Data Science in Government Services – IPMA Forum 2016 Speaker ...
PPTX
Create scalable and configurable multi tenancy application
DOCX
Inevitability of Multi-Tenancy & SAAS in Product Engineering
PPTX
Approach to Enterprise Cloudification with a focus on SaaS
PPTX
Webinar: Implementation of 10 Integration Patterns on iPaaS Platform
PDF
M2M Integration Platform as a Service iPaaS
Engage for Success: Improve Workforce Engagement with Open Communication and ...
Impact of SMAC Technology in HCM
Application of Data Science in Government Services – IPMA Forum 2016 Speaker ...
Create scalable and configurable multi tenancy application
Inevitability of Multi-Tenancy & SAAS in Product Engineering
Approach to Enterprise Cloudification with a focus on SaaS
Webinar: Implementation of 10 Integration Patterns on iPaaS Platform
M2M Integration Platform as a Service iPaaS

What's hot (20)

PDF
SoftWatch Overview_short (1)
PDF
AMB120: How Mature Are You? ITAM Attainment Model
POTX
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
PPTX
Wso2 con building the api centric enterprise - towards a connected business
PDF
AMB410: ITxM: The ITAM, ITSM, and Security Crossroads
PDF
AMB300: Lessons Learned from ITAM Customers
PDF
Partner Transformation for Hybrid Cloud Management
PPTX
Measuring the Success of Cloud-Based Services
PPTX
Overcoming Barriers to the Cloud
PDF
AMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
PPTX
IAM and cybersecurity - June 15
PDF
AMB420: Data Center Licensing with License Optimizer
PPTX
Data Analytics in Digital Transformation
PDF
WSO2Con USA 2017: Building Platforms for Rapid Application Development
PPTX
McKesson - Business Process Redesign
PPTX
CrossView Managed Services
PPT
Application Performance Monitoring
PPTX
Introducing Express Software Manager
PDF
eDocument Sciences SaaS 101
PDF
Launch Managed Services for IT Datasheet
SoftWatch Overview_short (1)
AMB120: How Mature Are You? ITAM Attainment Model
WSO2Con US 2013 - Creating the API Centric Enterprise Towards a Connected Bus...
Wso2 con building the api centric enterprise - towards a connected business
AMB410: ITxM: The ITAM, ITSM, and Security Crossroads
AMB300: Lessons Learned from ITAM Customers
Partner Transformation for Hybrid Cloud Management
Measuring the Success of Cloud-Based Services
Overcoming Barriers to the Cloud
AMB110: IT Asset Management – How to Start When You Don’t Know Where to Start
IAM and cybersecurity - June 15
AMB420: Data Center Licensing with License Optimizer
Data Analytics in Digital Transformation
WSO2Con USA 2017: Building Platforms for Rapid Application Development
McKesson - Business Process Redesign
CrossView Managed Services
Application Performance Monitoring
Introducing Express Software Manager
eDocument Sciences SaaS 101
Launch Managed Services for IT Datasheet
Ad

Viewers also liked (15)

PPTX
Webinar presentation-startups and mobility
PPTX
Building next gen hr solutions with people analytics-final
PPTX
Building real-time-collaborative-web-applications
PPTX
CLOUDIFICATION FOR INTERNET OF THINGS - THE ROAD AHEAD
PPTX
Webinar: Digital Health - The New Rx for USA Healthcare Ecosystem
PPTX
Webinar: UI/UX best practices in cms based web design
PPTX
Enhancing Unified Communication Experience through Microsoft Lync SDK and UCMA
PPTX
Webinar: How to choose your outsourcing partner for building mobile apps?
PPTX
iOS 8 HealthKit: Driving Smart Health Solutions
PPTX
Webinar: Structured attestation to meaningful use stage 2
PPTX
Webinar: Building amazing web apps rapidly with emerging tech
PPTX
Webinar: Mobile UX: Doing It The Right Way
PPTX
Webinar: Automation of Test Automation
PPTX
Open Technology Solutions For Healthcare Startups
PPTX
JavaScript MVC Frameworks: Backbone, Ember and Angular JS
Webinar presentation-startups and mobility
Building next gen hr solutions with people analytics-final
Building real-time-collaborative-web-applications
CLOUDIFICATION FOR INTERNET OF THINGS - THE ROAD AHEAD
Webinar: Digital Health - The New Rx for USA Healthcare Ecosystem
Webinar: UI/UX best practices in cms based web design
Enhancing Unified Communication Experience through Microsoft Lync SDK and UCMA
Webinar: How to choose your outsourcing partner for building mobile apps?
iOS 8 HealthKit: Driving Smart Health Solutions
Webinar: Structured attestation to meaningful use stage 2
Webinar: Building amazing web apps rapidly with emerging tech
Webinar: Mobile UX: Doing It The Right Way
Webinar: Automation of Test Automation
Open Technology Solutions For Healthcare Startups
JavaScript MVC Frameworks: Backbone, Ember and Angular JS
Ad

Similar to Discover the Potential of your Data with Machine Learning (20)

PDF
Choosing a Machine Learning technique to solve your need
PPTX
BIG DATA AND MACHINE LEARNING
PDF
Azure Machine Learning
PPTX
Lectuhhhhhhhhhhhhhhhhhhhhhhbbbhhhre 1.pptx
PDF
Introduction to Data Science
PPTX
machine learning introduction notes foRr
PDF
machine_learning_section1_ebook.pdf
PDF
Introduction to machine learning and applications (1)
PPTX
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
PPTX
The 4 Machine Learning Models Imperative for Business Transformation
PPTX
Ml in a Day Workshop 5/1
 
PDF
Machine learning it is time...
PPTX
Big Data & Machine Learning - TDC2013 Sao Paulo
PDF
Machine learning
PPTX
Machine learning101 v1.2
 
PDF
Introduction to Machine Learning
PPTX
Machine Learning with Azure and Databricks Virtual Workshop
 
PPT
intro to ML by the way m toh phasee movie Punjabi
PPTX
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
PDF
Intro to machine learning
Choosing a Machine Learning technique to solve your need
BIG DATA AND MACHINE LEARNING
Azure Machine Learning
Lectuhhhhhhhhhhhhhhhhhhhhhhbbbhhhre 1.pptx
Introduction to Data Science
machine learning introduction notes foRr
machine_learning_section1_ebook.pdf
Introduction to machine learning and applications (1)
Big Data & Machine Learning - TDC2013 São Paulo - 12/0713
The 4 Machine Learning Models Imperative for Business Transformation
Ml in a Day Workshop 5/1
 
Machine learning it is time...
Big Data & Machine Learning - TDC2013 Sao Paulo
Machine learning
Machine learning101 v1.2
 
Introduction to Machine Learning
Machine Learning with Azure and Databricks Virtual Workshop
 
intro to ML by the way m toh phasee movie Punjabi
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
Intro to machine learning

More from Harbinger Systems - HRTech Builder of Choice (20)

PPTX
Using People Analytics for a Sustainable Remote Workforce
PDF
5 Trends That Will Drive the Transformation of EdTech in 2021
PPTX
Rapidly Transforming Organizational Content into Learning Experiences
PPTX
Scalable HR Integrations for Better Data Analytics: Challenges & Solutions
PPTX
5 Key Items HR Should Consider Before Buying HR Technologies
PPTX
Best Practices to Build Marketplace-Ready Integrations
PPTX
HRTech Integration Masterclass Session 4 How to Expand Your Recruitment Datab...
PPTX
Recalibrating Product Strategy - Addressing Demand Shifts in Existing Markets
PPTX
How to Gain Key Insights from Data Distributed Across Multiple HR Systems
PPTX
HRTech Integration Master Class Session 1 -Delivering Seamless Learning Exper...
PPTX
Recalibrating Product Strategy - Addressing Demand Shifts in Existing Markets
PPTX
Integrating System of Records and Collaboration Tools
PPTX
How to Power Your HR Apps With AI And Make It Explainable
PPTX
Chatbot for Continuous Performance Management
PPTX
Leveraging mobile capabilities in your HR application
PDF
Automate HR applications using AI and ML
PPTX
A Cloud-based Collaborative Learning and Coaching Platform
PDF
Extending LRSs and the xAPI for Event-driven Blended and Adaptive Learning
PPTX
A medical prescription reminder app for i phone
PPTX
Webinar IoT Cloud Platforms and Middleware for Rapid Application Development
Using People Analytics for a Sustainable Remote Workforce
5 Trends That Will Drive the Transformation of EdTech in 2021
Rapidly Transforming Organizational Content into Learning Experiences
Scalable HR Integrations for Better Data Analytics: Challenges & Solutions
5 Key Items HR Should Consider Before Buying HR Technologies
Best Practices to Build Marketplace-Ready Integrations
HRTech Integration Masterclass Session 4 How to Expand Your Recruitment Datab...
Recalibrating Product Strategy - Addressing Demand Shifts in Existing Markets
How to Gain Key Insights from Data Distributed Across Multiple HR Systems
HRTech Integration Master Class Session 1 -Delivering Seamless Learning Exper...
Recalibrating Product Strategy - Addressing Demand Shifts in Existing Markets
Integrating System of Records and Collaboration Tools
How to Power Your HR Apps With AI And Make It Explainable
Chatbot for Continuous Performance Management
Leveraging mobile capabilities in your HR application
Automate HR applications using AI and ML
A Cloud-based Collaborative Learning and Coaching Platform
Extending LRSs and the xAPI for Event-driven Blended and Adaptive Learning
A medical prescription reminder app for i phone
Webinar IoT Cloud Platforms and Middleware for Rapid Application Development

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
sap open course for s4hana steps from ECC to s4
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Electronic commerce courselecture one. Pdf
PDF
Approach and Philosophy of On baking technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Machine learning based COVID-19 study performance prediction
PDF
MIND Revenue Release Quarter 2 2025 Press Release
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
Teaching material agriculture food technology
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Machine Learning_overview_presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
Empathic Computing: Creating Shared Understanding
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
sap open course for s4hana steps from ECC to s4
NewMind AI Weekly Chronicles - August'25-Week II
Electronic commerce courselecture one. Pdf
Approach and Philosophy of On baking technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Per capita expenditure prediction using model stacking based on satellite ima...
Programs and apps: productivity, graphics, security and other tools
Machine learning based COVID-19 study performance prediction
MIND Revenue Release Quarter 2 2025 Press Release
The AUB Centre for AI in Media Proposal.docx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Encapsulation_ Review paper, used for researhc scholars
Review of recent advances in non-invasive hemoglobin estimation
Network Security Unit 5.pdf for BCA BBA.
Teaching material agriculture food technology
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Machine Learning_overview_presentation.pptx
Spectral efficient network and resource selection model in 5G networks

Discover the Potential of your Data with Machine Learning

  • 1. Discover the Potential of Your Data with Machine Learning
  • 2. Housekeeping • Webinar recordings and slides will be shared with all attendees • Type in your questions and comments using the question pane on the right hand side © Harbinger Systems | www.harbinger-systems.com
  • 3. Presenters © Harbinger Systems | www.harbinger-systems.com Lalit Kumar Business Analyst Harbinger Systems Gautam Mainkar Data Analyst Harbinger Systems
  • 4. Agenda • A Practical definition • Why its important • Using machine learning on enterprise data – Types of business problems machine learning can solve – How to categorize a problem- Regression, Clustering and Classification • Overview of key algorithms, tools and technologies • Walk-through of real-world use cases © Harbinger Systems | www.harbinger-systems.com
  • 5. Machine Learning (ML) – A Practical Definition A type of artificial intelligence that provides computers with the ability to learn without being explicitly programmed. • Computer can infer rules inherent in data • Computer adapts when exposed to new data © Harbinger Systems | www.harbinger-systems.com
  • 6. Why we Need it? © Harbinger Systems | www.harbinger-systems.com Comic by XKCD
  • 7. Enterprise Data Hides Information “There are things we know we know, There are things we know we don't know. But there are also things we don't know we don't know” - Donald Rumsfeld © Harbinger Systems | www.harbinger-systems.com
  • 8. What Constitutes a Machine Learning Problem? © Harbinger Systems | www.harbinger-systems.com Emphasis of machine learning is on automatic methods Devise learning algorithms that do the learning automatically without human intervention Program by example: we don't care what the machine does, as long as it does it right Result-oriented rather than process- oriented
  • 9. How can Machine Learning Add Value? © Harbinger Systems | www.harbinger-systems.com ML is a data driven approach • Business knowledge isn’t necessary ML is domain independent • Same algorithms can be used across domains and in different use cases ML creates flexible decision systems • Creates robust systems that can adjust for changing systems without human intervention
  • 10. ML and Big Data ML thrives with big data! – Accuracy of algorithms increases with size of data – Statistical approaches can treat big datasets much better than traditional paradigms – Decision making using ML can adapt to transactional data much better © Harbinger Systems | www.harbinger-systems.com Machine Learning Big Data
  • 11. Fraud Detection: Did the user really do this login/make this purchase? Product Recommendation: Will the user like this product? Stock Trading: Will the stock go up or down? Medical Diagnosis: Given some symptoms, what is the patient suffering from? © Harbinger Systems | www.harbinger-systems.com Machine Learning Applications- Some Examples
  • 12. © Harbinger Systems | www.harbinger-systems.com How to Categorize the Problem? Generally, machine learning problems looks to: Identify a Value Assign data points to a category Discover similarities between two data points
  • 13. © Harbinger Systems | www.harbinger-systems.com Flowchart Start Sufficient Data? Sort into category? Predict a value? Refine Problem! Labeled Data Clustering Classification Get more! Regression
  • 14. © Harbinger Systems | www.harbinger-systems.com What to look for in algorithms: Flexible across many use cases Able to handle several input types Accurate Resistant to over-fitting/noise/error Machine Learning Algorithms
  • 15. © Harbinger Systems | www.harbinger-systems.com Random Forest Used for classification and regression Works on small subsets of data and combines the result into the best estimate XGBoost Works on classification and regression Starts off with a weak learner that improves over successive iterations K-Means Works on classification and clustering Tries to find boundaries between data points for each individual variable Machine Learning Algorithms
  • 16. © Harbinger Systems | www.harbinger-systems.com Tools and Technologies Emphasis on tools which: Can integrate with existing data architecture Have a smooth learning curve Simplify the process of analysis and prediction Have an active community
  • 17. © Harbinger Systems | www.harbinger-systems.com Popular Machine Learning Tools Python Free, open-source, widely popular Consolidates many important libraries in python, C Has an active community Disclaimer: Brand names, logos and trademarks used herein remain the property of their respective owners.
  • 18. © Harbinger Systems | www.harbinger-systems.com Popular Machine Learning Tools R Statistical computing language that simplifies complex statistical operations Large number of libraries available for extending functionality (DB connectors, algorithm, visualization) Disclaimer: Brand names, logos and trademarks used herein remain the property of their respective owners.
  • 19. Scenario Industrial MNC buys part assemblies from various suppliers Supplier selection workflow is cumbersome and inadaptable Create a system to predict supplier price quotes and simplify selection process © Harbinger Systems | www.harbinger-systems.com Price Prediction: Regression Problem
  • 20. Data Available Technical specifications and pricing parameters of past supplier quotes © Harbinger Systems | www.harbinger-systems.com Problem Type Predict a numerical value (price quoted by supplier) Numerical data (specs, prices, etc.) Categorical data (part composition, payment options, etc.) Price Prediction: Regression Problem
  • 21. Algorithm Chosen: Random Forest We are working with a mix of numerical and categorical variables Large number of records but with relatively low dimensionality of features (Overfitting is not a big risk) We expect a complex relationship between features © Harbinger Systems | www.harbinger-systems.com Price Prediction: Regression Problem
  • 22. Result Predicted the quote price given by the supplier with a relatively low error rate Simplified supplier selection workflow and opened avenues for complete automation in future © Harbinger Systems | www.harbinger-systems.com Price Prediction: Regression Problem
  • 23. Scenario eLearning product is sold to universities, corporate and institution across the globe There is a need to improve conversion rate by targeted marketing Create a system to sort prospects into a specific segment © Harbinger Systems | www.harbinger-systems.com Targeted Marketing: Classification Problem
  • 24. Data Available Historical data of purchases and customer data from CRM © Harbinger Systems | www.harbinger-systems.com Problem Type: Predict a Category (Customer Segment) Based on Numerical data (payment records) Categorical data (customer profile data, product purchased by data) Targeted Marketing: Classification Problem
  • 25. Algorithm Chosen: Gradient Boosting Machine (XGBoost) A mix of numerical and categorical values Extremely high dimensionality and size of data Parallel processing capacities could be useful Overfitting could be a problem © Harbinger Systems | www.harbinger-systems.com Targeted Marketing: Classification Problem
  • 26. Result Created customer segments; new prospects entering CRM are sorted into a segment and marketing campaigns are targeted to a particular segment Sales people are better equipped with insights © Harbinger Systems | www.harbinger-systems.com Targeted Marketing: Classification Problem
  • 27. Scenario News feed engine publishes varied news content for users Some level of categorization is done by humans There is a need to personalize and recommend articles Create a system to discover similar articles based on content © Harbinger Systems | www.harbinger-systems.com Personalized News Feed: Clustering Problem
  • 28. Data Available Text content of the news articles User’s reading history © Harbinger Systems | www.harbinger-systems.com Personalized News Feed: Clustering Problem
  • 29. Algorithm Chosen: K-Means Clustering We are interested in sorting data points in an arbitrary series of clusters No intrinsic metric for verifying the 'correctness' of a cluster, must be checked by human oversight We expect sorting to be accurate with more data © Harbinger Systems | www.harbinger-systems.com Personalized News Feed: Clustering Problem
  • 30. Result Sorted articles into different clusters which are nominally identified by a label © Harbinger Systems | www.harbinger-systems.com Personalized News Feed: Clustering Problem
  • 31. Conclusion • Amount of data available to enterprises is exploding • In order to remain competitive, enterprises will have to have mastery over their data • Machine learning provides a powerful framework for extracting meaning and actions from data © Harbinger Systems | www.harbinger-systems.com
  • 32. Q&A © Harbinger Systems | www.harbinger-systems.com
  • 33. © Harbinger Systems | www.harbinger-systems.com Thank You! Visit us at: www.harbinger-systems.com Write to us at: hsinfo@harbingergroup.com Blog: blog.harbinger-systems.com Twitter: twitter.com/HarbingerSys (@HarbingerSys) Slideshare: slideshare.net/hsplmkting Facebook: facebook.com/harbingersys LinkedIn: linkedin.com/company/382306 Instagram: https://www.instagram.com/harbingersystems