SlideShare a Scribd company logo
3
Most read
4
Most read
11
Most read
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Prediction Product AD
Campaign Performance
Presented by : Aishwarya Shetty
Introduction
 An advertising campaign is a set of advertisements that work together to promote a
product or service. An ad campaign is designed around a specific and unique theme to
create brand awareness about the company’s product or service.
 An advertising campaign can be a series of different individual ads or the same ad
across mediums used to create awareness and interest in a product or service.
 This is achieved through different forms of media, including radio, television, print
advertising, direct mail, or the internet.
 The intended objective of this project is to develop a robust supervised machine
learning model designed to accurately forecasts key performance indicators (KPIs) for
future product ad campaign. By achieving this goal, we aim to provide them with
valuable insights that can inform strategic decision-making, optimize resource
allocation, and enhance overall marketing effectiveness.
Dataset Information
Here are the key details about the dataset used in this project:
 .Our has 731 entries and 11 columns. The columns include
 There are three columns with float data types and eight with integer data types.
 The dataset has 5 categorical variables namely limit_infor, campaign_type, campaign_level, product_level,
resource_amount whose values have been represented by single digit numbers. Here, the target variable is
"orders"
limit_infor limits or restrictions associated with the marketing campaign or product.(0,1)
campaign_type type of marketing campaign, such as email, social media, print advertising,etc (0,1,2,3,4,5,6)
campaign_level level or scale of the marketing campaign, for example, national, regional, or local.(0,1)
product_level level or category of the product being marketed, such as high-end, mid-range, or budget.(1,2,3)
resource_amount resources (e.g., budget, personnel, or materials) allocated for the marketing campaign.(1,2,3,4,5,6,7,8,9)
email_rate email delivery rate or open rate.
price selling price of the product.
discount_rate discounts or promotional offers associated with the product.
hour_resources the number of labor hours or human resources dedicated to the marketing campaign or product sales efforts.
campaign_fee fees or costs associated with running the marketing campaign.
orders number of orders or sales generated for the product during the marketing campaign.
Exploratory Data Analysis (EDA)
 EDA is used to provides a provides a better understanding of data set variables and
the relationships between them.
 The dataset had no duplicates and 2 missing values in "price" which was 0.27% of
total data. Hence, the rows with missing values were dropped.
 While observing the relationship of numeric values with "orders", the campaign_fee
had one outlier which was removed for a cleaner data.
Exploratory Data Analysis (EDA)
 Correlation coefficient revealed that 'discount_rate' (0.232), email_rate'
(0.628), 'hour_resouces' (0.664), and 'campaign_fee' (0.929) have positive correlations
with 'orders'. 'price' (-0.103) has a weak negative correlation with 'orders'.
 The ANOVA results provide insights into the relationship between each categorical
variable and the numerical variable 'orders'. 'product_level' and 'resource_amount'
appear to have a significant relationship with 'orders', while the other categorical
variables do not.
 To ensure consistent scales for numerical features, MinMax Scaler was
employed during preprocessing.
Visualizations
• The independent variables "campaign_fee", "hour_resources" and "email_rate" have linear relationship
with target variable "orders" .
Visualizations
• There is a non-linear relationship between
price and the number of orders'
• The 'discount_rate' have positive
relationship with 'orders'.
 In this step, we divided the dataset into two parts: X and y.
 X contains all the independent variables, which are the features used to
make predictions.
 Meanwhile, y represents the dependent variable or target variable, which is the outcome
we want to predict.
 The dataset was split into training and testing sets.
 An 80:20 ratio was used, with 80% of the data allocated to training and 20% to testing,
and the test size set to 0.2.
 A random state of 42 was specified to ensure the reproducibility of results across
different runs
Splitting the data into X and y
Train-Test Split
Model Selection
The Prediction Product AD Campaign Performance is a regression problem. Hence
following models were used:
 Linear Regression is best for simple, linear relationships and offers high
interpretability.
 Support vector machine is versatile for both linear and non-linear relationships but
can be computationally expensive.
 Random Forest is powerful for complex, non-linear relationships and provides robust
performance but is less interpretable and more computationally intensive.
Predictions:
Linear Regression Support Vector Machine
RMSE on Train Score: 0.04030 RMSE on Train Score: 0.04123
RMSE on Test Score: 0.04055 RMSE on Test Score: 0.04047
Difference between RMSE on
train and test set
0.00025
Difference between
RMSE on train and test
set
0.00076
Observation: Linear regression have shown slightly better results than SVM
Predictions:
Observation: There is no significant improvement in RMSE values after
hypertuning the values
Random Forest Regression
Before Tuning After Tuning
RMSE on Train Score: 0.0211 RMSE on Train Score: 0.0312
RMSE on Test Score: 0.0445 RMSE on Test Score: 0.0455
Difference between
RMSE on train and test
set
0.0233
Difference between
RMSE on train and test
set
0.0233
Feature Importance (Random Forest
Regression)
• 'campaign_fee' has the highest importance , 'hour_resources' have moderate importance and price have
minimal importance compared to other two.
• The remaining features (email_rate, discount_rate, campaign_type, resource_amount, product_level,
campaign_level, limit_infor) have negligible importance in the model.
Conclusions
 The analysis and predictions provide valuable insights that can significantly enhance the ad
campaign performance for the company.
 Campaign Fee and Hour Resources: Increasing campaign fees and allocating more
resources positively influence the number of orders, suggesting that investments in these
areas are likely to yield higher returns.
 Pricing Strategy: Higher prices tend to reduce the number of orders. Therefore, maintaining
competitive and minimal prices can attract more customers and boost sales.
 Discount Rates: While higher discount rates can slightly increase the number of orders, their
impact is minimal. This indicates that focusing primarily on pricing and resource allocation
may be more effective than relying heavily on discounts.
 Model Performance: Linear regression outperforms SVM and random forest regression due
to the linear relationship between features and the target variable. This finding underscores
the importance of using a simpler, well-suited model to avoid overfitting and ensure accurate
predictions.
By leveraging these insights, the company can strategically allocate resources, optimize pricing,
and fine-tune their ad campaigns to maximize effectiveness and improve the overall return on
investment (ROI).

More Related Content

PPTX
Linear and Logistics Regression
PPTX
Bank churn with Data Science
PDF
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
PDF
Statistics for data scientists
PPTX
Application areas of data mining
PPTX
Architecture of data mining system
PPTX
Breast Cancer Detection with Convolutional Neural Networks (CNN)
PPTX
Random forest
Linear and Logistics Regression
Bank churn with Data Science
K Means Clustering Algorithm | K Means Example in Python | Machine Learning A...
Statistics for data scientists
Application areas of data mining
Architecture of data mining system
Breast Cancer Detection with Convolutional Neural Networks (CNN)
Random forest

What's hot (20)

PDF
Market baasket analysis
PDF
Module 5: Decision Trees
PDF
Linear discriminant analysis
PDF
Logistic Ordinal Regression
PPT
Spss beginners
PDF
Classification of Breast Masses Using Convolutional Neural Network as Feature...
PPT
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
PDF
Logistic regression
PPTX
Learning With Complete Data
PPT
CART Classification and Regression Trees Experienced User Guide
PDF
Data Science - Part V - Decision Trees & Random Forests
PPTX
Maximum likelihood estimation
PDF
Understanding random forests
PPT
Useful Techniques in Artificial Intelligence
PPTX
Ensemble methods
PPTX
Matrix factorization
PDF
PhD Defense - Example-Dependent Cost-Sensitive Classification
PDF
CART: Not only Classification and Regression Trees
PPTX
Support Vector Machine (SVM)
PPTX
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...
Market baasket analysis
Module 5: Decision Trees
Linear discriminant analysis
Logistic Ordinal Regression
Spss beginners
Classification of Breast Masses Using Convolutional Neural Network as Feature...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Logistic regression
Learning With Complete Data
CART Classification and Regression Trees Experienced User Guide
Data Science - Part V - Decision Trees & Random Forests
Maximum likelihood estimation
Understanding random forests
Useful Techniques in Artificial Intelligence
Ensemble methods
Matrix factorization
PhD Defense - Example-Dependent Cost-Sensitive Classification
CART: Not only Classification and Regression Trees
Support Vector Machine (SVM)
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...
Ad

Similar to Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation (20)

PPTX
Predict Your Profits: Optimizing Ad Campaigns with Data-Driven Insights
PPTX
Insurance Churn Prediction Data Analysis Project
PDF
Heuristic Approach for Demand Forecasting under the Impact of Promotions
PPT
Lobsters, Wine and Market Research
PPTX
Introduction to E-Commerce Product Recommendation
PPTX
Predicting Digital Marketing Success: Conversion Forecasting Strategies
PPTX
Business analytics
PPTX
SUVIDHA CHAPLOT Business Statistics_ Data-Driven Decision Making.pptx
PDF
Evans_Analytics2e_ppt_01.pdf
PDF
Intro_to_business_analytics_1707852756.pdf
PDF
Chapter 1 Introduction to Business Analytics.pdf
PPTX
Predictive modelling
PDF
Accurate Campaign Targeting Using Classification Algorithms
PPTX
Marketing Research Analytics - Predictive_modelling_.pptx
PPTX
Telecom Churn Prediction Presentation
PPTX
Bank Customer Churn Prediction- Saurav Singh.pptx
PPT
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
DOCX
Executive Program Practical Connection Assignment - 100 poin
PPTX
Introduction to Business Analytics---PPT
PPT
Analytical Strategies For Energy Marketing
Predict Your Profits: Optimizing Ad Campaigns with Data-Driven Insights
Insurance Churn Prediction Data Analysis Project
Heuristic Approach for Demand Forecasting under the Impact of Promotions
Lobsters, Wine and Market Research
Introduction to E-Commerce Product Recommendation
Predicting Digital Marketing Success: Conversion Forecasting Strategies
Business analytics
SUVIDHA CHAPLOT Business Statistics_ Data-Driven Decision Making.pptx
Evans_Analytics2e_ppt_01.pdf
Intro_to_business_analytics_1707852756.pdf
Chapter 1 Introduction to Business Analytics.pdf
Predictive modelling
Accurate Campaign Targeting Using Classification Algorithms
Marketing Research Analytics - Predictive_modelling_.pptx
Telecom Churn Prediction Presentation
Bank Customer Churn Prediction- Saurav Singh.pptx
Market Research using SPSS _ Edu4Sure Sept 2023.ppt
Executive Program Practical Connection Assignment - 100 poin
Introduction to Business Analytics---PPT
Analytical Strategies For Energy Marketing
Ad

More from Boston Institute of Analytics (20)

PPTX
"Predicting Employee Retention: A Data-Driven Approach to Enhancing Workforce...
PPTX
"Ecommerce Customer Segmentation & Prediction: Enhancing Business Strategies ...
PPTX
Music Recommendation System: A Data Science Project for Personalized Listenin...
PPTX
Mental Wellness Analyzer: Leveraging Data for Better Mental Health Insights -...
PPTX
Suddala-Scan: Enhancing Website Analysis with AI for Capstone Project at Bost...
PPTX
Fraud Detection in Cybersecurity: Advanced Techniques for Safeguarding Digita...
PPTX
Enhancing Brand Presence Through Social Media Marketing: A Strategic Approach...
PPTX
Employee Retention Prediction: Leveraging Data for Workforce Stability
PPTX
Predicting Movie Success: Unveiling Box Office Potential with Data Analytics
PPTX
Financial Fraud Detection: Identifying and Preventing Financial Fraud
PPTX
Smart Driver Alert: Predictive Fatigue Detection Technology
PPTX
Smart Driver Alert: Predictive Fatigue Detection Technology
PPTX
E-Commerce Customer Segmentation and Prediction: Unlocking Insights for Smart...
PPTX
Predictive Maintenance: Revolutionizing Vehicle Care with Demographic and Sen...
PPTX
Smart Driver Alert: Revolutionizing Road Safety with Predictive Fatigue Detec...
PDF
Water Potability Prediction: Ensuring Safe and Clean Water
PDF
Developing a Training Program for Employee Skill Enhancement
PPTX
Website Scanning: Uncovering Vulnerabilities and Ensuring Cybersecurity
PPTX
Analyzing Open Ports on Websites: Functions, Benefits, Threats, and Detailed ...
PPTX
Designing a Simple Python Tool for Website Vulnerability Scanning
"Predicting Employee Retention: A Data-Driven Approach to Enhancing Workforce...
"Ecommerce Customer Segmentation & Prediction: Enhancing Business Strategies ...
Music Recommendation System: A Data Science Project for Personalized Listenin...
Mental Wellness Analyzer: Leveraging Data for Better Mental Health Insights -...
Suddala-Scan: Enhancing Website Analysis with AI for Capstone Project at Bost...
Fraud Detection in Cybersecurity: Advanced Techniques for Safeguarding Digita...
Enhancing Brand Presence Through Social Media Marketing: A Strategic Approach...
Employee Retention Prediction: Leveraging Data for Workforce Stability
Predicting Movie Success: Unveiling Box Office Potential with Data Analytics
Financial Fraud Detection: Identifying and Preventing Financial Fraud
Smart Driver Alert: Predictive Fatigue Detection Technology
Smart Driver Alert: Predictive Fatigue Detection Technology
E-Commerce Customer Segmentation and Prediction: Unlocking Insights for Smart...
Predictive Maintenance: Revolutionizing Vehicle Care with Demographic and Sen...
Smart Driver Alert: Revolutionizing Road Safety with Predictive Fatigue Detec...
Water Potability Prediction: Ensuring Safe and Clean Water
Developing a Training Program for Employee Skill Enhancement
Website Scanning: Uncovering Vulnerabilities and Ensuring Cybersecurity
Analyzing Open Ports on Websites: Functions, Benefits, Threats, and Detailed ...
Designing a Simple Python Tool for Website Vulnerability Scanning

Recently uploaded (20)

PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Computer network topology notes for revision
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
Mega Projects Data Mega Projects Data
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Computer network topology notes for revision
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
IB Computer Science - Internal Assessment.pptx
Qualitative Qantitative and Mixed Methods.pptx
Fluorescence-microscope_Botany_detailed content
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Introduction to Knowledge Engineering Part 1
Introduction-to-Cloud-ComputingFinal.pptx
Supervised vs unsupervised machine learning algorithms
Business Acumen Training GuidePresentation.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Mega Projects Data Mega Projects Data
Galatica Smart Energy Infrastructure Startup Pitch Deck

Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation

  • 2. Prediction Product AD Campaign Performance Presented by : Aishwarya Shetty
  • 3. Introduction  An advertising campaign is a set of advertisements that work together to promote a product or service. An ad campaign is designed around a specific and unique theme to create brand awareness about the company’s product or service.  An advertising campaign can be a series of different individual ads or the same ad across mediums used to create awareness and interest in a product or service.  This is achieved through different forms of media, including radio, television, print advertising, direct mail, or the internet.  The intended objective of this project is to develop a robust supervised machine learning model designed to accurately forecasts key performance indicators (KPIs) for future product ad campaign. By achieving this goal, we aim to provide them with valuable insights that can inform strategic decision-making, optimize resource allocation, and enhance overall marketing effectiveness.
  • 4. Dataset Information Here are the key details about the dataset used in this project:  .Our has 731 entries and 11 columns. The columns include  There are three columns with float data types and eight with integer data types.  The dataset has 5 categorical variables namely limit_infor, campaign_type, campaign_level, product_level, resource_amount whose values have been represented by single digit numbers. Here, the target variable is "orders" limit_infor limits or restrictions associated with the marketing campaign or product.(0,1) campaign_type type of marketing campaign, such as email, social media, print advertising,etc (0,1,2,3,4,5,6) campaign_level level or scale of the marketing campaign, for example, national, regional, or local.(0,1) product_level level or category of the product being marketed, such as high-end, mid-range, or budget.(1,2,3) resource_amount resources (e.g., budget, personnel, or materials) allocated for the marketing campaign.(1,2,3,4,5,6,7,8,9) email_rate email delivery rate or open rate. price selling price of the product. discount_rate discounts or promotional offers associated with the product. hour_resources the number of labor hours or human resources dedicated to the marketing campaign or product sales efforts. campaign_fee fees or costs associated with running the marketing campaign. orders number of orders or sales generated for the product during the marketing campaign.
  • 5. Exploratory Data Analysis (EDA)  EDA is used to provides a provides a better understanding of data set variables and the relationships between them.  The dataset had no duplicates and 2 missing values in "price" which was 0.27% of total data. Hence, the rows with missing values were dropped.  While observing the relationship of numeric values with "orders", the campaign_fee had one outlier which was removed for a cleaner data.
  • 6. Exploratory Data Analysis (EDA)  Correlation coefficient revealed that 'discount_rate' (0.232), email_rate' (0.628), 'hour_resouces' (0.664), and 'campaign_fee' (0.929) have positive correlations with 'orders'. 'price' (-0.103) has a weak negative correlation with 'orders'.  The ANOVA results provide insights into the relationship between each categorical variable and the numerical variable 'orders'. 'product_level' and 'resource_amount' appear to have a significant relationship with 'orders', while the other categorical variables do not.  To ensure consistent scales for numerical features, MinMax Scaler was employed during preprocessing.
  • 7. Visualizations • The independent variables "campaign_fee", "hour_resources" and "email_rate" have linear relationship with target variable "orders" .
  • 8. Visualizations • There is a non-linear relationship between price and the number of orders' • The 'discount_rate' have positive relationship with 'orders'.
  • 9.  In this step, we divided the dataset into two parts: X and y.  X contains all the independent variables, which are the features used to make predictions.  Meanwhile, y represents the dependent variable or target variable, which is the outcome we want to predict.  The dataset was split into training and testing sets.  An 80:20 ratio was used, with 80% of the data allocated to training and 20% to testing, and the test size set to 0.2.  A random state of 42 was specified to ensure the reproducibility of results across different runs Splitting the data into X and y Train-Test Split
  • 10. Model Selection The Prediction Product AD Campaign Performance is a regression problem. Hence following models were used:  Linear Regression is best for simple, linear relationships and offers high interpretability.  Support vector machine is versatile for both linear and non-linear relationships but can be computationally expensive.  Random Forest is powerful for complex, non-linear relationships and provides robust performance but is less interpretable and more computationally intensive.
  • 11. Predictions: Linear Regression Support Vector Machine RMSE on Train Score: 0.04030 RMSE on Train Score: 0.04123 RMSE on Test Score: 0.04055 RMSE on Test Score: 0.04047 Difference between RMSE on train and test set 0.00025 Difference between RMSE on train and test set 0.00076 Observation: Linear regression have shown slightly better results than SVM
  • 12. Predictions: Observation: There is no significant improvement in RMSE values after hypertuning the values Random Forest Regression Before Tuning After Tuning RMSE on Train Score: 0.0211 RMSE on Train Score: 0.0312 RMSE on Test Score: 0.0445 RMSE on Test Score: 0.0455 Difference between RMSE on train and test set 0.0233 Difference between RMSE on train and test set 0.0233
  • 13. Feature Importance (Random Forest Regression) • 'campaign_fee' has the highest importance , 'hour_resources' have moderate importance and price have minimal importance compared to other two. • The remaining features (email_rate, discount_rate, campaign_type, resource_amount, product_level, campaign_level, limit_infor) have negligible importance in the model.
  • 14. Conclusions  The analysis and predictions provide valuable insights that can significantly enhance the ad campaign performance for the company.  Campaign Fee and Hour Resources: Increasing campaign fees and allocating more resources positively influence the number of orders, suggesting that investments in these areas are likely to yield higher returns.  Pricing Strategy: Higher prices tend to reduce the number of orders. Therefore, maintaining competitive and minimal prices can attract more customers and boost sales.  Discount Rates: While higher discount rates can slightly increase the number of orders, their impact is minimal. This indicates that focusing primarily on pricing and resource allocation may be more effective than relying heavily on discounts.  Model Performance: Linear regression outperforms SVM and random forest regression due to the linear relationship between features and the target variable. This finding underscores the importance of using a simpler, well-suited model to avoid overfitting and ensure accurate predictions. By leveraging these insights, the company can strategically allocate resources, optimize pricing, and fine-tune their ad campaigns to maximize effectiveness and improve the overall return on investment (ROI).