SlideShare a Scribd company logo
Project Presentation - Machine
Language Foundations For
Product Managers Course
A
J
B
– Amit J Bhattacharyya
Project Objective & Input Data
Characteristics
Objective is to build a model to predict the electrical energy output of a
Combined Cycle Power Plant, which uses a combination of gas turbines, steam
turbines, and heat recovery steam generators to generate power.
Input Data –
 A set of 9568 hourly average data points from sensors at the power plant
which can be used to build the model.
 The sensor readings provide data for the following factors –
o Temperature (T)
o Ambient Pressure (AP)
o Relative Humidity (RH)
o Exhaust Vacuum (V)
o Net hourly electrical energy output (PE) (Prediction Target)
A
J
B
Decisions – Approach, Dataset Divisions.
Model Finalizations (Feature Groupings)
 The dataset clearly provides very defined independent variables (AT,AP,RH,V) and dependent/output variable (PE) the energy
output from the combined cycle power plant so Multiple Linear Regression approach seems more likely to build Model.
 There are 9568 observations of the 4 features Ambient Temperature (AT), Ambient Pressure(AP), Relative Humidity(RH) and
Exhaust Vacuum(V). So straight off reserved the last 20% ~ 1913 rows for final Test Set. The remaining 7655 was divided 5
groups of 1531 rows each.
 First Model (MD1) is including all features (AT,AP,V,RH), decision for the Second Model (MD2) was taken from a functional
angle to leave out exhaust vacuum and to consider mostly the atmospheric impact parameters of temperature, pressure and
relative humidity. Third Model (MD3) was where we selected just AT & V features based on some scatter plots done on first
slice of data which helped understand better the relationship of each Independent variable to the target variable (PE).
 Used Excel Data Analysis Regression on the first 3 lots (from 7655 rows) for the bias & coefficients calculation for the 3
models. And the next 2 lots was used as a validation set for evaluation run on 3 models for comparing their R2 .
 SSE was calculated for the precited values of the target variable from the 3 models on the Validation Set , SST was also
calculated. So, SSE & SST was used to calculate R2 for each of the 3 models to see which model fits the closest and is best
able to explain the variances. Model MD-1 was found to be the best.
 And on the Final Dataset we calculated the Mean Square Error (MSE) and Mean Absolute Error (MAE) for all the 3 models and
the results there too reinforced our decision on the selection of the model MD1 from the validation set run. MD1 was the one
which ended up with the lowest MSE & MAE.
A
J
B
Training Set Analysis – Bias &
Coefficients calculation for Models
Validation Set Runs and Analysis –
Calculation of R2 for Model Selection
Final Test Set Run and Vindication of
Model Chosen – MSE & MAE

More Related Content

DOCX
ControlsLab1
PDF
40220130405014 (1)
PPTX
Data analytics Lecture power point presentations
PDF
Forecasting day ahead power prices in germany using fixed size least squares ...
PDF
Multi objective-optimization-with-fuzzy-based-ranking-for-tcsc-supplementary-...
PDF
Alienor method applied to induction machine parameters identification
PDF
Multi-objective Optimization Scheme for PID-Controlled DC Motor
PDF
Forecasting Methodology Used in Restructured Electricity Market: A Review
ControlsLab1
40220130405014 (1)
Data analytics Lecture power point presentations
Forecasting day ahead power prices in germany using fixed size least squares ...
Multi objective-optimization-with-fuzzy-based-ranking-for-tcsc-supplementary-...
Alienor method applied to induction machine parameters identification
Multi-objective Optimization Scheme for PID-Controlled DC Motor
Forecasting Methodology Used in Restructured Electricity Market: A Review

Similar to Machine Learning Foundations Project Presentation (20)

PDF
Air Con Energy Rating Info411 Presentation.pdf
PDF
Presentation on the inclusive analysis
PDF
Oscillatory Stability Prediction Using PSO Based Synchronizing and Damping To...
PDF
Analytical Evaluation of Generalized Predictive Control Algorithms Using a Fu...
PDF
40220140503006
PDF
Comparative analysis of FACTS controllers by tuning employing GA and PSO
PDF
An Efficient Control Implementation for Inverter Based Harmony Search Algorithm
PDF
Identifying Three Phase Induction Motor Equivalent Circuit Parameters from Na...
PDF
Multiple Regression
PDF
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...
PDF
J010417781
PPT
AHF_IDETC_2011_Jie
PDF
Identification study of solar cell/module using recent optimization techniques
PDF
IRJET- IR Instrument Thermal Background Modelling and Radiometric Analysis
PPTX
Electromechanical_Systems_with_Simscape3e.pptx
PDF
Genetic Algorithm for Solving the Economic Load Dispatch
PDF
AIRLINE FARE PRICE PREDICTION
PDF
Fault prediction using logistic regression (Python)
PPTX
WCSMO-Vidmap-2015
Air Con Energy Rating Info411 Presentation.pdf
Presentation on the inclusive analysis
Oscillatory Stability Prediction Using PSO Based Synchronizing and Damping To...
Analytical Evaluation of Generalized Predictive Control Algorithms Using a Fu...
40220140503006
Comparative analysis of FACTS controllers by tuning employing GA and PSO
An Efficient Control Implementation for Inverter Based Harmony Search Algorithm
Identifying Three Phase Induction Motor Equivalent Circuit Parameters from Na...
Multiple Regression
Optimization of Automatic Voltage Regulator Using Genetic Algorithm Applying ...
J010417781
AHF_IDETC_2011_Jie
Identification study of solar cell/module using recent optimization techniques
IRJET- IR Instrument Thermal Background Modelling and Radiometric Analysis
Electromechanical_Systems_with_Simscape3e.pptx
Genetic Algorithm for Solving the Economic Load Dispatch
AIRLINE FARE PRICE PREDICTION
Fault prediction using logistic regression (Python)
WCSMO-Vidmap-2015
Ad

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation theory and applications.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Approach and Philosophy of On baking technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
A Presentation on Artificial Intelligence
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Empathic Computing: Creating Shared Understanding
Network Security Unit 5.pdf for BCA BBA.
Programs and apps: productivity, graphics, security and other tools
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MYSQL Presentation for SQL database connectivity
Assigned Numbers - 2025 - Bluetooth® Document
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction
Encapsulation theory and applications.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Chapter 3 Spatial Domain Image Processing.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Approach and Philosophy of On baking technology
Digital-Transformation-Roadmap-for-Companies.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
MIND Revenue Release Quarter 2 2025 Press Release
Encapsulation_ Review paper, used for researhc scholars
A Presentation on Artificial Intelligence
The AUB Centre for AI in Media Proposal.docx
Empathic Computing: Creating Shared Understanding
Ad

Machine Learning Foundations Project Presentation

  • 1. Project Presentation - Machine Language Foundations For Product Managers Course A J B – Amit J Bhattacharyya
  • 2. Project Objective & Input Data Characteristics Objective is to build a model to predict the electrical energy output of a Combined Cycle Power Plant, which uses a combination of gas turbines, steam turbines, and heat recovery steam generators to generate power. Input Data –  A set of 9568 hourly average data points from sensors at the power plant which can be used to build the model.  The sensor readings provide data for the following factors – o Temperature (T) o Ambient Pressure (AP) o Relative Humidity (RH) o Exhaust Vacuum (V) o Net hourly electrical energy output (PE) (Prediction Target) A J B
  • 3. Decisions – Approach, Dataset Divisions. Model Finalizations (Feature Groupings)  The dataset clearly provides very defined independent variables (AT,AP,RH,V) and dependent/output variable (PE) the energy output from the combined cycle power plant so Multiple Linear Regression approach seems more likely to build Model.  There are 9568 observations of the 4 features Ambient Temperature (AT), Ambient Pressure(AP), Relative Humidity(RH) and Exhaust Vacuum(V). So straight off reserved the last 20% ~ 1913 rows for final Test Set. The remaining 7655 was divided 5 groups of 1531 rows each.  First Model (MD1) is including all features (AT,AP,V,RH), decision for the Second Model (MD2) was taken from a functional angle to leave out exhaust vacuum and to consider mostly the atmospheric impact parameters of temperature, pressure and relative humidity. Third Model (MD3) was where we selected just AT & V features based on some scatter plots done on first slice of data which helped understand better the relationship of each Independent variable to the target variable (PE).  Used Excel Data Analysis Regression on the first 3 lots (from 7655 rows) for the bias & coefficients calculation for the 3 models. And the next 2 lots was used as a validation set for evaluation run on 3 models for comparing their R2 .  SSE was calculated for the precited values of the target variable from the 3 models on the Validation Set , SST was also calculated. So, SSE & SST was used to calculate R2 for each of the 3 models to see which model fits the closest and is best able to explain the variances. Model MD-1 was found to be the best.  And on the Final Dataset we calculated the Mean Square Error (MSE) and Mean Absolute Error (MAE) for all the 3 models and the results there too reinforced our decision on the selection of the model MD1 from the validation set run. MD1 was the one which ended up with the lowest MSE & MAE. A J B
  • 4. Training Set Analysis – Bias & Coefficients calculation for Models
  • 5. Validation Set Runs and Analysis – Calculation of R2 for Model Selection
  • 6. Final Test Set Run and Vindication of Model Chosen – MSE & MAE