SlideShare a Scribd company logo
Fenghua Yang, P.E. BCEE
Senior Environment Research Scientist | MWRD
Outline
• Data-Driven ML vs. Mechanistic Models
• Case Studies Using Data-Driven ML Models
• Data management and security Consideration
• Challenges and Opportunities
Mechanistic vs. Data Driven Models in Digital Twin
Physical
System/Process
Mechanistic Models
- Biochemical process modeling (GPS-
X, BioWin etc.)
- Hydraulic modeling (visual hydraulics,
etc.)
Data Management
- Data storage, Data
integration, Data analytics
etc.
Visualization
- Data & Trends, KPI’s, Prediction, etc
Data Driven Models (AI/ML):
- Classical machine learning (RF, SVM
etc.)
- Neural Networks (RNN, etc.)
Digital Twin
Plant
Mechanistic vs. Data Driven Models in Digital Twin
Mechanistic Models
- Biochemical process modeling
(GPS-X, BioWin etc.)
- Hydraulic modeling (visual
hydraulics, etc.)
Data Management
- Data storage, Data integration,
Data management
Visualization
- Data & Trends, KPI’s,
Prediction
Data Driven Models (AI/ML):
- Data & Trend
- Process optimization
- Prediction
Data + Knowledge
Data Drive AI/ML in WastewaterTreatment
1. Advanced Control for Swing Zone Operation to Balance and
Maximize Enhanced Biological Phosphorus Removal (EBPR) and
Ammonia Removal - Proof-of-Concept Study (2018)
2. Optimize Disinfection using AI/ML – Project Planning
3. AI/ML in ABAC – Project Planning
4. Machine Learning Application for Headworks Odor and
Corrosion Control – Full Scale Test and Implementation (2019-
2022)
Case Study 1 - Advanced Control for Swing Zone
Operation (Proof-of-Concept )
• Teamed up with inCTRL Solutions
• Project goal: Advanced control
for swing zone operation to
balance and maximize BioP and
ammonia removal
• Data: 2.5 years routine data:
online instrument data (15-min or
1-hr interval) and Lab data (daily)
• Method: Multivariate statistical
methods were used to develop a
data driven predictive model (soft
sensor)
• Status: Proof of Concept
Case Study 1 - Multivariate Statistical Methods
Case Study 2 – Optimizing Disinfection using AI/ML
Models (Planning)
• Current Challenges:
• Flow can vary from 20 - 120 mgd
• Flow monitoring at headworks - difficult to utilize ‘real-time’ flow
• Residual analyzers/testing method limitations
• Chlorine Analyzer Detectable limit of 0.02 mg/l and no in-stream fecal
monitoring
DISINFECTION AT KIRIE WRP
• 270k gallons of NaOCL and 16k
gallons of sodium bisulfite used
annually
POTENTIAL EXPANSION TO OTHER
DISTRICT WRPS
• District uses approx. 1.8 MG of
sodium hypochlorite ($1.3M)
per year
• Potential Benefits:
IWS potential Implementations in WRP – Optimization
of Ammonia Based Aeration Control (ABAC)
Current feedback cascade ABAC in Stickney provides potential
of 36% air savings
• Effluent ammonia spikes were observed due to the delay and
limitation of air supply system – risk of permit violence
• Significant amount of time for sensor maintenance
• Significant amount of time for data QA/QC
• Interrupt ABAC operation due to sensors out of service
Benefits of adding ML to the feedback cascade ABAC
• Add data-based feedforward control to reduce effluent ammonia spikes –
performance improvements
• Soft sensor to reduce sensor maintenance, reduce time for data QA/QC, and provide
continued ABAC control when physical sensors out of service
IWS potential Implementations – Soft Sensors
• Physical sensor/ soft sensor cross validation
• Provide automatic fault detection of physical sensors for
outliners, error data (for sensors out in the field, or
sensors tends to drift)
• Filling the missing or bad data and can be used as backup
signal for control purpose when physical sensors goes
offline.
• Identify changes in process or operational issues
• For parameters that online instrument is currently not
available or costly to buy and maintain (VFA, Ortho-P,
etc)
IWS potential Implementations in Wastewater Industry –
Process optimizer
• Can be implemented in any process that has sufficient data and
needs operational control (screen cleaning cycle, RAS control, WAS
control, aeration control, chemical dose, filter backwash, UV
disinfection, polymer dose…)
• Use domain knowledge to identify and prioritize implementation to
achieve optimal chemical/energy savings and performance
improvements
Summary
• When should I consider ML
• When mathematic is not available and
you have sufficient data
• When you want to do prediction,
process optimization, real time decision
making
• What expertise needed
• Process engineer/operation staff
• Data analytics
Unit Process
Input Output
Energy Chemicals
Others
data
data
d a t a d a t a
d a t a
Modeling Unit Process
Case Study - Background
NaOCl dosing for
Headwork
Odor/Corrosion
Control
EPBR
implementatio
n in 2015
Suspension of
NaOCl dosing
Resume NaOCl dosing?
How ?
87% overdosing 5% under dosing
Team up with Data
Analytics from ISU for the
2019 IWS Challenge
Using ORP for Controlling NaOCl dosing?
0
50
100
150
200
250
300
-500
-300
-100
100
300
3/7/2019
3/14/2019
3/21/2019
3/28/2019
4/4/2019
4/11/2019
4/18/2019
4/25/2019
5/2/2019
5/9/2019
5/16/2019
5/23/2019
5/30/2019
6/6/2019
6/13/2019
6/20/2019
6/27/2019
7/4/2019
7/11/2019
7/18/2019
7/25/2019
8/1/2019
8/8/2019
8/15/2019
8/22/2019
H2S
(ppm)
ORP
(mV)
ORP (mV) 0 mV H2S (ppm)
85% overdosing 3% underdosing
Using Online H2S Meter for Controlling NaOCl dosing?
• Permeant online H2S to provide 4-20mA signal to control the dosing
• Target H2S in flow distribution box headspace < 5ppm
• Typical feedback control
• Concerns
• Online H2S sensor reliability
• Impact to VFA cannot be evaluated
• Potential overdosing – Impact downstream treatment (nitrification and bioP)
• Responsive – Lag time
• No prediction
Some Facts of using NaOCl to Remove H2S
Fundamentals
• H2S (water phase) level is impacted by many factors: wastewater characteristic, ORP, other
organics etc
• NaOCl reacts readily with H2S, but also reacts with any other oxidizable material, VFAs.
Reaction rates depend on chemical energy level of each reaction and other factors (pH, temp,
concentration etc) and is hard to determine
• H2S (air phase) is problematic. It is impacted by many factors: H2S level in the water
phase, pH, temp, turbulence etc.
• Lack of mechanistic modeling approaches
• Excess chlorine often used to force the reaction for H2S (water phase) removal.
Water Quality
Analysis
Data
Data
QA/QC
and AI
Models
Predicted
Water
Quality
Analysis
Data
Data
QA/QC
and AI
Models
Predicted
H2S class
Predicted
VFAs
class
Predicted
optimal
NaOCl
dosing
Data QA/QC
and AI
Models
Online
Instrument
Data
Wastewater flow,
target H2S, and
target VFAs
Module 1 Module 2 Module 3
Precipitation
Data
17 years of Influent
historical data: BOD,
Ammonia, TKN, TP,
SO4, TSS, TS
6 months online
instrument data (ORP,
Temp, pH, Flow, TARP
Elevation, and CUP
pumping
10 weeks of NaOCl
dose-response data
Footer put title here.
– 18
Module 1 Module 2 Module 3
Module Goal Predict influent wastewater
characteristics to use in Module 2
(TS, SS, TP, NH3-N, SO4, BOD5, Org-
N, and TKN)
Predict H2S and VFAs to use in
Module 3
Predict chemical dosage of
sodium hypochlorite
Available Dataset 17 years of influent historical data
which had good quality
Six months of online instrument data
which were insufficient and imbalanced
10 weeks dose response
data which were insufficient
and lack in variation
Feature Variables TS, SS, TP, NH3-N, SO4, BOD5, Org-
N, TKN, and precipitation
Model 1 output data (predicted TS, SS,
TP, NH3-N, SO4, BOD5, Org-N, and
TKN);
online instrument data (flow, ORP, pH,
wastewater temperature, tunnel
pumping, and tunnel elevation)
Flow,
Model 2 output (predicted
H2S and VFAs),
target H2S, and
target VFAs
Data Preprocessing
Method
data filling Quantization, Quantization,
oversampling
oversampling,
data filling
Models
Tested in each Module
RNN with LSTM,
ARIMA,
RF,
XGBoost,
SVM
RF,
SVM
RF,
SVM
Model Chosen for
Module
Regression using
deep learning model
(RNN with LSTM)
Classification using classical machine
learning model.
RF was the best at predicting VFAs
while SVM was the best at predicting
Classification using classical
machine learning model
(RF)
Module 1 Selection:
• RNN (LSTM)
• RF
• ARIMA
0
0.1
0.2
0.3
0.4
0.5
0.6
NH3 BOD5 TS Org-N TP SS TKN SO4
MAE
RNN ARIMA RF
Module 2 Selection:
• RF Classifier
• SVM Classifier
Module 3 Selection:
• RF Classifier
• SVM Classifier
Outpu
t
Method
Data
interval
Accurac
y in
testing
set (%)
H2S
RF 15-
minute
87.6
SVM 86.3
RF
daily
85.7
SVM 97.6
VFA
s
RF
daily
93.4
SVM 91.0
RF
SVM 68.75%
89.75%
Performance: Kirie WRP Treatment Optimization
• 1,000+ predictions of influent
water quality since August,
2021
• Augment LIMS dataset
• Extrapolation up to 7 days
• Data Health Feedback
• H2S, VFA prediction
• NaOCl recommendation
NH3 Predicted vs Observed
Concentration
(mg/L)
Data Preparation
From IWA Meta Data Management Project. 21
RAW DATA
DATA FIT FOR
PURPOSE
• DATA VISUALIZATION:time
series, histograms,
scatterplots…
• DATA GROUPING: dry/wet
weather, cyclic patterns...
• DESCRIPTIVE STATISTICS:
mean, median, min, max,
standard deviation,
skewness coefficient,
correlations among
variables...
• SANITY CHECKS
• OUTLIERS
• MASS BALANCES
• TRANSFORMATION
• SCALING
• MISSING DATA
IMPUTATION
• FILTERING AND
SMOOTHING
• FEATURE GENERATION
• DATA STRUCTURING
INITIAL REVIEW
ERROR IDENTIFICATION
AND CORRECTION
DATA PRE-PROCESSING
STEPS SELECTION AND
APPLICATION
Influent Water Quality
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• Typical challenge when working
within a real-time framework.
• The table represents a set of
timeseries for several water
quality parameters which are
required for Model 2 and Model 3
to function.
• A recursive function used to make
prediction at points in time when
data is missing any water quality
parameter. The function takes a
trained models from Module 1,
and an input dataframe of WQ
values with a column for each of
the models along a 24-hour time-
resolution datetime index.
• The following animation
illustrates how the missing data is
resolved with neural networks.
Real-Time Data Availability Matrix: Set of timeseries for influent water quality predictions
Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 3
• Which signals missing
• TKN
• Predict TKN on Jan 3
Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 4
• Which signals missing
• Total Solids, TKN
• Action:
• Predict TKN
• Predict Total Solids
Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 5
• Which signals missing
• Total Solids
• SS
• P-TOT
• TKN
• Action:
• Predict Total Solids
• Predict SS
• Predict P-TOT
• Predict TKN
Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 6
• Which signals missing
• Total Solids
• SS
• BOD5
• NH3
• Org-N
• P-TOT
• TKN
• Action:
• Predict Total Solids
• Predict SS
• Predict BOD5
• Predict NH3
• Predict Org-N
• Predict P-TOT
• Predict TKN
Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• Last Step
• No missing WQ
values
• Return the
dataframe
• Note:
• Error from preceding
predictions
accumulates in
subsequent
predictions
Kirie WRP Treatment Optimization, Project Objective
Objective: To integrate existing algorithms developed by Illinois State University (ISU) faculty and
MWRD under previous work, into a process optimization platform for the Kirie WRP.
Goal: The goal is to optimize dosing of sodium hypochlorite (NaOCl) to prevent the formation of
hydrogen sulfide (H2S) without disrupting the formation of volatile fatty acids (VFAs).
Figure 1: Flow process of BLU-X Treatment, data-driven
process optimization.
Option 1: Signal
Splitter
Option 2: DIODE
Integration of data into Real Time Decision Support System
30
Major Takeaways from Pilot Project:
Footer put title here.
– 31
• Direct, quantifiable recommendation for hypo
dosing in real time through optimization; however,
warrants more LIMs data
• Improve data governance and data management
across the District
• Gained more knowledge of the feasibility of
integrating research NN models into real time
using Xylem’s approach
Future Additional Benefits:
• Early warnings of upcoming flow events, influent
characteristic, H2S/VFAs levels, and Hypo dosing
recommendations to the Kirie WRP.
• Using this pilot study as a demonstration of the
possibilities for using AI in WRPs.
Challenges
• Data
• Integrate with traditional mechanistic model
• Transfer learning in AI/ML
Opportunities to learn and get involved
• LIFT Intelligent Water System (IWS) Challenge- 2022
• Registration deadline (Raise hand)
• Plan submission
• Final solution submission
• WEF RICE Group - Soft sensors and machine learning possibilities and
applications
• IWEA/IWS Committee
• Webinars
• Student IWS Challenge
• Funded AI/ML projects
• WRF
• DOE
Contact Information
Fenghua Yang, P.E. BCEE
yangf@mwrd.org
Acknowledgements
inCtrl Solution
Mr. Ivan Miletic and Dr. Alex Rosenthal
Illinois State University
Dr. Xin Fang,Dr. Yongning Tang,
MWRD
Ms Thais Pluth, Mr. Matt Jurjovec, and Kirie WRP Operational Staff
Xylem teams
• Nick Mills, Adam Erispaha, Patrick Henthorn, Xylem’s
Hydroinformatic Engineers Q&A

More Related Content

PDF
Estimation of pH and MLSS using Neural Network
PDF
Modeling full scale-data(2)
PDF
710201911
PPTX
PPTX
review main GURU SAI5446531251616502351645
PDF
710201911
PPTX
PREDICTING RIVER WATER QUALITY ppt presentation
PPTX
Serrao_Soutenance_2023_vconclusions.pptx
Estimation of pH and MLSS using Neural Network
Modeling full scale-data(2)
710201911
review main GURU SAI5446531251616502351645
710201911
PREDICTING RIVER WATER QUALITY ppt presentation
Serrao_Soutenance_2023_vconclusions.pptx

Similar to 8-Yang-Wastewater-Treatment-Optimization-Using-Data-Driven-AI-ML-Models.pdf (20)

PPTX
MANOJ H internship ppt.pptx
PDF
Modelling the effluent quality utilizing optical monitoring
PDF
Performance comparison of SVM and ANN for aerobic granular sludge
PDF
DIGITAL TWIN TO AUTOMATE OPTIMISATION AND EMBED EXCELLENCE IN WWTP OPERATIONS
PDF
DCUBE for industries: how to optimize a WWTP with Data Science and Artificial...
PDF
Unlock the power of MLOps.pdf
PPTX
22cggggffhhfdffgv091F0014 FINAL PPT-1.pptx
PPTX
Pollution prevention in textile industry
PDF
Unlock the power of MLOps.pdf
PDF
Unlock the power of MLOps.pdf
PDF
Unlock the power of MLOps.pdf
PPTX
Evaluating pollution prevention and control options and practicin
PPTX
Evaluating pollution prevention and control options and practicin
PDF
Applied Machine Learning for Chemistry II (HSI2020)
PDF
Modeling and optimization of a wastewater pumping system with
DOCX
PREDICTION OF WATER PORTABILITY USING CLASSIFICATION TECHNIQUES.docx
PDF
Soft Computing Techniques for Predicting Chemical Oxygen Demand in River Water
PDF
Models Done Better... - UDG2018 - Intertek and DHI
PPTX
Presentation on environmental impact assessment
PDF
Technical_Report_on_ML_Library
MANOJ H internship ppt.pptx
Modelling the effluent quality utilizing optical monitoring
Performance comparison of SVM and ANN for aerobic granular sludge
DIGITAL TWIN TO AUTOMATE OPTIMISATION AND EMBED EXCELLENCE IN WWTP OPERATIONS
DCUBE for industries: how to optimize a WWTP with Data Science and Artificial...
Unlock the power of MLOps.pdf
22cggggffhhfdffgv091F0014 FINAL PPT-1.pptx
Pollution prevention in textile industry
Unlock the power of MLOps.pdf
Unlock the power of MLOps.pdf
Unlock the power of MLOps.pdf
Evaluating pollution prevention and control options and practicin
Evaluating pollution prevention and control options and practicin
Applied Machine Learning for Chemistry II (HSI2020)
Modeling and optimization of a wastewater pumping system with
PREDICTION OF WATER PORTABILITY USING CLASSIFICATION TECHNIQUES.docx
Soft Computing Techniques for Predicting Chemical Oxygen Demand in River Water
Models Done Better... - UDG2018 - Intertek and DHI
Presentation on environmental impact assessment
Technical_Report_on_ML_Library
Ad

Recently uploaded (20)

PPT
Project quality management in manufacturing
DOCX
573137875-Attendance-Management-System-original
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
Sustainable Sites - Green Building Construction
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Geodesy 1.pptx...............................................
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
PPT on Performance Review to get promotions
Project quality management in manufacturing
573137875-Attendance-Management-System-original
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Model Code of Practice - Construction Work - 21102022 .pdf
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Sustainable Sites - Green Building Construction
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
R24 SURVEYING LAB MANUAL for civil enggi
CYBER-CRIMES AND SECURITY A guide to understanding
Geodesy 1.pptx...............................................
Automation-in-Manufacturing-Chapter-Introduction.pdf
OOP with Java - Java Introduction (Basics)
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Internet of Things (IOT) - A guide to understanding
PPT on Performance Review to get promotions
Ad

8-Yang-Wastewater-Treatment-Optimization-Using-Data-Driven-AI-ML-Models.pdf

  • 1. Fenghua Yang, P.E. BCEE Senior Environment Research Scientist | MWRD
  • 2. Outline • Data-Driven ML vs. Mechanistic Models • Case Studies Using Data-Driven ML Models • Data management and security Consideration • Challenges and Opportunities
  • 3. Mechanistic vs. Data Driven Models in Digital Twin Physical System/Process Mechanistic Models - Biochemical process modeling (GPS- X, BioWin etc.) - Hydraulic modeling (visual hydraulics, etc.) Data Management - Data storage, Data integration, Data analytics etc. Visualization - Data & Trends, KPI’s, Prediction, etc Data Driven Models (AI/ML): - Classical machine learning (RF, SVM etc.) - Neural Networks (RNN, etc.) Digital Twin Plant
  • 4. Mechanistic vs. Data Driven Models in Digital Twin Mechanistic Models - Biochemical process modeling (GPS-X, BioWin etc.) - Hydraulic modeling (visual hydraulics, etc.) Data Management - Data storage, Data integration, Data management Visualization - Data & Trends, KPI’s, Prediction Data Driven Models (AI/ML): - Data & Trend - Process optimization - Prediction Data + Knowledge
  • 5. Data Drive AI/ML in WastewaterTreatment 1. Advanced Control for Swing Zone Operation to Balance and Maximize Enhanced Biological Phosphorus Removal (EBPR) and Ammonia Removal - Proof-of-Concept Study (2018) 2. Optimize Disinfection using AI/ML – Project Planning 3. AI/ML in ABAC – Project Planning 4. Machine Learning Application for Headworks Odor and Corrosion Control – Full Scale Test and Implementation (2019- 2022)
  • 6. Case Study 1 - Advanced Control for Swing Zone Operation (Proof-of-Concept ) • Teamed up with inCTRL Solutions • Project goal: Advanced control for swing zone operation to balance and maximize BioP and ammonia removal • Data: 2.5 years routine data: online instrument data (15-min or 1-hr interval) and Lab data (daily) • Method: Multivariate statistical methods were used to develop a data driven predictive model (soft sensor) • Status: Proof of Concept
  • 7. Case Study 1 - Multivariate Statistical Methods
  • 8. Case Study 2 – Optimizing Disinfection using AI/ML Models (Planning) • Current Challenges: • Flow can vary from 20 - 120 mgd • Flow monitoring at headworks - difficult to utilize ‘real-time’ flow • Residual analyzers/testing method limitations • Chlorine Analyzer Detectable limit of 0.02 mg/l and no in-stream fecal monitoring DISINFECTION AT KIRIE WRP • 270k gallons of NaOCL and 16k gallons of sodium bisulfite used annually POTENTIAL EXPANSION TO OTHER DISTRICT WRPS • District uses approx. 1.8 MG of sodium hypochlorite ($1.3M) per year • Potential Benefits:
  • 9. IWS potential Implementations in WRP – Optimization of Ammonia Based Aeration Control (ABAC) Current feedback cascade ABAC in Stickney provides potential of 36% air savings • Effluent ammonia spikes were observed due to the delay and limitation of air supply system – risk of permit violence • Significant amount of time for sensor maintenance • Significant amount of time for data QA/QC • Interrupt ABAC operation due to sensors out of service Benefits of adding ML to the feedback cascade ABAC • Add data-based feedforward control to reduce effluent ammonia spikes – performance improvements • Soft sensor to reduce sensor maintenance, reduce time for data QA/QC, and provide continued ABAC control when physical sensors out of service
  • 10. IWS potential Implementations – Soft Sensors • Physical sensor/ soft sensor cross validation • Provide automatic fault detection of physical sensors for outliners, error data (for sensors out in the field, or sensors tends to drift) • Filling the missing or bad data and can be used as backup signal for control purpose when physical sensors goes offline. • Identify changes in process or operational issues • For parameters that online instrument is currently not available or costly to buy and maintain (VFA, Ortho-P, etc)
  • 11. IWS potential Implementations in Wastewater Industry – Process optimizer • Can be implemented in any process that has sufficient data and needs operational control (screen cleaning cycle, RAS control, WAS control, aeration control, chemical dose, filter backwash, UV disinfection, polymer dose…) • Use domain knowledge to identify and prioritize implementation to achieve optimal chemical/energy savings and performance improvements
  • 12. Summary • When should I consider ML • When mathematic is not available and you have sufficient data • When you want to do prediction, process optimization, real time decision making • What expertise needed • Process engineer/operation staff • Data analytics Unit Process Input Output Energy Chemicals Others data data d a t a d a t a d a t a Modeling Unit Process
  • 13. Case Study - Background NaOCl dosing for Headwork Odor/Corrosion Control EPBR implementatio n in 2015 Suspension of NaOCl dosing Resume NaOCl dosing? How ? 87% overdosing 5% under dosing Team up with Data Analytics from ISU for the 2019 IWS Challenge
  • 14. Using ORP for Controlling NaOCl dosing? 0 50 100 150 200 250 300 -500 -300 -100 100 300 3/7/2019 3/14/2019 3/21/2019 3/28/2019 4/4/2019 4/11/2019 4/18/2019 4/25/2019 5/2/2019 5/9/2019 5/16/2019 5/23/2019 5/30/2019 6/6/2019 6/13/2019 6/20/2019 6/27/2019 7/4/2019 7/11/2019 7/18/2019 7/25/2019 8/1/2019 8/8/2019 8/15/2019 8/22/2019 H2S (ppm) ORP (mV) ORP (mV) 0 mV H2S (ppm) 85% overdosing 3% underdosing
  • 15. Using Online H2S Meter for Controlling NaOCl dosing? • Permeant online H2S to provide 4-20mA signal to control the dosing • Target H2S in flow distribution box headspace < 5ppm • Typical feedback control • Concerns • Online H2S sensor reliability • Impact to VFA cannot be evaluated • Potential overdosing – Impact downstream treatment (nitrification and bioP) • Responsive – Lag time • No prediction
  • 16. Some Facts of using NaOCl to Remove H2S Fundamentals • H2S (water phase) level is impacted by many factors: wastewater characteristic, ORP, other organics etc • NaOCl reacts readily with H2S, but also reacts with any other oxidizable material, VFAs. Reaction rates depend on chemical energy level of each reaction and other factors (pH, temp, concentration etc) and is hard to determine • H2S (air phase) is problematic. It is impacted by many factors: H2S level in the water phase, pH, temp, turbulence etc. • Lack of mechanistic modeling approaches • Excess chlorine often used to force the reaction for H2S (water phase) removal.
  • 17. Water Quality Analysis Data Data QA/QC and AI Models Predicted Water Quality Analysis Data Data QA/QC and AI Models Predicted H2S class Predicted VFAs class Predicted optimal NaOCl dosing Data QA/QC and AI Models Online Instrument Data Wastewater flow, target H2S, and target VFAs Module 1 Module 2 Module 3 Precipitation Data 17 years of Influent historical data: BOD, Ammonia, TKN, TP, SO4, TSS, TS 6 months online instrument data (ORP, Temp, pH, Flow, TARP Elevation, and CUP pumping 10 weeks of NaOCl dose-response data
  • 18. Footer put title here. – 18 Module 1 Module 2 Module 3 Module Goal Predict influent wastewater characteristics to use in Module 2 (TS, SS, TP, NH3-N, SO4, BOD5, Org- N, and TKN) Predict H2S and VFAs to use in Module 3 Predict chemical dosage of sodium hypochlorite Available Dataset 17 years of influent historical data which had good quality Six months of online instrument data which were insufficient and imbalanced 10 weeks dose response data which were insufficient and lack in variation Feature Variables TS, SS, TP, NH3-N, SO4, BOD5, Org- N, TKN, and precipitation Model 1 output data (predicted TS, SS, TP, NH3-N, SO4, BOD5, Org-N, and TKN); online instrument data (flow, ORP, pH, wastewater temperature, tunnel pumping, and tunnel elevation) Flow, Model 2 output (predicted H2S and VFAs), target H2S, and target VFAs Data Preprocessing Method data filling Quantization, Quantization, oversampling oversampling, data filling Models Tested in each Module RNN with LSTM, ARIMA, RF, XGBoost, SVM RF, SVM RF, SVM Model Chosen for Module Regression using deep learning model (RNN with LSTM) Classification using classical machine learning model. RF was the best at predicting VFAs while SVM was the best at predicting Classification using classical machine learning model (RF)
  • 19. Module 1 Selection: • RNN (LSTM) • RF • ARIMA 0 0.1 0.2 0.3 0.4 0.5 0.6 NH3 BOD5 TS Org-N TP SS TKN SO4 MAE RNN ARIMA RF Module 2 Selection: • RF Classifier • SVM Classifier Module 3 Selection: • RF Classifier • SVM Classifier Outpu t Method Data interval Accurac y in testing set (%) H2S RF 15- minute 87.6 SVM 86.3 RF daily 85.7 SVM 97.6 VFA s RF daily 93.4 SVM 91.0 RF SVM 68.75% 89.75%
  • 20. Performance: Kirie WRP Treatment Optimization • 1,000+ predictions of influent water quality since August, 2021 • Augment LIMS dataset • Extrapolation up to 7 days • Data Health Feedback • H2S, VFA prediction • NaOCl recommendation NH3 Predicted vs Observed Concentration (mg/L)
  • 21. Data Preparation From IWA Meta Data Management Project. 21 RAW DATA DATA FIT FOR PURPOSE • DATA VISUALIZATION:time series, histograms, scatterplots… • DATA GROUPING: dry/wet weather, cyclic patterns... • DESCRIPTIVE STATISTICS: mean, median, min, max, standard deviation, skewness coefficient, correlations among variables... • SANITY CHECKS • OUTLIERS • MASS BALANCES • TRANSFORMATION • SCALING • MISSING DATA IMPUTATION • FILTERING AND SMOOTHING • FEATURE GENERATION • DATA STRUCTURING INITIAL REVIEW ERROR IDENTIFICATION AND CORRECTION DATA PRE-PROCESSING STEPS SELECTION AND APPLICATION
  • 22. Influent Water Quality Predictions datetim e Total Solids SS BOD5 NH3 Org-N P-TOT SO4 TKN Jan 1 Jan 2 Jan 3 Jan 4 Jan 5 Jan 6 • Typical challenge when working within a real-time framework. • The table represents a set of timeseries for several water quality parameters which are required for Model 2 and Model 3 to function. • A recursive function used to make prediction at points in time when data is missing any water quality parameter. The function takes a trained models from Module 1, and an input dataframe of WQ values with a column for each of the models along a 24-hour time- resolution datetime index. • The following animation illustrates how the missing data is resolved with neural networks. Real-Time Data Availability Matrix: Set of timeseries for influent water quality predictions
  • 23. Module 1 WQ Predictions datetim e Total Solids SS BOD5 NH3 Org-N P-TOT SO4 TKN Jan 1 Jan 2 Jan 3 Jan 4 Jan 5 Jan 6 • 1st date with missing WQ values: • Jan 3 • Which signals missing • TKN • Predict TKN on Jan 3
  • 24. Module 1 WQ Predictions datetim e Total Solids SS BOD5 NH3 Org-N P-TOT SO4 TKN Jan 1 Jan 2 Jan 3 Jan 4 Jan 5 Jan 6 • 1st date with missing WQ values: • Jan 4 • Which signals missing • Total Solids, TKN • Action: • Predict TKN • Predict Total Solids
  • 25. Module 1 WQ Predictions datetim e Total Solids SS BOD5 NH3 Org-N P-TOT SO4 TKN Jan 1 Jan 2 Jan 3 Jan 4 Jan 5 Jan 6 • 1st date with missing WQ values: • Jan 5 • Which signals missing • Total Solids • SS • P-TOT • TKN • Action: • Predict Total Solids • Predict SS • Predict P-TOT • Predict TKN
  • 26. Module 1 WQ Predictions datetim e Total Solids SS BOD5 NH3 Org-N P-TOT SO4 TKN Jan 1 Jan 2 Jan 3 Jan 4 Jan 5 Jan 6 • 1st date with missing WQ values: • Jan 6 • Which signals missing • Total Solids • SS • BOD5 • NH3 • Org-N • P-TOT • TKN • Action: • Predict Total Solids • Predict SS • Predict BOD5 • Predict NH3 • Predict Org-N • Predict P-TOT • Predict TKN
  • 27. Module 1 WQ Predictions datetim e Total Solids SS BOD5 NH3 Org-N P-TOT SO4 TKN Jan 1 Jan 2 Jan 3 Jan 4 Jan 5 Jan 6 • Last Step • No missing WQ values • Return the dataframe • Note: • Error from preceding predictions accumulates in subsequent predictions
  • 28. Kirie WRP Treatment Optimization, Project Objective Objective: To integrate existing algorithms developed by Illinois State University (ISU) faculty and MWRD under previous work, into a process optimization platform for the Kirie WRP. Goal: The goal is to optimize dosing of sodium hypochlorite (NaOCl) to prevent the formation of hydrogen sulfide (H2S) without disrupting the formation of volatile fatty acids (VFAs). Figure 1: Flow process of BLU-X Treatment, data-driven process optimization.
  • 30. Integration of data into Real Time Decision Support System 30
  • 31. Major Takeaways from Pilot Project: Footer put title here. – 31 • Direct, quantifiable recommendation for hypo dosing in real time through optimization; however, warrants more LIMs data • Improve data governance and data management across the District • Gained more knowledge of the feasibility of integrating research NN models into real time using Xylem’s approach Future Additional Benefits: • Early warnings of upcoming flow events, influent characteristic, H2S/VFAs levels, and Hypo dosing recommendations to the Kirie WRP. • Using this pilot study as a demonstration of the possibilities for using AI in WRPs.
  • 32. Challenges • Data • Integrate with traditional mechanistic model • Transfer learning in AI/ML
  • 33. Opportunities to learn and get involved • LIFT Intelligent Water System (IWS) Challenge- 2022 • Registration deadline (Raise hand) • Plan submission • Final solution submission • WEF RICE Group - Soft sensors and machine learning possibilities and applications • IWEA/IWS Committee • Webinars • Student IWS Challenge • Funded AI/ML projects • WRF • DOE Contact Information Fenghua Yang, P.E. BCEE yangf@mwrd.org
  • 34. Acknowledgements inCtrl Solution Mr. Ivan Miletic and Dr. Alex Rosenthal Illinois State University Dr. Xin Fang,Dr. Yongning Tang, MWRD Ms Thais Pluth, Mr. Matt Jurjovec, and Kirie WRP Operational Staff Xylem teams • Nick Mills, Adam Erispaha, Patrick Henthorn, Xylem’s Hydroinformatic Engineers Q&A