2. Outline
• Data-Driven ML vs. Mechanistic Models
• Case Studies Using Data-Driven ML Models
• Data management and security Consideration
• Challenges and Opportunities
3. Mechanistic vs. Data Driven Models in Digital Twin
Physical
System/Process
Mechanistic Models
- Biochemical process modeling (GPS-
X, BioWin etc.)
- Hydraulic modeling (visual hydraulics,
etc.)
Data Management
- Data storage, Data
integration, Data analytics
etc.
Visualization
- Data & Trends, KPI’s, Prediction, etc
Data Driven Models (AI/ML):
- Classical machine learning (RF, SVM
etc.)
- Neural Networks (RNN, etc.)
Digital Twin
Plant
4. Mechanistic vs. Data Driven Models in Digital Twin
Mechanistic Models
- Biochemical process modeling
(GPS-X, BioWin etc.)
- Hydraulic modeling (visual
hydraulics, etc.)
Data Management
- Data storage, Data integration,
Data management
Visualization
- Data & Trends, KPI’s,
Prediction
Data Driven Models (AI/ML):
- Data & Trend
- Process optimization
- Prediction
Data + Knowledge
5. Data Drive AI/ML in WastewaterTreatment
1. Advanced Control for Swing Zone Operation to Balance and
Maximize Enhanced Biological Phosphorus Removal (EBPR) and
Ammonia Removal - Proof-of-Concept Study (2018)
2. Optimize Disinfection using AI/ML – Project Planning
3. AI/ML in ABAC – Project Planning
4. Machine Learning Application for Headworks Odor and
Corrosion Control – Full Scale Test and Implementation (2019-
2022)
6. Case Study 1 - Advanced Control for Swing Zone
Operation (Proof-of-Concept )
• Teamed up with inCTRL Solutions
• Project goal: Advanced control
for swing zone operation to
balance and maximize BioP and
ammonia removal
• Data: 2.5 years routine data:
online instrument data (15-min or
1-hr interval) and Lab data (daily)
• Method: Multivariate statistical
methods were used to develop a
data driven predictive model (soft
sensor)
• Status: Proof of Concept
8. Case Study 2 – Optimizing Disinfection using AI/ML
Models (Planning)
• Current Challenges:
• Flow can vary from 20 - 120 mgd
• Flow monitoring at headworks - difficult to utilize ‘real-time’ flow
• Residual analyzers/testing method limitations
• Chlorine Analyzer Detectable limit of 0.02 mg/l and no in-stream fecal
monitoring
DISINFECTION AT KIRIE WRP
• 270k gallons of NaOCL and 16k
gallons of sodium bisulfite used
annually
POTENTIAL EXPANSION TO OTHER
DISTRICT WRPS
• District uses approx. 1.8 MG of
sodium hypochlorite ($1.3M)
per year
• Potential Benefits:
9. IWS potential Implementations in WRP – Optimization
of Ammonia Based Aeration Control (ABAC)
Current feedback cascade ABAC in Stickney provides potential
of 36% air savings
• Effluent ammonia spikes were observed due to the delay and
limitation of air supply system – risk of permit violence
• Significant amount of time for sensor maintenance
• Significant amount of time for data QA/QC
• Interrupt ABAC operation due to sensors out of service
Benefits of adding ML to the feedback cascade ABAC
• Add data-based feedforward control to reduce effluent ammonia spikes –
performance improvements
• Soft sensor to reduce sensor maintenance, reduce time for data QA/QC, and provide
continued ABAC control when physical sensors out of service
10. IWS potential Implementations – Soft Sensors
• Physical sensor/ soft sensor cross validation
• Provide automatic fault detection of physical sensors for
outliners, error data (for sensors out in the field, or
sensors tends to drift)
• Filling the missing or bad data and can be used as backup
signal for control purpose when physical sensors goes
offline.
• Identify changes in process or operational issues
• For parameters that online instrument is currently not
available or costly to buy and maintain (VFA, Ortho-P,
etc)
11. IWS potential Implementations in Wastewater Industry –
Process optimizer
• Can be implemented in any process that has sufficient data and
needs operational control (screen cleaning cycle, RAS control, WAS
control, aeration control, chemical dose, filter backwash, UV
disinfection, polymer dose…)
• Use domain knowledge to identify and prioritize implementation to
achieve optimal chemical/energy savings and performance
improvements
12. Summary
• When should I consider ML
• When mathematic is not available and
you have sufficient data
• When you want to do prediction,
process optimization, real time decision
making
• What expertise needed
• Process engineer/operation staff
• Data analytics
Unit Process
Input Output
Energy Chemicals
Others
data
data
d a t a d a t a
d a t a
Modeling Unit Process
13. Case Study - Background
NaOCl dosing for
Headwork
Odor/Corrosion
Control
EPBR
implementatio
n in 2015
Suspension of
NaOCl dosing
Resume NaOCl dosing?
How ?
87% overdosing 5% under dosing
Team up with Data
Analytics from ISU for the
2019 IWS Challenge
15. Using Online H2S Meter for Controlling NaOCl dosing?
• Permeant online H2S to provide 4-20mA signal to control the dosing
• Target H2S in flow distribution box headspace < 5ppm
• Typical feedback control
• Concerns
• Online H2S sensor reliability
• Impact to VFA cannot be evaluated
• Potential overdosing – Impact downstream treatment (nitrification and bioP)
• Responsive – Lag time
• No prediction
16. Some Facts of using NaOCl to Remove H2S
Fundamentals
• H2S (water phase) level is impacted by many factors: wastewater characteristic, ORP, other
organics etc
• NaOCl reacts readily with H2S, but also reacts with any other oxidizable material, VFAs.
Reaction rates depend on chemical energy level of each reaction and other factors (pH, temp,
concentration etc) and is hard to determine
• H2S (air phase) is problematic. It is impacted by many factors: H2S level in the water
phase, pH, temp, turbulence etc.
• Lack of mechanistic modeling approaches
• Excess chlorine often used to force the reaction for H2S (water phase) removal.
17. Water Quality
Analysis
Data
Data
QA/QC
and AI
Models
Predicted
Water
Quality
Analysis
Data
Data
QA/QC
and AI
Models
Predicted
H2S class
Predicted
VFAs
class
Predicted
optimal
NaOCl
dosing
Data QA/QC
and AI
Models
Online
Instrument
Data
Wastewater flow,
target H2S, and
target VFAs
Module 1 Module 2 Module 3
Precipitation
Data
17 years of Influent
historical data: BOD,
Ammonia, TKN, TP,
SO4, TSS, TS
6 months online
instrument data (ORP,
Temp, pH, Flow, TARP
Elevation, and CUP
pumping
10 weeks of NaOCl
dose-response data
18. Footer put title here.
– 18
Module 1 Module 2 Module 3
Module Goal Predict influent wastewater
characteristics to use in Module 2
(TS, SS, TP, NH3-N, SO4, BOD5, Org-
N, and TKN)
Predict H2S and VFAs to use in
Module 3
Predict chemical dosage of
sodium hypochlorite
Available Dataset 17 years of influent historical data
which had good quality
Six months of online instrument data
which were insufficient and imbalanced
10 weeks dose response
data which were insufficient
and lack in variation
Feature Variables TS, SS, TP, NH3-N, SO4, BOD5, Org-
N, TKN, and precipitation
Model 1 output data (predicted TS, SS,
TP, NH3-N, SO4, BOD5, Org-N, and
TKN);
online instrument data (flow, ORP, pH,
wastewater temperature, tunnel
pumping, and tunnel elevation)
Flow,
Model 2 output (predicted
H2S and VFAs),
target H2S, and
target VFAs
Data Preprocessing
Method
data filling Quantization, Quantization,
oversampling
oversampling,
data filling
Models
Tested in each Module
RNN with LSTM,
ARIMA,
RF,
XGBoost,
SVM
RF,
SVM
RF,
SVM
Model Chosen for
Module
Regression using
deep learning model
(RNN with LSTM)
Classification using classical machine
learning model.
RF was the best at predicting VFAs
while SVM was the best at predicting
Classification using classical
machine learning model
(RF)
20. Performance: Kirie WRP Treatment Optimization
• 1,000+ predictions of influent
water quality since August,
2021
• Augment LIMS dataset
• Extrapolation up to 7 days
• Data Health Feedback
• H2S, VFA prediction
• NaOCl recommendation
NH3 Predicted vs Observed
Concentration
(mg/L)
21. Data Preparation
From IWA Meta Data Management Project. 21
RAW DATA
DATA FIT FOR
PURPOSE
• DATA VISUALIZATION:time
series, histograms,
scatterplots…
• DATA GROUPING: dry/wet
weather, cyclic patterns...
• DESCRIPTIVE STATISTICS:
mean, median, min, max,
standard deviation,
skewness coefficient,
correlations among
variables...
• SANITY CHECKS
• OUTLIERS
• MASS BALANCES
• TRANSFORMATION
• SCALING
• MISSING DATA
IMPUTATION
• FILTERING AND
SMOOTHING
• FEATURE GENERATION
• DATA STRUCTURING
INITIAL REVIEW
ERROR IDENTIFICATION
AND CORRECTION
DATA PRE-PROCESSING
STEPS SELECTION AND
APPLICATION
22. Influent Water Quality
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• Typical challenge when working
within a real-time framework.
• The table represents a set of
timeseries for several water
quality parameters which are
required for Model 2 and Model 3
to function.
• A recursive function used to make
prediction at points in time when
data is missing any water quality
parameter. The function takes a
trained models from Module 1,
and an input dataframe of WQ
values with a column for each of
the models along a 24-hour time-
resolution datetime index.
• The following animation
illustrates how the missing data is
resolved with neural networks.
Real-Time Data Availability Matrix: Set of timeseries for influent water quality predictions
23. Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 3
• Which signals missing
• TKN
• Predict TKN on Jan 3
24. Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 4
• Which signals missing
• Total Solids, TKN
• Action:
• Predict TKN
• Predict Total Solids
25. Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 5
• Which signals missing
• Total Solids
• SS
• P-TOT
• TKN
• Action:
• Predict Total Solids
• Predict SS
• Predict P-TOT
• Predict TKN
26. Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• 1st date with missing WQ
values:
• Jan 6
• Which signals missing
• Total Solids
• SS
• BOD5
• NH3
• Org-N
• P-TOT
• TKN
• Action:
• Predict Total Solids
• Predict SS
• Predict BOD5
• Predict NH3
• Predict Org-N
• Predict P-TOT
• Predict TKN
27. Module 1 WQ
Predictions datetim
e
Total
Solids
SS BOD5 NH3 Org-N P-TOT SO4 TKN
Jan 1
Jan 2
Jan 3
Jan 4
Jan 5
Jan 6
• Last Step
• No missing WQ
values
• Return the
dataframe
• Note:
• Error from preceding
predictions
accumulates in
subsequent
predictions
28. Kirie WRP Treatment Optimization, Project Objective
Objective: To integrate existing algorithms developed by Illinois State University (ISU) faculty and
MWRD under previous work, into a process optimization platform for the Kirie WRP.
Goal: The goal is to optimize dosing of sodium hypochlorite (NaOCl) to prevent the formation of
hydrogen sulfide (H2S) without disrupting the formation of volatile fatty acids (VFAs).
Figure 1: Flow process of BLU-X Treatment, data-driven
process optimization.
31. Major Takeaways from Pilot Project:
Footer put title here.
– 31
• Direct, quantifiable recommendation for hypo
dosing in real time through optimization; however,
warrants more LIMs data
• Improve data governance and data management
across the District
• Gained more knowledge of the feasibility of
integrating research NN models into real time
using Xylem’s approach
Future Additional Benefits:
• Early warnings of upcoming flow events, influent
characteristic, H2S/VFAs levels, and Hypo dosing
recommendations to the Kirie WRP.
• Using this pilot study as a demonstration of the
possibilities for using AI in WRPs.
33. Opportunities to learn and get involved
• LIFT Intelligent Water System (IWS) Challenge- 2022
• Registration deadline (Raise hand)
• Plan submission
• Final solution submission
• WEF RICE Group - Soft sensors and machine learning possibilities and
applications
• IWEA/IWS Committee
• Webinars
• Student IWS Challenge
• Funded AI/ML projects
• WRF
• DOE
Contact Information
Fenghua Yang, P.E. BCEE
yangf@mwrd.org
34. Acknowledgements
inCtrl Solution
Mr. Ivan Miletic and Dr. Alex Rosenthal
Illinois State University
Dr. Xin Fang,Dr. Yongning Tang,
MWRD
Ms Thais Pluth, Mr. Matt Jurjovec, and Kirie WRP Operational Staff
Xylem teams
• Nick Mills, Adam Erispaha, Patrick Henthorn, Xylem’s
Hydroinformatic Engineers Q&A