SlideShare a Scribd company logo
Statistical Process Monitoring Using Advanced
Datadriven And Deep Learning Approaches Theory
And Practical Applications 1st Edition Fouzi
Harrou download
https://ebookbell.com/product/statistical-process-monitoring-
using-advanced-datadriven-and-deep-learning-approaches-theory-
and-practical-applications-1st-edition-fouzi-harrou-11295874
Explore and download more ebooks at ebookbell.com
Here are some recommended products that we believe you will be
interested in. You can click the link to download.
Multivariate Statistical Process Control Process Monitoring Methods
And Applications 1st Edition Zhiqiang Ge
https://ebookbell.com/product/multivariate-statistical-process-
control-process-monitoring-methods-and-applications-1st-edition-
zhiqiang-ge-4231042
Geometric Tolerances Impact On Product Design Quality Inspection And
Statistical Process Monitoring 1st Edition Antonio Armillotta
https://ebookbell.com/product/geometric-tolerances-impact-on-product-
design-quality-inspection-and-statistical-process-monitoring-1st-
edition-antonio-armillotta-2143658
Advanced Statistical Methods In Process Monitoring Finance And
Environmental Science 2024th Edition Sven Knoth
https://ebookbell.com/product/advanced-statistical-methods-in-process-
monitoring-finance-and-environmental-science-2024th-edition-sven-
knoth-63693306
Process Monitoring And Fault Diagnosis Based On Multivariable
Statistical Analysis 2024th Edition Xiangyu Kong
https://ebookbell.com/product/process-monitoring-and-fault-diagnosis-
based-on-multivariable-statistical-analysis-2024th-edition-xiangyu-
kong-57514892
Statistical Monitoring Of Complex Multivariate Processes With
Applications In Industrial Process Control Uwe Kruger
https://ebookbell.com/product/statistical-monitoring-of-complex-
multivariate-processes-with-applications-in-industrial-process-
control-uwe-kruger-4312464
Statistical Process Control And Data Analytics 8th Edition 8th Edition
John Oakland Robert Oakland
https://ebookbell.com/product/statistical-process-control-and-data-
analytics-8th-edition-8th-edition-john-oakland-robert-oakland-58376494
Statistical Process Control 6ed Oakland J
https://ebookbell.com/product/statistical-process-control-6ed-
oakland-j-2046442
Statistical Process Control The Deming Paradigm And Beyond Second
Edition 2nd J Koronacki
https://ebookbell.com/product/statistical-process-control-the-deming-
paradigm-and-beyond-second-edition-2nd-j-koronacki-2138180
Statistical Process Control For Realworld Applications William A
Levinson
https://ebookbell.com/product/statistical-process-control-for-
realworld-applications-william-a-levinson-2364782
Statistical Process Monitoring Using Advanced Datadriven And Deep Learning Approaches Theory And Practical Applications 1st Edition Fouzi Harrou
Statistical Process Monitoring Using Advanced Datadriven And Deep Learning Approaches Theory And Practical Applications 1st Edition Fouzi Harrou
Statistical Process Monitoring using
Advanced Data-Driven and Deep
Learning Approaches
Statistical Process Monitoring Using Advanced Datadriven And Deep Learning Approaches Theory And Practical Applications 1st Edition Fouzi Harrou
Statistical Process
Monitoring using
Advanced Data-Driven
and Deep Learning
Approaches
Theory and Practical Applications
Fouzi Harrou
King Abdullah University of Science and Technology
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Thuwal, Saudi Arabia
Ying Sun
King Abdullah University of Science and Technology
Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division
Thuwal, Saudi Arabia
Amanda S. Hering
Baylor University, Dept of Statistical Science
Waco, TX, United States
Muddu Madakyaru
Department of Chemical Engineering, Manipal Institute of Technology
Manipal Academy of Higher Education
Manipal, India
Abdelkader Dairi
University of Science and Technology of Oran-Mohamed Boudiaf
Computer Science Department, Signal, Image and Speech Laboratory
Oran, Algeria
Elsevier
Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands
The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom
50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
Copyright © 2021 Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means,
electronic or mechanical, including photocopying, recording, or any information storage and
retrieval system, without permission in writing from the publisher. Details on how to seek
permission, further information about the Publisher’s permissions policies and our arrangements
with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency,
can be found at our website: www.elsevier.com/permissions.
This book and the individual contributions contained in it are protected under copyright by the
Publisher (other than as may be noted herein).
Notices
Knowledge and best practice in this field are constantly changing. As new research and
experience broaden our understanding, changes in research methods, professional practices, or
medical treatment may become necessary.
Practitioners and researchers must always rely on their own experience and knowledge in
evaluating and using any information, methods, compounds, or experiments described herein. In
using such information or methods they should be mindful of their own safety and the safety of
others, including parties for whom they have a professional responsibility.
To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors,
assume any liability for any injury and/or damage to persons or property as a matter of products
liability, negligence or otherwise, or from any use or operation of any methods, products,
instructions, or ideas contained in the material herein.
Library of Congress Cataloging-in-Publication Data
A catalog record for this book is available from the Library of Congress
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library
ISBN: 978-0-12-819365-5
For information on all Elsevier publications
visit our website at https://www.elsevier.com/books-and-journals
Publisher: Susan Dennis
Acquisitions Editor: Anita A Koch
Editorial Project Manager: Lena Sparks
Production Project Manager: Kumar Anbazhagan
Designer: Miles Hitchen
Typeset by VTeX
Contents
Preface ix
Acknowledgments xi
1. Introduction
1.1 Introduction 1
1.1.1 Motivation: why process monitoring 1
1.1.2 Types of faults 2
1.1.3 Process monitoring 4
1.1.4 Physical redundancy vs analytical redundancy 5
1.2 Process monitoring methods 6
1.2.1 Model-based methods 7
1.2.2 Knowledge-based methods 9
1.2.3 Data-based monitoring methods 9
1.3 Fault detection metrics 13
1.4 Conclusion 14
References 15
2. Linear latent variable regression (LVR)-based process
monitoring
2.1 Introduction 19
2.2 Development of linear LVR models 20
2.2.1 Full rank methods 21
2.2.2 Latent variable regression (LVR) models 22
2.3 Dynamic LVR models 30
2.4 Process monitoring methods 32
2.4.1 Univariate chart for process monitoring 32
2.4.2 Distribution-based process monitoring schemes 39
2.4.3 Multivariate process monitoring schemes with parametric
and nonparametric thresholds 44
2.5 Linear LVR-based process monitoring strategies 47
2.5.1 Conventional LVR monitoring statistics 47
2.5.2 Fault isolation 50
2.6 Cases studies 53
2.6.1 Simulated example 53
v
vi Contents
2.6.2 Monitoring influent measurements at water resource
recovery facilities 55
2.7 Discussion 63
References 63
3. Fault isolation
3.1 Introduction 71
3.1.1 Pitfalls of standardizing data 72
3.1.2 Shortcomings of contribution plots/scores 77
3.2 Fault isolation 79
3.2.1 Variable thinning 79
3.2.2 Iterative traditional isolation 80
3.2.3 Variable selection methods 83
3.3 Fault classification 99
3.4 Fault isolation metrics 100
3.4.1 Fault isolation errors 101
3.4.2 Precision and recall 102
3.4.3 Phase I FI metrics 102
3.4.4 Discussion 103
3.5 Case studies 103
3.5.1 Retrospective fault isolation 104
3.5.2 Real-time fault isolation 108
3.6 Further reading 111
References 112
4. Nonlinear latent variable regression methods
4.1 Introduction 119
4.2 Limitations of linear LVR methods for process monitoring 121
4.3 Developing nonlinear LVR methods for process monitoring 123
4.3.1 Nonlinear partial least squares 123
4.3.2 ANFIS-PLS modeling framework 127
4.3.3 Kernel PCA 131
4.3.4 Kernel principal components analysis (KPCA) model 131
4.3.5 KPCA-based fault detection procedures 135
4.4 Cases study: monitoring WWTP 138
4.4.1 Anomaly detection using KPCA-OCSVM method 139
4.5 Simulated synthetic data 142
4.5.1 Application of plug flow reactor 143
4.6 Discussion 149
References 151
5. Multiscale latent variable regression-based process
monitoring methods
5.1 Introduction 155
Contents vii
5.2 Theoretical background of wavelet-based data representation 158
5.2.1 Wavelet transform 159
5.2.2 Multiscale representation of data using wavelets 159
5.2.3 Advantages of multiscale representation 164
5.3 Multiscale filtering using wavelets 167
5.3.1 Single scale filter method 167
5.3.2 Multiscale filtering methods 168
5.3.3 Advantages of multiscale denoising 169
5.4 Wavelet-based multiscale univariate monitoring techniques 170
5.4.1 An illustrative example 172
5.5 Multiscale LVR modeling 176
5.5.1 Benefits of multiscale denoising in LVR modeling 176
5.6 Multiscale LVR modeling 177
5.7 Results and discussions 180
5.7.1 Application with synthetic data 180
5.7.2 Application of monitoring distillation column 183
5.8 Discussion 186
References 188
6. Unsupervised deep learning-based process monitoring
methods
6.1 Introduction 193
6.2 Clustering 195
6.2.1 Partition-based clustering techniques 196
6.2.2 Hierarchy-based clustering techniques 197
6.2.3 Density-based approach 198
6.2.4 Expectation maximization 201
6.3 One-class classification 202
6.3.1 One-class SVM 202
6.3.2 Support vector data description (SVDD) 203
6.4 Deep learning models 206
6.4.1 Autoencoders 206
6.4.2 Probabilistic models 210
6.4.3 Deep neural networks 213
6.4.4 Deep Boltzmann machine 215
6.5 Deep learning-based clustering schemes for process monitoring 217
6.6 Discussion 218
References 219
7. Unsupervised recurrent deep learning scheme for
process monitoring
7.1 Introduction 225
7.2 Recurrent neural networks approach 227
7.2.1 Basics of recurrent neural networks 227
7.2.2 Long short-term memory 229
viii Contents
7.2.3 Gated recurrent neural networks 234
7.3 Hybrid deep models 235
7.3.1 RNN-RBM 236
7.3.2 RNN-RBM method 237
7.3.3 LSTM-RBM model 238
7.3.4 LSTM-DBN 239
7.4 Recurrent deep learning-based process monitoring 241
7.4.1 Residuals-based process monitoring approaches 242
7.4.2 Recurrent deep learning-based clustering schemes for
process monitoring 243
7.5 Applications: monitoring influent conditions at WWTP 244
7.6 Discussion 250
References 251
8. Case studies
8.1 Introduction 255
8.2 Stereovision 258
8.2.1 Deep stacked autoencoder-based KNN approach 261
8.2.2 Data description 266
8.2.3 Results and discussion 266
8.2.4 Model trained using data with no obstacles 267
8.2.5 Evaluation of performance for busy scenes 269
8.2.6 Obstacle detection using the Bahnhof dataset 271
8.3 Detecting abnormal ozone measurements using deep learning 274
8.3.1 Introduction 274
8.3.2 Data description 276
8.3.3 Ozone monitoring based on deep learning approaches 278
8.3.4 Detection results 284
8.4 Monitoring of a wastewater treatment plant using deep learning 288
8.4.1 Introduction 288
8.4.2 Proposed DBN-based kNN, OCSVM, and k-means
algorithms 290
8.4.3 Real data application: monitoring a decentralized
wastewater treatment plant in Golden, CO, USA 291
8.4.4 Conclusion 297
References 297
9. Conclusion and further research directions
References 308
Index 311
Preface
Anomaly detection and isolation have a vital role in modern industrial processes
to enhance productivity, efficiency, and safety, as well as to avoid expensive
maintenance. Therefore, it is important to be able to detect and identify any
possible anomalies or failures in the system as early as possible. Generally,
anomalies in modern automatic processes are difficult to avoid and may result
in serious process degradations. The role of detection is to identify any anomaly
event and indicate a distance from the system behavior compared to its nominal
behavior. Furthermore, anomaly isolation determines the probable source of the
detected anomaly. To illustrate, an accidental or even deliberate contamination
of a drinking water distribution network can lead to financial losses, as well as to
serious health risks. Therefore, early detection of anomalies is crucial not only to
maintain proper process operation but also for the sake of people’s health. Today
engineered and environmental processes have become far more complex due to
advances in technology. Multiple key variables need to be monitored simulta-
neously, and data may have both temporal and spatial aspects. New features of
these processes require new and better statistical tools for process monitoring.
Early detection and isolation of potential faults in complex engineering
and environmental processes have proven to be particularly challenging. In the
absence of a physics-based process model, data-driven statistical techniques
for process monitoring have proved themselves in practice over the past four
decades. These approaches use information derived directly from input data and
require no explicit models for which development is usually costly or time-
consuming. This book is intended to report recent developments in statistical
process monitoring using advanced data-driven and deep learning techniques.
The book is divided into nine chapters, and they are grouped into two parts.
The objective of the first part is to tackle multivariate challenges in process
monitoring by merging the advantages of univariate and traditional multivariate
techniques to enhance their performance and widen their practical applicabil-
ity. The second part aims to merge the desirable properties of shallow learning
approaches, such as a one-class support vector machine, k-nearest neighbors,
and unsupervised deep learning approaches to develop more sophisticated and
efficient monitoring techniques. Throughout the book, the presented approaches
are demonstrated using experimental data from many processes including waste-
water treatment plants at KAUST and Golden, CO, USA, ozone air quality data,
ix
x Preface
and stereovision data for obstacle detection in driving environments. Thus, the
reader will find illustrative examples from a range of environmental and engi-
neering processes.
The book should be of interest to engineering and academic readers from
process chemometrics and data analytics, process monitoring and control, data
scientists, applied statistics, and industrial statisticians. In fact, this book can be
assimilated by advanced undergraduates and graduate students having knowl-
edge of basic multivariate statistical analysis and machine learning.
Acknowledgments
Addressing anomaly detection and isolation is essential to promptly detect ab-
normalities and helpful in the decision making of the operators to better opti-
mize, take corrective actions, and maintain downstream processes. This book
is primarily based on data-driven based approaches for anomaly detection and
isolation. The reader of this book will gain an in-depth understanding of fault
detection and isolation in complex and multivariate systems, familiarizing with
the most suitable data-driven based techniques including multivariate statistical
techniques and deep learning-based methods. It gives the reader several real en-
gineering and environmental applications to clearly show the implementation of
anomaly detection and isolation approaches.
Ying Sun and Fouzi Harrou would like to gratefully acknowledge the fi-
nancial support by funding from King Abdullah University of Science and
Technology (KAUST), Office of Sponsored Research (OSR) under Award No:
OSR-2019-CRG7-3800 and OSR-2015-CRG4-2582. They would like also to
express their sincere gratitude to the team of Publication Services and Re-
searcher Support at KAUST for their support. In addition, we would also like
to thank Professor Tzahi Cath of Colorado School of Mines who provided the
decentralized wastewater treatment data.
Amanda S. Hering would like to thank Professor Tzahi Cath of Colorado
School of Mines who has been instrumental in introducing her to fault iso-
lation problems and who has shared data from his facilities with her. She
would also like to thank her graduate students, Molly Klanderman and Kathryn
Newhart; their expertise and insight accumulated over the course of working
together for the past few years has been invaluable. Her work on this project
has been supported by King Abdullah University of Science and Technology
(KAUST) Office of Sponsored Research (OSR), Grant/Award Number: OSR-
2015-CRG4-2582; Partnerships for Innovation: Building Innovation Capacity,
National Science Foundation, Grant/Award Number: 1632227; the National
Science Foundation Engineering Research Center program under cooperative
agreement EEC-1028968 (ReNUWIt); and Baylor University through a research
leave sabbatical.
xi
xii Acknowledgments
Muddu Madakyaru would like to thank the Manipal Institute of Technology,
Manipal Academy of Higher Education, Manipal, India, for continuous support
during the preparation of this book.
Finally, we would like to thank Lena Sparks, Author Service Manager, for
her continuous assistance during the preparation of this book.
Chapter 1
Introduction
1.1 Introduction
1.1.1 Motivation: why process monitoring
Recent decades have witnessed a huge growth in new technologies and advance-
ments in instrumentation, industrial systems, and environmental processes,
which are becoming increasingly complex. Diagnostic operation has become
an essential element of these processes and systems to ensure their operational
reliability and availability. In an environment where productivity and safety are
paramount, failing to detect anomalies in a process can lead to harmful effects
to a plant’s productivity, profitability, and safety. Several serious accidents have
happened in the past few decades in various industrial plants across the world,
including the Bhopal gas tragedy [1,2], the Piper Alpha explosion [3,4], the acci-
dents at the Mina al-Ahmadi Kuwait refinery [5] and two photovoltaic plants in
the US burned in 2009 and 2011 (a 383 KWp PV array in Bakersfield, CA and a
1.208 MWp power plant in Mount Holly, NC, respectively) [6]. The Bhopal ac-
cident, also referred to as the Bhopal gas disaster, was a gas leak accident at the
Union Carbide pesticide plant in India in 1984 that resulted in over 3000 deaths
and over 400,000 others gravely injured in the local area around the plant [1,2].
The explosion of the Piper Alpha oil production platform, which is located in
the North Sea and managed by Occidental Petroleum, caused 167 deaths and
a financial loss of around $3.4 billion [3,4]. In 2000, an explosion occurred in
the Mina Al-Ahmadi oil refinery in Kuwait, killing five people and causing seri-
ous damage to the plant. The explosion was caused by a defect in a condensate
line in a refinery. Nimmo [7] has estimated that the petrochemical industry in
the USA can avoid losing up to $20 billion per year if anomalies in inspected
processes could be discovered in time. In safety-critical systems such as nu-
clear reactors and aircrafts, undetected faults may lead to catastrophic accidents.
For example, the pilot of the American Airlines DC10 that crashed at Chicago
O’Hare International Airport was notified of a fault only 15 seconds before the
accident happened, giving the pilot too little time to react; this crash could easily
have been avoided according to [8]. Recently, the Fukushima accident of 2011
in Japan highlighted the importance of developing accurate and efficient moni-
toring systems for nuclear plants. Essentially, monitoring of industrial processes
represents the backbone for ensuring the safe operation of these processes and
to ensure that the process is always functioning properly.
Statistical Process Monitoring using Advanced Data-Driven and Deep Learning Approaches
https://doi.org/10.1016/B978-0-12-819365-5.00007-3
Copyright © 2021 Elsevier Inc. All rights reserved.
1
2 Statistical Process Monitoring
1.1.2 Types of faults
Generally speaking, three main subsystems are merged to form a plant or sys-
tem: sensors, actuators, and the main process itself. These systems’ components
are permanently exposed to faults caused by many factors, such as aging, man-
ufacturing, and severe operating conditions. A fault or anomaly is a tolerable
deviation of a characteristic property of a variable from its acceptable behavior
that could lead to a failure in the system if it is not detected early enough so
that the necessary correction can be performed [9]. Conventionally, a fault, if
it is not detected in time, could progress to produce a failure or malfunction.
Note that there is a distinction between failure and malfunction; this distinc-
tion is important. A malfunction can be defined as an intermittent deviation of
the accomplishment of a process’s intended function [10], whereas failure is a
persistent suspension of a process’s capability to perform a demanded function
within indicated operating conditions [10].
In industrial processes, a fault or an abnormal event is defined as the depar-
ture of a calculated process variable from its acceptable region of operation. The
underlying causes of a fault can be malfunctions or changes in sensor, actuator,
or process components:
• Process faults or structural changes. Structural change usually takes place
within the process itself due to a hard failure of the equipment. The informa-
tion flow between the different variables is affected because of these changes.
Failure of a central controller, a broken or leaking pipe, and a stuck valve
are a few examples of process faults. These faults are distinguished by slow
changes across various variables in the process.
• Faults in sensors and actuators. Sensors and actuators play a very important
role in the functioning of any industrial process since they provide feedback
signals that are crucial for the control of the plant. Actuators are essential for
transforming control inputs into appropriate actuation signals (e.g., forces and
torques needed for system operation). Generally, actuator faults may lead to
higher power consumption or even a total loss of control [11]. Faults in pumps
and motors are examples of actuator faults. On the other hand, sensor-based
errors include positive or negative bias errors, out of range errors, precision
degradation error, and drift sensor error. Sensor faults are generally charac-
terized by quick deviations in a few numbers of process variables. Fig. 1.1
shows examples of the most commonly occurring sensor faults: bias, drift,
degradation, and sensor freezing.
We can also find in the literature another type of anomaly called gross pa-
rameter changes in a model. Indeed, parameter failure occurs when there is a
disturbance entering the monitored process from the environment through one or
more variables. Some common examples of such malfunctions include a change
in the heat transfer coefficient, a change in the temperature coefficient in a heat
exchanger, a change in the liquid flow rate, or a change in the concentration of
a reactant.
Introduction Chapter | 1 3
FIGURE 1.1 Commonly occurring sensor faults. (A) Bias sensor fault. (B) Drift sensor fault.
(C) Degradation sensor fault. (D) Freezing sensor fault.
FIGURE 1.2 Fault types. (A) Abrupt anomaly. (B) Gradual anomaly. (C) Intermittent anomaly.
Thus, sensor or process faults can affect the normal functioning of a process
plant. In today’s highly competitive industrial environment, improved moni-
toring of processes is an important step towards increasing the efficiency of
production facilities.
In practice, there is a tendency to classify anomalies according to their
time-variant behavior. Fig. 1.2 illustrates three commonly occurring types of
anomalies that can be distinguished by their time-variant form: abrupt, incipi-
ent, and intermittent faults. Abrupt anomalies happen regularly in real systems
and are generally typified by a sudden change in a variable from its normal op-
erating range (Fig. 1.2A). The faulty measurement can be formally expressed as
M(t) =

r(t), t  ta,
r(t) + F, t ≥ ta,
(1.1)
where F is a bias that happens at the time instant ts.
4 Statistical Process Monitoring
The drift anomaly type can be caused by the aging or degradation of a sensor
and can be viewed as a linear change of the magnitude of fault in time. Here,
the measurement corrupted with a drift fault is modeled as
m(t) =

r(t), t  ta,
r(t) + θ(t − ta), t ≥ ts,
(1.2)
where θ is a slope of the slow drift and ta is the start time of the occurred fault.
Finally, intermittent faults are faults characterized by discontinuous occurrence
in time; they occur and disappear repeatedly (Fig. 1.2C).
Generally, industrial and environmental processes are exposed to various
types of faults that negatively affect their productivity and efficiency. According
to the form in which the fault is introduced, faults can be further classified as
additive and multiplicative faults. Additive faults often appear as offsets of sen-
sors or as additive bias, while multiplicative faults influence process parameters.
Specifically, in an additive fault, the measurable variable Y(t) is corrupted by
an additive fault, θt , as Y = Yt + θt . On the other hand, a multiplicative fault
influences a measurable variable Y by the product of another variable U with θt
(i.e., Y = (a + f )Ut ), where Ut is the input variable.
1.1.3 Process monitoring
Before automation became commonplace in the field of process monitoring, hu-
man operators carried out important control tasks in managing process plants.
However, the complete reliance on human operators to cope with abnormal
events and emergencies has become increasingly difficult because of the com-
plexity and a large number of variables in modern process plants. Considering
such difficult conditions, it is understandable that human operators tend to make
mistakes that can lead to significant economic, safety, and environmental prob-
lems. Thanks to advancements in technology over recent years, automation of
process fault detection and isolation has been a major milestone in automatic
process monitoring. Automatic process monitoring has been shown to respond
very well to abnormal events in a process plant with much fewer mistakes com-
pared to fault management by human operators.
The demand for a monitoring system that is capable of appropriately detect-
ing abnormal changes (sensor or process faults) has attracted the attention of
researchers from different fields. The detection and isolation of anomalies that
may occur in a monitored system are the two main elements of process monitor-
ing (Fig. 1.3). The purpose of the detection step is to detect abnormal changes
that affect the behavior of the monitored system. Once the anomaly is detected,
effective system operation also requires evaluation of the risk of a system shut-
down, followed by fault isolation or correction before the anomaly contaminates
the process performance [12,13]. The purpose of fault isolation is to determine
the source responsible for the occurring anomalies, i.e., to determine which sen-
sor or process component is faulty. In practice, sometimes it is also essential to
Introduction Chapter | 1 5
FIGURE 1.3 Steps of process monitoring.
assess the severity of the occurred fault, which is done by the fault identification
step. Here, we will focus only on fault detection and isolation.
There are two types of anomaly detection:
• Online fault detection. The objective of online anomaly detection is to set
up a decision rule capable of detecting, as quickly as possible, the transition
from a normal operating state to an abnormal operating state. Online detec-
tion is based on the idea that system evolution is considered a succession of
stationary modes separated by fast transitions.
• Offline fault detection. The purpose of offline fault detection is to detect the
presence of a possible anomaly outside the use of the monitored system. The
system is observed for a finite period (the system is in stationary mode), and
then, based on these observations, a decision is made on the state of the mon-
itored system. Offline detection methods rely on an observation number fixed
a priori, where the observations also come from the same law.
1.1.4 Physical redundancy vs analytical redundancy
Process monitoring is essentially based on the exploitation of redundant sources
of information. There are two types of redundancy in the process: physical re-
dundancy and analytical redundancy (Fig. 1.4A–B). The essence of hardware
or physical redundancy, which is a traditional method in process monitoring,
consists of measuring a particular process variable using several sensors (e.g.,
two or more sensors). To detect and isolate simple faults, the number of sensors
to use should be doubled. Specifically, under normal conditions, one sensor is
sufficient to monitor a particular variable, but adding at least two extra sensors
is generally needed to guarantee reliable measurements and monitoring under
6 Statistical Process Monitoring
FIGURE 1.4 Conceptual representation of (A) physical and (B) analytical redundancy.
faulty conditions. Typically, fault detection and isolation are achieved by a ma-
jority vote between all the redundant sensors. This strategy has been widely
used in the industry because of its reliability and simplicity of implementation.
In practice, the main disadvantage of hardware redundancy is the additional
cost of equipment and maintenance, as well as the space needed to install the
equipment that increases complexity considerably in the already very complex
systems. In addition, this method is limited in practice to sensor faults and can-
not detect faults in variables that are not measured directly. This approach is
mainly only justified for critical systems, such as nuclear reactors and aero-
nautic systems. Unlike a physical redundancy, which is performed by adding
more sensors (hardware) to measure a specific process variable, the analytical
redundancy does not require additional hardware because it is based on using
the existing relations between the dependent measured variables that are or are
not of the same nature. Analytical redundancy is a more accessible strategy that
compares the measured variable with the predicted values from a mathemat-
ical model of the monitored system. It thereby exploits redundant analytical
relationships among various measured variables of the monitored process and
avoids replicating every hardware separately.
1.2 Process monitoring methods
Today, engineering and environmental processes have become far more complex
due to advances in technology. Anomaly detection and isolation have become
necessary to monitor the continuity and proper functioning of modern industrial
systems and environmental processes. Depending on the field of application, the
repercussions of anomalies become binding and harmful if it concerns human
safety, such as in aeronautical systems and nuclear reactors. Advancements in
Introduction Chapter | 1 7
the field of process control and automation over the last few years have yielded
various methods for successful diagnosis and detection of abnormal events. To
meet safety and productivity requirements, extensive theoretical and practical
monitoring methods have been developed. These methods are generally divided
into three families of approaches, depending on the nature of the knowledge
available on the system: model-, knowledge-, and data-based methods. A thor-
ough overview of process fault detection and diagnosis can be found in [5].
Fig. 1.5 shows a summary of various monitoring methods; this section presents
a brief overview of these monitoring techniques.
FIGURE 1.5 A summary of various fault detection approaches.
1.2.1 Model-based methods
Over the past three decades, numerous monitoring methods to improve the
safety and productivity of several environmental and engineering processes
have emerged. Model-based methods have proven especially useful in indus-
trial applications where keeping the desired performance is highly required.
A model-based method involves comparing the process’s measured variables
with the prediction from the mathematical model of the process. The concep-
tual schematic of the model-based fault detection is illustrated in Fig. 1.6. The
FIGURE 1.6 Conceptual schematics of model-based process monitoring.
8 Statistical Process Monitoring
backbone of the model-based method is the generation of residuals by compar-
ing the measurement data with their predictions from the analytical model of
the monitored process. Indeed, the residuals play the role of a fault indicator.
Ideally, in the absence of modeling uncertainties and errors, the residual will be
zero and the model will perfectly fit the measurements. Thus, any departure of
the residual from zero indicates the presence of faults. However, in practice, we
cannot avoid the presence of modeling uncertainties and noise measurements.
In other words, a perfectly precise analytical model of an inspected process is
never available. Notice that there is a distinction between a deviation of the real
measurement and its prediction from a reference model, even under no-fault
conditions. Hence, instead of using the departure of residuals from zero as a
fault indicator, detection can be done by constructing a detection threshold that
distinguishes between fault-free residuals and anomalies. The detection perfor-
mance is mainly related to the selected detection threshold. This means that if
the value of the thresholds is too small, then we get repeat false alarms due
to errors and uncertainties when the residuals overpass the threshold and are
consequently flagged as faults; this scenario obviously must be avoided. The
detection threshold should thus be computed so that the frequency of correct
detection is maximized for a given small number of false alarms (e.g., 5% or
1%). To address this concern, several statistical schemes have been proposed
to monitor the residuals vector, including the generalized likelihood ratio ap-
proach, cumulative sum (CUSUM) type schemes, and EWMA schemes. In the
case of multivariate data, when the residuals matrix is generated, multivariate
extensions of CUSUM and EWMA and T 2 are usually used to detect faults in
the mean/variance of process.
In summary, fault detection and isolation using model-based methods usu-
ally take place in two distinct steps:
• The first step consists of residual generation. Ideally, these residuals must be
zero in normal operation and nonzero in the presence of an anomaly. How-
ever, the presence of noise and modeling errors make the residuals fluctuate
around zero. A significant divergence of the residual from zero is an indica-
tion of faults.
• The second step concerns the evaluations of the residuals based on a deci-
sion procedure for detecting and isolating faults. This is done using statistical
detection techniques such as EWMA, CUSUM, and generalized likelihood
ratio (GLR) test [12].
A substantial amount of research work has been carried out on model-
based monitoring methods. Methods that fall into the model-based monitoring
category include parity space approaches [14–17], observer-based approaches
[18,19], and interval approaches [20]. A related discussion and a comprehen-
sive survey on model-based fault detection methods can be found in [21–23].
Essentially, the detection performance of model-based approaches is closely
related to the accuracy of the reference model. The availability of an accurate
Introduction Chapter | 1 9
model that mimics the nominal behavior of the monitored process is very help-
ful for facilitating the detection of faulty measurements. However, for complex
processes, such as those of many industrial and environmental processes with
a large number of variables, deriving and developing accurate models is not al-
ways easy and can be time-consuming, which makes them nonapplicable for
many applications. For instance, modeling the inflow measurements of wastew-
ater treatment plants is very challenging because of the presence of a large
number of variables that are nonlinearly dependent and autocorrelated. Addi-
tionally, modeling modern industrial and environmental processes is challenging
because of the complexity and the absence of a precise understanding of these
processes. The successful detection of faults using model-based approaches can,
therefore, be considered a challenging and unsuitable approach. Alternatively,
data-based methods are more commonly used.
1.2.2 Knowledge-based methods
The success of modern industrial systems relies on their proper and safe op-
eration. Early detection of anomalies as they emerge in the inspected process
is essential for avoiding extensive damage and reducing the downtime needed
for reparation [24]. As discussed above, when the information available to un-
derstand the process under fault-free operation is insufficient to construct an
accurate analytical model, analytical monitoring methods are no longer effec-
tive. Knowledge-based methods present an alternative solution to bypass this
difficulty. In the following, we use artificial intelligence methods and available
historical measurements, which inherently represent the correlation of the pro-
cess variables, to extract the underlying knowledge and system characteristics.
In other words, we utilize process characteristic values, such as variance, mag-
nitude, and state variables, for extracting features under fault-free and faulty
conditions based on heuristic and analytical knowledge. Fault detection is then
performed in a heuristic manner. Specifically, the actual features from the on-
line data are compared with the obtained under-lying knowledge. Methods that
fall in this category include expert systems [25], fuzzy logic, Hazop-digraph
(HDG) [5], possible cause and effect graphs (PCEG) [26], neuro-fuzzy based
causal analysis, failure mode and effect analysis (FMEA) [27], and Bayesian
networks [28]. The major drawback of these techniques is that they are more
appropriate for small-scale systems and thus may not be suited to inspect mod-
ern systems.
1.2.3 Data-based monitoring methods
Engineering and environmental processes have undoubtedly become far more
complex due to advances in technology. Consequently, designing an accurate
model for complex, high dimensional and nonlinear systems has also become
very challenging, expensive, and time-consuming to develop. Setting simplifi-
cations and assumptions on models leads to limits in their capacity to capture
10 Statistical Process Monitoring
certain relevant features and operation modes, and induces a modeling bias that
significantly degrades the efficiency of the monitoring system. In the absence
of a physics-based process model, data-driven statistical techniques for process
monitoring have proved themselves in practice over the past four decades. In-
deed, data-based implicit models only require an available process-data resource
for process monitoring [5]. Data-based monitoring techniques are mainly based
on statistical control charts and machine-learning methods.
Essentially, these monitoring techniques rely on historical data collected
from the monitoring system. The system is modeled as a black box with in-
put and output data (Fig. 1.7). At first, a reference empirical model that mimics
the nominal behavior of the inspected process is constructed using the fault-free
data, and then this model is used for detecting faults in new data. In contrast to
model-based methods, only historic process data is required to be available in
the data-based fault detection methods, and they are classified into two classes:
qualitative and quantitative methods.
FIGURE 1.7 Data-based methods.
Unsupervised data-based techniques for fault detection and isolation do
not use any prior information on faults affecting the process. Unsupervised
data-based techniques cover a set of methods for monitoring industrial pro-
cesses through tools such as statistical control charts (see Fig. 1.8). Univariate
techniques, such as a Shewhart chart, exponentially weighted moving average
(EWMA) [29], and cumulative sum (CUSUM), are used for monitoring only a
single process variable at a given time instant. Monitoring charts have been ex-
tensively exploited in most industrial processes. CUSUM and EWMA schemes
show good capacity in sensing small changes compared to the Shewhart chart.
In [30], a spectral monitoring scheme is designed based on the information em-
bedded in the coefficients of the signal Fourier. However, these conventional
schemes are designed based on the hypotheses that the data are Gaussian and un-
correlated. To escape these basic assumptions, multiscale monitoring schemes
using wavelets have been developed [31]. Furthermore, the above-discussed
schemes use static thresholds computed using the fault-free data. Recently, sev-
eral adaptive monitoring methods have been developed. These schemes are, in
practice, more flexible and efficient than conventional schemes with fixed pa-
rameters. For more details, see [32–35]. These univariate monitoring schemes
Introduction Chapter | 1 11
examine a particular variable at a time instant by assuming independence be-
tween variables. When monitoring multivariate data using several univariates,
even when the number of false alarms of each scheme is small, the collective
rate could be very large [36–38]. In addition, measurements from modern in-
dustrial processes are rarely independent and present a large number of process
variables. Since univariate schemes ignore the cross-correlation between vari-
ables, they consequently suffer from an inflated number of undetected issues
and false alarms, which makes this monitoring scheme unsuitable [36–38].
FIGURE 1.8 Data-based monitoring techniques.
To alleviate this difficulty and to handle high dimensional data effectively,
multivariate monitoring schemes have been developed to take into account the
correlations between the variables, and thus monitor processes with several
variables. These schemes include Hotelling T 2 [39], multivariate EWMA [40],
and multivariate CUSUM [41]. However, the performance of these multivariate
schemes degrades as the number of variables monitored increases, which makes
them unsuitable for high dimensional data.
Multivariate monitoring methods for monitoring multivariate data have been
designed to directly tackle the above limitations. Multivariate statistical meth-
ods are useful for compressing data and retaining relevant information, which
is more appropriate to analyze than the original data. Moreover, these methods
are efficient at handling noise and interactions between variables to effectively
extract pertinent information. The most common multivariate methods for fault
detection are principal component analysis (PCA) [22,42], partial least squares
(PLS), principal component regression (PCR), canonical variate analysis (CVA),
and independent component analysis (ICA) [43]. The essential element of multi-
variate statistical methods, such as PCA, is their ability to transform multivariate
correlated variables to a reduced set of uncorrelated variables. In the past two
decades, these techniques have been extensively used to monitor industrial pro-
cesses. For fault detection purposes, the original data is first projected into a
12 Statistical Process Monitoring
latent subspace, where latent variables and residuals are monitored. PCA and
PLS are the two most popular multivariate statistical methods that use latent
variable methods for monitoring because they have a strong mathematical foun-
dation that is available in the literature. Indeed, the PCA or PLS model is
constructed based on historical normal process operations. This empirical model
could be used to monitor the future behavior of the process. Any departure from
the model should be flagged as a potential anomaly, such as sensor fault or pro-
cess drift. PCA is used to reduce dimensionality in the process data and to retain
important features of the data. PCA projects the observations from a higher
dimension on to a lower-dimensional subspace and is optimal in terms of cap-
turing the data variability. The PCA procedure is applied to a single data matrix
only, whereas PLS models the relationship between two data matrices while
compressing them simultaneously. The PCA technique is used to monitor and
detect the faults in a multivariate process, along with the two fault detection in-
dices, T 2 and the squared prediction error (SPE) statistics. The major advantage
of latent variable approaches (i.e., PCA and PLS) is that a limited number of
monitoring schemes are needed for monitoring multivariate data using monitor-
ing indices of T 2 and SPE.
However, data from modern industrial processes are time-dependent, non-
stationary, nonlinear, non-Gaussian, and multiscale [44–47]. Most process mon-
itoring methods assume that the process measurements at a given time are
independent of the observations at a past sampling instant. Industrial pro-
cesses are operated under dynamic conditions and variables have strong auto-
correlation properties. Augmenting observations at a previous sampling time
with observations at the present sampling time is referred to as Dynamic PCA
(DPCA) [48,49]. For high-dimensional and time-dependent industrial data,
using a fixed model monitoring approach could lead to poor diagnostic re-
sults [50]. However, process monitoring for such processes could be improved
by updating the model using a recursive PCA and a moving window PCA tech-
nique [50]. Recursive PCA updates the model continuously online; similarly,
online adaptive PCA updates the model using EWMA [50,51]. For nonlinear
processes, a nonlinear version of data-based methods has been used, such as
kernel PCA, kernel PLS, polynomial PLS and quadratic or fuzzy PLS, to reveal
nonlinear relationships between variables [46]. In practice, most of the data need
not be Gaussian in nature; to handle the non-Gaussian nature of the data, inde-
pendent component analysis (ICA), the Gaussian mixture model (GMM), and
its nonlinear variant have been used [47]. Other extensions have been developed,
such as multiway PCA [45] that permits analyzing data from batch processes,
and multiscale PCA that monitors processes at different frequency bands and
denoises the data and reduces autocorrelation. Overall, these extensions are in-
troduced based on an understanding of the nature of the data gathered from the
inspected process. Accordingly, understanding the process characteristics is a
central factor to meet practical expectations and construct an effective statistical
monitoring system.
Introduction Chapter | 1 13
Other approaches that fall into this category are based on machine- and
deep-learning methods, which have recently gained considerable attention from
researchers due to their ability to learn from large and complex datasets. Under
a machine-learning framework, support vector machines (SVM) [52–54] and
artificial neural networks (ANNs) have become an important tool in fault de-
tection literature. Recently, increasing process complexity has resulted in the
development of several monitoring methods based on deep learning that can
account for features such as time dependency, nonlinearity, and nonnormality.
A major strategy has been to extract features from the data using deep-learning
models, such as Restricted Boltzmann Machine (RBM), Deep Belief Network
(DBM), Deep Boltzmann Machine (DBM), Long Short-Term Memory (LSTM),
and recurrent neural network (RNN), and to monitor the extracted features using
binary clustering schemes or traditional monitoring charts. For instance, [55]
introduced an approach that integrated an RNN-RBM model with clustering
algorithms including k-means, spectral clustering, and OCSVM, for anomaly
detection in WWTPs. In [56], several deep learning-based monitoring methods,
such as DBN, deep-stacked auto-encoders, and restricted Boltzmann machines-
based clustering procedures, were applied to detect abnormal ozone pollution.
Deep-learning methods are appealing because of their flexibility to not make
restricting assumptions on the underlying data. Also, applications using deep
learning cover detection in complex data as multivariate time-series data [57],
images and videos [58,59].
1.3 Fault detection metrics
To verify the performance of fault detection methods, several well-know metrics
are commonly employed in the context of binary detection problems. Basically,
many detection performance metrics are computed based on the 2 × 2 confusion
matrix that reports the number of true positives (TP), false positives (FP), false
negatives (FN), and true negatives (TN) [60]. The detection quality of the fault
detection methods can be assessed using a false positive rate (FPR) (i.e., false
alarm rate), a true positive rate (TPR) (i.e., detection rate), precision, accuracy,
F-measure, recall, and the area under the curve (AUC). Fig. 1.9 displays a confu-
sion matrix and recapitulate equations of the well-known related metrics that are
frequently used to assess the performance of a binary decision method [60,61].
Also, another metric called average run length (ARL), which is able to char-
acterize both types of error, I and II, is usually used to evaluate detection quality.
Specifically, there are two kinds of ARL: ARL0 and ARL1. ARL0 is the average
number of data points a fault detection method takes to flag out an alarm when
the process is under control. ARL1 refers to the number of data points it takes
a monitoring method to uncover a fault under faulty conditions (i.e., speed of
detection).
14 Statistical Process Monitoring
FIGURE 1.9 Fault detection metrics.
1.4 Conclusion
In summary, accurately detecting and isolating faults that can occur in industrial
and environmental processes is essential to minimize downtime, increase safety,
reduce maintenance costs, and extend equipment lifetime. Process monitoring is
required to successfully detect, isolate, and remove the faults before they affect
the process performance. Several aspects should be considered when designing
or using a particular fault detection approach, including the type of fault, process
dynamics, measured variables, available data, and complexity. The simplest and
most common practice is to directly check the limit of a measurable variable.
However, these techniques are limited when monitoring large-scale processes.
This has led to the development of reliable techniques that incorporate informa-
tion from not just one process variable, but that include more knowledge about
the process such as process state and parameters. Some approaches rely on accu-
rate process models whereas others use available historical process data. Process
model-based monitoring that incorporates dynamics information is easy to im-
plement for well-defined systems; however, process model-based monitoring
needs accurate models that are not always easy to obtain, in particular for com-
plex processes. On the other hand, when information on the reliance of faults
and symptoms is available, knowledge-based approaches are preferable; how-
ever, these approaches are limited to small and simple processes. An alternative
approach is to use data-based monitoring techniques, which are flexible and
assumption-free. Of course, when a large amount of process data is available,
and the process is too complex to be explicitly modeled, data-based techniques
are more appropriate because of their flexibility to handle large, noisy, and non-
linear data.
Introduction Chapter | 1 15
References
[1] V.R. Dhara, R. Dhara, The union carbide disaster in Bhopal: a review of health effects, Archives
of Environmental Health: An International Journal 57 (5) (2002) 391–404.
[2] B. Bowonder, The Bhopal accident, Technological Forecasting  Social Change 32 (2) (1987)
169–182.
[3] M.E. Paté-Cornell, Learning from the Piper Alpha accident: a postmortem analysis of technical
and organizational factors, Risk Analysis 13 (2) (1993) 215–232.
[4] L.W.D. Cullen, The public inquiry into the Piper Alpha disaster, Drilling Contractor; (United
States) 49 (4) (1993).
[5] V. Venkatasubramanian, R. Rengaswamy, S.N. Kavuri, K. Yin, A review of process fault
detection and diagnosis: part III: process history based methods, Computers  Chemical En-
gineering 27 (3) (2003) 327–346.
[6] B. Brooks, The Bakersfield fire: a lesson in ground-fault protection, SolarPro Magazine 62
(2011).
[7] I. Nimmo, Adequately address abnormal operations, Chemical Engineering Progress 91 (9)
(1995).
[8] R.J. Patton, Fault-tolerant control: the 1997 situation, IFAC Proceedings Volumes 30 (18)
(1997) 1029–1051.
[9] O. Büyüköztürk, M.A. Taşdemir, Nondestructive Testing of Materials and Structures, vol. 6,
Springer Science  Business Media, 2012.
[10] R. Isermann, Fault-Diagnosis Systems: an Introduction from Fault Detection to Fault Toler-
ance, Springer Science  Business Media, 2006.
[11] G.J. Ducard, Fault-Tolerant Flight Control and Guidance Systems: Practical Methods for Small
Unmanned Aerial Vehicles, Springer Science  Business Media, 2009.
[12] M. Basseville, I.V. Nikiforov, et al., Detection of Abrupt Changes: Theory and Application,
vol. 104, Prentice Hall, Englewood Cliffs, 1993.
[13] F. Harrou, L. Fillatre, I. Nikiforov, Anomaly detection/detectability for a linear model with a
bounded nuisance parameter, Annual Reviews in Control 38 (1) (2014) 32–44.
[14] E. Chow, A. Willsky, Analytical redundancy and the design of robust failure detection systems,
IEEE Transactions on Automatic Control 29 (7) (1984) 603–614.
[15] P.M. Frank, Fault diagnosis in dynamic systems using analytical and knowledge-based redun-
dancy: a survey and some new results, Automatica 26 (3) (1990) 459–474.
[16] R.J. Patton, J. Chen, A review of parity space approaches to fault diagnosis, IFAC Proceedings
Volumes 24 (6) (1991) 65–81.
[17] J. Ragot, D. Maquin, F. Kratz, Analytical redundancy for systems with unknown inputs. Ap-
plication to faults detection, Control Theory and Advanced Technology 9 (3) (1993) 775–788.
[18] R.N. Clark, D.C. Fosth, V.M. Walton, Detecting instrument malfunctions in control systems,
IEEE Transactions on Aerospace and Electronic Systems 4 (1975) 465–473.
[19] R.J. Patton, P.M. Frank, R.N. Clarke, Fault Diagnosis in Dynamic Systems: Theory and Appli-
cation, Prentice-Hall, Inc., 1989.
[20] K. Benothman, D. Maquin, J. Ragot, M. Benrejeb, Diagnosis of uncertain linear systems: an
interval approach, International Journal of Sciences and Techniques of Automatic control 
computer engineering 1 (2) (2007) 136–154.
[21] P.M. Frank, Analytical and qualitative model-based fault diagnosis–a survey and some new
results, European Journal of Control 2 (1) (1996) 6–28.
[22] L.H. Chiang, E.L. Russell, R.D. Braatz, Fault Detection and Diagnosis in Industrial Systems,
Springer Science  Business Media, 2000.
[23] N. Martin, Advanced signal processing and condition monitoring, Insight-Non-Destructive
Testing and Condition Monitoring 49 (8) (2007) 459–464.
[24] Z. Gao, C. Cecati, S. Ding, A survey of fault diagnosis and fault-tolerant techniques—part II:
fault diagnosis with knowledge-based and hybrid/active-based approaches, IEEE Transactions
on Industrial Electronics 62 (6) (2015) 3768–3774.
16 Statistical Process Monitoring
[25] S. Kim, S. jin Ahn, J. Chung, I. Hwang, S. Kim, M. No, S. Sin, A rule based approach to
network fault and security diagnosis with agent collaboration, in: International Conference on
AI, Simulation, and Planning in High Autonomy Systems, Springer, 2004, pp. 597–606.
[26] N. Wilcox, D. Himmelblau, The possible cause and effect graphs (PCEG) model for fault
diagnosis—I. Methodology, Computers  Chemical Engineering 18 (2) (1994) 103–116.
[27] R. Wirth, B. Berthold, A. Krämer, G. Peter, Knowledge-based support of system analysis for
the analysis of failure modes and effects, Engineering Applications of Artificial Intelligence
9 (3) (1996) 219–229.
[28] V. Sylvain, T. Teodor, K. Abdessamad, Fault detection with Bayesian network, in: Frontiers in
Robotics, Automation and Control, IntechOpen, 2008.
[29] J.M. Lucas, M.S. Saccucci, Exponentially weighted moving average control schemes: proper-
ties and enhancements, Technometrics 32 (1) (1990) 1–12.
[30] T. Tiplica, A. Kobi, A. Barreau, Spectral control chart, Quality Engineering 17 (4) (2005)
695–702.
[31] R. Ganesan, T.K. Das, V. Venkataraman, Wavelet-based multiscale statistical process monitor-
ing: a literature review, IIE Transactions 36 (9) (2004) 787–806.
[32] M.S. De Magalhães, E.K. Epprecht, A.F. Costa, Economic design of a Vp X chart, International
Journal of Production Economics 74 (1–3) (2001) 191–200.
[33] R.B. Kazemzadeh, M. Karbasian, M.A. Babakhani, An EWMA t chart with variable sampling
intervals for monitoring the process mean, The International Journal of Advanced Manufactur-
ing Technology 66 (1–4) (2013) 125–139.
[34] D.S. Bai, K. Lee, An economic design of variable sampling interval X control charts, Interna-
tional Journal of Production Economics 54 (1) (1998) 57–64.
[35] Y. Su, L. Shu, K.-L. Tsui, Adaptive EWMA procedures for monitoring processes subject to
linear drifts, Computational Statistics  Data Analysis 55 (10) (2011) 2819–2829.
[36] J.F. MacGregor, T. Kourti, Statistical process control of multivariate processes, Control Engi-
neering Practice 3 (3) (1995) 403–414.
[37] P. Nomikos, J.F. MacGregor, Multivariate SPC charts for monitoring batch processes, Techno-
metrics 37 (1) (1995) 41–59.
[38] U. Kruger, L. Xie, Advances in Statistical Monitoring of Complex Multivariate Processes: With
Applications in Industrial Process Control, Wiley, 2012.
[39] H. Hotteling, Multivariate quality control, illustrated by the air testing of sample bombsights,
in: M.W.H.C. Eisenhart, W.A. Wallis (Eds.), Selected Techniques of Statistical Analysis,
McGraw-Hill, New York, NY, USA, 1947.
[40] C.A. Lowry, W.H. Woodall, C.W. Champ, S.E. Rigdon, A multivariate exponentially weighted
moving average control chart, Technometrics 34 (1) (1992) 46–53.
[41] R.B. Crosier, Multivariate generalizations of cumulative sum quality-control schemes, Tech-
nometrics 30 (3) (1988) 291–303.
[42] S. Wold, K. Esbensen, P. Geladi, Principal component analysis, Chemometrics and Intelligent
Laboratory Systems 2 (1–3) (1987) 37–52.
[43] A. Hyvärinen, E. Oja, Independent component analysis: algorithms and applications, Neural
Networks 13 (4–5) (2000) 411–430.
[44] S.W. Choi, E.B. Martin, A.J. Morris, I.-B. Lee, Adaptive multivariate statistical process control
for monitoring time-varying processes, Industrial  Engineering Chemistry Research 45 (9)
(2006) 3108–3118.
[45] P. Nomikos, J.F. MacGregor, Monitoring batch processes using multiway principal component
analysis, AIChE Journal 40 (8) (1994) 1361–1375.
[46] J.-H. Cho, J.-M. Lee, S.W. Choi, D. Lee, I.-B. Lee, Fault identification for process monitor-
ing using kernel principal component analysis, Chemical Engineering Science 60 (1) (2005)
279–288.
[47] J.-M. Lee, C. Yoo, I.-B. Lee, Statistical process monitoring with independent component anal-
ysis, Journal of Process Control 14 (5) (2004) 467–485.
Introduction Chapter | 1 17
[48] W. Ku, R.H. Storer, C. Georgakis, Disturbance detection and isolation by dynamic principal
component analysis, Chemometrics and Intelligent Laboratory Systems 30 (1) (1995) 179–196.
[49] K. Chow, K. Tan, H. Tabe, J. Zhang, N. Thornhill, Dynamic principal component analysis
using integral transforms, in: AIChE Annual Meeting, Miami Beach, vol. 13, 1999.
[50] M. Kano, K. Nagao, S. Hasebe, I. Hashimoto, H. Ohno, R. Strauss, B.R. Bakshi, Comparison of
multivariate statistical process monitoring methods with applications to the Eastman challenge
problem, Computers  Chemical Engineering 26 (2) (2002) 161–174.
[51] W. Li, H.H. Yue, S. Valle-Cervantes, S.J. Qin, Recursive PCA for adaptive process monitoring,
Journal of Process Control 10 (5) (2000) 471–486.
[52] S. Yin, X. Gao, H.R. Karimi, X. Zhu, Study on support vector machine-based fault detection
in Tennessee Eastman process, in: Abstract and Applied Analysis, vol. 2014, Hindawi, 2014.
[53] M. Namdari, H. Jazayeri-Rad, S.-J. Hashemi, Process fault diagnosis using support vector
machines with a genetic algorithm based parameter tuning, Journal of Automation and Control
2 (1) (2014) 1–7.
[54] Z.B. Sahri, R.B. Yusof, Support vector machine-based fault diagnosis of power transformer
using k nearest-neighbor imputed DGA dataset, Journal of Computer and Communications
2 (09) (2014) 22.
[55] A. Dairi, T. Cheng, F. Harrou, Y. Sun, T. Leiknes, Deep learning approach for sustainable
WWTP operation: a case study on data-driven influent conditions monitoring, Sustainable
Cities and Society 50 (2019) 101670.
[56] F. Harrou, A. Dairi, Y. Sun, F. Kadri, Detecting abnormal ozone measurements with a deep
learning-based strategy, IEEE Sensors Journal 18 (17) (2018) 7222–7232.
[57] P. Malhotra, L. Vig, G. Shroff, P. Agarwal, Long short term memory networks for anomaly
detection in time series, in: Proceedings, Presses universitaires de Louvain, 2015, p. 89.
[58] A. Dairi, F. Harrou, M. Senouci, Y. Sun, Unsupervised obstacle detection in driving environ-
ments using deep-learning-based stereovision, Robotics and Autonomous Systems 100 (2018)
287–301.
[59] A. Dairi, F. Harrou, Y. Sun, M. Senouci, Obstacle detection for intelligent transportation sys-
tems using deep stacked autoencoder and k-nearest neighbor scheme, IEEE Sensors Journal
18 (12) (2018) 5122–5132.
[60] D.M. Powers, Evaluation: from precision, recall and F-measure to ROC, informedness,
markedness and correlation, Journal of Machine Learning Technology 2 (2011) 37–63.
[61] D.L. Olson, D. Delen, Advanced Data Mining Techniques, Springer Science  Business Me-
dia, 2008.
Chapter 2
Linear latent variable regression
(LVR)-based process monitoring
2.1 Introduction
With the advancement in instrumentation, data acquisition, and rapid develop-
ment in the “Internet-of-Things” technology, which connects a large number of
digital devices, enormous amounts of information have become available from
anywhere at any time, from a multitude of smart devices. Indeed, large datasets
are produced by the collection of large number of measurements from modern
engineering and environmental processes. Exploiting these measurements with
a certain level of redundancy, it becomes feasible to detect abnormal change
and locate its sources in the inspected process. However, in the absence of ef-
fective tools, the information in these datasets cannot be suitably extracted and
exploited for inference and process monitoring.
Over the past decade, the necessity for prediction and fault-detection tools
has resulted in the design of several fault-detection mechanisms, which belong
to either model-based (or analytical) or data-driven methods [1,2]. Analytical
models, based on ideal hypotheses that utilize first principles, could theoreti-
cally explain a system’s behavior; however, they need prior calibration of model
parameters, which is challenging and costly in high-dimensional cases and may
result in ill-conditioning problems [3]. Data-driven approaches can perform sys-
tematic and objective exploration, visualization, and interpretation of data, can
identify essential factors, features, or patterns, and can endorse and optimize
data-supported decision-making [4]. Data-based techniques carry information
on faults by extracting relevant features from data. Data-driven approaches
are more currently commonly applied in engineering and petrochemical pro-
cesses [5]. For instance, in the petrochemical industry where soft-sensors are
widely used, billions of dollars were once lost annually because of the oc-
currence of faults [6]. Environmental data have been exploited by data-driven
approaches for anomaly detection in, for example, meteorological signals [7]
or monitoring of sludge bulking in wastewater treatment plants (WWTPs) [8].
For instance, fault detection in chemical process industries is challenging due
to the large number of variables involved, the dynamic characteristics and noisy
measurements that occur in these processes. Indeed, a large number of variables
leads to collinearity, which increases the uncertainty about the model parame-
ter estimates. The latent variable regression (LVR) model is a commonly used
Statistical Process Monitoring using Advanced Data-Driven and Deep Learning Approaches
https://doi.org/10.1016/B978-0-12-819365-5.00008-5
Copyright © 2021 Elsevier Inc. All rights reserved.
19
20 Statistical Process Monitoring
modeling framework to remedy such problems. The LVR model can deal with
collinearity among variables, by constructing a model from a reduced number of
variables (which are a linear combination of the original variables) called latent
variables or principal components. This approach results in well-conditioned
models [9,10]. LVR model estimation techniques include principal component
regression (PCR) [11,9] and partial least squares (PLS) [12,13].
The organization of this chapter is as follows. In Sect. 2.2, we present a brief
introduction to inferential modeling methods, including full rank models and
latent variable regression (LVR) techniques. The presented full rank modeling
techniques include ordinary least squares (OLS) regression and ridge regression
(RR), while the latent variable regression techniques include PCR and PLS.
Since the conventional LVR models are static and more appropriate for han-
dling steady-state processes, the dynamic version of the LVR models is also
briefly presented. Section 2.3 is devoted to an overview of some common statis-
tical techniques that are applied in statistical process monitoring. Specifically,
this section presents the basic univariate monitoring schemes, namely Shewhart,
exponentially-weighted moving average (EWMA), cumulative sum (CUSUM),
generalized likelihood ratio (GLR), and distribution-based algorithms, and we
discuss their limitations. Section 2.4 presents the general framework of fault de-
tection based on LVR approaches. In Sect. 2.5, we discuss one of the commonly
used fault isolation approaches, namely contribution plots. We also present an
innovative method that uses the radial visualization RadViz to perform root
cause diagnosis in Sect. 2.5. The main objective of this chapter is to investi-
gate these multivariate monitoring schemes (PCA and PLS) and their practical
applications. In Sect. 2.6, we assess the performances of the developed inferen-
tial modeling technique using simulated and practical examples. In addition, we
evaluate the method of using PCA-based anomaly detection by importing seven
years of influent characteristics (ICs) data from a coastal municipal WWTP
where multiple abnormal events occurred. The chapter concludes with a dis-
cussion and remarks in Sect. 2.7.
2.2 Development of linear LVR models
Measurements from engineering and industrial processes are usually massive
and include a large number of (high-dimensional) variables because of the com-
plexity of the processes involved. Using traditional regression models like least
squares are unsuited to provide reliable predictions due to high colinearity and
ill-conditioning issues. There are a large variety of estimation techniques to ad-
dress this modeling problem, including full-rank methods and latent variable
regression methods. In this section, we present the basic theoretical perspective
of some commonly used linear regression models that are used to design pro-
cess monitoring algorithms, namely, OLS, RR, PCR, and PLS. In this section,
we review the traditional linear correlation models for multivariate data that are
the basis for designing fault detection methods. The basic concepts of each ap-
proach and discussion on their advantages and weaknesses are presented.
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 21
2.2.1 Full rank methods
2.2.1.1 Ordinary least squares regression
We regress the data matrix y ∈ Rn (the measured output) to X ∈ Rn×m (selected
group of process variables whose values are known precisely) as
Y = Xβ + , (2.1)
where β ∈ Rn is a vector of unknown constants to be estimated, and  ∈
N(0,σIn) is a zero-mean Gaussian noise with the known variance. The essence
of the ordinary least squares (OLS) regression is to estimate the model parame-
ters by minimizing the following objective function [14,11]:
min
β

Xβ − y2
2

. (2.2)
The unbiased maximum likelihood estimate of β, if the matrix XT X is nonsin-
gular and the elements of noise  are uncorrelated [15,16], is

βOLS =

XT
X
−1
XT
y. (2.3)
When the input process variables are highly correlated, the variances of the
OLS regression coefficients become very high, and the estimates may be inac-
curate. In other words, the determinant of the matrix XT X is then very close to
zero, hence giving unstable values for the variance of the estimated regression
parameters (V (b) = σ2(XT X)−1). Moreover, the parameter estimates change
considerably if elements of y are changed slightly and thus y is poorly predicted
when utilized with new X measurements.
In summary, when (XT X) is close to being singular, the variance of β̂ is
inflated, which also results in increasing the uncertainty about its estimation.
Even if numerical issues can be surpassed via methods such as pseudo-inverse,
the statistical features of the model are not suited to inflated variance. One way
to cope with this collinearity problem and the ill-conditioning of X is through
regularization methods, such as ridge regression (RR), which is presented in the
following.
2.2.1.2 Ridge regression (RR)
As discussed above, in cases when the input process variables are highly cross-
correlated, the OLS method can result in a poor estimate of the regression
coefficients. One way to mitigate this problem is to relax the condition that

βOLS should be an unbiased estimator. There are several methods in the liter-
ature to obtain biased estimators of regression coefficients. The RR approach,
which was originally introduced by Hoerl and Kennard [17], is commonly used
to alleviate the collinearity problem and tuned to obtain good prediction models
22 Statistical Process Monitoring
by trading-off bias and variance. The RR estimator is computed by minimizing
the following objective function [17].
min
β

Xβ − y2
2 + λβ2
2

, (2.4)
β̂RR =

XT
X + λI
−1
XT
y, (2.5)
where λ is a positive constant, and I ∈ Rm×m is the identity matrix. Note that
from Eq. (2.5), the term λI added to (XT X) enhances the conditioning of the
estimation problem. Of course, the RR estimator, β̂RR, is basically a linear trans-
formation of the OLS estimator β̂OLS. Eq. (2.5) can be rewritten as
β̂RR =

XT
X + λI
−1
(XT
X)β̂OLS = Zλβ̂OLS. (2.6)
Thus, the RR estimator is a biased estimator since
E(β̂RR) = E(Zλβ̂OLS) = Zλβ. (2.7)
The covariance matrix of β̂RR is expressed as
V (β̂RR) = σ2

XT
X + λI
−1
XT

XT
X + λI
−1
. (2.8)
The basic concept when using RR is to select a value of λ that guarantees
a greater decrease in the variance term than an increase in the squared bias. If
this is accomplished, the MSE of β̂RR will be less than the variance of β̂OLS.
In [18], it has been demonstrated that there is a positive constant λ for which the
MSE β̂RR is less than the variance of β̂OLS.
In practice, various procedures have been developed to choose the value of λ.
For instance, in [18] the authors proposed to determine a suitable value of λ by
inspecting the ridge trace, which is a plot of elements of β̂RR versus λ, where
λ ∈ [0 1]. The aim is to determine a reasonably small value of λ for which
the ridge estimates are stable. In [19], an appropriate selection of λ is given
as, κ = m
σ2
βT
OLSβOLS
, where βOLS and 
σ2 are determined using a least squares
solution.
Of course, these models can be used as an alternative to mitigate the ill-
conditioning problem. However, they are not easily interpretable, whereas an
important purpose of data modeling is interpretability; see [15,16,18].
2.2.2 Latent variable regression (LVR) models
Multivariate statistical projection methods such as PCA, PCR, and PLS are com-
monly utilized to handle a high number of highly correlated process variables by
conducting regression on a smaller number of transformed variables (i.e., latent
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 23
or principal components), which are linear combinations of the raw measure-
ments. After computing the latent variables in the process being investigated,
these fewer number of variables are then used instead of using the raw data. This
latent variables regression (LVR) approach generally results in well-conditioned
parameter estimates and reliable model predictions [20]. In this section, these
LVR methods are briefly presented. For more details, refer to [21–23]. Before
presenting PCR and PLS regression methods, we present PCA, which is a pop-
ular multivariate dimensionality-reduction approach.
2.2.2.1 Principal component analysis
Feature extraction with PCA
PCA, a dimensionality-reduction approach, is an increasingly popular model-
ing framework for discovering relevant and crucial features from multivariate
data. The foundation of PCA can be tracked back to Pearson (1901) [24] and
Hotelling (1933) [25]. By projecting process variables into a lower-dimensional
subspace, PCA reveals the inherent cross-correlation among process vari-
ables [26]. In this regard, PCA latent variables or principal components (PCs)
(also called scores), which consist of linear combinations of physical variables,
can efficiently describe a process in a reduced subspace. PCA-based methods
are currently more commonly applied in data compression [27], pattern recog-
nition, data smoothing, classification [28], and fault detection [29].
PCA does not differentiate between input data X and output data Y. It is ap-
plied to one data set that contains all the process variables involved in the prob-
lem. Here, X is used to represent the whole data set. Let X =

xT
1 ,...,xT
n
T
∈
Rn×m be a dataset gathered from a process having n observations and m vari-
ables.
Let us first discuss an important point before going into any further in de-
tail. When performing PCA on multivariate data, it is assumed that all the data
are on a comparable scale. If scaling of the data is omitted, then certain vari-
ables in the data have to be adjusted to avoid the occurrence of misleading
dominance. Scaling of data changes the covariance matrix and consequently
affects the principal components. Scaling is important for both the variance and
mean adjustments [30]. When the process variables are measured with different
units, the purpose of the usual scaling is to make the variance the same (i.e., to
give standard units), which gives a correlation matrix. Other variance-stabilizing
transformations, such as log transformation, are used in the literature. The most
commonly used scaling converts the variables to zero mean and unit variance.
Each variable xj ∈ Rn, j = 1,...,m, should be scaled to have zero mean and
unit variance prior to using PCA:
xj,s =
xj − μxj
σxj
. (2.9)
24 Statistical Process Monitoring
From now on, we consider that autoscaled data is zero-mean centered with
unit variance,
X =
⎛
⎜
⎜
⎜
⎝
x1,1 x1,2 ... x1,m
x2,1 x2,2 ... x2,m
.
.
.
.
.
.
...
.
.
.
xn,1 xn,2 ... xn,m
⎞
⎟
⎟
⎟
⎠
n×m
.
The scaled data X can be expressed using singular value decomposition (SVD)
as a product of two factors:
X = t1wT
1 + t2wT
2 + ··· + tmwT
m = TWT
, (2.10)
where T ∈ Rn×m represents a matrix of the principal components (PCs) and
W ∈ Rm×m is the loading matrix. The PCs are linear combinations of the orig-
inal data, and each PC is not correlated with the others. The loading matrix is
frequently calculated through SVD of the covariance matrix S of the data X:
S =
1
n − 1
XT
X = WWT
with WWT
= WT
W = In, (2.11)
where,  = diag(σ2
1 ,...,σ2
m) is a matrix comprising eigenvalues of S arranged
diagonally in decreasing magnitude. The eigenvalues λi are equal to the variance
of the PC ti, σ2
i (i.e., var(wT
i X) = λi).
In the presence of cross-correlated multivariate data, X, the first l PCs (where
k  m) are sufficient for preserving relevant information in the original data.
One important step in PCA model development is to select the number of PCs.
Criteria for selecting the number of principal components to use
A core step in designing LVR approaches is selecting by the number of LVs,
l, to appropriately extract relevant information from the received data. In other
words, the prediction performance of the designed LVR model is influenced by
the choice of the number of LVs, l. Accordingly, an appropriate estimation of
the number of LVs is necessary to avoid the problem of the model underfitting
or overfitting the data. Some of these techniques are briefed below:
• The scree test. The scree plot displays the variance caught by every PC against
the number of the PCs [31]. Then, the number of PCs to retain are obtained
by finding the value of the eigenvalue λ corresponding to the profile with an
elbow shape (i.e., the profile is no longer linear). This identification procedure
is easy to visualize but it could be not easy for automatic implementation.
• Parallel analysis. Parallel analysis compares the variance profile to that ob-
tained by assuming independent variables, to determine the number of PCs.
Specifically, l is determined at the point where the two profiles cross [31,32].
• The cumulative percentage variance (CPV) procedure. The CPV procedure
has been commonly employed to find the number of PCs explaining a certain
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 25
percentage of the total variance (e.g., 90%) [31]:
CPV (l) =
l
i=1 λi
m
i=1 λi
× 100. (2.12)
This procedure is attractive since it is intuitive and easy to implement [31].
• Cross-validation. Basically, the key concept of the cross-validation mech-
anism is splitting the data in training datasets for model construction and
testing datasets for model validation [33]. The model is verified using the
test data, and residuals are generated by comparing the estimated values to
the measured values. In the CV approach, the optimum number of PCs is
determined by using Predictive Sum of Squares (PRESS) statistics [33],
PRESSl =
n

i=1
(Xi − X̂l
i)2
, (2.13)
where l is the number of PCs vectors retained to calculate X̂, i.e., the dimen-
sion of the PCs. The dimensionality is determined by finding the number of
PCs corresponding to the minimum of the PRESS [33].
Based on the PCA model, after selecting the appropriate number of PCs
to include in the model, the data matrix X can be expressed as a sum of the
approximated matrix, 
X, and residual data, E (Fig. 2.1),
X = TWT
=
k

i=1
tiwT
i +
m

i=k+1
tiwT
i = 
X + E, (2.14)
where T ∈ Rn×m represents a matrix of the principal components (PCs) and
W ∈ Rm×m is the loading matrix.
FIGURE 2.1 Schematic representation of PCA model.
26 Statistical Process Monitoring
As described above, the orthogonal eigenvectors of the covariance matrix
are equal to the loading matrix W = (v1,w2,...,wm), and eigenvalue λi is the
variance of score ti. The loading matrix can be partitioned into two parts, 
W
and 
W, i.e., W = [
W 
W]. Here 
W = (w1,w2,...,Wl), represents the first l
principal loading vectors (PCs) and 
W = (wl+1,wl+2,...,Wm) represents the
remaining m − l PCs. The partition is shown below:
S =
1
n − 1
XT
X =


W 
W



 0
0 



WT

WT

. (2.15)
The data matrix X can be factorized as
X =
T
  


T | 
T

WT
  


W | 
W
T
= 
T
WT
+ 
T
WT
=

X
  
X
W
WT
+
E
  
X

Im − 
W
WT

.
(2.16)
Here 
T ∈ Rn×l is the PC matrix (n × l), which describes the values of variables
in the transformed (n × l) basic space spanned by 
W, while l is chosen to cap-
ture most of the variability in the data, and no relevant information is lost in E.
The matrices 
W
WT and (Im − 
W
WT ) span the principal component and resid-
ual subspaces, respectively. The row vectors in X and E are orthogonal, i.e.,
X
T
E = 0.
The residual matrix plays a core role in uncovering abnormal features in
process monitoring. For the purpose of anomaly detection, we will evaluate the
generated residuals based on the developed PCA reference model by univariate
or multivariate statistical monitoring schemes. More details on process monitor-
ing are given in the subsequent sections.
2.2.2.2 Principal component regression
PCR is an alternative to OLS regression for addressing the issue of ill-
conditioning or collinearity in multivariate linear regression, which results in
a poor estimation of the model parameters. PCR is a linear regression approach
that can handle highly correlated process variables by latent variables as regres-
sors in the regression. It can be implemented in two steps. The first step in PCR
consists of projecting the input variables via PCA to account for collinearity and
reduce their dimensions. To this end, SVD is frequently employed to compute
the PCs. In the second step, OLS regression is conducted between the retained
PCs and the response [14,11] (Fig. 2.2).
To sum up, the key idea of PCR is to use uncorrelated l score vectors from
the PCA instead of the l columns in X. Specifically, the multicollinearity among
the predictor variables can be eliminated by using a subset of orthogonal PCs
from the input data X via PCA. Then, OLS is performed between the response
variable Y and the retained l PCs of X. From the PCA model, the matrix X can
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 27
FIGURE 2.2 Schematic representation of (A) MLR and (B) PCR models.
be decomposed as follows:
X = TWT
=
l

i=1
tiwT
i +
m

i=l+1
tiwT
i = 
X + E, (2.17)
where T ∈ Rn×m represent a matrix of the PCs and W ∈ Rm×m is the loading
matrix. Then, a subset of these PCs (with the largest variance) are utilized to
build a linear model relating these PCs to the response variable, y, using OLS
regression,
y = 
Tβ̂, (2.18)
where 
T = [t1 ...tl] is the retained PCs (with the largest eigenvalues) used to
construct the model, with l ≤ m; l is selected such that there is no important loss
in process information retained in residuals. The regression matrix β̂ is obtained
by solving the following minimization problem:
min
β


Tβ − y2
2

, (2.19)
β̂ =


TT 
T
−1

TT
y. (2.20)
Note that PCR is equivalent to OLS if all PCs are used in designing the PCR
model (i.e., l = m). In the case of uncorrelated input variables, OLS would be
the first option in regression. Note that all PCs in PCR are determined without
taking the model response into consideration. Next we present another approach
to cope with the multicollinearity problem, which takes into account the input–
output relationship when determining the PCs, called partial least squares (PLS).
2.2.2.3 Partial least squares
This section introduces the PLS regression modeling (also known as the pro-
jection on latent structures), which was first proposed in [34] in the field of
econometrics. Later in [35] a detailed PLS algorithm was provided. In [36], the
geometry of two procedures to perform PLS has been illustrated. This technique
is used in chemometrics and chemical engineering for soft sensor develop-
ment [37], process monitoring, and fault diagnosis.
The capacity of PLS to deal with multivariate input–output data with
collinearity is one of its desirable characteristics [38]. When the matrix XT X
28 Statistical Process Monitoring
is singular or ill-conditioned, PLS determines an optimum pair of LVs in the
input and output data (X and Y) so that these transformed variables have the
largest covariance. Unlike PCR, PLSR exploits the information in input and out-
put variables by using the covariance between them and reducing the impact of
irrelevant variations of input variables. In other words, PLSR is designed using
both PCs of X and Y. Basically, the PLS model is performed by searching a set
of PCs that explains the maximum cross-correlation between X and Y (Fig. 2.3).
FIGURE 2.3 Schematic representation of PLS model.
Consider an input with n samples and m variables, X ∈ Rn×m, and output
with n samples and p variables, Y ∈ Rn×p. PLS extracts the principal com-
ponents iteratively by maximizing the covariance of the extracted principal
components. PLS model development has two components, one is to develop
inner models and the other is to develop outer models [39,40]. Outer models
have a relationship with the inner model such that

X = l
i=1 tpT
i = TPT + G,
Y = l
i=1 uqT
i = UQT + F,
(2.21)
where T ∈ Rn×l and U ∈ Rn×q are matrices of the transformed uncorrelated
variables. The loading matrices of input and output space are P ∈ Rm×l and
Q ∈ Rp×q, respectively. The model residuals are G and F. The number of PCs,
l, is determined by cross-validation.
The retained latent variables of the input and output space are related by the
linear inner model as
U = TB + H, (2.22)
where B is a regression matrix linking the input and response PCs, and H is a
residual matrix. The regression coefficients of B can be obtained by minimiza-
tion of residuals H. The response Y is given as
Y = TBQT
+ F∗
. (2.23)
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 29
Notice that each pair of latent variables in the PLS model (i.e., tj and uj
(j = 1,...,l)) is estimated iteratively [35,41]. Various procedures are developed
in the literature to obtain PLS estimators, including nonlinear iterative partial
least squares (NIPALS) and SIMPLS methods. For more details, refer to [35,36,
34,12].
The first pair of latent variable vectors is calculated so that the following
covariance:
argmax
pi,qi
cov(t1,u1) = tT
1 u1 = pT
1 XT
Yq1 (2.24)
can be maximized with constraints pT
i pi = 1 and qT
i qi = 1.
The first pair (p1,q1) of loading vectors, which represents the dominant di-
rection, is computed so that the covariance between X and Y is maximized.
Then, the first set of latent variable vectors (t1 = Xp1;u1 = Yq1) is obtained by
projecting X data on p1 and Y data on q1 (the outer model). After that, the inner
model can be established between t1 and u1 (
u1 = t1b1).
After the first set of scores and loadings are computed, the residuals of the
input and output variables are calculated as

E1 = X − t1p1,
F1 = Y − u1q1 = Y − t1b1q1.
(2.25)
Overall, PLS iteratively estimates both LVs for X and Y, so that they have
maximal covariance. These pairs of LVs are estimated and added to the model in
an iterative way. The input and output residuals are generated and the procedure
is iterated based on the residual until cross-validation error is minimized [11,35,
34,14]. Fig. 2.4 illustrates the recursive process of determining the LVs in PLS.
FIGURE 2.4 Schematic representation of the recursive procedure to determine the PCs in PLS.
The NIPALS algorithm, which is commonly used to derive PLS models, is
summarized below [42]:
Step 1. Set data X and Y to have mean zero and unit variance
Step 2. Set u equal to a column of Y
30 Statistical Process Monitoring
Step 3. Let w = uT X
uT u
Step 4. Normalize u to have unit length
Step 5. Evaluate the scores, t = Xw
wT w
Step 6. Evaluate the new u vector, u = Yq
qT q
Step 7. Check convergence on u: if YES go to Step 8, if NO go to Step 2
Step 8. Evaluate X loading, p = XT t
tT t
Step 9. Evaluate the residual matrices, E = X − tpT and F = Y − tqT
Step 10. If additional PLS dimensions are necessary then replace X and Y by
E and F, respectively, and repeat Steps 1 through 9.
Since PLS is using a covariance objective function, it frequently needs mul-
tiple LVs even in the case of a single output in Y. However, sometimes an
important part of the LV subspace is orthogonal or irrelevant to the output, de-
spite the fact that the subspace includes large variability of the input data [43].
Thus, to further improve PLS, numerous extensions have been developed such
as orthogonalized PLS [44] and concurrent PLS approaches [45].
Note that the above described LVR methods all exploit the latent structured
relationships between the process variables that are linear and static. They es-
tablish the basic framework for further enhancements to nonlinear or dynamic
LVR modeling.
2.3 Dynamic LVR models
From the above discussion, we have shown that LVR models such as PLS and
PCR can be used to handle multivariate data with collinearity among the vari-
ables by designing a model from a reduced number of variables (which are
a linear combination of the original variables) termed latent variables. These
methods result in well-conditioned models. However, LVR models are static
and ignore process dynamics, which make them unsuitable to catch the tempo-
ral evolution of data. In other words, the use of such methods to select the key
variables is performed by assuming that the variables are uncorrelated in time.
Since many practical data produced from engineering and environmental pro-
cesses are correlated in time, it is necessary to have a model incorporating such
information to deal with process dynamics.
For dynamic processes such as engineering and chemical processes, fre-
quently the actual observations of the process variable depend on past observa-
tions. The application of static LVR approaches (e.g., PLS and PCA) to dynamic
data will not give accurate modeling of the relations among the variables, but
just a linear static approximation. To remedy this limitation and consider the
dynamic information, an augmented process dataset, including previously au-
tocorrelated measurements, should be created. A commonly used approach to
bypass such limitations is dynamic PCA (DPCA), which has been introduced
in [46]. Basically, DPCA is the conventional PCA applied to augmented data
including time-lagged measurements of process variables. Specifically, to de-
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 31
scribe the temporal dynamics, the Hankel matrix of the original data, which is
usually employed in time series modeling, is used in [46]. The augmented data
that includes time-lagged variables at time instance k is
Xz = [X(k) X(k − 1)... X(k − z)]
=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
xT (0) xT (1) ... xT (z)
xT (1) xT (2) ... xT (z + 1)
.
.
.
.
.
.
...
.
.
.
xT (n − z) xT (n − z + 1) ... xT (n)
⎤
⎥
⎥
⎥
⎥
⎥
⎦
, (2.26)
where z is the time lags and its length is related to the past memory entered in
the variables.
The DPCA is applied to the augmented process data matrix in a similar way
to conventional PCA [46]. Indeed, this is basically the same as the static PCA
except that the input data is augmented to include past measurements. The se-
lection of the appropriate number of lagged data plays a key role in DPCA to
appropriately model the process dynamics. For highly nonlinear data, the num-
ber of lags, z, to incorporate in the data may take a higher value to achieve better
linear approximation. DPCA modeling can be outlined in the following steps:
(1) Start with z = 0
(2) Compute the augmented data matrix Xz
(3) Design PCA model using the augmented data
(4) Select the optimal PCs to be kept in the model using some known criteria
such as Cumulative Percent Variance (CPV) approach
(5) Check the autocorrelation function (ACF) of the residuals of the PCA model
(6) If ACF is within the threshold, i.e., the residuals are not correlated, go to
Step (8), otherwise, proceed
(7) Increment the number of lags z = z + 1 and go to Step (2)
(8) End
The essence of DPCA is to apply PCA using time-lagged data, thus both
the linear static and dynamic relationships among process variables are cap-
tured. To sum up, DPCA exploits both the desirable characteristics of PCA to
high-dimensional data and the flexibility of the time series model, Autoregres-
sive Integrated Moving Average (ARIMA), to capture the time dependency in
data [47,48].
On the other hand, several approaches are designed in the literature to handle
dynamics in multivariate input–output processes based on dynamic PLS. One
approach consists of incorporating a large number of time-lagged input mea-
surements in the input data matrix X, which conducts to a PLS-Finite Impulse
Response (FIR) model [49]. Analogous to DPCA, both the time-lagged data of
the input and output process variables are included in the input data matrix X,
which results in the PLS-Autoregressive Moving Average (ARMA) model. Both
32 Statistical Process Monitoring
PLS-FIR and PLS-ARMA models need a large augmentation in the dimension
of the input data matrix X, which may be cumbersome to handle. To remedy
this difficulty, in [50] a simple and flexible method is presented permitting the
inclusion of the process dynamics as part of inner PLS model and avoiding the
consideration of significant time-lagged input and output variables in the input
data. The key benefit of this approach is that no lagged variables are used in
the PLS outer model. In [51], a dynamic Autoregressive with Exogenous Terms
(ARX) or Hammerstein model is used to account for process dynamics in PLS,
for inner relation between ti and ui instead of a static model.
The aforementioned LVR methods are all extensively used for multivariate
process monitoring. To do so, these LVR methods are combined with fault detec-
tion indices such as the Hotelling T 2 and the squared prediction error schemes.
The general framework of LVR-based process monitoring strategies is presented
in Sect. 2.5.
2.4 Process monitoring methods
Detecting anomalies in industrial processes plays a core role in developing
efficient production systems that have acceptable performance and meet the de-
sired requirements and specifications. Without an efficient detection procedure,
chemical processes such as distillation columns would be damaged by unex-
pected faults and could result in financial losses and serious damages. Univariate
statistical monitoring schemes are widely applied in numerous production pro-
cesses as tools for checking product quality when the inspected variable is
univariate. The goal of statistical process monitoring schemes is to uncover any
deviation of the supervised process from the desired performance. For many
decades, these univariate schemes were frequently applied in quality control
applications, and now they have been extended to many other fields, such as
air quality [29], cybersecurity [52], healthcare systems [53,54], and economics
[53]. In this section, we describe the essence of some basic univariate monitor-
ing schemes, such as Shewhart, CUSUM, EWMA, and GLR charts.
2.4.1 Univariate chart for process monitoring
In this subsection, we summarize univariate process monitoring charts including
Shewhart, CUSUM, EWMA, and GLR.
2.4.1.1 Shewhart-based monitoring scheme
Shewhart introduced the Shewhart monitoring scheme in 1931 to supervise the
product quality at different phases of a manufacturing process [55]. In practice,
this monitoring chart is one of the most frequently applied statistical quality con-
trol schemes [55]. Instead of waiting to examine the quality of the final product,
early inspection and monitoring would enable companies save costs with re-
gards to inspection and rejection of the finished product [55]. This would help
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 33
ensure that uniform quality of products is maintained, thus leading to increased
economic benefits and improved time efficiency. Statistical decisions in She-
whart schemes are based on current observations and no memory about the past
is considered. Thus, they are suitable for detecting relatively large faults. The
Shewhart chart is used online to evaluate the process performance based on the
current measured data.
Consider that (x1,x2,...,xn) are individual observations received from the
supervised process. Shewhart schemes are designed under the assumption that
the measurements are uncorrelated and the data under normal operating condi-
tions are normally distributed [55]. If these two assumptions are verified, the
control limits of the Shewhart chart are defined as [55,56]
UCL,LCL = μ0 ± z1− α
2
σ0, (2.27)
UCL and LCL denote the upper control limit (UCL) and the lower control limit
(LCL) while z1− α
2
is the (1 − α
2 )th quartile of the Gaussian distribution N(0,1).
Also μ0 and σ0 represent the mean and standard deviation of the measurements
without anomalies. The term z1− α
2
is usually called the width of the control
limits and it is generally fixed to be 3, which is equivalent to a false alarm rate
of 0.27%. The Shewhart scheme flags a fault if
xt  LCL or xt  UCL. (2.28)
In summary, the performance of the Shewhart charts is limited when utilized
to sense small changes in the process mean. They consider only the current
measurement of the process, thus they are classified as detection charts with-
out memory. To tackle this deficiency, improved mechanisms with increased
process memory would be very helpful. Memory-type monitoring approaches,
such as CUSUM, moving average, and EWMA charts, are designed to detect
small changes.
2.4.1.2 Cumulative sum (CUSUM)-based monitoring schemes
Cumulative sum (CUSUM) monitoring schemes are well reputed in fault de-
tection and were first introduced by Page [57]. Compared to Shewhart-type
approaches, the CUSUM schemes are a suitable alternative for detecting small
changes, which are often a major concern in process monitoring [57]. Instead of
using only the current measurement, the CUSUM scheme exploits all the avail-
able information from previous and current measurements to uncover faults. The
CUSUM statistic (Si) is determined as [58]
Si =
i

j=1
(xj − μ0), (2.29)
where Si is the cumulative sum of all available measurements including the
current and previously received measurements, and μ0 is the fault-free process
34 Statistical Process Monitoring
mean. The CUSUM decision function is obtained in a recursive manner as [58]
Si = (xi − μ0) + Si−1. (2.30)
One-sided CUSUM statistic is calculated as follows [58]:
Si =
i

j=1

xj − (μ0 + k)

, (2.31)
where k is a parameter that is employed as a reference for detecting a change
in the process mean. If St changes into a negative value, the CUSUM decision
statistic is automatically set to zero. A fault is flagged out when the CUSUM
statistic St overpasses the decision threshold, H. In practice, the threshold H
of 4σ or 5σ, which results in good detection of a deviation of about 1σ in the
process mean, is recommended [59]. Here σ is the standard deviation of the
monitored variable.
Numerous variations of the CUSUM exist; one of the most common forms
is the two-sided CUSUM (tabular) [56]. The recursive formula for high and low
side shifts are:
S+
t = max

0,xt − (μ0 + k) + S+
t−1

, (2.32)
S−
t + = max

0,(μ0 − k) − xt + S−
t−1

, (2.33)
where the statistics S+ and S− are respectively the upper and lower one-sided
CUSUMs, and S+
0 = S−
0 = 0, μ0. A fault is declared if either S−
t or S+
t exceeds
the decision threshold H = hσ, where h relies on the shift to be detected.
2.4.1.3 Exponentially weighted moving average (EWMA) schemes
While CUSUM schemes consider all available measurements with equal weight
in process monitoring, EWMA schemes exponentially weight the measure-
ments based on their importance in characterizing the process [60]. The EWMA
shows suitable performance in detecting small changes in the process mean. The
EWMA scheme was first designed by Roberts [61], and was frequently applied
in quality control and process monitoring [56]. The EWMA monitoring statistic
is defined as follows:

z0 = μ0,
zt = γ xt +

1 − γ

zt−1,
(2.34)
where z0 = μ0 is the mean of fault-free data, γ is a weighting factor with the
range 0  γ ≤ 1, which defines the temporal memory of the EWMA scheme.
Eq. (2.34) indicates that the EWMA statistic utilizes all the available informa-
tion to sense small anomalies. To highlight this point, the EWMA statistic, zt ,
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 35
can be expressed recursively as:
zt = γ zt +

1 − γ

zt−1
  

γ zt−1 +

1 − γ

zt−2

= γ zt + γ (1 − γ )zt−1 + (1 − γ )2
zt−2.
= γ zn + γ (1 − γ )zn−1 + γ (1 − γ )2
zn−2 + ···
+ γ (1 − γ )n−1
z1 + (1 − γ )n
z0. (2.35)
The EWMA decision function in (2.35) can be expressed in compact form as
zt = γ
n

t=1
(1 − γ )n−t
zt + (1 − γ )n
z0, (2.36)
where γ (1−γ )n−t denotes the weight for zt , which exponentially decreases for
previous observations. The parameters L and γ play an important role in design-
ing the EWMA scheme [56,54]. The value of L is frequently fixed in practice to
be 3, which implies a false alarm rate of 0.27%. Generally, a choice of small val-
ues of γ (i.e., where less importance is placed on the newer observations) is used
in order to extend the sensitivity to small deviations, while the use of large val-
ues of γ (i.e., EWMA with short memory) is suited for detecting larger changes
in the process mean [56,62,56]. For the purpose of detecting small changes, in
practice the value of γ is usually selected in the interval [0.1,0.3] [62,56].
In the absence of anomalies, the distribution of the EWMA statistic is
zt ∼ N(μ0,σ2
zt
), where σzt = σ0

γ
(2−γ ) [1 − (1 − γ )2t ] and σ0 represents the
standard deviation of the fault-free measurements. However, when a mean shift
occurs at the time 1 ≤ τ ≤ n, the distribution of the EWMA statistic is computed
as zt ∼ N(μ0 +

1 − (1 − γ )n−τ+1

(μ1 − μ0),σ2
zt
). Accordingly, when faults
happen, the mean of the EWMA decision function, zt , is a weighted average of
μ0 and μ1, and the weight related to μ1 becomes large when n is large. Then,
this clearly highlights that the statistic zt provides pertinent information about
the mean shift. The EWMA scheme flags faults when the monitoring statistic zt ,
as given in (2.34), exceeds the upper and lower control limits defined as
UCL = μ0 + Lσzt ,
LCL = μ0 − Lσzt ,
(2.37)
where μ0 is the targeted mean, L is the width of the control limit, and σ is the
standard deviation of the fault-free or preliminary data set.
From σzt , it can be seen that when t becomes larger, the term [1 − (1 − γ )2t ]
is asymptotically equivalent to unity. In other words, the control limits attain
36 Statistical Process Monitoring
their steady-state values [56]:
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎨
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎩
UCL = μ0 + L
σ
  
σ0
%
γ
(2 − γ )
,
LCL = μ0 − L
σ
  
σ0
%
γ
(2 − γ )
.
(2.38)
As described previously, in the Shewhart schemes, anomaly detection is
based only on the current measurement and all past measurements are ignored
(Fig. 2.5). Accordingly, these schemes provide unsatisfactory monitoring re-
sults when used for sensing small changes in the process mean. This limitation
can be mitigated by incorporating the information from the actual and the past
measurements in the decision process such as in EWMA and CUSUM schemes
(Fig. 2.5). In the CUSUM scheme, information from all available measurements
are exploited and the same weight is assigned to all observations (Fig. 2.5). On
the other hand, the EWMA scheme, which is designed by using an exponentially
weighted average of all available measurements, is also sensitive in detecting
small changes in the process mean.
FIGURE 2.5 Univariate process monitoring charts.
In EWMA schemes, a larger value of the smoothing parameter is suited
to rapidly detect faults with a large amplitude, while a smaller value can ef-
ficiently detect small faults in the mean of the process [60]. Therefore, by
using a unique value for the smoothing parameter, monitoring-based EWMA
schemes cannot reach a good detection capacity for both small and large faults
simultaneously [60]. Since the univariate EWMA control schemes assume fixed
thresholds, which may not be suitable for dealing with nonstationary (or time-
varying) data. Therefore, several adaptive EWMA and CUSUM methods have
been designed in the literature by allowing the thresholds of these methods to
vary online to account for the changing nature of the data [63,64]. The idea
behind the adaptive EWMA is to adapt the weight of the past observations, ac-
cording to the magnitude of the error (et = xt − zt−1, see (2.39)), and to detect
Linear latent variable regression (LVR)-based process monitoring Chapter | 2 37
in a more balanced way faults with different sizes:
zt = γ xt +

1 − γ

zt−1
= γ
et
  
(xt − zt−1)+ zt−1. (2.39)
Also, several adaptive CUSUM (ACUSUM) schemes have been developed in
the literature to achieve suitable detection performance covering a range of mean
change magnitudes [64,65]. For instance, the basic idea behind the ACUSUM
proposed in [64] is to update the reference value (K) in CUSUM based on the
EWMA estimate.
2.4.1.4 Generalized likelihood ratio (GLR) hypothesis testing
approach
Theabove-described monitoring schemes(i.e., Shewhart, CUSUM, and EWMA)
are more or less suited to some specific range of fault amplitudes. For instance,
Shewhart-type approaches provide satisfactory detection of large faults, but they
are insensitive to small changes in the process mean [54,59]. While CUSUM
and EWMA schemes are effective in detecting small changes, they fail to detect
large faults. However, in practice, the magnitude of occurring faults is unknown.
Accordingly, it is desirable to automatically detect a large range of faults and
thus reduce the rate of missed detection. To this end, one approach to achieve
a reliable detection of different sizes of process anomalies is to base the moni-
toring scheme on a generalized likelihood ratio test (usually called GLR charts)
[66]. The benefits of the GLR approach are its efficiency in separating com-
posite hypotheses, simplicity, and absence of complex computations. Extensive
literature has been dedicated to studying GLR properties. Signficant efforts have
been devoted to establishing different asymptotic optimality properties of this
hypothesis testing approach and can found in [67–71]. The GLR detector is
widely used in several applications including air quality monitoring [29] and
train safety [66].
Here, we consider problems related to binary composite hypothesis test-
ing. When testing two composite hypotheses in which their corresponding data
probability density functions (PDFs) comprise unknown parameters, the GLR
approach is commonly utilized for separating the two possibilities. The null hy-
pothesis generally defines the nominal operating situation, while the alternatives
characterize departures whose presence should be either confirmed or discarded.
The essence of the GLR approach is to maximize the likelihood ratio statistic
over all possible faults to decide between two composite hypotheses [68–71].
In other words, the aim of the GLR approach is to separate two composite hy-
potheses, H0 and H1, based on the observed data.
For the purpose of anomaly detection, let’s consider an observation vector
Y = [y1,y2,...,yn] ∈ Rn being generated by one of these Gaussian distribu-
38 Statistical Process Monitoring
tions:

H0 : Y ∼ N(0,σ2In),
H1 : Y ∼ N(θ = 0,σ2In),
(2.40)
where θ is the value of the anomaly and σ2  0 is the variance. In this chapter,
the null hypothesis, H0, represents the fault-free situation, and the alternative
hypothesis, H1, represents the situation with potential faults. Generally speak-
ing, to decide between the two hypotheses, the GLR approach compares the
decision statistic, L(Y), to the control limit, h(α):
δ(Y) =
⎧
⎨
⎩
H0 if L(Y) = 2log
sup
θ∈Rn
fθ (Y)
fθ=0(Y)  h(α),
H1 else.
(2.41)
The GLR charting statistic, L(Y), is given as
L(Y) = 2logsup
θ

exp

−
Y − θ2
2
2σ2

/exp

−
Y2
2
2σ2

, (2.42)
where  · 2 is the Euclidean norm and fθ (Y) = 1
(2π)
n
2 σn
exp
'
− 1
2σ2 Y − θ2
2
(
is the pdf of Y. Then, (2.42) can be expressed as
L(Y) =
1
σ2
min
θ
Y − θ2
2 + Y2
2
)
=
1
σ2
'
Y − 
θ2
2 + Y2
2
(
. (2.43)
After the estimation of θ as 
θ = argmin
θ
Y − θ2
2 = Y, L(Y) can be expressed
as
L(Y) =
1
σ2
Y2
2. (2.44)
The control limit, h(α), is defined to achieve the desired probability of false
alarms, selected a priori:
P0 (L(Y) ≥ h(α)) =
* ∞
h
f0(y)dy = 1 − Fχ2
1
(h) = α. (2.45)
The power function of the GLR approach is determined as
βδ∗ (c2
) = Pθ (δ∗
(Y) = H1) = 1 − F1,γ (θ)(h), (2.46)
where F1,γ (Y) is the non-central χ2(1,γ ) distribution with one degree of free-
dom, and the noncentrality parameter γ (θ) = 1
σ2 P⊥
H θ2
2.
In summary, a fault is flagged by the GLR approach when the decision statis-
tic, L(Y), exceeds the control limit, h(α). Otherwise, the supervised process is
performing normally.
Other documents randomly have
different content
with active links or immediate access to the full terms of the Project
Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or
expense to the user, provide a copy, a means of exporting a copy, or
a means of obtaining a copy upon request, of the work in its original
“Plain Vanilla ASCII” or other form. Any alternate format must
include the full Project Gutenberg™ License as specified in
paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,
performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or providing
access to or distributing Project Gutenberg™ electronic works
provided that:
• You pay a royalty fee of 20% of the gross profits you derive
from the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who
notifies you in writing (or by e-mail) within 30 days of receipt
that s/he does not agree to the terms of the full Project
Gutenberg™ License. You must require such a user to return or
destroy all copies of the works possessed in a physical medium
and discontinue all use of and all access to other copies of
Project Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of
any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™
electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend
considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for
the “Right of Replacement or Refund” described in paragraph 1.F.3,
the Project Gutenberg Literary Archive Foundation, the owner of the
Project Gutenberg™ trademark, and any other party distributing a
Project Gutenberg™ electronic work under this agreement, disclaim
all liability to you for damages, costs and expenses, including legal
fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR
NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR
BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH
1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK
OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL
NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT,
CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF
YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE.
1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you
discover a defect in this electronic work within 90 days of receiving
it, you can receive a refund of the money (if any) you paid for it by
sending a written explanation to the person you received the work
from. If you received the work on a physical medium, you must
return the medium with your written explanation. The person or
entity that provided you with the defective work may elect to provide
a replacement copy in lieu of a refund. If you received the work
electronically, the person or entity providing it to you may choose to
give you a second opportunity to receive the work electronically in
lieu of a refund. If the second copy is also defective, you may
demand a refund in writing without further opportunities to fix the
problem.
1.F.4. Except for the limited right of replacement or refund set forth
in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO
OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR ANY PURPOSE.
1.F.5. Some states do not allow disclaimers of certain implied
warranties or the exclusion or limitation of certain types of damages.
If any disclaimer or limitation set forth in this agreement violates the
law of the state applicable to this agreement, the agreement shall be
interpreted to make the maximum disclaimer or limitation permitted
by the applicable state law. The invalidity or unenforceability of any
provision of this agreement shall not void the remaining provisions.
1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation,
the trademark owner, any agent or employee of the Foundation,
anyone providing copies of Project Gutenberg™ electronic works in
accordance with this agreement, and any volunteers associated with
the production, promotion and distribution of Project Gutenberg™
electronic works, harmless from all liability, costs and expenses,
including legal fees, that arise directly or indirectly from any of the
following which you do or cause to occur: (a) distribution of this or
any Project Gutenberg™ work, (b) alteration, modification, or
additions or deletions to any Project Gutenberg™ work, and (c) any
Defect you cause.
Section 2. Information about the Mission
of Project Gutenberg™
Project Gutenberg™ is synonymous with the free distribution of
electronic works in formats readable by the widest variety of
computers including obsolete, old, middle-aged and new computers.
It exists because of the efforts of hundreds of volunteers and
donations from people in all walks of life.
Volunteers and financial support to provide volunteers with the
assistance they need are critical to reaching Project Gutenberg™’s
goals and ensuring that the Project Gutenberg™ collection will
remain freely available for generations to come. In 2001, the Project
Gutenberg Literary Archive Foundation was created to provide a
secure and permanent future for Project Gutenberg™ and future
generations. To learn more about the Project Gutenberg Literary
Archive Foundation and how your efforts and donations can help,
see Sections 3 and 4 and the Foundation information page at
www.gutenberg.org.
Section 3. Information about the Project
Gutenberg Literary Archive Foundation
The Project Gutenberg Literary Archive Foundation is a non-profit
501(c)(3) educational corporation organized under the laws of the
state of Mississippi and granted tax exempt status by the Internal
Revenue Service. The Foundation’s EIN or federal tax identification
number is 64-6221541. Contributions to the Project Gutenberg
Literary Archive Foundation are tax deductible to the full extent
permitted by U.S. federal laws and your state’s laws.
The Foundation’s business office is located at 809 North 1500 West,
Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up
to date contact information can be found at the Foundation’s website
and official page at www.gutenberg.org/contact
Section 4. Information about Donations to
the Project Gutenberg Literary Archive
Foundation
Project Gutenberg™ depends upon and cannot survive without
widespread public support and donations to carry out its mission of
increasing the number of public domain and licensed works that can
be freely distributed in machine-readable form accessible by the
widest array of equipment including outdated equipment. Many
small donations ($1 to $5,000) are particularly important to
maintaining tax exempt status with the IRS.
The Foundation is committed to complying with the laws regulating
charities and charitable donations in all 50 states of the United
States. Compliance requirements are not uniform and it takes a
considerable effort, much paperwork and many fees to meet and
keep up with these requirements. We do not solicit donations in
locations where we have not received written confirmation of
compliance. To SEND DONATIONS or determine the status of
compliance for any particular state visit www.gutenberg.org/donate.
While we cannot and do not solicit contributions from states where
we have not met the solicitation requirements, we know of no
prohibition against accepting unsolicited donations from donors in
such states who approach us with offers to donate.
International donations are gratefully accepted, but we cannot make
any statements concerning tax treatment of donations received from
outside the United States. U.S. laws alone swamp our small staff.
Please check the Project Gutenberg web pages for current donation
methods and addresses. Donations are accepted in a number of
other ways including checks, online payments and credit card
donations. To donate, please visit: www.gutenberg.org/donate.
Section 5. General Information About
Project Gutenberg™ electronic works
Professor Michael S. Hart was the originator of the Project
Gutenberg™ concept of a library of electronic works that could be
freely shared with anyone. For forty years, he produced and
distributed Project Gutenberg™ eBooks with only a loose network of
volunteer support.
Project Gutenberg™ eBooks are often created from several printed
editions, all of which are confirmed as not protected by copyright in
the U.S. unless a copyright notice is included. Thus, we do not
necessarily keep eBooks in compliance with any particular paper
edition.
Most people start at our website which has the main PG search
facility: www.gutenberg.org.
This website includes information about Project Gutenberg™,
including how to make donations to the Project Gutenberg Literary
Archive Foundation, how to help produce our new eBooks, and how
to subscribe to our email newsletter to hear about new eBooks.
Welcome to our website – the perfect destination for book lovers and
knowledge seekers. We believe that every book holds a new world,
offering opportunities for learning, discovery, and personal growth.
That’s why we are dedicated to bringing you a diverse collection of
books, ranging from classic literature and specialized publications to
self-development guides and children's books.
More than just a book-buying platform, we strive to be a bridge
connecting you with timeless cultural and intellectual values. With an
elegant, user-friendly interface and a smart search system, you can
quickly find the books that best suit your interests. Additionally,
our special promotions and home delivery services help you save time
and fully enjoy the joy of reading.
Join us on a journey of knowledge exploration, passion nurturing, and
personal growth every day!
ebookbell.com

More Related Content

PDF
Monitoring Multimode Continuous Processes: A Data-Driven Approach Marcos Quiñ...
PDF
Multivariate Statistical Process Control Process Monitoring Methods And Appli...
PDF
Download full ebook of Handbook Of Statistics Vol 22 instant download pdf
PDF
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
PDF
Another Adaptive Approach to Novelty Detection in Time Series
PDF
A KPI-based process monitoring and fault detection framework for large-scale ...
PDF
Distillation Column Process Fault Detection in the Chemical Industries
PDF
Fault Detection in the Distillation Column Process
Monitoring Multimode Continuous Processes: A Data-Driven Approach Marcos Quiñ...
Multivariate Statistical Process Control Process Monitoring Methods And Appli...
Download full ebook of Handbook Of Statistics Vol 22 instant download pdf
1026332_Master_Thesis_Eef_Lemmens_BIS_269.pdf
Another Adaptive Approach to Novelty Detection in Time Series
A KPI-based process monitoring and fault detection framework for large-scale ...
Distillation Column Process Fault Detection in the Chemical Industries
Fault Detection in the Distillation Column Process

Similar to Statistical Process Monitoring Using Advanced Datadriven And Deep Learning Approaches Theory And Practical Applications 1st Edition Fouzi Harrou (20)

PDF
Modeling full scale-data(2)
PPTX
Outlier Detection in Data Mining An Essential Component of Semiconductor Manu...
PDF
ECG SIGNAL DENOISING USING EMPIRICAL MODE DECOMPOSITION
PDF
Multimode system condition monitoring using sparsity reconstruction for quali...
PDF
A Parametric Framework For Modelling Of Bioelectrical Signals 1st Edition Yar...
PDF
A Parametric Framework For Modelling Of Bioelectrical Signals 1st Edition Yar...
PPTX
AI for Business Process Management
PDF
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...
PPTX
AI for Business Process Management
DOCX
Industrial Modeling Service (IMS-IMPL)
DOCX
ACTIVE LEARNING TEMPLATES TherapeuTic procedure A11System
PPTX
Service Management: Forecasting Hydrogen Demand
PDF
Fundamentals of industrial quality control Third Edition Aft
PDF
Source separation and machine learning Chien
PDF
Abnormal Patterns Detection In Control Charts Using Classification Techniques
PDF
Classification of ecg signal using artificial neural network
PPTX
Towards IoT-Driven Predictive Business Process Analytics
PPTX
Time series Segmentation & Anomaly Detection
PDF
Data Fusion Methodology and Applications 1st Edition - eBook PDF
PPTX
Statistical Process Control
Modeling full scale-data(2)
Outlier Detection in Data Mining An Essential Component of Semiconductor Manu...
ECG SIGNAL DENOISING USING EMPIRICAL MODE DECOMPOSITION
Multimode system condition monitoring using sparsity reconstruction for quali...
A Parametric Framework For Modelling Of Bioelectrical Signals 1st Edition Yar...
A Parametric Framework For Modelling Of Bioelectrical Signals 1st Edition Yar...
AI for Business Process Management
IRJET- Analysis of Crucial Oil Gas and Liquid Sensor Statistics and Productio...
AI for Business Process Management
Industrial Modeling Service (IMS-IMPL)
ACTIVE LEARNING TEMPLATES TherapeuTic procedure A11System
Service Management: Forecasting Hydrogen Demand
Fundamentals of industrial quality control Third Edition Aft
Source separation and machine learning Chien
Abnormal Patterns Detection In Control Charts Using Classification Techniques
Classification of ecg signal using artificial neural network
Towards IoT-Driven Predictive Business Process Analytics
Time series Segmentation & Anomaly Detection
Data Fusion Methodology and Applications 1st Edition - eBook PDF
Statistical Process Control
Ad

Recently uploaded (20)

PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
master seminar digital applications in india
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
Computing-Curriculum for Schools in Ghana
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Complications of Minimal Access Surgery at WLH
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
master seminar digital applications in india
102 student loan defaulters named and shamed – Is someone you know on the list?
Microbial disease of the cardiovascular and lymphatic systems
VCE English Exam - Section C Student Revision Booklet
human mycosis Human fungal infections are called human mycosis..pptx
RMMM.pdf make it easy to upload and study
Computing-Curriculum for Schools in Ghana
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Abdominal Access Techniques with Prof. Dr. R K Mishra
202450812 BayCHI UCSC-SV 20250812 v17.pptx
2.FourierTransform-ShortQuestionswithAnswers.pdf
Cell Structure & Organelles in detailed.
O5-L3 Freight Transport Ops (International) V1.pdf
Complications of Minimal Access Surgery at WLH
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Ad

Statistical Process Monitoring Using Advanced Datadriven And Deep Learning Approaches Theory And Practical Applications 1st Edition Fouzi Harrou

  • 1. Statistical Process Monitoring Using Advanced Datadriven And Deep Learning Approaches Theory And Practical Applications 1st Edition Fouzi Harrou download https://ebookbell.com/product/statistical-process-monitoring- using-advanced-datadriven-and-deep-learning-approaches-theory- and-practical-applications-1st-edition-fouzi-harrou-11295874 Explore and download more ebooks at ebookbell.com
  • 2. Here are some recommended products that we believe you will be interested in. You can click the link to download. Multivariate Statistical Process Control Process Monitoring Methods And Applications 1st Edition Zhiqiang Ge https://ebookbell.com/product/multivariate-statistical-process- control-process-monitoring-methods-and-applications-1st-edition- zhiqiang-ge-4231042 Geometric Tolerances Impact On Product Design Quality Inspection And Statistical Process Monitoring 1st Edition Antonio Armillotta https://ebookbell.com/product/geometric-tolerances-impact-on-product- design-quality-inspection-and-statistical-process-monitoring-1st- edition-antonio-armillotta-2143658 Advanced Statistical Methods In Process Monitoring Finance And Environmental Science 2024th Edition Sven Knoth https://ebookbell.com/product/advanced-statistical-methods-in-process- monitoring-finance-and-environmental-science-2024th-edition-sven- knoth-63693306 Process Monitoring And Fault Diagnosis Based On Multivariable Statistical Analysis 2024th Edition Xiangyu Kong https://ebookbell.com/product/process-monitoring-and-fault-diagnosis- based-on-multivariable-statistical-analysis-2024th-edition-xiangyu- kong-57514892
  • 3. Statistical Monitoring Of Complex Multivariate Processes With Applications In Industrial Process Control Uwe Kruger https://ebookbell.com/product/statistical-monitoring-of-complex- multivariate-processes-with-applications-in-industrial-process- control-uwe-kruger-4312464 Statistical Process Control And Data Analytics 8th Edition 8th Edition John Oakland Robert Oakland https://ebookbell.com/product/statistical-process-control-and-data- analytics-8th-edition-8th-edition-john-oakland-robert-oakland-58376494 Statistical Process Control 6ed Oakland J https://ebookbell.com/product/statistical-process-control-6ed- oakland-j-2046442 Statistical Process Control The Deming Paradigm And Beyond Second Edition 2nd J Koronacki https://ebookbell.com/product/statistical-process-control-the-deming- paradigm-and-beyond-second-edition-2nd-j-koronacki-2138180 Statistical Process Control For Realworld Applications William A Levinson https://ebookbell.com/product/statistical-process-control-for- realworld-applications-william-a-levinson-2364782
  • 6. Statistical Process Monitoring using Advanced Data-Driven and Deep Learning Approaches
  • 8. Statistical Process Monitoring using Advanced Data-Driven and Deep Learning Approaches Theory and Practical Applications Fouzi Harrou King Abdullah University of Science and Technology Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division Thuwal, Saudi Arabia Ying Sun King Abdullah University of Science and Technology Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division Thuwal, Saudi Arabia Amanda S. Hering Baylor University, Dept of Statistical Science Waco, TX, United States Muddu Madakyaru Department of Chemical Engineering, Manipal Institute of Technology Manipal Academy of Higher Education Manipal, India Abdelkader Dairi University of Science and Technology of Oran-Mohamed Boudiaf Computer Science Department, Signal, Image and Speech Laboratory Oran, Algeria
  • 9. Elsevier Radarweg 29, PO Box 211, 1000 AE Amsterdam, Netherlands The Boulevard, Langford Lane, Kidlington, Oxford OX5 1GB, United Kingdom 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States Copyright © 2021 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library ISBN: 978-0-12-819365-5 For information on all Elsevier publications visit our website at https://www.elsevier.com/books-and-journals Publisher: Susan Dennis Acquisitions Editor: Anita A Koch Editorial Project Manager: Lena Sparks Production Project Manager: Kumar Anbazhagan Designer: Miles Hitchen Typeset by VTeX
  • 10. Contents Preface ix Acknowledgments xi 1. Introduction 1.1 Introduction 1 1.1.1 Motivation: why process monitoring 1 1.1.2 Types of faults 2 1.1.3 Process monitoring 4 1.1.4 Physical redundancy vs analytical redundancy 5 1.2 Process monitoring methods 6 1.2.1 Model-based methods 7 1.2.2 Knowledge-based methods 9 1.2.3 Data-based monitoring methods 9 1.3 Fault detection metrics 13 1.4 Conclusion 14 References 15 2. Linear latent variable regression (LVR)-based process monitoring 2.1 Introduction 19 2.2 Development of linear LVR models 20 2.2.1 Full rank methods 21 2.2.2 Latent variable regression (LVR) models 22 2.3 Dynamic LVR models 30 2.4 Process monitoring methods 32 2.4.1 Univariate chart for process monitoring 32 2.4.2 Distribution-based process monitoring schemes 39 2.4.3 Multivariate process monitoring schemes with parametric and nonparametric thresholds 44 2.5 Linear LVR-based process monitoring strategies 47 2.5.1 Conventional LVR monitoring statistics 47 2.5.2 Fault isolation 50 2.6 Cases studies 53 2.6.1 Simulated example 53 v
  • 11. vi Contents 2.6.2 Monitoring influent measurements at water resource recovery facilities 55 2.7 Discussion 63 References 63 3. Fault isolation 3.1 Introduction 71 3.1.1 Pitfalls of standardizing data 72 3.1.2 Shortcomings of contribution plots/scores 77 3.2 Fault isolation 79 3.2.1 Variable thinning 79 3.2.2 Iterative traditional isolation 80 3.2.3 Variable selection methods 83 3.3 Fault classification 99 3.4 Fault isolation metrics 100 3.4.1 Fault isolation errors 101 3.4.2 Precision and recall 102 3.4.3 Phase I FI metrics 102 3.4.4 Discussion 103 3.5 Case studies 103 3.5.1 Retrospective fault isolation 104 3.5.2 Real-time fault isolation 108 3.6 Further reading 111 References 112 4. Nonlinear latent variable regression methods 4.1 Introduction 119 4.2 Limitations of linear LVR methods for process monitoring 121 4.3 Developing nonlinear LVR methods for process monitoring 123 4.3.1 Nonlinear partial least squares 123 4.3.2 ANFIS-PLS modeling framework 127 4.3.3 Kernel PCA 131 4.3.4 Kernel principal components analysis (KPCA) model 131 4.3.5 KPCA-based fault detection procedures 135 4.4 Cases study: monitoring WWTP 138 4.4.1 Anomaly detection using KPCA-OCSVM method 139 4.5 Simulated synthetic data 142 4.5.1 Application of plug flow reactor 143 4.6 Discussion 149 References 151 5. Multiscale latent variable regression-based process monitoring methods 5.1 Introduction 155
  • 12. Contents vii 5.2 Theoretical background of wavelet-based data representation 158 5.2.1 Wavelet transform 159 5.2.2 Multiscale representation of data using wavelets 159 5.2.3 Advantages of multiscale representation 164 5.3 Multiscale filtering using wavelets 167 5.3.1 Single scale filter method 167 5.3.2 Multiscale filtering methods 168 5.3.3 Advantages of multiscale denoising 169 5.4 Wavelet-based multiscale univariate monitoring techniques 170 5.4.1 An illustrative example 172 5.5 Multiscale LVR modeling 176 5.5.1 Benefits of multiscale denoising in LVR modeling 176 5.6 Multiscale LVR modeling 177 5.7 Results and discussions 180 5.7.1 Application with synthetic data 180 5.7.2 Application of monitoring distillation column 183 5.8 Discussion 186 References 188 6. Unsupervised deep learning-based process monitoring methods 6.1 Introduction 193 6.2 Clustering 195 6.2.1 Partition-based clustering techniques 196 6.2.2 Hierarchy-based clustering techniques 197 6.2.3 Density-based approach 198 6.2.4 Expectation maximization 201 6.3 One-class classification 202 6.3.1 One-class SVM 202 6.3.2 Support vector data description (SVDD) 203 6.4 Deep learning models 206 6.4.1 Autoencoders 206 6.4.2 Probabilistic models 210 6.4.3 Deep neural networks 213 6.4.4 Deep Boltzmann machine 215 6.5 Deep learning-based clustering schemes for process monitoring 217 6.6 Discussion 218 References 219 7. Unsupervised recurrent deep learning scheme for process monitoring 7.1 Introduction 225 7.2 Recurrent neural networks approach 227 7.2.1 Basics of recurrent neural networks 227 7.2.2 Long short-term memory 229
  • 13. viii Contents 7.2.3 Gated recurrent neural networks 234 7.3 Hybrid deep models 235 7.3.1 RNN-RBM 236 7.3.2 RNN-RBM method 237 7.3.3 LSTM-RBM model 238 7.3.4 LSTM-DBN 239 7.4 Recurrent deep learning-based process monitoring 241 7.4.1 Residuals-based process monitoring approaches 242 7.4.2 Recurrent deep learning-based clustering schemes for process monitoring 243 7.5 Applications: monitoring influent conditions at WWTP 244 7.6 Discussion 250 References 251 8. Case studies 8.1 Introduction 255 8.2 Stereovision 258 8.2.1 Deep stacked autoencoder-based KNN approach 261 8.2.2 Data description 266 8.2.3 Results and discussion 266 8.2.4 Model trained using data with no obstacles 267 8.2.5 Evaluation of performance for busy scenes 269 8.2.6 Obstacle detection using the Bahnhof dataset 271 8.3 Detecting abnormal ozone measurements using deep learning 274 8.3.1 Introduction 274 8.3.2 Data description 276 8.3.3 Ozone monitoring based on deep learning approaches 278 8.3.4 Detection results 284 8.4 Monitoring of a wastewater treatment plant using deep learning 288 8.4.1 Introduction 288 8.4.2 Proposed DBN-based kNN, OCSVM, and k-means algorithms 290 8.4.3 Real data application: monitoring a decentralized wastewater treatment plant in Golden, CO, USA 291 8.4.4 Conclusion 297 References 297 9. Conclusion and further research directions References 308 Index 311
  • 14. Preface Anomaly detection and isolation have a vital role in modern industrial processes to enhance productivity, efficiency, and safety, as well as to avoid expensive maintenance. Therefore, it is important to be able to detect and identify any possible anomalies or failures in the system as early as possible. Generally, anomalies in modern automatic processes are difficult to avoid and may result in serious process degradations. The role of detection is to identify any anomaly event and indicate a distance from the system behavior compared to its nominal behavior. Furthermore, anomaly isolation determines the probable source of the detected anomaly. To illustrate, an accidental or even deliberate contamination of a drinking water distribution network can lead to financial losses, as well as to serious health risks. Therefore, early detection of anomalies is crucial not only to maintain proper process operation but also for the sake of people’s health. Today engineered and environmental processes have become far more complex due to advances in technology. Multiple key variables need to be monitored simulta- neously, and data may have both temporal and spatial aspects. New features of these processes require new and better statistical tools for process monitoring. Early detection and isolation of potential faults in complex engineering and environmental processes have proven to be particularly challenging. In the absence of a physics-based process model, data-driven statistical techniques for process monitoring have proved themselves in practice over the past four decades. These approaches use information derived directly from input data and require no explicit models for which development is usually costly or time- consuming. This book is intended to report recent developments in statistical process monitoring using advanced data-driven and deep learning techniques. The book is divided into nine chapters, and they are grouped into two parts. The objective of the first part is to tackle multivariate challenges in process monitoring by merging the advantages of univariate and traditional multivariate techniques to enhance their performance and widen their practical applicabil- ity. The second part aims to merge the desirable properties of shallow learning approaches, such as a one-class support vector machine, k-nearest neighbors, and unsupervised deep learning approaches to develop more sophisticated and efficient monitoring techniques. Throughout the book, the presented approaches are demonstrated using experimental data from many processes including waste- water treatment plants at KAUST and Golden, CO, USA, ozone air quality data, ix
  • 15. x Preface and stereovision data for obstacle detection in driving environments. Thus, the reader will find illustrative examples from a range of environmental and engi- neering processes. The book should be of interest to engineering and academic readers from process chemometrics and data analytics, process monitoring and control, data scientists, applied statistics, and industrial statisticians. In fact, this book can be assimilated by advanced undergraduates and graduate students having knowl- edge of basic multivariate statistical analysis and machine learning.
  • 16. Acknowledgments Addressing anomaly detection and isolation is essential to promptly detect ab- normalities and helpful in the decision making of the operators to better opti- mize, take corrective actions, and maintain downstream processes. This book is primarily based on data-driven based approaches for anomaly detection and isolation. The reader of this book will gain an in-depth understanding of fault detection and isolation in complex and multivariate systems, familiarizing with the most suitable data-driven based techniques including multivariate statistical techniques and deep learning-based methods. It gives the reader several real en- gineering and environmental applications to clearly show the implementation of anomaly detection and isolation approaches. Ying Sun and Fouzi Harrou would like to gratefully acknowledge the fi- nancial support by funding from King Abdullah University of Science and Technology (KAUST), Office of Sponsored Research (OSR) under Award No: OSR-2019-CRG7-3800 and OSR-2015-CRG4-2582. They would like also to express their sincere gratitude to the team of Publication Services and Re- searcher Support at KAUST for their support. In addition, we would also like to thank Professor Tzahi Cath of Colorado School of Mines who provided the decentralized wastewater treatment data. Amanda S. Hering would like to thank Professor Tzahi Cath of Colorado School of Mines who has been instrumental in introducing her to fault iso- lation problems and who has shared data from his facilities with her. She would also like to thank her graduate students, Molly Klanderman and Kathryn Newhart; their expertise and insight accumulated over the course of working together for the past few years has been invaluable. Her work on this project has been supported by King Abdullah University of Science and Technology (KAUST) Office of Sponsored Research (OSR), Grant/Award Number: OSR- 2015-CRG4-2582; Partnerships for Innovation: Building Innovation Capacity, National Science Foundation, Grant/Award Number: 1632227; the National Science Foundation Engineering Research Center program under cooperative agreement EEC-1028968 (ReNUWIt); and Baylor University through a research leave sabbatical. xi
  • 17. xii Acknowledgments Muddu Madakyaru would like to thank the Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, India, for continuous support during the preparation of this book. Finally, we would like to thank Lena Sparks, Author Service Manager, for her continuous assistance during the preparation of this book.
  • 18. Chapter 1 Introduction 1.1 Introduction 1.1.1 Motivation: why process monitoring Recent decades have witnessed a huge growth in new technologies and advance- ments in instrumentation, industrial systems, and environmental processes, which are becoming increasingly complex. Diagnostic operation has become an essential element of these processes and systems to ensure their operational reliability and availability. In an environment where productivity and safety are paramount, failing to detect anomalies in a process can lead to harmful effects to a plant’s productivity, profitability, and safety. Several serious accidents have happened in the past few decades in various industrial plants across the world, including the Bhopal gas tragedy [1,2], the Piper Alpha explosion [3,4], the acci- dents at the Mina al-Ahmadi Kuwait refinery [5] and two photovoltaic plants in the US burned in 2009 and 2011 (a 383 KWp PV array in Bakersfield, CA and a 1.208 MWp power plant in Mount Holly, NC, respectively) [6]. The Bhopal ac- cident, also referred to as the Bhopal gas disaster, was a gas leak accident at the Union Carbide pesticide plant in India in 1984 that resulted in over 3000 deaths and over 400,000 others gravely injured in the local area around the plant [1,2]. The explosion of the Piper Alpha oil production platform, which is located in the North Sea and managed by Occidental Petroleum, caused 167 deaths and a financial loss of around $3.4 billion [3,4]. In 2000, an explosion occurred in the Mina Al-Ahmadi oil refinery in Kuwait, killing five people and causing seri- ous damage to the plant. The explosion was caused by a defect in a condensate line in a refinery. Nimmo [7] has estimated that the petrochemical industry in the USA can avoid losing up to $20 billion per year if anomalies in inspected processes could be discovered in time. In safety-critical systems such as nu- clear reactors and aircrafts, undetected faults may lead to catastrophic accidents. For example, the pilot of the American Airlines DC10 that crashed at Chicago O’Hare International Airport was notified of a fault only 15 seconds before the accident happened, giving the pilot too little time to react; this crash could easily have been avoided according to [8]. Recently, the Fukushima accident of 2011 in Japan highlighted the importance of developing accurate and efficient moni- toring systems for nuclear plants. Essentially, monitoring of industrial processes represents the backbone for ensuring the safe operation of these processes and to ensure that the process is always functioning properly. Statistical Process Monitoring using Advanced Data-Driven and Deep Learning Approaches https://doi.org/10.1016/B978-0-12-819365-5.00007-3 Copyright © 2021 Elsevier Inc. All rights reserved. 1
  • 19. 2 Statistical Process Monitoring 1.1.2 Types of faults Generally speaking, three main subsystems are merged to form a plant or sys- tem: sensors, actuators, and the main process itself. These systems’ components are permanently exposed to faults caused by many factors, such as aging, man- ufacturing, and severe operating conditions. A fault or anomaly is a tolerable deviation of a characteristic property of a variable from its acceptable behavior that could lead to a failure in the system if it is not detected early enough so that the necessary correction can be performed [9]. Conventionally, a fault, if it is not detected in time, could progress to produce a failure or malfunction. Note that there is a distinction between failure and malfunction; this distinc- tion is important. A malfunction can be defined as an intermittent deviation of the accomplishment of a process’s intended function [10], whereas failure is a persistent suspension of a process’s capability to perform a demanded function within indicated operating conditions [10]. In industrial processes, a fault or an abnormal event is defined as the depar- ture of a calculated process variable from its acceptable region of operation. The underlying causes of a fault can be malfunctions or changes in sensor, actuator, or process components: • Process faults or structural changes. Structural change usually takes place within the process itself due to a hard failure of the equipment. The informa- tion flow between the different variables is affected because of these changes. Failure of a central controller, a broken or leaking pipe, and a stuck valve are a few examples of process faults. These faults are distinguished by slow changes across various variables in the process. • Faults in sensors and actuators. Sensors and actuators play a very important role in the functioning of any industrial process since they provide feedback signals that are crucial for the control of the plant. Actuators are essential for transforming control inputs into appropriate actuation signals (e.g., forces and torques needed for system operation). Generally, actuator faults may lead to higher power consumption or even a total loss of control [11]. Faults in pumps and motors are examples of actuator faults. On the other hand, sensor-based errors include positive or negative bias errors, out of range errors, precision degradation error, and drift sensor error. Sensor faults are generally charac- terized by quick deviations in a few numbers of process variables. Fig. 1.1 shows examples of the most commonly occurring sensor faults: bias, drift, degradation, and sensor freezing. We can also find in the literature another type of anomaly called gross pa- rameter changes in a model. Indeed, parameter failure occurs when there is a disturbance entering the monitored process from the environment through one or more variables. Some common examples of such malfunctions include a change in the heat transfer coefficient, a change in the temperature coefficient in a heat exchanger, a change in the liquid flow rate, or a change in the concentration of a reactant.
  • 20. Introduction Chapter | 1 3 FIGURE 1.1 Commonly occurring sensor faults. (A) Bias sensor fault. (B) Drift sensor fault. (C) Degradation sensor fault. (D) Freezing sensor fault. FIGURE 1.2 Fault types. (A) Abrupt anomaly. (B) Gradual anomaly. (C) Intermittent anomaly. Thus, sensor or process faults can affect the normal functioning of a process plant. In today’s highly competitive industrial environment, improved moni- toring of processes is an important step towards increasing the efficiency of production facilities. In practice, there is a tendency to classify anomalies according to their time-variant behavior. Fig. 1.2 illustrates three commonly occurring types of anomalies that can be distinguished by their time-variant form: abrupt, incipi- ent, and intermittent faults. Abrupt anomalies happen regularly in real systems and are generally typified by a sudden change in a variable from its normal op- erating range (Fig. 1.2A). The faulty measurement can be formally expressed as M(t) = r(t), t ta, r(t) + F, t ≥ ta, (1.1) where F is a bias that happens at the time instant ts.
  • 21. 4 Statistical Process Monitoring The drift anomaly type can be caused by the aging or degradation of a sensor and can be viewed as a linear change of the magnitude of fault in time. Here, the measurement corrupted with a drift fault is modeled as m(t) = r(t), t ta, r(t) + θ(t − ta), t ≥ ts, (1.2) where θ is a slope of the slow drift and ta is the start time of the occurred fault. Finally, intermittent faults are faults characterized by discontinuous occurrence in time; they occur and disappear repeatedly (Fig. 1.2C). Generally, industrial and environmental processes are exposed to various types of faults that negatively affect their productivity and efficiency. According to the form in which the fault is introduced, faults can be further classified as additive and multiplicative faults. Additive faults often appear as offsets of sen- sors or as additive bias, while multiplicative faults influence process parameters. Specifically, in an additive fault, the measurable variable Y(t) is corrupted by an additive fault, θt , as Y = Yt + θt . On the other hand, a multiplicative fault influences a measurable variable Y by the product of another variable U with θt (i.e., Y = (a + f )Ut ), where Ut is the input variable. 1.1.3 Process monitoring Before automation became commonplace in the field of process monitoring, hu- man operators carried out important control tasks in managing process plants. However, the complete reliance on human operators to cope with abnormal events and emergencies has become increasingly difficult because of the com- plexity and a large number of variables in modern process plants. Considering such difficult conditions, it is understandable that human operators tend to make mistakes that can lead to significant economic, safety, and environmental prob- lems. Thanks to advancements in technology over recent years, automation of process fault detection and isolation has been a major milestone in automatic process monitoring. Automatic process monitoring has been shown to respond very well to abnormal events in a process plant with much fewer mistakes com- pared to fault management by human operators. The demand for a monitoring system that is capable of appropriately detect- ing abnormal changes (sensor or process faults) has attracted the attention of researchers from different fields. The detection and isolation of anomalies that may occur in a monitored system are the two main elements of process monitor- ing (Fig. 1.3). The purpose of the detection step is to detect abnormal changes that affect the behavior of the monitored system. Once the anomaly is detected, effective system operation also requires evaluation of the risk of a system shut- down, followed by fault isolation or correction before the anomaly contaminates the process performance [12,13]. The purpose of fault isolation is to determine the source responsible for the occurring anomalies, i.e., to determine which sen- sor or process component is faulty. In practice, sometimes it is also essential to
  • 22. Introduction Chapter | 1 5 FIGURE 1.3 Steps of process monitoring. assess the severity of the occurred fault, which is done by the fault identification step. Here, we will focus only on fault detection and isolation. There are two types of anomaly detection: • Online fault detection. The objective of online anomaly detection is to set up a decision rule capable of detecting, as quickly as possible, the transition from a normal operating state to an abnormal operating state. Online detec- tion is based on the idea that system evolution is considered a succession of stationary modes separated by fast transitions. • Offline fault detection. The purpose of offline fault detection is to detect the presence of a possible anomaly outside the use of the monitored system. The system is observed for a finite period (the system is in stationary mode), and then, based on these observations, a decision is made on the state of the mon- itored system. Offline detection methods rely on an observation number fixed a priori, where the observations also come from the same law. 1.1.4 Physical redundancy vs analytical redundancy Process monitoring is essentially based on the exploitation of redundant sources of information. There are two types of redundancy in the process: physical re- dundancy and analytical redundancy (Fig. 1.4A–B). The essence of hardware or physical redundancy, which is a traditional method in process monitoring, consists of measuring a particular process variable using several sensors (e.g., two or more sensors). To detect and isolate simple faults, the number of sensors to use should be doubled. Specifically, under normal conditions, one sensor is sufficient to monitor a particular variable, but adding at least two extra sensors is generally needed to guarantee reliable measurements and monitoring under
  • 23. 6 Statistical Process Monitoring FIGURE 1.4 Conceptual representation of (A) physical and (B) analytical redundancy. faulty conditions. Typically, fault detection and isolation are achieved by a ma- jority vote between all the redundant sensors. This strategy has been widely used in the industry because of its reliability and simplicity of implementation. In practice, the main disadvantage of hardware redundancy is the additional cost of equipment and maintenance, as well as the space needed to install the equipment that increases complexity considerably in the already very complex systems. In addition, this method is limited in practice to sensor faults and can- not detect faults in variables that are not measured directly. This approach is mainly only justified for critical systems, such as nuclear reactors and aero- nautic systems. Unlike a physical redundancy, which is performed by adding more sensors (hardware) to measure a specific process variable, the analytical redundancy does not require additional hardware because it is based on using the existing relations between the dependent measured variables that are or are not of the same nature. Analytical redundancy is a more accessible strategy that compares the measured variable with the predicted values from a mathemat- ical model of the monitored system. It thereby exploits redundant analytical relationships among various measured variables of the monitored process and avoids replicating every hardware separately. 1.2 Process monitoring methods Today, engineering and environmental processes have become far more complex due to advances in technology. Anomaly detection and isolation have become necessary to monitor the continuity and proper functioning of modern industrial systems and environmental processes. Depending on the field of application, the repercussions of anomalies become binding and harmful if it concerns human safety, such as in aeronautical systems and nuclear reactors. Advancements in
  • 24. Introduction Chapter | 1 7 the field of process control and automation over the last few years have yielded various methods for successful diagnosis and detection of abnormal events. To meet safety and productivity requirements, extensive theoretical and practical monitoring methods have been developed. These methods are generally divided into three families of approaches, depending on the nature of the knowledge available on the system: model-, knowledge-, and data-based methods. A thor- ough overview of process fault detection and diagnosis can be found in [5]. Fig. 1.5 shows a summary of various monitoring methods; this section presents a brief overview of these monitoring techniques. FIGURE 1.5 A summary of various fault detection approaches. 1.2.1 Model-based methods Over the past three decades, numerous monitoring methods to improve the safety and productivity of several environmental and engineering processes have emerged. Model-based methods have proven especially useful in indus- trial applications where keeping the desired performance is highly required. A model-based method involves comparing the process’s measured variables with the prediction from the mathematical model of the process. The concep- tual schematic of the model-based fault detection is illustrated in Fig. 1.6. The FIGURE 1.6 Conceptual schematics of model-based process monitoring.
  • 25. 8 Statistical Process Monitoring backbone of the model-based method is the generation of residuals by compar- ing the measurement data with their predictions from the analytical model of the monitored process. Indeed, the residuals play the role of a fault indicator. Ideally, in the absence of modeling uncertainties and errors, the residual will be zero and the model will perfectly fit the measurements. Thus, any departure of the residual from zero indicates the presence of faults. However, in practice, we cannot avoid the presence of modeling uncertainties and noise measurements. In other words, a perfectly precise analytical model of an inspected process is never available. Notice that there is a distinction between a deviation of the real measurement and its prediction from a reference model, even under no-fault conditions. Hence, instead of using the departure of residuals from zero as a fault indicator, detection can be done by constructing a detection threshold that distinguishes between fault-free residuals and anomalies. The detection perfor- mance is mainly related to the selected detection threshold. This means that if the value of the thresholds is too small, then we get repeat false alarms due to errors and uncertainties when the residuals overpass the threshold and are consequently flagged as faults; this scenario obviously must be avoided. The detection threshold should thus be computed so that the frequency of correct detection is maximized for a given small number of false alarms (e.g., 5% or 1%). To address this concern, several statistical schemes have been proposed to monitor the residuals vector, including the generalized likelihood ratio ap- proach, cumulative sum (CUSUM) type schemes, and EWMA schemes. In the case of multivariate data, when the residuals matrix is generated, multivariate extensions of CUSUM and EWMA and T 2 are usually used to detect faults in the mean/variance of process. In summary, fault detection and isolation using model-based methods usu- ally take place in two distinct steps: • The first step consists of residual generation. Ideally, these residuals must be zero in normal operation and nonzero in the presence of an anomaly. How- ever, the presence of noise and modeling errors make the residuals fluctuate around zero. A significant divergence of the residual from zero is an indica- tion of faults. • The second step concerns the evaluations of the residuals based on a deci- sion procedure for detecting and isolating faults. This is done using statistical detection techniques such as EWMA, CUSUM, and generalized likelihood ratio (GLR) test [12]. A substantial amount of research work has been carried out on model- based monitoring methods. Methods that fall into the model-based monitoring category include parity space approaches [14–17], observer-based approaches [18,19], and interval approaches [20]. A related discussion and a comprehen- sive survey on model-based fault detection methods can be found in [21–23]. Essentially, the detection performance of model-based approaches is closely related to the accuracy of the reference model. The availability of an accurate
  • 26. Introduction Chapter | 1 9 model that mimics the nominal behavior of the monitored process is very help- ful for facilitating the detection of faulty measurements. However, for complex processes, such as those of many industrial and environmental processes with a large number of variables, deriving and developing accurate models is not al- ways easy and can be time-consuming, which makes them nonapplicable for many applications. For instance, modeling the inflow measurements of wastew- ater treatment plants is very challenging because of the presence of a large number of variables that are nonlinearly dependent and autocorrelated. Addi- tionally, modeling modern industrial and environmental processes is challenging because of the complexity and the absence of a precise understanding of these processes. The successful detection of faults using model-based approaches can, therefore, be considered a challenging and unsuitable approach. Alternatively, data-based methods are more commonly used. 1.2.2 Knowledge-based methods The success of modern industrial systems relies on their proper and safe op- eration. Early detection of anomalies as they emerge in the inspected process is essential for avoiding extensive damage and reducing the downtime needed for reparation [24]. As discussed above, when the information available to un- derstand the process under fault-free operation is insufficient to construct an accurate analytical model, analytical monitoring methods are no longer effec- tive. Knowledge-based methods present an alternative solution to bypass this difficulty. In the following, we use artificial intelligence methods and available historical measurements, which inherently represent the correlation of the pro- cess variables, to extract the underlying knowledge and system characteristics. In other words, we utilize process characteristic values, such as variance, mag- nitude, and state variables, for extracting features under fault-free and faulty conditions based on heuristic and analytical knowledge. Fault detection is then performed in a heuristic manner. Specifically, the actual features from the on- line data are compared with the obtained under-lying knowledge. Methods that fall in this category include expert systems [25], fuzzy logic, Hazop-digraph (HDG) [5], possible cause and effect graphs (PCEG) [26], neuro-fuzzy based causal analysis, failure mode and effect analysis (FMEA) [27], and Bayesian networks [28]. The major drawback of these techniques is that they are more appropriate for small-scale systems and thus may not be suited to inspect mod- ern systems. 1.2.3 Data-based monitoring methods Engineering and environmental processes have undoubtedly become far more complex due to advances in technology. Consequently, designing an accurate model for complex, high dimensional and nonlinear systems has also become very challenging, expensive, and time-consuming to develop. Setting simplifi- cations and assumptions on models leads to limits in their capacity to capture
  • 27. 10 Statistical Process Monitoring certain relevant features and operation modes, and induces a modeling bias that significantly degrades the efficiency of the monitoring system. In the absence of a physics-based process model, data-driven statistical techniques for process monitoring have proved themselves in practice over the past four decades. In- deed, data-based implicit models only require an available process-data resource for process monitoring [5]. Data-based monitoring techniques are mainly based on statistical control charts and machine-learning methods. Essentially, these monitoring techniques rely on historical data collected from the monitoring system. The system is modeled as a black box with in- put and output data (Fig. 1.7). At first, a reference empirical model that mimics the nominal behavior of the inspected process is constructed using the fault-free data, and then this model is used for detecting faults in new data. In contrast to model-based methods, only historic process data is required to be available in the data-based fault detection methods, and they are classified into two classes: qualitative and quantitative methods. FIGURE 1.7 Data-based methods. Unsupervised data-based techniques for fault detection and isolation do not use any prior information on faults affecting the process. Unsupervised data-based techniques cover a set of methods for monitoring industrial pro- cesses through tools such as statistical control charts (see Fig. 1.8). Univariate techniques, such as a Shewhart chart, exponentially weighted moving average (EWMA) [29], and cumulative sum (CUSUM), are used for monitoring only a single process variable at a given time instant. Monitoring charts have been ex- tensively exploited in most industrial processes. CUSUM and EWMA schemes show good capacity in sensing small changes compared to the Shewhart chart. In [30], a spectral monitoring scheme is designed based on the information em- bedded in the coefficients of the signal Fourier. However, these conventional schemes are designed based on the hypotheses that the data are Gaussian and un- correlated. To escape these basic assumptions, multiscale monitoring schemes using wavelets have been developed [31]. Furthermore, the above-discussed schemes use static thresholds computed using the fault-free data. Recently, sev- eral adaptive monitoring methods have been developed. These schemes are, in practice, more flexible and efficient than conventional schemes with fixed pa- rameters. For more details, see [32–35]. These univariate monitoring schemes
  • 28. Introduction Chapter | 1 11 examine a particular variable at a time instant by assuming independence be- tween variables. When monitoring multivariate data using several univariates, even when the number of false alarms of each scheme is small, the collective rate could be very large [36–38]. In addition, measurements from modern in- dustrial processes are rarely independent and present a large number of process variables. Since univariate schemes ignore the cross-correlation between vari- ables, they consequently suffer from an inflated number of undetected issues and false alarms, which makes this monitoring scheme unsuitable [36–38]. FIGURE 1.8 Data-based monitoring techniques. To alleviate this difficulty and to handle high dimensional data effectively, multivariate monitoring schemes have been developed to take into account the correlations between the variables, and thus monitor processes with several variables. These schemes include Hotelling T 2 [39], multivariate EWMA [40], and multivariate CUSUM [41]. However, the performance of these multivariate schemes degrades as the number of variables monitored increases, which makes them unsuitable for high dimensional data. Multivariate monitoring methods for monitoring multivariate data have been designed to directly tackle the above limitations. Multivariate statistical meth- ods are useful for compressing data and retaining relevant information, which is more appropriate to analyze than the original data. Moreover, these methods are efficient at handling noise and interactions between variables to effectively extract pertinent information. The most common multivariate methods for fault detection are principal component analysis (PCA) [22,42], partial least squares (PLS), principal component regression (PCR), canonical variate analysis (CVA), and independent component analysis (ICA) [43]. The essential element of multi- variate statistical methods, such as PCA, is their ability to transform multivariate correlated variables to a reduced set of uncorrelated variables. In the past two decades, these techniques have been extensively used to monitor industrial pro- cesses. For fault detection purposes, the original data is first projected into a
  • 29. 12 Statistical Process Monitoring latent subspace, where latent variables and residuals are monitored. PCA and PLS are the two most popular multivariate statistical methods that use latent variable methods for monitoring because they have a strong mathematical foun- dation that is available in the literature. Indeed, the PCA or PLS model is constructed based on historical normal process operations. This empirical model could be used to monitor the future behavior of the process. Any departure from the model should be flagged as a potential anomaly, such as sensor fault or pro- cess drift. PCA is used to reduce dimensionality in the process data and to retain important features of the data. PCA projects the observations from a higher dimension on to a lower-dimensional subspace and is optimal in terms of cap- turing the data variability. The PCA procedure is applied to a single data matrix only, whereas PLS models the relationship between two data matrices while compressing them simultaneously. The PCA technique is used to monitor and detect the faults in a multivariate process, along with the two fault detection in- dices, T 2 and the squared prediction error (SPE) statistics. The major advantage of latent variable approaches (i.e., PCA and PLS) is that a limited number of monitoring schemes are needed for monitoring multivariate data using monitor- ing indices of T 2 and SPE. However, data from modern industrial processes are time-dependent, non- stationary, nonlinear, non-Gaussian, and multiscale [44–47]. Most process mon- itoring methods assume that the process measurements at a given time are independent of the observations at a past sampling instant. Industrial pro- cesses are operated under dynamic conditions and variables have strong auto- correlation properties. Augmenting observations at a previous sampling time with observations at the present sampling time is referred to as Dynamic PCA (DPCA) [48,49]. For high-dimensional and time-dependent industrial data, using a fixed model monitoring approach could lead to poor diagnostic re- sults [50]. However, process monitoring for such processes could be improved by updating the model using a recursive PCA and a moving window PCA tech- nique [50]. Recursive PCA updates the model continuously online; similarly, online adaptive PCA updates the model using EWMA [50,51]. For nonlinear processes, a nonlinear version of data-based methods has been used, such as kernel PCA, kernel PLS, polynomial PLS and quadratic or fuzzy PLS, to reveal nonlinear relationships between variables [46]. In practice, most of the data need not be Gaussian in nature; to handle the non-Gaussian nature of the data, inde- pendent component analysis (ICA), the Gaussian mixture model (GMM), and its nonlinear variant have been used [47]. Other extensions have been developed, such as multiway PCA [45] that permits analyzing data from batch processes, and multiscale PCA that monitors processes at different frequency bands and denoises the data and reduces autocorrelation. Overall, these extensions are in- troduced based on an understanding of the nature of the data gathered from the inspected process. Accordingly, understanding the process characteristics is a central factor to meet practical expectations and construct an effective statistical monitoring system.
  • 30. Introduction Chapter | 1 13 Other approaches that fall into this category are based on machine- and deep-learning methods, which have recently gained considerable attention from researchers due to their ability to learn from large and complex datasets. Under a machine-learning framework, support vector machines (SVM) [52–54] and artificial neural networks (ANNs) have become an important tool in fault de- tection literature. Recently, increasing process complexity has resulted in the development of several monitoring methods based on deep learning that can account for features such as time dependency, nonlinearity, and nonnormality. A major strategy has been to extract features from the data using deep-learning models, such as Restricted Boltzmann Machine (RBM), Deep Belief Network (DBM), Deep Boltzmann Machine (DBM), Long Short-Term Memory (LSTM), and recurrent neural network (RNN), and to monitor the extracted features using binary clustering schemes or traditional monitoring charts. For instance, [55] introduced an approach that integrated an RNN-RBM model with clustering algorithms including k-means, spectral clustering, and OCSVM, for anomaly detection in WWTPs. In [56], several deep learning-based monitoring methods, such as DBN, deep-stacked auto-encoders, and restricted Boltzmann machines- based clustering procedures, were applied to detect abnormal ozone pollution. Deep-learning methods are appealing because of their flexibility to not make restricting assumptions on the underlying data. Also, applications using deep learning cover detection in complex data as multivariate time-series data [57], images and videos [58,59]. 1.3 Fault detection metrics To verify the performance of fault detection methods, several well-know metrics are commonly employed in the context of binary detection problems. Basically, many detection performance metrics are computed based on the 2 × 2 confusion matrix that reports the number of true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) [60]. The detection quality of the fault detection methods can be assessed using a false positive rate (FPR) (i.e., false alarm rate), a true positive rate (TPR) (i.e., detection rate), precision, accuracy, F-measure, recall, and the area under the curve (AUC). Fig. 1.9 displays a confu- sion matrix and recapitulate equations of the well-known related metrics that are frequently used to assess the performance of a binary decision method [60,61]. Also, another metric called average run length (ARL), which is able to char- acterize both types of error, I and II, is usually used to evaluate detection quality. Specifically, there are two kinds of ARL: ARL0 and ARL1. ARL0 is the average number of data points a fault detection method takes to flag out an alarm when the process is under control. ARL1 refers to the number of data points it takes a monitoring method to uncover a fault under faulty conditions (i.e., speed of detection).
  • 31. 14 Statistical Process Monitoring FIGURE 1.9 Fault detection metrics. 1.4 Conclusion In summary, accurately detecting and isolating faults that can occur in industrial and environmental processes is essential to minimize downtime, increase safety, reduce maintenance costs, and extend equipment lifetime. Process monitoring is required to successfully detect, isolate, and remove the faults before they affect the process performance. Several aspects should be considered when designing or using a particular fault detection approach, including the type of fault, process dynamics, measured variables, available data, and complexity. The simplest and most common practice is to directly check the limit of a measurable variable. However, these techniques are limited when monitoring large-scale processes. This has led to the development of reliable techniques that incorporate informa- tion from not just one process variable, but that include more knowledge about the process such as process state and parameters. Some approaches rely on accu- rate process models whereas others use available historical process data. Process model-based monitoring that incorporates dynamics information is easy to im- plement for well-defined systems; however, process model-based monitoring needs accurate models that are not always easy to obtain, in particular for com- plex processes. On the other hand, when information on the reliance of faults and symptoms is available, knowledge-based approaches are preferable; how- ever, these approaches are limited to small and simple processes. An alternative approach is to use data-based monitoring techniques, which are flexible and assumption-free. Of course, when a large amount of process data is available, and the process is too complex to be explicitly modeled, data-based techniques are more appropriate because of their flexibility to handle large, noisy, and non- linear data.
  • 32. Introduction Chapter | 1 15 References [1] V.R. Dhara, R. Dhara, The union carbide disaster in Bhopal: a review of health effects, Archives of Environmental Health: An International Journal 57 (5) (2002) 391–404. [2] B. Bowonder, The Bhopal accident, Technological Forecasting Social Change 32 (2) (1987) 169–182. [3] M.E. Paté-Cornell, Learning from the Piper Alpha accident: a postmortem analysis of technical and organizational factors, Risk Analysis 13 (2) (1993) 215–232. [4] L.W.D. Cullen, The public inquiry into the Piper Alpha disaster, Drilling Contractor; (United States) 49 (4) (1993). [5] V. Venkatasubramanian, R. Rengaswamy, S.N. Kavuri, K. Yin, A review of process fault detection and diagnosis: part III: process history based methods, Computers Chemical En- gineering 27 (3) (2003) 327–346. [6] B. Brooks, The Bakersfield fire: a lesson in ground-fault protection, SolarPro Magazine 62 (2011). [7] I. Nimmo, Adequately address abnormal operations, Chemical Engineering Progress 91 (9) (1995). [8] R.J. Patton, Fault-tolerant control: the 1997 situation, IFAC Proceedings Volumes 30 (18) (1997) 1029–1051. [9] O. Büyüköztürk, M.A. Taşdemir, Nondestructive Testing of Materials and Structures, vol. 6, Springer Science Business Media, 2012. [10] R. Isermann, Fault-Diagnosis Systems: an Introduction from Fault Detection to Fault Toler- ance, Springer Science Business Media, 2006. [11] G.J. Ducard, Fault-Tolerant Flight Control and Guidance Systems: Practical Methods for Small Unmanned Aerial Vehicles, Springer Science Business Media, 2009. [12] M. Basseville, I.V. Nikiforov, et al., Detection of Abrupt Changes: Theory and Application, vol. 104, Prentice Hall, Englewood Cliffs, 1993. [13] F. Harrou, L. Fillatre, I. Nikiforov, Anomaly detection/detectability for a linear model with a bounded nuisance parameter, Annual Reviews in Control 38 (1) (2014) 32–44. [14] E. Chow, A. Willsky, Analytical redundancy and the design of robust failure detection systems, IEEE Transactions on Automatic Control 29 (7) (1984) 603–614. [15] P.M. Frank, Fault diagnosis in dynamic systems using analytical and knowledge-based redun- dancy: a survey and some new results, Automatica 26 (3) (1990) 459–474. [16] R.J. Patton, J. Chen, A review of parity space approaches to fault diagnosis, IFAC Proceedings Volumes 24 (6) (1991) 65–81. [17] J. Ragot, D. Maquin, F. Kratz, Analytical redundancy for systems with unknown inputs. Ap- plication to faults detection, Control Theory and Advanced Technology 9 (3) (1993) 775–788. [18] R.N. Clark, D.C. Fosth, V.M. Walton, Detecting instrument malfunctions in control systems, IEEE Transactions on Aerospace and Electronic Systems 4 (1975) 465–473. [19] R.J. Patton, P.M. Frank, R.N. Clarke, Fault Diagnosis in Dynamic Systems: Theory and Appli- cation, Prentice-Hall, Inc., 1989. [20] K. Benothman, D. Maquin, J. Ragot, M. Benrejeb, Diagnosis of uncertain linear systems: an interval approach, International Journal of Sciences and Techniques of Automatic control computer engineering 1 (2) (2007) 136–154. [21] P.M. Frank, Analytical and qualitative model-based fault diagnosis–a survey and some new results, European Journal of Control 2 (1) (1996) 6–28. [22] L.H. Chiang, E.L. Russell, R.D. Braatz, Fault Detection and Diagnosis in Industrial Systems, Springer Science Business Media, 2000. [23] N. Martin, Advanced signal processing and condition monitoring, Insight-Non-Destructive Testing and Condition Monitoring 49 (8) (2007) 459–464. [24] Z. Gao, C. Cecati, S. Ding, A survey of fault diagnosis and fault-tolerant techniques—part II: fault diagnosis with knowledge-based and hybrid/active-based approaches, IEEE Transactions on Industrial Electronics 62 (6) (2015) 3768–3774.
  • 33. 16 Statistical Process Monitoring [25] S. Kim, S. jin Ahn, J. Chung, I. Hwang, S. Kim, M. No, S. Sin, A rule based approach to network fault and security diagnosis with agent collaboration, in: International Conference on AI, Simulation, and Planning in High Autonomy Systems, Springer, 2004, pp. 597–606. [26] N. Wilcox, D. Himmelblau, The possible cause and effect graphs (PCEG) model for fault diagnosis—I. Methodology, Computers Chemical Engineering 18 (2) (1994) 103–116. [27] R. Wirth, B. Berthold, A. Krämer, G. Peter, Knowledge-based support of system analysis for the analysis of failure modes and effects, Engineering Applications of Artificial Intelligence 9 (3) (1996) 219–229. [28] V. Sylvain, T. Teodor, K. Abdessamad, Fault detection with Bayesian network, in: Frontiers in Robotics, Automation and Control, IntechOpen, 2008. [29] J.M. Lucas, M.S. Saccucci, Exponentially weighted moving average control schemes: proper- ties and enhancements, Technometrics 32 (1) (1990) 1–12. [30] T. Tiplica, A. Kobi, A. Barreau, Spectral control chart, Quality Engineering 17 (4) (2005) 695–702. [31] R. Ganesan, T.K. Das, V. Venkataraman, Wavelet-based multiscale statistical process monitor- ing: a literature review, IIE Transactions 36 (9) (2004) 787–806. [32] M.S. De Magalhães, E.K. Epprecht, A.F. Costa, Economic design of a Vp X chart, International Journal of Production Economics 74 (1–3) (2001) 191–200. [33] R.B. Kazemzadeh, M. Karbasian, M.A. Babakhani, An EWMA t chart with variable sampling intervals for monitoring the process mean, The International Journal of Advanced Manufactur- ing Technology 66 (1–4) (2013) 125–139. [34] D.S. Bai, K. Lee, An economic design of variable sampling interval X control charts, Interna- tional Journal of Production Economics 54 (1) (1998) 57–64. [35] Y. Su, L. Shu, K.-L. Tsui, Adaptive EWMA procedures for monitoring processes subject to linear drifts, Computational Statistics Data Analysis 55 (10) (2011) 2819–2829. [36] J.F. MacGregor, T. Kourti, Statistical process control of multivariate processes, Control Engi- neering Practice 3 (3) (1995) 403–414. [37] P. Nomikos, J.F. MacGregor, Multivariate SPC charts for monitoring batch processes, Techno- metrics 37 (1) (1995) 41–59. [38] U. Kruger, L. Xie, Advances in Statistical Monitoring of Complex Multivariate Processes: With Applications in Industrial Process Control, Wiley, 2012. [39] H. Hotteling, Multivariate quality control, illustrated by the air testing of sample bombsights, in: M.W.H.C. Eisenhart, W.A. Wallis (Eds.), Selected Techniques of Statistical Analysis, McGraw-Hill, New York, NY, USA, 1947. [40] C.A. Lowry, W.H. Woodall, C.W. Champ, S.E. Rigdon, A multivariate exponentially weighted moving average control chart, Technometrics 34 (1) (1992) 46–53. [41] R.B. Crosier, Multivariate generalizations of cumulative sum quality-control schemes, Tech- nometrics 30 (3) (1988) 291–303. [42] S. Wold, K. Esbensen, P. Geladi, Principal component analysis, Chemometrics and Intelligent Laboratory Systems 2 (1–3) (1987) 37–52. [43] A. Hyvärinen, E. Oja, Independent component analysis: algorithms and applications, Neural Networks 13 (4–5) (2000) 411–430. [44] S.W. Choi, E.B. Martin, A.J. Morris, I.-B. Lee, Adaptive multivariate statistical process control for monitoring time-varying processes, Industrial Engineering Chemistry Research 45 (9) (2006) 3108–3118. [45] P. Nomikos, J.F. MacGregor, Monitoring batch processes using multiway principal component analysis, AIChE Journal 40 (8) (1994) 1361–1375. [46] J.-H. Cho, J.-M. Lee, S.W. Choi, D. Lee, I.-B. Lee, Fault identification for process monitor- ing using kernel principal component analysis, Chemical Engineering Science 60 (1) (2005) 279–288. [47] J.-M. Lee, C. Yoo, I.-B. Lee, Statistical process monitoring with independent component anal- ysis, Journal of Process Control 14 (5) (2004) 467–485.
  • 34. Introduction Chapter | 1 17 [48] W. Ku, R.H. Storer, C. Georgakis, Disturbance detection and isolation by dynamic principal component analysis, Chemometrics and Intelligent Laboratory Systems 30 (1) (1995) 179–196. [49] K. Chow, K. Tan, H. Tabe, J. Zhang, N. Thornhill, Dynamic principal component analysis using integral transforms, in: AIChE Annual Meeting, Miami Beach, vol. 13, 1999. [50] M. Kano, K. Nagao, S. Hasebe, I. Hashimoto, H. Ohno, R. Strauss, B.R. Bakshi, Comparison of multivariate statistical process monitoring methods with applications to the Eastman challenge problem, Computers Chemical Engineering 26 (2) (2002) 161–174. [51] W. Li, H.H. Yue, S. Valle-Cervantes, S.J. Qin, Recursive PCA for adaptive process monitoring, Journal of Process Control 10 (5) (2000) 471–486. [52] S. Yin, X. Gao, H.R. Karimi, X. Zhu, Study on support vector machine-based fault detection in Tennessee Eastman process, in: Abstract and Applied Analysis, vol. 2014, Hindawi, 2014. [53] M. Namdari, H. Jazayeri-Rad, S.-J. Hashemi, Process fault diagnosis using support vector machines with a genetic algorithm based parameter tuning, Journal of Automation and Control 2 (1) (2014) 1–7. [54] Z.B. Sahri, R.B. Yusof, Support vector machine-based fault diagnosis of power transformer using k nearest-neighbor imputed DGA dataset, Journal of Computer and Communications 2 (09) (2014) 22. [55] A. Dairi, T. Cheng, F. Harrou, Y. Sun, T. Leiknes, Deep learning approach for sustainable WWTP operation: a case study on data-driven influent conditions monitoring, Sustainable Cities and Society 50 (2019) 101670. [56] F. Harrou, A. Dairi, Y. Sun, F. Kadri, Detecting abnormal ozone measurements with a deep learning-based strategy, IEEE Sensors Journal 18 (17) (2018) 7222–7232. [57] P. Malhotra, L. Vig, G. Shroff, P. Agarwal, Long short term memory networks for anomaly detection in time series, in: Proceedings, Presses universitaires de Louvain, 2015, p. 89. [58] A. Dairi, F. Harrou, M. Senouci, Y. Sun, Unsupervised obstacle detection in driving environ- ments using deep-learning-based stereovision, Robotics and Autonomous Systems 100 (2018) 287–301. [59] A. Dairi, F. Harrou, Y. Sun, M. Senouci, Obstacle detection for intelligent transportation sys- tems using deep stacked autoencoder and k-nearest neighbor scheme, IEEE Sensors Journal 18 (12) (2018) 5122–5132. [60] D.M. Powers, Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation, Journal of Machine Learning Technology 2 (2011) 37–63. [61] D.L. Olson, D. Delen, Advanced Data Mining Techniques, Springer Science Business Me- dia, 2008.
  • 35. Chapter 2 Linear latent variable regression (LVR)-based process monitoring 2.1 Introduction With the advancement in instrumentation, data acquisition, and rapid develop- ment in the “Internet-of-Things” technology, which connects a large number of digital devices, enormous amounts of information have become available from anywhere at any time, from a multitude of smart devices. Indeed, large datasets are produced by the collection of large number of measurements from modern engineering and environmental processes. Exploiting these measurements with a certain level of redundancy, it becomes feasible to detect abnormal change and locate its sources in the inspected process. However, in the absence of ef- fective tools, the information in these datasets cannot be suitably extracted and exploited for inference and process monitoring. Over the past decade, the necessity for prediction and fault-detection tools has resulted in the design of several fault-detection mechanisms, which belong to either model-based (or analytical) or data-driven methods [1,2]. Analytical models, based on ideal hypotheses that utilize first principles, could theoreti- cally explain a system’s behavior; however, they need prior calibration of model parameters, which is challenging and costly in high-dimensional cases and may result in ill-conditioning problems [3]. Data-driven approaches can perform sys- tematic and objective exploration, visualization, and interpretation of data, can identify essential factors, features, or patterns, and can endorse and optimize data-supported decision-making [4]. Data-based techniques carry information on faults by extracting relevant features from data. Data-driven approaches are more currently commonly applied in engineering and petrochemical pro- cesses [5]. For instance, in the petrochemical industry where soft-sensors are widely used, billions of dollars were once lost annually because of the oc- currence of faults [6]. Environmental data have been exploited by data-driven approaches for anomaly detection in, for example, meteorological signals [7] or monitoring of sludge bulking in wastewater treatment plants (WWTPs) [8]. For instance, fault detection in chemical process industries is challenging due to the large number of variables involved, the dynamic characteristics and noisy measurements that occur in these processes. Indeed, a large number of variables leads to collinearity, which increases the uncertainty about the model parame- ter estimates. The latent variable regression (LVR) model is a commonly used Statistical Process Monitoring using Advanced Data-Driven and Deep Learning Approaches https://doi.org/10.1016/B978-0-12-819365-5.00008-5 Copyright © 2021 Elsevier Inc. All rights reserved. 19
  • 36. 20 Statistical Process Monitoring modeling framework to remedy such problems. The LVR model can deal with collinearity among variables, by constructing a model from a reduced number of variables (which are a linear combination of the original variables) called latent variables or principal components. This approach results in well-conditioned models [9,10]. LVR model estimation techniques include principal component regression (PCR) [11,9] and partial least squares (PLS) [12,13]. The organization of this chapter is as follows. In Sect. 2.2, we present a brief introduction to inferential modeling methods, including full rank models and latent variable regression (LVR) techniques. The presented full rank modeling techniques include ordinary least squares (OLS) regression and ridge regression (RR), while the latent variable regression techniques include PCR and PLS. Since the conventional LVR models are static and more appropriate for han- dling steady-state processes, the dynamic version of the LVR models is also briefly presented. Section 2.3 is devoted to an overview of some common statis- tical techniques that are applied in statistical process monitoring. Specifically, this section presents the basic univariate monitoring schemes, namely Shewhart, exponentially-weighted moving average (EWMA), cumulative sum (CUSUM), generalized likelihood ratio (GLR), and distribution-based algorithms, and we discuss their limitations. Section 2.4 presents the general framework of fault de- tection based on LVR approaches. In Sect. 2.5, we discuss one of the commonly used fault isolation approaches, namely contribution plots. We also present an innovative method that uses the radial visualization RadViz to perform root cause diagnosis in Sect. 2.5. The main objective of this chapter is to investi- gate these multivariate monitoring schemes (PCA and PLS) and their practical applications. In Sect. 2.6, we assess the performances of the developed inferen- tial modeling technique using simulated and practical examples. In addition, we evaluate the method of using PCA-based anomaly detection by importing seven years of influent characteristics (ICs) data from a coastal municipal WWTP where multiple abnormal events occurred. The chapter concludes with a dis- cussion and remarks in Sect. 2.7. 2.2 Development of linear LVR models Measurements from engineering and industrial processes are usually massive and include a large number of (high-dimensional) variables because of the com- plexity of the processes involved. Using traditional regression models like least squares are unsuited to provide reliable predictions due to high colinearity and ill-conditioning issues. There are a large variety of estimation techniques to ad- dress this modeling problem, including full-rank methods and latent variable regression methods. In this section, we present the basic theoretical perspective of some commonly used linear regression models that are used to design pro- cess monitoring algorithms, namely, OLS, RR, PCR, and PLS. In this section, we review the traditional linear correlation models for multivariate data that are the basis for designing fault detection methods. The basic concepts of each ap- proach and discussion on their advantages and weaknesses are presented.
  • 37. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 21 2.2.1 Full rank methods 2.2.1.1 Ordinary least squares regression We regress the data matrix y ∈ Rn (the measured output) to X ∈ Rn×m (selected group of process variables whose values are known precisely) as Y = Xβ + , (2.1) where β ∈ Rn is a vector of unknown constants to be estimated, and ∈ N(0,σIn) is a zero-mean Gaussian noise with the known variance. The essence of the ordinary least squares (OLS) regression is to estimate the model parame- ters by minimizing the following objective function [14,11]: min β Xβ − y2 2 . (2.2) The unbiased maximum likelihood estimate of β, if the matrix XT X is nonsin- gular and the elements of noise are uncorrelated [15,16], is βOLS = XT X −1 XT y. (2.3) When the input process variables are highly correlated, the variances of the OLS regression coefficients become very high, and the estimates may be inac- curate. In other words, the determinant of the matrix XT X is then very close to zero, hence giving unstable values for the variance of the estimated regression parameters (V (b) = σ2(XT X)−1). Moreover, the parameter estimates change considerably if elements of y are changed slightly and thus y is poorly predicted when utilized with new X measurements. In summary, when (XT X) is close to being singular, the variance of β̂ is inflated, which also results in increasing the uncertainty about its estimation. Even if numerical issues can be surpassed via methods such as pseudo-inverse, the statistical features of the model are not suited to inflated variance. One way to cope with this collinearity problem and the ill-conditioning of X is through regularization methods, such as ridge regression (RR), which is presented in the following. 2.2.1.2 Ridge regression (RR) As discussed above, in cases when the input process variables are highly cross- correlated, the OLS method can result in a poor estimate of the regression coefficients. One way to mitigate this problem is to relax the condition that βOLS should be an unbiased estimator. There are several methods in the liter- ature to obtain biased estimators of regression coefficients. The RR approach, which was originally introduced by Hoerl and Kennard [17], is commonly used to alleviate the collinearity problem and tuned to obtain good prediction models
  • 38. 22 Statistical Process Monitoring by trading-off bias and variance. The RR estimator is computed by minimizing the following objective function [17]. min β Xβ − y2 2 + λβ2 2 , (2.4) β̂RR = XT X + λI −1 XT y, (2.5) where λ is a positive constant, and I ∈ Rm×m is the identity matrix. Note that from Eq. (2.5), the term λI added to (XT X) enhances the conditioning of the estimation problem. Of course, the RR estimator, β̂RR, is basically a linear trans- formation of the OLS estimator β̂OLS. Eq. (2.5) can be rewritten as β̂RR = XT X + λI −1 (XT X)β̂OLS = Zλβ̂OLS. (2.6) Thus, the RR estimator is a biased estimator since E(β̂RR) = E(Zλβ̂OLS) = Zλβ. (2.7) The covariance matrix of β̂RR is expressed as V (β̂RR) = σ2 XT X + λI −1 XT XT X + λI −1 . (2.8) The basic concept when using RR is to select a value of λ that guarantees a greater decrease in the variance term than an increase in the squared bias. If this is accomplished, the MSE of β̂RR will be less than the variance of β̂OLS. In [18], it has been demonstrated that there is a positive constant λ for which the MSE β̂RR is less than the variance of β̂OLS. In practice, various procedures have been developed to choose the value of λ. For instance, in [18] the authors proposed to determine a suitable value of λ by inspecting the ridge trace, which is a plot of elements of β̂RR versus λ, where λ ∈ [0 1]. The aim is to determine a reasonably small value of λ for which the ridge estimates are stable. In [19], an appropriate selection of λ is given as, κ = m σ2 βT OLSβOLS , where βOLS and σ2 are determined using a least squares solution. Of course, these models can be used as an alternative to mitigate the ill- conditioning problem. However, they are not easily interpretable, whereas an important purpose of data modeling is interpretability; see [15,16,18]. 2.2.2 Latent variable regression (LVR) models Multivariate statistical projection methods such as PCA, PCR, and PLS are com- monly utilized to handle a high number of highly correlated process variables by conducting regression on a smaller number of transformed variables (i.e., latent
  • 39. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 23 or principal components), which are linear combinations of the raw measure- ments. After computing the latent variables in the process being investigated, these fewer number of variables are then used instead of using the raw data. This latent variables regression (LVR) approach generally results in well-conditioned parameter estimates and reliable model predictions [20]. In this section, these LVR methods are briefly presented. For more details, refer to [21–23]. Before presenting PCR and PLS regression methods, we present PCA, which is a pop- ular multivariate dimensionality-reduction approach. 2.2.2.1 Principal component analysis Feature extraction with PCA PCA, a dimensionality-reduction approach, is an increasingly popular model- ing framework for discovering relevant and crucial features from multivariate data. The foundation of PCA can be tracked back to Pearson (1901) [24] and Hotelling (1933) [25]. By projecting process variables into a lower-dimensional subspace, PCA reveals the inherent cross-correlation among process vari- ables [26]. In this regard, PCA latent variables or principal components (PCs) (also called scores), which consist of linear combinations of physical variables, can efficiently describe a process in a reduced subspace. PCA-based methods are currently more commonly applied in data compression [27], pattern recog- nition, data smoothing, classification [28], and fault detection [29]. PCA does not differentiate between input data X and output data Y. It is ap- plied to one data set that contains all the process variables involved in the prob- lem. Here, X is used to represent the whole data set. Let X = xT 1 ,...,xT n T ∈ Rn×m be a dataset gathered from a process having n observations and m vari- ables. Let us first discuss an important point before going into any further in de- tail. When performing PCA on multivariate data, it is assumed that all the data are on a comparable scale. If scaling of the data is omitted, then certain vari- ables in the data have to be adjusted to avoid the occurrence of misleading dominance. Scaling of data changes the covariance matrix and consequently affects the principal components. Scaling is important for both the variance and mean adjustments [30]. When the process variables are measured with different units, the purpose of the usual scaling is to make the variance the same (i.e., to give standard units), which gives a correlation matrix. Other variance-stabilizing transformations, such as log transformation, are used in the literature. The most commonly used scaling converts the variables to zero mean and unit variance. Each variable xj ∈ Rn, j = 1,...,m, should be scaled to have zero mean and unit variance prior to using PCA: xj,s = xj − μxj σxj . (2.9)
  • 40. 24 Statistical Process Monitoring From now on, we consider that autoscaled data is zero-mean centered with unit variance, X = ⎛ ⎜ ⎜ ⎜ ⎝ x1,1 x1,2 ... x1,m x2,1 x2,2 ... x2,m . . . . . . ... . . . xn,1 xn,2 ... xn,m ⎞ ⎟ ⎟ ⎟ ⎠ n×m . The scaled data X can be expressed using singular value decomposition (SVD) as a product of two factors: X = t1wT 1 + t2wT 2 + ··· + tmwT m = TWT , (2.10) where T ∈ Rn×m represents a matrix of the principal components (PCs) and W ∈ Rm×m is the loading matrix. The PCs are linear combinations of the orig- inal data, and each PC is not correlated with the others. The loading matrix is frequently calculated through SVD of the covariance matrix S of the data X: S = 1 n − 1 XT X = WWT with WWT = WT W = In, (2.11) where, = diag(σ2 1 ,...,σ2 m) is a matrix comprising eigenvalues of S arranged diagonally in decreasing magnitude. The eigenvalues λi are equal to the variance of the PC ti, σ2 i (i.e., var(wT i X) = λi). In the presence of cross-correlated multivariate data, X, the first l PCs (where k m) are sufficient for preserving relevant information in the original data. One important step in PCA model development is to select the number of PCs. Criteria for selecting the number of principal components to use A core step in designing LVR approaches is selecting by the number of LVs, l, to appropriately extract relevant information from the received data. In other words, the prediction performance of the designed LVR model is influenced by the choice of the number of LVs, l. Accordingly, an appropriate estimation of the number of LVs is necessary to avoid the problem of the model underfitting or overfitting the data. Some of these techniques are briefed below: • The scree test. The scree plot displays the variance caught by every PC against the number of the PCs [31]. Then, the number of PCs to retain are obtained by finding the value of the eigenvalue λ corresponding to the profile with an elbow shape (i.e., the profile is no longer linear). This identification procedure is easy to visualize but it could be not easy for automatic implementation. • Parallel analysis. Parallel analysis compares the variance profile to that ob- tained by assuming independent variables, to determine the number of PCs. Specifically, l is determined at the point where the two profiles cross [31,32]. • The cumulative percentage variance (CPV) procedure. The CPV procedure has been commonly employed to find the number of PCs explaining a certain
  • 41. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 25 percentage of the total variance (e.g., 90%) [31]: CPV (l) = l i=1 λi m i=1 λi × 100. (2.12) This procedure is attractive since it is intuitive and easy to implement [31]. • Cross-validation. Basically, the key concept of the cross-validation mech- anism is splitting the data in training datasets for model construction and testing datasets for model validation [33]. The model is verified using the test data, and residuals are generated by comparing the estimated values to the measured values. In the CV approach, the optimum number of PCs is determined by using Predictive Sum of Squares (PRESS) statistics [33], PRESSl = n i=1 (Xi − X̂l i)2 , (2.13) where l is the number of PCs vectors retained to calculate X̂, i.e., the dimen- sion of the PCs. The dimensionality is determined by finding the number of PCs corresponding to the minimum of the PRESS [33]. Based on the PCA model, after selecting the appropriate number of PCs to include in the model, the data matrix X can be expressed as a sum of the approximated matrix, X, and residual data, E (Fig. 2.1), X = TWT = k i=1 tiwT i + m i=k+1 tiwT i = X + E, (2.14) where T ∈ Rn×m represents a matrix of the principal components (PCs) and W ∈ Rm×m is the loading matrix. FIGURE 2.1 Schematic representation of PCA model.
  • 42. 26 Statistical Process Monitoring As described above, the orthogonal eigenvectors of the covariance matrix are equal to the loading matrix W = (v1,w2,...,wm), and eigenvalue λi is the variance of score ti. The loading matrix can be partitioned into two parts, W and W, i.e., W = [ W W]. Here W = (w1,w2,...,Wl), represents the first l principal loading vectors (PCs) and W = (wl+1,wl+2,...,Wm) represents the remaining m − l PCs. The partition is shown below: S = 1 n − 1 XT X = W W 0 0 WT WT . (2.15) The data matrix X can be factorized as X = T T | T WT W | W T = T WT + T WT = X X W WT + E X Im − W WT . (2.16) Here T ∈ Rn×l is the PC matrix (n × l), which describes the values of variables in the transformed (n × l) basic space spanned by W, while l is chosen to cap- ture most of the variability in the data, and no relevant information is lost in E. The matrices W WT and (Im − W WT ) span the principal component and resid- ual subspaces, respectively. The row vectors in X and E are orthogonal, i.e., X T E = 0. The residual matrix plays a core role in uncovering abnormal features in process monitoring. For the purpose of anomaly detection, we will evaluate the generated residuals based on the developed PCA reference model by univariate or multivariate statistical monitoring schemes. More details on process monitor- ing are given in the subsequent sections. 2.2.2.2 Principal component regression PCR is an alternative to OLS regression for addressing the issue of ill- conditioning or collinearity in multivariate linear regression, which results in a poor estimation of the model parameters. PCR is a linear regression approach that can handle highly correlated process variables by latent variables as regres- sors in the regression. It can be implemented in two steps. The first step in PCR consists of projecting the input variables via PCA to account for collinearity and reduce their dimensions. To this end, SVD is frequently employed to compute the PCs. In the second step, OLS regression is conducted between the retained PCs and the response [14,11] (Fig. 2.2). To sum up, the key idea of PCR is to use uncorrelated l score vectors from the PCA instead of the l columns in X. Specifically, the multicollinearity among the predictor variables can be eliminated by using a subset of orthogonal PCs from the input data X via PCA. Then, OLS is performed between the response variable Y and the retained l PCs of X. From the PCA model, the matrix X can
  • 43. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 27 FIGURE 2.2 Schematic representation of (A) MLR and (B) PCR models. be decomposed as follows: X = TWT = l i=1 tiwT i + m i=l+1 tiwT i = X + E, (2.17) where T ∈ Rn×m represent a matrix of the PCs and W ∈ Rm×m is the loading matrix. Then, a subset of these PCs (with the largest variance) are utilized to build a linear model relating these PCs to the response variable, y, using OLS regression, y = Tβ̂, (2.18) where T = [t1 ...tl] is the retained PCs (with the largest eigenvalues) used to construct the model, with l ≤ m; l is selected such that there is no important loss in process information retained in residuals. The regression matrix β̂ is obtained by solving the following minimization problem: min β Tβ − y2 2 , (2.19) β̂ = TT T −1 TT y. (2.20) Note that PCR is equivalent to OLS if all PCs are used in designing the PCR model (i.e., l = m). In the case of uncorrelated input variables, OLS would be the first option in regression. Note that all PCs in PCR are determined without taking the model response into consideration. Next we present another approach to cope with the multicollinearity problem, which takes into account the input– output relationship when determining the PCs, called partial least squares (PLS). 2.2.2.3 Partial least squares This section introduces the PLS regression modeling (also known as the pro- jection on latent structures), which was first proposed in [34] in the field of econometrics. Later in [35] a detailed PLS algorithm was provided. In [36], the geometry of two procedures to perform PLS has been illustrated. This technique is used in chemometrics and chemical engineering for soft sensor develop- ment [37], process monitoring, and fault diagnosis. The capacity of PLS to deal with multivariate input–output data with collinearity is one of its desirable characteristics [38]. When the matrix XT X
  • 44. 28 Statistical Process Monitoring is singular or ill-conditioned, PLS determines an optimum pair of LVs in the input and output data (X and Y) so that these transformed variables have the largest covariance. Unlike PCR, PLSR exploits the information in input and out- put variables by using the covariance between them and reducing the impact of irrelevant variations of input variables. In other words, PLSR is designed using both PCs of X and Y. Basically, the PLS model is performed by searching a set of PCs that explains the maximum cross-correlation between X and Y (Fig. 2.3). FIGURE 2.3 Schematic representation of PLS model. Consider an input with n samples and m variables, X ∈ Rn×m, and output with n samples and p variables, Y ∈ Rn×p. PLS extracts the principal com- ponents iteratively by maximizing the covariance of the extracted principal components. PLS model development has two components, one is to develop inner models and the other is to develop outer models [39,40]. Outer models have a relationship with the inner model such that X = l i=1 tpT i = TPT + G, Y = l i=1 uqT i = UQT + F, (2.21) where T ∈ Rn×l and U ∈ Rn×q are matrices of the transformed uncorrelated variables. The loading matrices of input and output space are P ∈ Rm×l and Q ∈ Rp×q, respectively. The model residuals are G and F. The number of PCs, l, is determined by cross-validation. The retained latent variables of the input and output space are related by the linear inner model as U = TB + H, (2.22) where B is a regression matrix linking the input and response PCs, and H is a residual matrix. The regression coefficients of B can be obtained by minimiza- tion of residuals H. The response Y is given as Y = TBQT + F∗ . (2.23)
  • 45. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 29 Notice that each pair of latent variables in the PLS model (i.e., tj and uj (j = 1,...,l)) is estimated iteratively [35,41]. Various procedures are developed in the literature to obtain PLS estimators, including nonlinear iterative partial least squares (NIPALS) and SIMPLS methods. For more details, refer to [35,36, 34,12]. The first pair of latent variable vectors is calculated so that the following covariance: argmax pi,qi cov(t1,u1) = tT 1 u1 = pT 1 XT Yq1 (2.24) can be maximized with constraints pT i pi = 1 and qT i qi = 1. The first pair (p1,q1) of loading vectors, which represents the dominant di- rection, is computed so that the covariance between X and Y is maximized. Then, the first set of latent variable vectors (t1 = Xp1;u1 = Yq1) is obtained by projecting X data on p1 and Y data on q1 (the outer model). After that, the inner model can be established between t1 and u1 ( u1 = t1b1). After the first set of scores and loadings are computed, the residuals of the input and output variables are calculated as E1 = X − t1p1, F1 = Y − u1q1 = Y − t1b1q1. (2.25) Overall, PLS iteratively estimates both LVs for X and Y, so that they have maximal covariance. These pairs of LVs are estimated and added to the model in an iterative way. The input and output residuals are generated and the procedure is iterated based on the residual until cross-validation error is minimized [11,35, 34,14]. Fig. 2.4 illustrates the recursive process of determining the LVs in PLS. FIGURE 2.4 Schematic representation of the recursive procedure to determine the PCs in PLS. The NIPALS algorithm, which is commonly used to derive PLS models, is summarized below [42]: Step 1. Set data X and Y to have mean zero and unit variance Step 2. Set u equal to a column of Y
  • 46. 30 Statistical Process Monitoring Step 3. Let w = uT X uT u Step 4. Normalize u to have unit length Step 5. Evaluate the scores, t = Xw wT w Step 6. Evaluate the new u vector, u = Yq qT q Step 7. Check convergence on u: if YES go to Step 8, if NO go to Step 2 Step 8. Evaluate X loading, p = XT t tT t Step 9. Evaluate the residual matrices, E = X − tpT and F = Y − tqT Step 10. If additional PLS dimensions are necessary then replace X and Y by E and F, respectively, and repeat Steps 1 through 9. Since PLS is using a covariance objective function, it frequently needs mul- tiple LVs even in the case of a single output in Y. However, sometimes an important part of the LV subspace is orthogonal or irrelevant to the output, de- spite the fact that the subspace includes large variability of the input data [43]. Thus, to further improve PLS, numerous extensions have been developed such as orthogonalized PLS [44] and concurrent PLS approaches [45]. Note that the above described LVR methods all exploit the latent structured relationships between the process variables that are linear and static. They es- tablish the basic framework for further enhancements to nonlinear or dynamic LVR modeling. 2.3 Dynamic LVR models From the above discussion, we have shown that LVR models such as PLS and PCR can be used to handle multivariate data with collinearity among the vari- ables by designing a model from a reduced number of variables (which are a linear combination of the original variables) termed latent variables. These methods result in well-conditioned models. However, LVR models are static and ignore process dynamics, which make them unsuitable to catch the tempo- ral evolution of data. In other words, the use of such methods to select the key variables is performed by assuming that the variables are uncorrelated in time. Since many practical data produced from engineering and environmental pro- cesses are correlated in time, it is necessary to have a model incorporating such information to deal with process dynamics. For dynamic processes such as engineering and chemical processes, fre- quently the actual observations of the process variable depend on past observa- tions. The application of static LVR approaches (e.g., PLS and PCA) to dynamic data will not give accurate modeling of the relations among the variables, but just a linear static approximation. To remedy this limitation and consider the dynamic information, an augmented process dataset, including previously au- tocorrelated measurements, should be created. A commonly used approach to bypass such limitations is dynamic PCA (DPCA), which has been introduced in [46]. Basically, DPCA is the conventional PCA applied to augmented data including time-lagged measurements of process variables. Specifically, to de-
  • 47. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 31 scribe the temporal dynamics, the Hankel matrix of the original data, which is usually employed in time series modeling, is used in [46]. The augmented data that includes time-lagged variables at time instance k is Xz = [X(k) X(k − 1)... X(k − z)] = ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ xT (0) xT (1) ... xT (z) xT (1) xT (2) ... xT (z + 1) . . . . . . ... . . . xT (n − z) xT (n − z + 1) ... xT (n) ⎤ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ , (2.26) where z is the time lags and its length is related to the past memory entered in the variables. The DPCA is applied to the augmented process data matrix in a similar way to conventional PCA [46]. Indeed, this is basically the same as the static PCA except that the input data is augmented to include past measurements. The se- lection of the appropriate number of lagged data plays a key role in DPCA to appropriately model the process dynamics. For highly nonlinear data, the num- ber of lags, z, to incorporate in the data may take a higher value to achieve better linear approximation. DPCA modeling can be outlined in the following steps: (1) Start with z = 0 (2) Compute the augmented data matrix Xz (3) Design PCA model using the augmented data (4) Select the optimal PCs to be kept in the model using some known criteria such as Cumulative Percent Variance (CPV) approach (5) Check the autocorrelation function (ACF) of the residuals of the PCA model (6) If ACF is within the threshold, i.e., the residuals are not correlated, go to Step (8), otherwise, proceed (7) Increment the number of lags z = z + 1 and go to Step (2) (8) End The essence of DPCA is to apply PCA using time-lagged data, thus both the linear static and dynamic relationships among process variables are cap- tured. To sum up, DPCA exploits both the desirable characteristics of PCA to high-dimensional data and the flexibility of the time series model, Autoregres- sive Integrated Moving Average (ARIMA), to capture the time dependency in data [47,48]. On the other hand, several approaches are designed in the literature to handle dynamics in multivariate input–output processes based on dynamic PLS. One approach consists of incorporating a large number of time-lagged input mea- surements in the input data matrix X, which conducts to a PLS-Finite Impulse Response (FIR) model [49]. Analogous to DPCA, both the time-lagged data of the input and output process variables are included in the input data matrix X, which results in the PLS-Autoregressive Moving Average (ARMA) model. Both
  • 48. 32 Statistical Process Monitoring PLS-FIR and PLS-ARMA models need a large augmentation in the dimension of the input data matrix X, which may be cumbersome to handle. To remedy this difficulty, in [50] a simple and flexible method is presented permitting the inclusion of the process dynamics as part of inner PLS model and avoiding the consideration of significant time-lagged input and output variables in the input data. The key benefit of this approach is that no lagged variables are used in the PLS outer model. In [51], a dynamic Autoregressive with Exogenous Terms (ARX) or Hammerstein model is used to account for process dynamics in PLS, for inner relation between ti and ui instead of a static model. The aforementioned LVR methods are all extensively used for multivariate process monitoring. To do so, these LVR methods are combined with fault detec- tion indices such as the Hotelling T 2 and the squared prediction error schemes. The general framework of LVR-based process monitoring strategies is presented in Sect. 2.5. 2.4 Process monitoring methods Detecting anomalies in industrial processes plays a core role in developing efficient production systems that have acceptable performance and meet the de- sired requirements and specifications. Without an efficient detection procedure, chemical processes such as distillation columns would be damaged by unex- pected faults and could result in financial losses and serious damages. Univariate statistical monitoring schemes are widely applied in numerous production pro- cesses as tools for checking product quality when the inspected variable is univariate. The goal of statistical process monitoring schemes is to uncover any deviation of the supervised process from the desired performance. For many decades, these univariate schemes were frequently applied in quality control applications, and now they have been extended to many other fields, such as air quality [29], cybersecurity [52], healthcare systems [53,54], and economics [53]. In this section, we describe the essence of some basic univariate monitor- ing schemes, such as Shewhart, CUSUM, EWMA, and GLR charts. 2.4.1 Univariate chart for process monitoring In this subsection, we summarize univariate process monitoring charts including Shewhart, CUSUM, EWMA, and GLR. 2.4.1.1 Shewhart-based monitoring scheme Shewhart introduced the Shewhart monitoring scheme in 1931 to supervise the product quality at different phases of a manufacturing process [55]. In practice, this monitoring chart is one of the most frequently applied statistical quality con- trol schemes [55]. Instead of waiting to examine the quality of the final product, early inspection and monitoring would enable companies save costs with re- gards to inspection and rejection of the finished product [55]. This would help
  • 49. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 33 ensure that uniform quality of products is maintained, thus leading to increased economic benefits and improved time efficiency. Statistical decisions in She- whart schemes are based on current observations and no memory about the past is considered. Thus, they are suitable for detecting relatively large faults. The Shewhart chart is used online to evaluate the process performance based on the current measured data. Consider that (x1,x2,...,xn) are individual observations received from the supervised process. Shewhart schemes are designed under the assumption that the measurements are uncorrelated and the data under normal operating condi- tions are normally distributed [55]. If these two assumptions are verified, the control limits of the Shewhart chart are defined as [55,56] UCL,LCL = μ0 ± z1− α 2 σ0, (2.27) UCL and LCL denote the upper control limit (UCL) and the lower control limit (LCL) while z1− α 2 is the (1 − α 2 )th quartile of the Gaussian distribution N(0,1). Also μ0 and σ0 represent the mean and standard deviation of the measurements without anomalies. The term z1− α 2 is usually called the width of the control limits and it is generally fixed to be 3, which is equivalent to a false alarm rate of 0.27%. The Shewhart scheme flags a fault if xt LCL or xt UCL. (2.28) In summary, the performance of the Shewhart charts is limited when utilized to sense small changes in the process mean. They consider only the current measurement of the process, thus they are classified as detection charts with- out memory. To tackle this deficiency, improved mechanisms with increased process memory would be very helpful. Memory-type monitoring approaches, such as CUSUM, moving average, and EWMA charts, are designed to detect small changes. 2.4.1.2 Cumulative sum (CUSUM)-based monitoring schemes Cumulative sum (CUSUM) monitoring schemes are well reputed in fault de- tection and were first introduced by Page [57]. Compared to Shewhart-type approaches, the CUSUM schemes are a suitable alternative for detecting small changes, which are often a major concern in process monitoring [57]. Instead of using only the current measurement, the CUSUM scheme exploits all the avail- able information from previous and current measurements to uncover faults. The CUSUM statistic (Si) is determined as [58] Si = i j=1 (xj − μ0), (2.29) where Si is the cumulative sum of all available measurements including the current and previously received measurements, and μ0 is the fault-free process
  • 50. 34 Statistical Process Monitoring mean. The CUSUM decision function is obtained in a recursive manner as [58] Si = (xi − μ0) + Si−1. (2.30) One-sided CUSUM statistic is calculated as follows [58]: Si = i j=1 xj − (μ0 + k) , (2.31) where k is a parameter that is employed as a reference for detecting a change in the process mean. If St changes into a negative value, the CUSUM decision statistic is automatically set to zero. A fault is flagged out when the CUSUM statistic St overpasses the decision threshold, H. In practice, the threshold H of 4σ or 5σ, which results in good detection of a deviation of about 1σ in the process mean, is recommended [59]. Here σ is the standard deviation of the monitored variable. Numerous variations of the CUSUM exist; one of the most common forms is the two-sided CUSUM (tabular) [56]. The recursive formula for high and low side shifts are: S+ t = max 0,xt − (μ0 + k) + S+ t−1 , (2.32) S− t + = max 0,(μ0 − k) − xt + S− t−1 , (2.33) where the statistics S+ and S− are respectively the upper and lower one-sided CUSUMs, and S+ 0 = S− 0 = 0, μ0. A fault is declared if either S− t or S+ t exceeds the decision threshold H = hσ, where h relies on the shift to be detected. 2.4.1.3 Exponentially weighted moving average (EWMA) schemes While CUSUM schemes consider all available measurements with equal weight in process monitoring, EWMA schemes exponentially weight the measure- ments based on their importance in characterizing the process [60]. The EWMA shows suitable performance in detecting small changes in the process mean. The EWMA scheme was first designed by Roberts [61], and was frequently applied in quality control and process monitoring [56]. The EWMA monitoring statistic is defined as follows: z0 = μ0, zt = γ xt + 1 − γ zt−1, (2.34) where z0 = μ0 is the mean of fault-free data, γ is a weighting factor with the range 0 γ ≤ 1, which defines the temporal memory of the EWMA scheme. Eq. (2.34) indicates that the EWMA statistic utilizes all the available informa- tion to sense small anomalies. To highlight this point, the EWMA statistic, zt ,
  • 51. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 35 can be expressed recursively as: zt = γ zt + 1 − γ zt−1 γ zt−1 + 1 − γ zt−2 = γ zt + γ (1 − γ )zt−1 + (1 − γ )2 zt−2. = γ zn + γ (1 − γ )zn−1 + γ (1 − γ )2 zn−2 + ··· + γ (1 − γ )n−1 z1 + (1 − γ )n z0. (2.35) The EWMA decision function in (2.35) can be expressed in compact form as zt = γ n t=1 (1 − γ )n−t zt + (1 − γ )n z0, (2.36) where γ (1−γ )n−t denotes the weight for zt , which exponentially decreases for previous observations. The parameters L and γ play an important role in design- ing the EWMA scheme [56,54]. The value of L is frequently fixed in practice to be 3, which implies a false alarm rate of 0.27%. Generally, a choice of small val- ues of γ (i.e., where less importance is placed on the newer observations) is used in order to extend the sensitivity to small deviations, while the use of large val- ues of γ (i.e., EWMA with short memory) is suited for detecting larger changes in the process mean [56,62,56]. For the purpose of detecting small changes, in practice the value of γ is usually selected in the interval [0.1,0.3] [62,56]. In the absence of anomalies, the distribution of the EWMA statistic is zt ∼ N(μ0,σ2 zt ), where σzt = σ0 γ (2−γ ) [1 − (1 − γ )2t ] and σ0 represents the standard deviation of the fault-free measurements. However, when a mean shift occurs at the time 1 ≤ τ ≤ n, the distribution of the EWMA statistic is computed as zt ∼ N(μ0 + 1 − (1 − γ )n−τ+1 (μ1 − μ0),σ2 zt ). Accordingly, when faults happen, the mean of the EWMA decision function, zt , is a weighted average of μ0 and μ1, and the weight related to μ1 becomes large when n is large. Then, this clearly highlights that the statistic zt provides pertinent information about the mean shift. The EWMA scheme flags faults when the monitoring statistic zt , as given in (2.34), exceeds the upper and lower control limits defined as UCL = μ0 + Lσzt , LCL = μ0 − Lσzt , (2.37) where μ0 is the targeted mean, L is the width of the control limit, and σ is the standard deviation of the fault-free or preliminary data set. From σzt , it can be seen that when t becomes larger, the term [1 − (1 − γ )2t ] is asymptotically equivalent to unity. In other words, the control limits attain
  • 52. 36 Statistical Process Monitoring their steady-state values [56]: ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ UCL = μ0 + L σ σ0 % γ (2 − γ ) , LCL = μ0 − L σ σ0 % γ (2 − γ ) . (2.38) As described previously, in the Shewhart schemes, anomaly detection is based only on the current measurement and all past measurements are ignored (Fig. 2.5). Accordingly, these schemes provide unsatisfactory monitoring re- sults when used for sensing small changes in the process mean. This limitation can be mitigated by incorporating the information from the actual and the past measurements in the decision process such as in EWMA and CUSUM schemes (Fig. 2.5). In the CUSUM scheme, information from all available measurements are exploited and the same weight is assigned to all observations (Fig. 2.5). On the other hand, the EWMA scheme, which is designed by using an exponentially weighted average of all available measurements, is also sensitive in detecting small changes in the process mean. FIGURE 2.5 Univariate process monitoring charts. In EWMA schemes, a larger value of the smoothing parameter is suited to rapidly detect faults with a large amplitude, while a smaller value can ef- ficiently detect small faults in the mean of the process [60]. Therefore, by using a unique value for the smoothing parameter, monitoring-based EWMA schemes cannot reach a good detection capacity for both small and large faults simultaneously [60]. Since the univariate EWMA control schemes assume fixed thresholds, which may not be suitable for dealing with nonstationary (or time- varying) data. Therefore, several adaptive EWMA and CUSUM methods have been designed in the literature by allowing the thresholds of these methods to vary online to account for the changing nature of the data [63,64]. The idea behind the adaptive EWMA is to adapt the weight of the past observations, ac- cording to the magnitude of the error (et = xt − zt−1, see (2.39)), and to detect
  • 53. Linear latent variable regression (LVR)-based process monitoring Chapter | 2 37 in a more balanced way faults with different sizes: zt = γ xt + 1 − γ zt−1 = γ et (xt − zt−1)+ zt−1. (2.39) Also, several adaptive CUSUM (ACUSUM) schemes have been developed in the literature to achieve suitable detection performance covering a range of mean change magnitudes [64,65]. For instance, the basic idea behind the ACUSUM proposed in [64] is to update the reference value (K) in CUSUM based on the EWMA estimate. 2.4.1.4 Generalized likelihood ratio (GLR) hypothesis testing approach Theabove-described monitoring schemes(i.e., Shewhart, CUSUM, and EWMA) are more or less suited to some specific range of fault amplitudes. For instance, Shewhart-type approaches provide satisfactory detection of large faults, but they are insensitive to small changes in the process mean [54,59]. While CUSUM and EWMA schemes are effective in detecting small changes, they fail to detect large faults. However, in practice, the magnitude of occurring faults is unknown. Accordingly, it is desirable to automatically detect a large range of faults and thus reduce the rate of missed detection. To this end, one approach to achieve a reliable detection of different sizes of process anomalies is to base the moni- toring scheme on a generalized likelihood ratio test (usually called GLR charts) [66]. The benefits of the GLR approach are its efficiency in separating com- posite hypotheses, simplicity, and absence of complex computations. Extensive literature has been dedicated to studying GLR properties. Signficant efforts have been devoted to establishing different asymptotic optimality properties of this hypothesis testing approach and can found in [67–71]. The GLR detector is widely used in several applications including air quality monitoring [29] and train safety [66]. Here, we consider problems related to binary composite hypothesis test- ing. When testing two composite hypotheses in which their corresponding data probability density functions (PDFs) comprise unknown parameters, the GLR approach is commonly utilized for separating the two possibilities. The null hy- pothesis generally defines the nominal operating situation, while the alternatives characterize departures whose presence should be either confirmed or discarded. The essence of the GLR approach is to maximize the likelihood ratio statistic over all possible faults to decide between two composite hypotheses [68–71]. In other words, the aim of the GLR approach is to separate two composite hy- potheses, H0 and H1, based on the observed data. For the purpose of anomaly detection, let’s consider an observation vector Y = [y1,y2,...,yn] ∈ Rn being generated by one of these Gaussian distribu-
  • 54. 38 Statistical Process Monitoring tions: H0 : Y ∼ N(0,σ2In), H1 : Y ∼ N(θ = 0,σ2In), (2.40) where θ is the value of the anomaly and σ2 0 is the variance. In this chapter, the null hypothesis, H0, represents the fault-free situation, and the alternative hypothesis, H1, represents the situation with potential faults. Generally speak- ing, to decide between the two hypotheses, the GLR approach compares the decision statistic, L(Y), to the control limit, h(α): δ(Y) = ⎧ ⎨ ⎩ H0 if L(Y) = 2log sup θ∈Rn fθ (Y) fθ=0(Y) h(α), H1 else. (2.41) The GLR charting statistic, L(Y), is given as L(Y) = 2logsup θ exp − Y − θ2 2 2σ2 /exp − Y2 2 2σ2 , (2.42) where · 2 is the Euclidean norm and fθ (Y) = 1 (2π) n 2 σn exp ' − 1 2σ2 Y − θ2 2 ( is the pdf of Y. Then, (2.42) can be expressed as L(Y) = 1 σ2 min θ Y − θ2 2 + Y2 2 ) = 1 σ2 ' Y − θ2 2 + Y2 2 ( . (2.43) After the estimation of θ as θ = argmin θ Y − θ2 2 = Y, L(Y) can be expressed as L(Y) = 1 σ2 Y2 2. (2.44) The control limit, h(α), is defined to achieve the desired probability of false alarms, selected a priori: P0 (L(Y) ≥ h(α)) = * ∞ h f0(y)dy = 1 − Fχ2 1 (h) = α. (2.45) The power function of the GLR approach is determined as βδ∗ (c2 ) = Pθ (δ∗ (Y) = H1) = 1 − F1,γ (θ)(h), (2.46) where F1,γ (Y) is the non-central χ2(1,γ ) distribution with one degree of free- dom, and the noncentrality parameter γ (θ) = 1 σ2 P⊥ H θ2 2. In summary, a fault is flagged by the GLR approach when the decision statis- tic, L(Y), exceeds the control limit, h(α). Otherwise, the supervised process is performing normally.
  • 55. Other documents randomly have different content
  • 56. with active links or immediate access to the full terms of the Project Gutenberg™ License. 1.E.6. You may convert to and distribute this work in any binary, compressed, marked up, nonproprietary or proprietary form, including any word processing or hypertext form. However, if you provide access to or distribute copies of a Project Gutenberg™ work in a format other than “Plain Vanilla ASCII” or other format used in the official version posted on the official Project Gutenberg™ website (www.gutenberg.org), you must, at no additional cost, fee or expense to the user, provide a copy, a means of exporting a copy, or a means of obtaining a copy upon request, of the work in its original “Plain Vanilla ASCII” or other form. Any alternate format must include the full Project Gutenberg™ License as specified in paragraph 1.E.1. 1.E.7. Do not charge a fee for access to, viewing, displaying, performing, copying or distributing any Project Gutenberg™ works unless you comply with paragraph 1.E.8 or 1.E.9. 1.E.8. You may charge a reasonable fee for copies of or providing access to or distributing Project Gutenberg™ electronic works provided that: • You pay a royalty fee of 20% of the gross profits you derive from the use of Project Gutenberg™ works calculated using the method you already use to calculate your applicable taxes. The fee is owed to the owner of the Project Gutenberg™ trademark, but he has agreed to donate royalties under this paragraph to the Project Gutenberg Literary Archive Foundation. Royalty payments must be paid within 60 days following each date on which you prepare (or are legally required to prepare) your periodic tax returns. Royalty payments should be clearly marked as such and sent to the Project Gutenberg Literary Archive Foundation at the address specified in Section 4, “Information
  • 57. about donations to the Project Gutenberg Literary Archive Foundation.” • You provide a full refund of any money paid by a user who notifies you in writing (or by e-mail) within 30 days of receipt that s/he does not agree to the terms of the full Project Gutenberg™ License. You must require such a user to return or destroy all copies of the works possessed in a physical medium and discontinue all use of and all access to other copies of Project Gutenberg™ works. • You provide, in accordance with paragraph 1.F.3, a full refund of any money paid for a work or a replacement copy, if a defect in the electronic work is discovered and reported to you within 90 days of receipt of the work. • You comply with all other terms of this agreement for free distribution of Project Gutenberg™ works. 1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™ electronic work or group of works on different terms than are set forth in this agreement, you must obtain permission in writing from the Project Gutenberg Literary Archive Foundation, the manager of the Project Gutenberg™ trademark. Contact the Foundation as set forth in Section 3 below. 1.F. 1.F.1. Project Gutenberg volunteers and employees expend considerable effort to identify, do copyright research on, transcribe and proofread works not protected by U.S. copyright law in creating the Project Gutenberg™ collection. Despite these efforts, Project Gutenberg™ electronic works, and the medium on which they may be stored, may contain “Defects,” such as, but not limited to, incomplete, inaccurate or corrupt data, transcription errors, a copyright or other intellectual property infringement, a defective or
  • 58. damaged disk or other medium, a computer virus, or computer codes that damage or cannot be read by your equipment. 1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except for the “Right of Replacement or Refund” described in paragraph 1.F.3, the Project Gutenberg Literary Archive Foundation, the owner of the Project Gutenberg™ trademark, and any other party distributing a Project Gutenberg™ electronic work under this agreement, disclaim all liability to you for damages, costs and expenses, including legal fees. YOU AGREE THAT YOU HAVE NO REMEDIES FOR NEGLIGENCE, STRICT LIABILITY, BREACH OF WARRANTY OR BREACH OF CONTRACT EXCEPT THOSE PROVIDED IN PARAGRAPH 1.F.3. YOU AGREE THAT THE FOUNDATION, THE TRADEMARK OWNER, AND ANY DISTRIBUTOR UNDER THIS AGREEMENT WILL NOT BE LIABLE TO YOU FOR ACTUAL, DIRECT, INDIRECT, CONSEQUENTIAL, PUNITIVE OR INCIDENTAL DAMAGES EVEN IF YOU GIVE NOTICE OF THE POSSIBILITY OF SUCH DAMAGE. 1.F.3. LIMITED RIGHT OF REPLACEMENT OR REFUND - If you discover a defect in this electronic work within 90 days of receiving it, you can receive a refund of the money (if any) you paid for it by sending a written explanation to the person you received the work from. If you received the work on a physical medium, you must return the medium with your written explanation. The person or entity that provided you with the defective work may elect to provide a replacement copy in lieu of a refund. If you received the work electronically, the person or entity providing it to you may choose to give you a second opportunity to receive the work electronically in lieu of a refund. If the second copy is also defective, you may demand a refund in writing without further opportunities to fix the problem. 1.F.4. Except for the limited right of replacement or refund set forth in paragraph 1.F.3, this work is provided to you ‘AS-IS’, WITH NO OTHER WARRANTIES OF ANY KIND, EXPRESS OR IMPLIED,
  • 59. INCLUDING BUT NOT LIMITED TO WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PURPOSE. 1.F.5. Some states do not allow disclaimers of certain implied warranties or the exclusion or limitation of certain types of damages. If any disclaimer or limitation set forth in this agreement violates the law of the state applicable to this agreement, the agreement shall be interpreted to make the maximum disclaimer or limitation permitted by the applicable state law. The invalidity or unenforceability of any provision of this agreement shall not void the remaining provisions. 1.F.6. INDEMNITY - You agree to indemnify and hold the Foundation, the trademark owner, any agent or employee of the Foundation, anyone providing copies of Project Gutenberg™ electronic works in accordance with this agreement, and any volunteers associated with the production, promotion and distribution of Project Gutenberg™ electronic works, harmless from all liability, costs and expenses, including legal fees, that arise directly or indirectly from any of the following which you do or cause to occur: (a) distribution of this or any Project Gutenberg™ work, (b) alteration, modification, or additions or deletions to any Project Gutenberg™ work, and (c) any Defect you cause. Section 2. Information about the Mission of Project Gutenberg™ Project Gutenberg™ is synonymous with the free distribution of electronic works in formats readable by the widest variety of computers including obsolete, old, middle-aged and new computers. It exists because of the efforts of hundreds of volunteers and donations from people in all walks of life. Volunteers and financial support to provide volunteers with the assistance they need are critical to reaching Project Gutenberg™’s goals and ensuring that the Project Gutenberg™ collection will
  • 60. remain freely available for generations to come. In 2001, the Project Gutenberg Literary Archive Foundation was created to provide a secure and permanent future for Project Gutenberg™ and future generations. To learn more about the Project Gutenberg Literary Archive Foundation and how your efforts and donations can help, see Sections 3 and 4 and the Foundation information page at www.gutenberg.org. Section 3. Information about the Project Gutenberg Literary Archive Foundation The Project Gutenberg Literary Archive Foundation is a non-profit 501(c)(3) educational corporation organized under the laws of the state of Mississippi and granted tax exempt status by the Internal Revenue Service. The Foundation’s EIN or federal tax identification number is 64-6221541. Contributions to the Project Gutenberg Literary Archive Foundation are tax deductible to the full extent permitted by U.S. federal laws and your state’s laws. The Foundation’s business office is located at 809 North 1500 West, Salt Lake City, UT 84116, (801) 596-1887. Email contact links and up to date contact information can be found at the Foundation’s website and official page at www.gutenberg.org/contact Section 4. Information about Donations to the Project Gutenberg Literary Archive Foundation Project Gutenberg™ depends upon and cannot survive without widespread public support and donations to carry out its mission of increasing the number of public domain and licensed works that can be freely distributed in machine-readable form accessible by the widest array of equipment including outdated equipment. Many
  • 61. small donations ($1 to $5,000) are particularly important to maintaining tax exempt status with the IRS. The Foundation is committed to complying with the laws regulating charities and charitable donations in all 50 states of the United States. Compliance requirements are not uniform and it takes a considerable effort, much paperwork and many fees to meet and keep up with these requirements. We do not solicit donations in locations where we have not received written confirmation of compliance. To SEND DONATIONS or determine the status of compliance for any particular state visit www.gutenberg.org/donate. While we cannot and do not solicit contributions from states where we have not met the solicitation requirements, we know of no prohibition against accepting unsolicited donations from donors in such states who approach us with offers to donate. International donations are gratefully accepted, but we cannot make any statements concerning tax treatment of donations received from outside the United States. U.S. laws alone swamp our small staff. Please check the Project Gutenberg web pages for current donation methods and addresses. Donations are accepted in a number of other ways including checks, online payments and credit card donations. To donate, please visit: www.gutenberg.org/donate. Section 5. General Information About Project Gutenberg™ electronic works Professor Michael S. Hart was the originator of the Project Gutenberg™ concept of a library of electronic works that could be freely shared with anyone. For forty years, he produced and distributed Project Gutenberg™ eBooks with only a loose network of volunteer support.
  • 62. Project Gutenberg™ eBooks are often created from several printed editions, all of which are confirmed as not protected by copyright in the U.S. unless a copyright notice is included. Thus, we do not necessarily keep eBooks in compliance with any particular paper edition. Most people start at our website which has the main PG search facility: www.gutenberg.org. This website includes information about Project Gutenberg™, including how to make donations to the Project Gutenberg Literary Archive Foundation, how to help produce our new eBooks, and how to subscribe to our email newsletter to hear about new eBooks.
  • 63. Welcome to our website – the perfect destination for book lovers and knowledge seekers. We believe that every book holds a new world, offering opportunities for learning, discovery, and personal growth. That’s why we are dedicated to bringing you a diverse collection of books, ranging from classic literature and specialized publications to self-development guides and children's books. More than just a book-buying platform, we strive to be a bridge connecting you with timeless cultural and intellectual values. With an elegant, user-friendly interface and a smart search system, you can quickly find the books that best suit your interests. Additionally, our special promotions and home delivery services help you save time and fully enjoy the joy of reading. Join us on a journey of knowledge exploration, passion nurturing, and personal growth every day! ebookbell.com