SlideShare a Scribd company logo
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 56
A Method for Detection of Outliers in Time Series
Data
Evan Abdulmajeed Hasan
Erbil, Kurdistan region of Iraq
Evan.hasah@outlook.com
Abstract— An outlier is a data value that is an unusually
small or large, or that deviates fromthe pattern of the rest of
the data. Outliers are usually removed from the data set
before fitting a forecasting model, or not removed but the
forecasting model adjusted in presence of outliers. There
are four types of OUTLIERS are as follows: Additive
outlier (AO), Innovational outlier (IO), Level shift (LS) and
Temporary change (TC). There is more than one method for
the detection of outlier; the study considers the detection of
outlier in two cases: first, at the time when the parameters
are known. Second, when the parameters are unknown.
There are several reasons for outlier detection and
adjustment in time series analysis and forecasting which are
mentioned in this study. The study has used the volume of
water inflow in the reservoir of Dokan dam in Sulaymaniah
city as a time series for the purpose of the study. The study
came to conclude that throughout the research, the
following conclusions: first, every time increasing the
critical value, the value of residual standard error (with
outlier adjustment) increased. Second, every time increasing
the critical value, the number of outlier values decreased.
Third, in the case of presence of outliers the forecasts with
adjustment of outliers better than the forecasts without
adjusting outliers.
Keywords— ARMA model, Innovational Outlier,
Temporary Change, Time Series.
I. INTRODUCTION
The study of outliers is not a new phenomenon. It has in
fact a long history dating back to the earliest statistical
analysis. Outlier methods have developed hand in hand with
other statistical methods. Unfortunately, in time series
analysis this expansion of outlier methods has not been as
rapid and widespread. One reason for this must be that
methods of time series outliers were first considered
explicitly. However, since then the amount of papers
dealing with the issue has grown steadily (Rousseeuw &
Bossche, 2018).
Outliers and structure changes are commonly encountered
in time series data analysis. The presence of those
extraordinary events could easily mislead the conventional
time series analysis procedure resulting in erroneous
conclusions. The impact of those events is often
overlooked, however, for the lack of simple yet useful
methods available to deal with the dynamic behaviour of
those events in the underlying series. The primary goal of
this paper, therefore, is to consider unified methods for
detecting and handling outliers and structure changes in a
univariate time series. The outliers treated are the additive
outlier (AO) and the innovational outlier. The structure
changes allowed for are level shift (LS) and variance
change (VC). Level shift is further classified as permanent
level change (LC) and transient level change (TC)
(Rousseeuw, et al. 2019).
The literature study forms the first stage of a research
project aiming to establish the applicability of time series
and other techniques in estimating missing values and
outlier detection/replacement in a variety of transport data.
Missing data and outliers can occur for a variety of reasons,
for example the breakdown of automatic counters (Cabrieto,
et al. 2017). Initial enquiries suggest that methods for
patching such data can be crude. Local authorities are to be
approached individually using a short questionnaire enquiry
form to attempt to ascertain their current practices. Having
reviewed current practices, the project aims to transfer
recently developed methods for dealing with outliers in
general time series into a transport context. It is anticipated
that comparisons between possible methods could highlight
an alternative and more analytical approach to current
practices (Staal, et al. 2019).
Several approaches have been considered in the literature
for handling outliers in a time series. Abraham and Box
(1979) used a Bayesian method, Martin and Yohai (1986)
treated outliers as contamination generated from a given
probability distribution, and Fox (1972) proposed two
parametric models for studying outliers. Chang (1982)
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 57
adopted Fox’s models and proposed an iterative procedure
to detect multiple outliers. In recent years, this iterative
procedure has been widely used with encouraging results
(Liu, et al. 2018). The methods mentioned above may be
regarded as batch-type procedures for detecting outliers,
because the full data set is used in detecting the existence of
outliers. On the other hand, Harrison and Stevens (1976),
Smith and West (1983), West, Harrison and Migon (1985)
and West (1986) have considered sequential detecting
methods for handling outliers. These sequential methods
assume probabilistic models forbyDenby and Martin
(1979). This approach is summarized in Martin and Yohai
(1985). However, the study of Chang and Tiao (1983)
shows that Denby and Martin’s robust procedure is not
powerful in handling innovational outliers. (Note that the
effect of a single I0 on estimation is usually negligible
provided that the I0 is not close to the end of the
observational period. The effect of multiple IOs, however,
could be serious. There is no comparison available between
the batch-type and the sequential procedures in handling
outliers. The probabilistic treatment has its appeal but may
not be easy to implement as it requires prior information of
the underlying model to begin with. Since level shifts and
variance changes are also considered, the approach of
Chang and Tiao (1983) and Tsay (1986a) is adopted and
generalized in this study (Arumugam& Saranya, 2018).
Outliers can take several forms in time series. There are
additive and innovational outliers. An additive outlier
affects a single observation, which is smaller or larger in
value than expected. In contrast an innovational outlier
affects several observations. Three other types of outliers
can be defined, namely level shifts, transient changes and
variance changes (Aminikhanghahi& Cook, 2017). A level
shift simply changes the level or mean of the series by a
certain magnitude from a certain observation onwards. A
transient change is a generalization of the additive outlier
and level shift in the sense that it causes an initial impact
like an additive outlier, but the effect is passed on to the
observations that come after it. A variance change simply
changes the variance of the observed data by a certain
magnitude (Wang & Mao, 2018).
Outliers have some effects on the forecasts from ARMA
models, and especially outliers near the beginning of the
forecast period can have serious consequences. Point
forecasts may suffer only a little from additive outliers, but
the prediction intervals can become severely misleading, as
outliers can inflate the estimated variance of the series.
Level shifts and transient changes can have more serious
effects also on point forecasts even when outliers are not
close to the forecast region. Attempts have been made to
construct forecasting intervals in the presence of outliers
(Liu, et al. 2018).
II. LITERATURE REVIEW
Types of outliers in a time series
Temporary Change (TC):
An additive outlier (AO) and a level shift (LS) represent
two distinct patterns in which an event affects a series. For
LS, the level of the underlying process is affected for all
future time, while an AO affects the series for only one time
period. It is useful to consider an event that has some initial
impacts on a series, and then the impact eventually
disappears (Hermosilla, et al. 2015). A temporary (or
transient change) (TC) is an event having such an initial
impact and whose effect decays exponentially according to
some dampening factor, say δ. We can represent the
observed series as:
Innovational Outlier (IO):
An innovational outlier is characterized by an initial impact
with effects lingering over subsequent observations. The
influence of the outliers may increase as time proceeds.
We consider integer-valued autoregressive models of order
one contaminated with innovational outliers. Assuming that
the time points of the outliers are known but their sizes are
unknown, we prove that Conditional Least Squares (CLS)
estimators of the offspring and innovation means are
strongly consistent. In contrast, CLS estimators of the
outliers' sizes are not strongly consistent.We also prove that
the joint CLS estimator of the offspring and innovation
means is asymptotically normal. Conditionally on the
values of the process at time points preceding the outliers'
occurrences, the joint CLS estimator of the sizes of the
outliers is asymptotically normal (Capozzoli, et al. 2015).
It is the type of outliers that affects the subsequent
observations starting from its position, in other words that
occurs as a result of natural randomness. The model,
defined as “randomness outlier” in the literature, is shown
as follows:
Thus, the AO case may be called a gross error model, since
only the level of the T’th observation is affected. On the
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 58
other hand, an IO represents an extraordinary shock at time
point T influencing, ,... T T1 z z through the dynamic
system described by (B) (B)/(B) (Chang, Tiao and
Chen, 1988).
Unlike an additive outlier, an innovational outlier (IO) is an
event whose effect is propagated according to the ARIMA
model of the process. In this manner, an IO affects all
values observed after its occurrence. In practice, an IO often
represents the onset of an external cause. The model for the
observed series can be expressed as
The above model can also be written as
As a result, an AO only affects one observation, T Y, while
an IO affects all values of T Y for t ≥ T according to the ψ-
weights {where ψ(B)= () () B B } of the model. The
terminology IO arises because of the representation given in
(2.6) as ta is also referred to as innovation. The
contaminated series t Y is identical to the original series t Z
until t=T ; then t Y , shift up (if I W >0) or down (if I W <
0 ) by I W units at t=T; after t=T ,this effect fades
exponentially at a rate determined by the decay coefficient
φ(B).
For t ≥ T, t Y is higher than t Z by tT IW units .The
effect of the IO fades until eventually the contaminated
series t Y is indistinguishable from the original series t Z .
Level Shift (LS):
A level shift (LS) (sometime known as a level change LC)
is an event that affects a series at a given time, and whose
effect becomes permanent. A level shift could reflect the
change of a process mechanism, the change in a recording
device, or a change in the definition of the variable itself.
The model for the series the study observes may be
represented by
The above representation can also be written as
Auto Regressive Moving Average
An ARMA model, or Autoregressive Moving Average
model, is used to describe weakly stationary stochastic time
series in terms of two polynomials. The first of these
polynomials is for autoregression, the second for the
moving average (Chen, et al. 2017). The autoregressive-
moving average (ARMA) process is the basic model for
analyzing a stationary time series. First, though, stationarity
has to be defined formally in terms of the behavior of the
autocorrelation function (ACF) through World’s
decomposition. Several simple cases of the ARMA
model are then introduced and analyzed, with the partial
autocorrelation function (PACF) also being defined, before
the general model is introduced. ARMA modelbuilding and
estimation may then be developed, and this is done via a
sequence of examples designed to demonstrate some of the
intricacies of selecting an appropriate model to explain the
evolution of an observed time series (Johansen & Nielsen,
2016).
Often this model is referred to as the ARMA(p,q) model;
where:
 p is the order of the autoregressive polynomial,
 q is the order of the moving average polynomial.
The equation is given by:
Where:
 φ = the autoregressive model’s parameters,
 θ = the moving average model’s parameters.
 c = a constant,
 ε = error terms (white noise).
As we have remarked, dependence is very common in time
series observations. To model this time series dependence,
we start with univariate ARMA models. To motivate the
model, basically we can track two lines of thinking. First,
for a series xt , we can model that the level of its current
observations depends on the level of its lagged observations
(Li, et al. 2015). For example, if we observe a high GDP
realization this quarter, we would expect that the GDP in
the next few quarters are good as well. This way of thinking
can be represented by an AR model. The AR(1)
(autoregressive of order one) can be written as:
We introduced the ARMA model that may be written as:
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 59
The model of equation above can be directly extended to
include differencing operators to induce stationarity and to
encompass seasonal terms (as multiplicative AR or MA
operators). To facilitate our understanding of outliers, we
will concentrate our discussions to nonseasonal models.
Moreover, we will assume C=0 so that we may re-write as:
In the above model, t z represents a series that is not
contaminated with outliers. We will use t Y to represent the
values observed for t z in the presence of an outlier. As we
will see, our representation for an outlier will take the form
of the intervention model. The AR operator (and the
differencing operator if exists) is placed in the denominator
of the ARIMA model. Therefore the effect of an outlier is
relative to t Y , rather than relative to the AR filtered t Y
(Reiche, et al. 2015).
We now define and illustrate the types of outliers. These are
additive outlier (AO), innovational outlier (IO), level shift
(LS), and temporary (or transient) change (TC), and to
illustrate the effect of each type of outlier, and how it
affects the values of a time series, we assume that we have
AR(1), then the following simple AR process is employed:
(2-1-1) Additive Outlier (AO)
An additive outlier (AO) is an event that affects a series for
one time period only. One illustration of an AO is a
recording error. For this reason, an additive outlier is
sometimes called a gross error. If we assume that an outlier
occurs at time t=T, we can represent the series we observe
by the model
where () T tp is a pulse function (that is, assumes the value
1 when t=T and is 0 otherwise). The value A W represents
the amount of deviation from the “true” value of T Z. Such
additive outlier (AO's) affect observations in isolation due
to some nonrepetitive events and may occur as a result of
measurement errors of economic, political and financial
events such as oil shocks, wars, financial crashes and
changes in policy regimes.
Outliers detection in time series
1-Likelihood ratio tests:
In practice we don’t know if an AO, LS or IO event has
occurred at any time t. We use a hypo study testing
procedure to decide if such events have occurred.
Let HAdenote the alternate hypostudy, A W ≠ 0; Let HS
denote the alternate, S W ≠0; and let HI denote the
alternate, I W ≠ 0. Tests may be performed with the
following likelihood ratio statistical (denoted as L):
we are just dividing each estimated * w coefficient by its
corresponding standard error [the square root of the
variance] given by:
Under the null hypostudy H0, and assuming that both time i
and the parameters of the ARIMA model, the statistics LA,
LS, and LIare normally distribution with mean zero and
variance. In practice, we don’t know the parameters of the
ARIMA model in
Methods of Outlier Detection
Outlier detection when ARMA parameters are
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 60
It is natural to consider the residuals of a fitted model for
use in detecting outliers in a time series, since most
diagnostic checks of a model are based on residuals.
However, outliers in a time series can affect both the model
we may identify for the series as well as the parameter
estimates of the identified model. As a result, it is unclear
how useful the residuals may be for outlier detection in
certain situations. To better understand how a single outlier
manifests itself in the residual series, consider the filtered
series (Zhang, et al. 2016).
where () B is the polynomial operator in the π-weights of
the ARIMA model. The weights in π(B) may be obtained
by equating coefficients in the backshift operator in an
expression involving π(B) and the polynomial operators of
the model. In the case of the non-seasonal stationary model.
The values of t e become the residuals of the fitted model if
the πweights are computed from the estimated parameters
of the ARIMA model rather than from the known
parameters of the “true” ARIMA model.
We may be able to use the analytic representation of t e to
test for the effect of an outlier. If only one outlier occurs in
a time series, then a least squares estimate for the effect of
the outlier at time t=T , ˆi W (i=1,2,3,4), and the statistic
that may be used for testing its significance can be easily
derived. An adjusted series (i.e., one with the outlier effect
removed) can also be obtained. However, some problems
remain since:
1. In the event there is an outlier, we do not know its
type;
2. We do not know whether an outlier occurs, and if
it occurs, the time of its occurrence;
3. There may be more than one outlier present in the
series; and
4. We do not know precisely what the “true”
underlying model is, nor are we sure of the
accuracy of the estimates of a correct model.
Procedures to account for (1) - (3) above have been
developed during the past few years. Most of these outlier
detection procedures are based on the residuals from fitted
models. In this way, we can diagnostically check a fitted
model for the presence of outliers.
An iterative detection procedure
Suppose there is unknown number of AO, LS and IO events
in a time series t Y , occurring at unknown times t= 12 ,
ii,… . A detection procedure is as follows:
1. Identify and estimate an ARIMA model (or DR
model) forecast t Y assuming that no AO, LS, or
IO events are present.
2. Compute the model residuals ( ˆte ) and estimate 2
a as:
where m is the number of residuals available (m=n- 1 n and
1n = p+ S P + d+ S D )
3. Compute the likelihood ratios. Set 0,tˆ L equal to
the largest of these statistics; that is, 0,t ˆ L = max
{ A,t ˆ L ,s,tˆ L , I,t ˆ L }for the m time periods
t=1+ 1 n , 2+ 1 n ,….,n. 4. Find ˆ L =max { 0,t ˆ L
}. Compare ˆ L with a predetermined critical
value dc (discussed later). If ˆ L ≤ d c , stop the
procedure. If ˆ L > d c ,then a possible AO, LS, or
IO is detected. At the time (t = i), type (AO, LS, or
IO), and estimated w coefficient of the identified
possible event are those associated with ˆ L .
4. Find ˆ L =max { 0,t ˆ L }. Compare ˆ L with a
predetermined critical value dc (discussed later).
If ˆ L ≤ d c , stop the procedure. If ˆ L > d c ,then a
possible AO, LS, or IO is detected.At the time (t
= i), type (AO, LS, or IO), and estimated w
coefficient of the identified possible event are
those associated with ˆ L .
a- If a possible LS is detected, its size is estimated by ˆ SW
in * SW = s k () C F i e = s k remove this LS effect from
the residual series by replacing each ˆ te with ˆ te - ˆS W
() ˆ t BXc for t ≥ i. Reestimate 2 a using the new ˆ te
series; use this new estimate to recomputed S,t ˆ L.
c- If a possible IO is detected, its effect is estimated by ˆI W
according to * IW = Remove this IO effect from the
residual series by replacing ˆ teattime t = i with ˆ te - ˆI W
=0. Re estimate 2 a using the new ˆ te series; use this new
estimate to recompute I,t ˆ L.
5. Suppose T possible AO,LS or IO effects are found at
times i1,i2,…., Ti . Treat these times as known and estimate
the w coefficients for each effect simultaneously within a
DR model. For example, suppose we find T= 3 effects, with
a possible AO detected at time t = i3 . Then we estimate the
model
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 61
Where 1,t X =1 at t = i1 and 1,t X =0 otherwise; 2,t X =0
for t <i2 and 2,tX =1 for t ≥ i2 ; 3,t X =1 at t=i3 and 3,t
X =0 otherwise. The model may also call for a constant
term. Diagnostic checking may lead to us to modify.
III. METHODS AND FINDINGS
Collection of data
The researcher gathered data for the application of a
research from the Dokan dam in Sulaimaniah city, where
the data are the volume of water inputting the reservoir of
Dokan dam(daily rates cubic meters) late 2018 and early
2019,the very large volume of data has been converted to
monthly averages (cubic meters) time series.
Building ARIMA model
Model identification
A time series plot of volume, the study is certain that the
series does not have a fixed mean level and not stable in the
variance. First to stabling the variance, we transform the
data by using the natural logarithmic. We will store the
transformed data under the name Lvolume, by using SCA
paragraph.A time series plot of Lvolume, furthermore the
new series still exhibits a trend and seasonality, but we
seem to have stabilized the variability over the length of the
series ( as seen in figure 1).
Fig.1-Plot of water volume series of Dokan dam
Fig.2-Plot of log of water volume (Lvolume) ofDokan dam
We expect that the Lvolume is not stationary. This is
confirmed when we compute and display the sample ACF
of the series, by using ACF paragraph of SCA system.
Fig.3-Estimate of ACF for the Lvolume .
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 62
The ACF has a slow die-out pattern that is indicative of a
nonstationary series. Differencing is required. However,
because the data is seasonal, the study may wonder if the
“proper” differencing operator is (1-B) or (1- 12 B ). We
can examine the sample ACF by using both of these
differencing operators. The output is edited for presentation
purposes as shown below. -- >ACF LVOLUME.
DFORDERS 1 12.
Fig.4-Estimate of ACF for differenced Lvolume (d=1,D=1).
Model estimation
The study estimates the volume model by using ESTIM
paragraph as:
Table 1-Summary of estimate time series for Lvolume
Parameters estimates are significant based on their t-values.
Diagnostic check of model
A time plot of the residual series does not reveal any gross
abnormalities, although some unusual points appear to be
present. We can compute and display 24 lags of the sample
ACF of the residuals. We see the sample ACF of the
residuals is “clean”. The output is edited for presentation
purposes.
Fig.5-The ACF plot for the residuals of suggested model.
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 63
The detection of outliers when the parameters Unknown
To demonstrate outlier detection, the study used the
OUTLIER paragraph of SCA system for Lvolume time
series. The study obtained the following estimates for
model parameters and outliers at different critical values
(2.5, 3.0, 3.5 ,4.0) for outlier detection, as seen in table(2).
Table 2-The estimates of outliers and it types with different
critical values
The study illustrates that the number of outliers decrease
whenever critical values increase. Alternatively,we could
have estimated model Lvolume using the OESTIM
paragraph. In this way the SCA System will simultaneously
detect outliers and jointly estimate their effects with the
parameter.When critical value equal to 2.5 as seen in table
(3).
Table 3-Summary of estimate time series model for volume
(cd=2.5)
The OFORECAST paragraph extends the outlier detection
and adjustment capabilities of the SCA System to the
forecasting of a time series in the presence of outliers.
Unlike other forecasting capabilities that simply utilize the
current parameter estimates and the data on hand to
compute forecasts, the OFORECAST paragraph also
performs its own outlier detection and adjustment. As a
result, it provides us with:
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 64
Table 4-Forecasts for Volume after adjusting the outliers
(Cd=2.5).
We can converse the forecasting value of Lvolume to the
origin values before taking the natural logarithmic. Then
compare these values with the forecasts by using the fitted
ARIMA model, assuming that the outliers are not presence,
the results shown as in the table (3.16) below by using
critical value (4.0).
Table 5-Forecasts for the volume data with and without
adjusting the outliers
And we note that the Mse(0.052905) forecast without
adjusting the outlier is greater than the Mse (0.043100) of
forecasting with adjusting the outlier. This means that when
analyzing the data of time series, first we must detect and
adjust the outliers.
IV. CONCLUSIONS
The study came to conclude that throughout the research,
the following conclusions: first, every time increasing the
critical value, the value of residual standard error (with
outlier adjustment) increased. Second, every time increasing
the critical value, the number of outlier values decreased.
Third, in the case of presence of outliers the forecasts with
adjustment of outliers better than the forecasts without
adjusting outliers.
Since the procedures are based on simple techniques, they
are widely applicable. For instance, they can be used as data
screening device in spectral density estimation and in robust
time series analysis. They can also be used in biological
study where exogenous disturbances are unavoidable. For
example, Greenhouse, Kass and Tsay (1987) analysed body
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 65
temperature of an individual involved in a psychiatric study
where the observations clearly depended on the individual
physical activities. A variance change from day to night
seems highly plausible. A third application of the
procedures is that they can be used to identify the time point
of an intervention in the intervention analysis of Box and
Tiao (1975). In the traditional intervention analysis, the
time point of an intervention is assumed to be known.
Finally, two remarks are made on the procedures. First, in
Section 4 the adjusted series was used in the detection
process to demonstrate the usefulness of the suggested
procedure. This, however, does not imply that one can rely
on the adjusted series to make inferences. A more
appropriate strategy would be (a) to search for the causes of
the identified outliers, level changes and variance changes,
(b) to specify a general model in the form of (2) based on
causes of the exogenous disturbances, and (c) to estimate
jointly the impact of disturbances and the time series
parameters. This strategy allows for the use of prior
information of the disturbances. It can also reduce the
possibility of over parameterization that arises from the
abuse of the detection procedure. Readers are referred to
Tsay (1986a) for further discussion. Second, to detect the
transient level change, 6 = 0.8 was used in Section 4. In
fact, other values of 6 can also be used. As an example, 6 =
0.6 was used to the air-passenger-miles data of Example 1.
The procedure still identified the same time points as
significant disturbances even though some of the
classifications between permanent and transient level
changes are different. Similarly, to detect the variance
changes, h = 30 was used to compute residual variances at
both ends of a series. The choice of h is not critical as long
as it is reasonable. For instance, the same detection results
were obtained in Example 2 when h = 20 was used. In
general, a h between 20 and 30 appears to be useful.
The study supports the claimthat outliers do result in model
misspecification as they affect the autocorrelation structure
of any time series. In our case it is illustrated by the fact that
initially we had the ARIMA (1 1 0) *(0 0 1) 12 as the best
model that could be fitted to our data. Testing the residuals
for normality and constant variance showed that both
assumptions were violated although the parameters in the
model were significant. Using this model for forecasts
would have given misleading figures for a decision maker.
This is possibly attributed to the presence of outliers. The
best model was found to be ARIMA (1 1 2) *(0 0 1) 12
after correcting the series for outliers and all the parameters
were significant in the model. Diagnostic checks also
showed that the assumptions of normality and constant
variance were not violated. This therefore demonstrates that
the procedure is useful in detecting and correcting for
outliers. It can be applied to all invertible ARIMA models.
Moreover, it is flexible and easy to interpret. The procedure
must be used with other diagnostic tools for time series to
produce even better results. Further study is needed to
investigate the variances and other sampling properties of
the resulting parameter estimates. The message from this
study is that when examining economic time series data any
potential outliers should be taken seriously, no matter what
the ultimate aim or the model used may be. Outliers have
already been shown to be potentially harmful, and there is
also increasing evidence that the dangers are not only
theoretical. Other possible models that might be useful for
modelling time series must be explored such as GARCH
and ARCH models. These are non-linear forms of time
series that might be used to model data that has got a lot of
fluctuations in it. Non-linearity tests are normally done on
the data before the previous models can be applied. The
study suggested for future studies the following: Studying
the methods of detection outliers in multivariate time series
and application. Studying the detection outlier when occurs
at the end of the series, and finally studying the detection
outlier when presence the missing data in the series.
REFERENCES
[1] Ahmad, S., & Purdy, S. (2016). Real-time anomaly
detection for streaming analytics. arXiv preprint
arXiv:1607.02480.
[2] Aminikhanghahi, S., & Cook, D. J. (2017). A survey of
methods for time series change point
detection. Knowledge and information systems, 51(2),
339-367.
[3] Arumugam, P., & Saranya, R. (2018). Outlier Detection
and Missing Value in Seasonal ARIMA Model Using
Rainfall Data. Materials Today: Proceedings, 5(1),
1791-1799.
[4] Cabrieto, J., Tuerlinckx, F., Kuppens, P., Grassmann,
M., &Ceulemans, E. (2017). Detecting correlation
changes in multivariate time series: A comparison of
four non-parametric change point detection
methods. Behavior
[5] Capozzoli, A., Lauro, F., & Khan, I. (2015). Fault
detection analysis using data mining techniques for a
cluster of smart office buildings. Expert Systems with
Applications, 42(9), 4324-4338.
International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019]
https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X
http://www.aipublications.com/ijcmp/ Page | 66
[6] Chen, W., Zhou, K., Yang, S., & Wu, C. (2017). Data
quality of electricity consumption data in a smart grid
environment. Renewable and Sustainable Energy
Reviews, 75, 98-105.
[7] Filonov, P., Lavrentyev, A., &Vorontsov, A. (2016).
Multivariate industrial time series with cyber-attack
simulation: Fault detection using anlstm-based
predictive data model. arXiv preprint arXiv:1612.06676.
[8] Frantz, D., Röder, A., Udelhoven, T., & Schmidt, M.
(2015). Enhancing the detectability of clouds and their
shadows in multitemporal dryland Landsat imagery:
Extending Fmask. IEEE Geoscience and Remote
Sensing Letters, 12(6), 1242-1246.
[9] Ganz, F., Puschmann, D., Barnaghi, P., &Carrez, F.
(2015). A practical evaluation of information processing
and abstraction techniques for the internet of
things. IEEE Internet of Things journal, 2(4), 340-354.
[10] Hermosilla, T., Wulder, M. A., White, J. C., Coops, N.
C., & Hobart, G. W. (2015). An integrated Landsat time
series protocol for change detection and generation of
annual gap-free surface reflectance composites. Remote
Sensing of Environment, 158, 220-234.
[11] Johansen, S., & Nielsen, B. (2016). Asymptotic theory
of outlier detection algorithms for linear time series
regression models. Scandinavian Journal of
Statistics, 43(2), 321-348.
[12] Kontaki, M., Gounaris, A., Papadopoulos, A. N.,
Tsichlas, K., &Manolopoulos, Y. (2016). Efficient and
flexible algorithms for monitoring distance-based
outliers over data streams. Information systems, 55, 37-
53.
[13] Li, L., Das, S., John Hansman, R., Palacios, R., &
Srivastava, A. N. (2015). Analysis of flight data using
clustering techniques for detecting abnormal
operations. Journal of Aerospace information
systems, 12(9), 587-598.
[14] Liu, M., Shi, J., Cao, K., Zhu, J., & Liu, S. (2018).
Analyzing the training processes of deep generative
models. IEEE transactions on visualization and
computer graphics, 24(1), 77-87.
[15] Liu, S., Wright, A., &Hauskrecht, M. (2018). Change-
point detection method for clinical decision support
system rule monitoring. Artificial intelligence in
medicine, 91, 49-56.
[16] Liu, Z., Verstraete, M. M., & de Jager, G. (2018).
Handling outliers in model inversion studies: a remote
sensing case study using MISR-HR data in South
Africa. South African Geographical Journal, 100(1),
122-139.
[17] Loureiro, D., Amado, C., Martins, A., Vitorino, D.,
Mamade, A., & Coelho, S. T. (2016). Water distribution
systems flow monitoring and anomalous event
detection: A practical approach. Urban Water
Journal, 13(3), 242-252.
[18] Martí, L., Sanchez-Pi, N., Molina, J., & Garcia, A.
(2015). Anomaly detection based on sensor data in
petroleum industry applications. Sensors, 15(2), 2774-
2797.
[19] Reiche, J., Verbesselt, J., Hoekman, D., & Herold, M.
(2015). Fusing Landsat and SAR time series to detect
deforestation in the tropics. Remote Sensing of
Environment, 156, 276-293.
[20] Rousseeuw, P. J., &Bossche, W. V. D. (2018).
Detecting deviating data cells. Technometrics, 60(2),
135-145.
[21] Rousseeuw, P., Perrotta, D., Riani, M., & Hubert, M.
(2019). Robust monitoring of time series with
application to fraud detection. Econometrics and
statistics, 9, 108-121.
[22] Sprint, G., Cook, D. J., &Schmitter-Edgecombe, M.
(2016). Unsupervised detection and analysis of changes
in everyday physical activity data. Journal of
biomedical informatics, 63, 54-65.
[23] Sprint, G., Cook, D. J., Fritz, R., &Schmitter-
Edgecombe, M. (2016). Using smart homes to detect
and analyze health events. Computer, 49(11), 29-37.
[24] Staal, O. M., Sælid, S., Fougner, A., &Stavdahl, Ø.
(2019). Kalman smoothing for objective and automatic
preprocessing of glucose data. IEEE journal of
biomedical and health informatics, 23(1), 218-226.
[25] Stumpf, A., Malet, J. P., &Delacourt, C. (2017).
Correlation of satellite image time-series for the
detection and monitoring of slow-moving
landslides. Remote sensing of environment, 189, 40-55.
[26] Wang, B., & Mao, Z. (2018). Detecting Outliers in
Electric Arc Furnace under the Condition of Unlabeled,
Imbalanced, Non-stationary and Noisy
Data. Measurement and Control, 51(3-4), 83-93.
[27] Zhang, Q., Pandey, B., &Seto, K. C. (2016). A robust
method to generate a consistent time series from
DMSP/OLS nighttime light data. IEEE Transactions on
Geoscience and Remote Sensing, 54(10), 5821-5831.

More Related Content

PDF
erros em experimentos de adsorção
PDF
GLMM in interventional study at Require 23, 20151219
PDF
Time series forecasting of solid waste generation in arusha city tanzania
PDF
Ijarmb 01-01-2016-03
PDF
applied multivariate statistical techniques in agriculture and plant science 2
PPTX
Analytical Chemistry and Statistics in Exposure Science
PDF
Statistical Modelling For Heterogeneous Dataset
PDF
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)
erros em experimentos de adsorção
GLMM in interventional study at Require 23, 20151219
Time series forecasting of solid waste generation in arusha city tanzania
Ijarmb 01-01-2016-03
applied multivariate statistical techniques in agriculture and plant science 2
Analytical Chemistry and Statistics in Exposure Science
Statistical Modelling For Heterogeneous Dataset
[Xin yan, xiao_gang_su]_linear_regression_analysis(book_fi.org)

Similar to A Method for Detection of Outliers in Time Series Data (20)

PDF
A_review_on_outlier_detection_in_time_series_data__BCAM_1.pdf.pdf
PDF
AN IMPROVED FRAMEWORK FOR OUTLIER PERIODIC PATTERN DETECTION IN TIME SERIES U...
PDF
Outlier analysis for Temporal Datasets
PDF
Detection of Outliers in Large Dataset using Distributed Approach
PDF
Outlier Detection Approaches in Data Mining
PDF
angle based outlier de
PDF
Kdd08 abod
DOCX
A Survey on Cluster Based Outlier Detection Techniques in Data Stream
PDF
Bäßler2022_Article_UnsupervisedAnomalyDetectionIn.pdf
PDF
Multiple Linear Regression Models in Outlier Detection
PDF
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
PDF
Fuzzified input data tuning for agriculture commodities price prediction
PDF
Identification of Outliersin Time Series Data via Simulation Study
PDF
A Course in Time Series Analysis 1st Edition Pena D.
PPTX
Time Series Anomaly Detection with .net and Azure
PDF
The RuLIS approach to outliers (Marcello D'Orazio,FAO)
 
PPTX
Regression diagnostics
PPTX
Outlier Detection in Data Mining An Essential Component of Semiconductor Manu...
PPT
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
PDF
A Course in Time Series Analysis 1st Edition Pena D.
A_review_on_outlier_detection_in_time_series_data__BCAM_1.pdf.pdf
AN IMPROVED FRAMEWORK FOR OUTLIER PERIODIC PATTERN DETECTION IN TIME SERIES U...
Outlier analysis for Temporal Datasets
Detection of Outliers in Large Dataset using Distributed Approach
Outlier Detection Approaches in Data Mining
angle based outlier de
Kdd08 abod
A Survey on Cluster Based Outlier Detection Techniques in Data Stream
Bäßler2022_Article_UnsupervisedAnomalyDetectionIn.pdf
Multiple Linear Regression Models in Outlier Detection
SURVEY PAPER ON OUT LIER DETECTION USING FUZZY LOGIC BASED METHOD
Fuzzified input data tuning for agriculture commodities price prediction
Identification of Outliersin Time Series Data via Simulation Study
A Course in Time Series Analysis 1st Edition Pena D.
Time Series Anomaly Detection with .net and Azure
The RuLIS approach to outliers (Marcello D'Orazio,FAO)
 
Regression diagnostics
Outlier Detection in Data Mining An Essential Component of Semiconductor Manu...
Data mining: Concepts and Techniques, Chapter12 outlier Analysis
A Course in Time Series Analysis 1st Edition Pena D.
Ad

More from AI Publications (20)

PDF
Shelling and Schooling: Educational Disruptions and Social Consequences for C...
PDF
Climate Resilient Crops: Innovations in Vegetable Breeding for a Warming Worl...
PDF
Impact of Processing Techniques on Antioxidant, Antimicrobial and Phytochemic...
PDF
Determinants of Food Safety Standard Compliance among Local Meat Sellers in I...
PDF
A Study on Analysing the Financial Performance of AU Small Finance and Ujjiva...
PDF
An Examine on Impact of Social Media Advertising on Consumer Purchasing Behav...
PDF
A Study on Impact of Customer Review on Online Purchase Decision with Amazon
PDF
A Comparative Analysis of Traditional and Digital Marketing Strategies in Era...
PDF
Assessment of Root Rot Disease in Green Gram (Vigna radiata L.) Caused by Rhi...
PDF
Biochemical Abnormalities in OPS Poisoning and its Prognostic Significance
PDF
Potential energy curves, spectroscopic parameters, vibrational levels and mol...
PDF
Effect of Thermal Treatment of Two Titanium Alloys (Ti-49Al & Ti-51Al) on Cor...
PDF
Theoretical investigation of low-lying electronic states of the Be+He molecul...
PDF
Phenomenology and Production Mechanisms of Axion-Like Particles via Photon In...
PDF
Effect of Storage Conditions and Plastic Packaging on Postharvest Quality of ...
PDF
Shared Links: Building a Community Economic Ecosystem under ‘The Wall’—Based ...
PDF
Design a Novel Neutral Point Clamped Inverter Without AC booster for Photo-vo...
PDF
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
PDF
Anomaly Detection in Smart Home IoT Systems Using Machine Learning Approaches
PDF
Improving the quality of life of older adults through acupuncture
Shelling and Schooling: Educational Disruptions and Social Consequences for C...
Climate Resilient Crops: Innovations in Vegetable Breeding for a Warming Worl...
Impact of Processing Techniques on Antioxidant, Antimicrobial and Phytochemic...
Determinants of Food Safety Standard Compliance among Local Meat Sellers in I...
A Study on Analysing the Financial Performance of AU Small Finance and Ujjiva...
An Examine on Impact of Social Media Advertising on Consumer Purchasing Behav...
A Study on Impact of Customer Review on Online Purchase Decision with Amazon
A Comparative Analysis of Traditional and Digital Marketing Strategies in Era...
Assessment of Root Rot Disease in Green Gram (Vigna radiata L.) Caused by Rhi...
Biochemical Abnormalities in OPS Poisoning and its Prognostic Significance
Potential energy curves, spectroscopic parameters, vibrational levels and mol...
Effect of Thermal Treatment of Two Titanium Alloys (Ti-49Al & Ti-51Al) on Cor...
Theoretical investigation of low-lying electronic states of the Be+He molecul...
Phenomenology and Production Mechanisms of Axion-Like Particles via Photon In...
Effect of Storage Conditions and Plastic Packaging on Postharvest Quality of ...
Shared Links: Building a Community Economic Ecosystem under ‘The Wall’—Based ...
Design a Novel Neutral Point Clamped Inverter Without AC booster for Photo-vo...
Empowering Electric Vehicle Charging Infrastructure with Renewable Energy Int...
Anomaly Detection in Smart Home IoT Systems Using Machine Learning Approaches
Improving the quality of life of older adults through acupuncture
Ad

Recently uploaded (20)

PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPT
Mechanical Engineering MATERIALS Selection
DOCX
573137875-Attendance-Management-System-original
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Internet of Things (IOT) - A guide to understanding
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
UNIT 4 Total Quality Management .pptx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
Construction Project Organization Group 2.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPT
Project quality management in manufacturing
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Well-logging-methods_new................
PDF
PPT on Performance Review to get promotions
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Operating System & Kernel Study Guide-1 - converted.pdf
Mechanical Engineering MATERIALS Selection
573137875-Attendance-Management-System-original
OOP with Java - Java Introduction (Basics)
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
CH1 Production IntroductoryConcepts.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Internet of Things (IOT) - A guide to understanding
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
UNIT 4 Total Quality Management .pptx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Construction Project Organization Group 2.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
R24 SURVEYING LAB MANUAL for civil enggi
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Project quality management in manufacturing
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Well-logging-methods_new................
PPT on Performance Review to get promotions

A Method for Detection of Outliers in Time Series Data

  • 1. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 56 A Method for Detection of Outliers in Time Series Data Evan Abdulmajeed Hasan Erbil, Kurdistan region of Iraq Evan.hasah@outlook.com Abstract— An outlier is a data value that is an unusually small or large, or that deviates fromthe pattern of the rest of the data. Outliers are usually removed from the data set before fitting a forecasting model, or not removed but the forecasting model adjusted in presence of outliers. There are four types of OUTLIERS are as follows: Additive outlier (AO), Innovational outlier (IO), Level shift (LS) and Temporary change (TC). There is more than one method for the detection of outlier; the study considers the detection of outlier in two cases: first, at the time when the parameters are known. Second, when the parameters are unknown. There are several reasons for outlier detection and adjustment in time series analysis and forecasting which are mentioned in this study. The study has used the volume of water inflow in the reservoir of Dokan dam in Sulaymaniah city as a time series for the purpose of the study. The study came to conclude that throughout the research, the following conclusions: first, every time increasing the critical value, the value of residual standard error (with outlier adjustment) increased. Second, every time increasing the critical value, the number of outlier values decreased. Third, in the case of presence of outliers the forecasts with adjustment of outliers better than the forecasts without adjusting outliers. Keywords— ARMA model, Innovational Outlier, Temporary Change, Time Series. I. INTRODUCTION The study of outliers is not a new phenomenon. It has in fact a long history dating back to the earliest statistical analysis. Outlier methods have developed hand in hand with other statistical methods. Unfortunately, in time series analysis this expansion of outlier methods has not been as rapid and widespread. One reason for this must be that methods of time series outliers were first considered explicitly. However, since then the amount of papers dealing with the issue has grown steadily (Rousseeuw & Bossche, 2018). Outliers and structure changes are commonly encountered in time series data analysis. The presence of those extraordinary events could easily mislead the conventional time series analysis procedure resulting in erroneous conclusions. The impact of those events is often overlooked, however, for the lack of simple yet useful methods available to deal with the dynamic behaviour of those events in the underlying series. The primary goal of this paper, therefore, is to consider unified methods for detecting and handling outliers and structure changes in a univariate time series. The outliers treated are the additive outlier (AO) and the innovational outlier. The structure changes allowed for are level shift (LS) and variance change (VC). Level shift is further classified as permanent level change (LC) and transient level change (TC) (Rousseeuw, et al. 2019). The literature study forms the first stage of a research project aiming to establish the applicability of time series and other techniques in estimating missing values and outlier detection/replacement in a variety of transport data. Missing data and outliers can occur for a variety of reasons, for example the breakdown of automatic counters (Cabrieto, et al. 2017). Initial enquiries suggest that methods for patching such data can be crude. Local authorities are to be approached individually using a short questionnaire enquiry form to attempt to ascertain their current practices. Having reviewed current practices, the project aims to transfer recently developed methods for dealing with outliers in general time series into a transport context. It is anticipated that comparisons between possible methods could highlight an alternative and more analytical approach to current practices (Staal, et al. 2019). Several approaches have been considered in the literature for handling outliers in a time series. Abraham and Box (1979) used a Bayesian method, Martin and Yohai (1986) treated outliers as contamination generated from a given probability distribution, and Fox (1972) proposed two parametric models for studying outliers. Chang (1982)
  • 2. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 57 adopted Fox’s models and proposed an iterative procedure to detect multiple outliers. In recent years, this iterative procedure has been widely used with encouraging results (Liu, et al. 2018). The methods mentioned above may be regarded as batch-type procedures for detecting outliers, because the full data set is used in detecting the existence of outliers. On the other hand, Harrison and Stevens (1976), Smith and West (1983), West, Harrison and Migon (1985) and West (1986) have considered sequential detecting methods for handling outliers. These sequential methods assume probabilistic models forbyDenby and Martin (1979). This approach is summarized in Martin and Yohai (1985). However, the study of Chang and Tiao (1983) shows that Denby and Martin’s robust procedure is not powerful in handling innovational outliers. (Note that the effect of a single I0 on estimation is usually negligible provided that the I0 is not close to the end of the observational period. The effect of multiple IOs, however, could be serious. There is no comparison available between the batch-type and the sequential procedures in handling outliers. The probabilistic treatment has its appeal but may not be easy to implement as it requires prior information of the underlying model to begin with. Since level shifts and variance changes are also considered, the approach of Chang and Tiao (1983) and Tsay (1986a) is adopted and generalized in this study (Arumugam& Saranya, 2018). Outliers can take several forms in time series. There are additive and innovational outliers. An additive outlier affects a single observation, which is smaller or larger in value than expected. In contrast an innovational outlier affects several observations. Three other types of outliers can be defined, namely level shifts, transient changes and variance changes (Aminikhanghahi& Cook, 2017). A level shift simply changes the level or mean of the series by a certain magnitude from a certain observation onwards. A transient change is a generalization of the additive outlier and level shift in the sense that it causes an initial impact like an additive outlier, but the effect is passed on to the observations that come after it. A variance change simply changes the variance of the observed data by a certain magnitude (Wang & Mao, 2018). Outliers have some effects on the forecasts from ARMA models, and especially outliers near the beginning of the forecast period can have serious consequences. Point forecasts may suffer only a little from additive outliers, but the prediction intervals can become severely misleading, as outliers can inflate the estimated variance of the series. Level shifts and transient changes can have more serious effects also on point forecasts even when outliers are not close to the forecast region. Attempts have been made to construct forecasting intervals in the presence of outliers (Liu, et al. 2018). II. LITERATURE REVIEW Types of outliers in a time series Temporary Change (TC): An additive outlier (AO) and a level shift (LS) represent two distinct patterns in which an event affects a series. For LS, the level of the underlying process is affected for all future time, while an AO affects the series for only one time period. It is useful to consider an event that has some initial impacts on a series, and then the impact eventually disappears (Hermosilla, et al. 2015). A temporary (or transient change) (TC) is an event having such an initial impact and whose effect decays exponentially according to some dampening factor, say δ. We can represent the observed series as: Innovational Outlier (IO): An innovational outlier is characterized by an initial impact with effects lingering over subsequent observations. The influence of the outliers may increase as time proceeds. We consider integer-valued autoregressive models of order one contaminated with innovational outliers. Assuming that the time points of the outliers are known but their sizes are unknown, we prove that Conditional Least Squares (CLS) estimators of the offspring and innovation means are strongly consistent. In contrast, CLS estimators of the outliers' sizes are not strongly consistent.We also prove that the joint CLS estimator of the offspring and innovation means is asymptotically normal. Conditionally on the values of the process at time points preceding the outliers' occurrences, the joint CLS estimator of the sizes of the outliers is asymptotically normal (Capozzoli, et al. 2015). It is the type of outliers that affects the subsequent observations starting from its position, in other words that occurs as a result of natural randomness. The model, defined as “randomness outlier” in the literature, is shown as follows: Thus, the AO case may be called a gross error model, since only the level of the T’th observation is affected. On the
  • 3. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 58 other hand, an IO represents an extraordinary shock at time point T influencing, ,... T T1 z z through the dynamic system described by (B) (B)/(B) (Chang, Tiao and Chen, 1988). Unlike an additive outlier, an innovational outlier (IO) is an event whose effect is propagated according to the ARIMA model of the process. In this manner, an IO affects all values observed after its occurrence. In practice, an IO often represents the onset of an external cause. The model for the observed series can be expressed as The above model can also be written as As a result, an AO only affects one observation, T Y, while an IO affects all values of T Y for t ≥ T according to the ψ- weights {where ψ(B)= () () B B } of the model. The terminology IO arises because of the representation given in (2.6) as ta is also referred to as innovation. The contaminated series t Y is identical to the original series t Z until t=T ; then t Y , shift up (if I W >0) or down (if I W < 0 ) by I W units at t=T; after t=T ,this effect fades exponentially at a rate determined by the decay coefficient φ(B). For t ≥ T, t Y is higher than t Z by tT IW units .The effect of the IO fades until eventually the contaminated series t Y is indistinguishable from the original series t Z . Level Shift (LS): A level shift (LS) (sometime known as a level change LC) is an event that affects a series at a given time, and whose effect becomes permanent. A level shift could reflect the change of a process mechanism, the change in a recording device, or a change in the definition of the variable itself. The model for the series the study observes may be represented by The above representation can also be written as Auto Regressive Moving Average An ARMA model, or Autoregressive Moving Average model, is used to describe weakly stationary stochastic time series in terms of two polynomials. The first of these polynomials is for autoregression, the second for the moving average (Chen, et al. 2017). The autoregressive- moving average (ARMA) process is the basic model for analyzing a stationary time series. First, though, stationarity has to be defined formally in terms of the behavior of the autocorrelation function (ACF) through World’s decomposition. Several simple cases of the ARMA model are then introduced and analyzed, with the partial autocorrelation function (PACF) also being defined, before the general model is introduced. ARMA modelbuilding and estimation may then be developed, and this is done via a sequence of examples designed to demonstrate some of the intricacies of selecting an appropriate model to explain the evolution of an observed time series (Johansen & Nielsen, 2016). Often this model is referred to as the ARMA(p,q) model; where:  p is the order of the autoregressive polynomial,  q is the order of the moving average polynomial. The equation is given by: Where:  φ = the autoregressive model’s parameters,  θ = the moving average model’s parameters.  c = a constant,  ε = error terms (white noise). As we have remarked, dependence is very common in time series observations. To model this time series dependence, we start with univariate ARMA models. To motivate the model, basically we can track two lines of thinking. First, for a series xt , we can model that the level of its current observations depends on the level of its lagged observations (Li, et al. 2015). For example, if we observe a high GDP realization this quarter, we would expect that the GDP in the next few quarters are good as well. This way of thinking can be represented by an AR model. The AR(1) (autoregressive of order one) can be written as: We introduced the ARMA model that may be written as:
  • 4. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 59 The model of equation above can be directly extended to include differencing operators to induce stationarity and to encompass seasonal terms (as multiplicative AR or MA operators). To facilitate our understanding of outliers, we will concentrate our discussions to nonseasonal models. Moreover, we will assume C=0 so that we may re-write as: In the above model, t z represents a series that is not contaminated with outliers. We will use t Y to represent the values observed for t z in the presence of an outlier. As we will see, our representation for an outlier will take the form of the intervention model. The AR operator (and the differencing operator if exists) is placed in the denominator of the ARIMA model. Therefore the effect of an outlier is relative to t Y , rather than relative to the AR filtered t Y (Reiche, et al. 2015). We now define and illustrate the types of outliers. These are additive outlier (AO), innovational outlier (IO), level shift (LS), and temporary (or transient) change (TC), and to illustrate the effect of each type of outlier, and how it affects the values of a time series, we assume that we have AR(1), then the following simple AR process is employed: (2-1-1) Additive Outlier (AO) An additive outlier (AO) is an event that affects a series for one time period only. One illustration of an AO is a recording error. For this reason, an additive outlier is sometimes called a gross error. If we assume that an outlier occurs at time t=T, we can represent the series we observe by the model where () T tp is a pulse function (that is, assumes the value 1 when t=T and is 0 otherwise). The value A W represents the amount of deviation from the “true” value of T Z. Such additive outlier (AO's) affect observations in isolation due to some nonrepetitive events and may occur as a result of measurement errors of economic, political and financial events such as oil shocks, wars, financial crashes and changes in policy regimes. Outliers detection in time series 1-Likelihood ratio tests: In practice we don’t know if an AO, LS or IO event has occurred at any time t. We use a hypo study testing procedure to decide if such events have occurred. Let HAdenote the alternate hypostudy, A W ≠ 0; Let HS denote the alternate, S W ≠0; and let HI denote the alternate, I W ≠ 0. Tests may be performed with the following likelihood ratio statistical (denoted as L): we are just dividing each estimated * w coefficient by its corresponding standard error [the square root of the variance] given by: Under the null hypostudy H0, and assuming that both time i and the parameters of the ARIMA model, the statistics LA, LS, and LIare normally distribution with mean zero and variance. In practice, we don’t know the parameters of the ARIMA model in Methods of Outlier Detection Outlier detection when ARMA parameters are
  • 5. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 60 It is natural to consider the residuals of a fitted model for use in detecting outliers in a time series, since most diagnostic checks of a model are based on residuals. However, outliers in a time series can affect both the model we may identify for the series as well as the parameter estimates of the identified model. As a result, it is unclear how useful the residuals may be for outlier detection in certain situations. To better understand how a single outlier manifests itself in the residual series, consider the filtered series (Zhang, et al. 2016). where () B is the polynomial operator in the π-weights of the ARIMA model. The weights in π(B) may be obtained by equating coefficients in the backshift operator in an expression involving π(B) and the polynomial operators of the model. In the case of the non-seasonal stationary model. The values of t e become the residuals of the fitted model if the πweights are computed from the estimated parameters of the ARIMA model rather than from the known parameters of the “true” ARIMA model. We may be able to use the analytic representation of t e to test for the effect of an outlier. If only one outlier occurs in a time series, then a least squares estimate for the effect of the outlier at time t=T , ˆi W (i=1,2,3,4), and the statistic that may be used for testing its significance can be easily derived. An adjusted series (i.e., one with the outlier effect removed) can also be obtained. However, some problems remain since: 1. In the event there is an outlier, we do not know its type; 2. We do not know whether an outlier occurs, and if it occurs, the time of its occurrence; 3. There may be more than one outlier present in the series; and 4. We do not know precisely what the “true” underlying model is, nor are we sure of the accuracy of the estimates of a correct model. Procedures to account for (1) - (3) above have been developed during the past few years. Most of these outlier detection procedures are based on the residuals from fitted models. In this way, we can diagnostically check a fitted model for the presence of outliers. An iterative detection procedure Suppose there is unknown number of AO, LS and IO events in a time series t Y , occurring at unknown times t= 12 , ii,… . A detection procedure is as follows: 1. Identify and estimate an ARIMA model (or DR model) forecast t Y assuming that no AO, LS, or IO events are present. 2. Compute the model residuals ( ˆte ) and estimate 2 a as: where m is the number of residuals available (m=n- 1 n and 1n = p+ S P + d+ S D ) 3. Compute the likelihood ratios. Set 0,tˆ L equal to the largest of these statistics; that is, 0,t ˆ L = max { A,t ˆ L ,s,tˆ L , I,t ˆ L }for the m time periods t=1+ 1 n , 2+ 1 n ,….,n. 4. Find ˆ L =max { 0,t ˆ L }. Compare ˆ L with a predetermined critical value dc (discussed later). If ˆ L ≤ d c , stop the procedure. If ˆ L > d c ,then a possible AO, LS, or IO is detected. At the time (t = i), type (AO, LS, or IO), and estimated w coefficient of the identified possible event are those associated with ˆ L . 4. Find ˆ L =max { 0,t ˆ L }. Compare ˆ L with a predetermined critical value dc (discussed later). If ˆ L ≤ d c , stop the procedure. If ˆ L > d c ,then a possible AO, LS, or IO is detected.At the time (t = i), type (AO, LS, or IO), and estimated w coefficient of the identified possible event are those associated with ˆ L . a- If a possible LS is detected, its size is estimated by ˆ SW in * SW = s k () C F i e = s k remove this LS effect from the residual series by replacing each ˆ te with ˆ te - ˆS W () ˆ t BXc for t ≥ i. Reestimate 2 a using the new ˆ te series; use this new estimate to recomputed S,t ˆ L. c- If a possible IO is detected, its effect is estimated by ˆI W according to * IW = Remove this IO effect from the residual series by replacing ˆ teattime t = i with ˆ te - ˆI W =0. Re estimate 2 a using the new ˆ te series; use this new estimate to recompute I,t ˆ L. 5. Suppose T possible AO,LS or IO effects are found at times i1,i2,…., Ti . Treat these times as known and estimate the w coefficients for each effect simultaneously within a DR model. For example, suppose we find T= 3 effects, with a possible AO detected at time t = i3 . Then we estimate the model
  • 6. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 61 Where 1,t X =1 at t = i1 and 1,t X =0 otherwise; 2,t X =0 for t <i2 and 2,tX =1 for t ≥ i2 ; 3,t X =1 at t=i3 and 3,t X =0 otherwise. The model may also call for a constant term. Diagnostic checking may lead to us to modify. III. METHODS AND FINDINGS Collection of data The researcher gathered data for the application of a research from the Dokan dam in Sulaimaniah city, where the data are the volume of water inputting the reservoir of Dokan dam(daily rates cubic meters) late 2018 and early 2019,the very large volume of data has been converted to monthly averages (cubic meters) time series. Building ARIMA model Model identification A time series plot of volume, the study is certain that the series does not have a fixed mean level and not stable in the variance. First to stabling the variance, we transform the data by using the natural logarithmic. We will store the transformed data under the name Lvolume, by using SCA paragraph.A time series plot of Lvolume, furthermore the new series still exhibits a trend and seasonality, but we seem to have stabilized the variability over the length of the series ( as seen in figure 1). Fig.1-Plot of water volume series of Dokan dam Fig.2-Plot of log of water volume (Lvolume) ofDokan dam We expect that the Lvolume is not stationary. This is confirmed when we compute and display the sample ACF of the series, by using ACF paragraph of SCA system. Fig.3-Estimate of ACF for the Lvolume .
  • 7. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 62 The ACF has a slow die-out pattern that is indicative of a nonstationary series. Differencing is required. However, because the data is seasonal, the study may wonder if the “proper” differencing operator is (1-B) or (1- 12 B ). We can examine the sample ACF by using both of these differencing operators. The output is edited for presentation purposes as shown below. -- >ACF LVOLUME. DFORDERS 1 12. Fig.4-Estimate of ACF for differenced Lvolume (d=1,D=1). Model estimation The study estimates the volume model by using ESTIM paragraph as: Table 1-Summary of estimate time series for Lvolume Parameters estimates are significant based on their t-values. Diagnostic check of model A time plot of the residual series does not reveal any gross abnormalities, although some unusual points appear to be present. We can compute and display 24 lags of the sample ACF of the residuals. We see the sample ACF of the residuals is “clean”. The output is edited for presentation purposes. Fig.5-The ACF plot for the residuals of suggested model.
  • 8. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 63 The detection of outliers when the parameters Unknown To demonstrate outlier detection, the study used the OUTLIER paragraph of SCA system for Lvolume time series. The study obtained the following estimates for model parameters and outliers at different critical values (2.5, 3.0, 3.5 ,4.0) for outlier detection, as seen in table(2). Table 2-The estimates of outliers and it types with different critical values The study illustrates that the number of outliers decrease whenever critical values increase. Alternatively,we could have estimated model Lvolume using the OESTIM paragraph. In this way the SCA System will simultaneously detect outliers and jointly estimate their effects with the parameter.When critical value equal to 2.5 as seen in table (3). Table 3-Summary of estimate time series model for volume (cd=2.5) The OFORECAST paragraph extends the outlier detection and adjustment capabilities of the SCA System to the forecasting of a time series in the presence of outliers. Unlike other forecasting capabilities that simply utilize the current parameter estimates and the data on hand to compute forecasts, the OFORECAST paragraph also performs its own outlier detection and adjustment. As a result, it provides us with:
  • 9. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 64 Table 4-Forecasts for Volume after adjusting the outliers (Cd=2.5). We can converse the forecasting value of Lvolume to the origin values before taking the natural logarithmic. Then compare these values with the forecasts by using the fitted ARIMA model, assuming that the outliers are not presence, the results shown as in the table (3.16) below by using critical value (4.0). Table 5-Forecasts for the volume data with and without adjusting the outliers And we note that the Mse(0.052905) forecast without adjusting the outlier is greater than the Mse (0.043100) of forecasting with adjusting the outlier. This means that when analyzing the data of time series, first we must detect and adjust the outliers. IV. CONCLUSIONS The study came to conclude that throughout the research, the following conclusions: first, every time increasing the critical value, the value of residual standard error (with outlier adjustment) increased. Second, every time increasing the critical value, the number of outlier values decreased. Third, in the case of presence of outliers the forecasts with adjustment of outliers better than the forecasts without adjusting outliers. Since the procedures are based on simple techniques, they are widely applicable. For instance, they can be used as data screening device in spectral density estimation and in robust time series analysis. They can also be used in biological study where exogenous disturbances are unavoidable. For example, Greenhouse, Kass and Tsay (1987) analysed body
  • 10. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 65 temperature of an individual involved in a psychiatric study where the observations clearly depended on the individual physical activities. A variance change from day to night seems highly plausible. A third application of the procedures is that they can be used to identify the time point of an intervention in the intervention analysis of Box and Tiao (1975). In the traditional intervention analysis, the time point of an intervention is assumed to be known. Finally, two remarks are made on the procedures. First, in Section 4 the adjusted series was used in the detection process to demonstrate the usefulness of the suggested procedure. This, however, does not imply that one can rely on the adjusted series to make inferences. A more appropriate strategy would be (a) to search for the causes of the identified outliers, level changes and variance changes, (b) to specify a general model in the form of (2) based on causes of the exogenous disturbances, and (c) to estimate jointly the impact of disturbances and the time series parameters. This strategy allows for the use of prior information of the disturbances. It can also reduce the possibility of over parameterization that arises from the abuse of the detection procedure. Readers are referred to Tsay (1986a) for further discussion. Second, to detect the transient level change, 6 = 0.8 was used in Section 4. In fact, other values of 6 can also be used. As an example, 6 = 0.6 was used to the air-passenger-miles data of Example 1. The procedure still identified the same time points as significant disturbances even though some of the classifications between permanent and transient level changes are different. Similarly, to detect the variance changes, h = 30 was used to compute residual variances at both ends of a series. The choice of h is not critical as long as it is reasonable. For instance, the same detection results were obtained in Example 2 when h = 20 was used. In general, a h between 20 and 30 appears to be useful. The study supports the claimthat outliers do result in model misspecification as they affect the autocorrelation structure of any time series. In our case it is illustrated by the fact that initially we had the ARIMA (1 1 0) *(0 0 1) 12 as the best model that could be fitted to our data. Testing the residuals for normality and constant variance showed that both assumptions were violated although the parameters in the model were significant. Using this model for forecasts would have given misleading figures for a decision maker. This is possibly attributed to the presence of outliers. The best model was found to be ARIMA (1 1 2) *(0 0 1) 12 after correcting the series for outliers and all the parameters were significant in the model. Diagnostic checks also showed that the assumptions of normality and constant variance were not violated. This therefore demonstrates that the procedure is useful in detecting and correcting for outliers. It can be applied to all invertible ARIMA models. Moreover, it is flexible and easy to interpret. The procedure must be used with other diagnostic tools for time series to produce even better results. Further study is needed to investigate the variances and other sampling properties of the resulting parameter estimates. The message from this study is that when examining economic time series data any potential outliers should be taken seriously, no matter what the ultimate aim or the model used may be. Outliers have already been shown to be potentially harmful, and there is also increasing evidence that the dangers are not only theoretical. Other possible models that might be useful for modelling time series must be explored such as GARCH and ARCH models. These are non-linear forms of time series that might be used to model data that has got a lot of fluctuations in it. Non-linearity tests are normally done on the data before the previous models can be applied. The study suggested for future studies the following: Studying the methods of detection outliers in multivariate time series and application. Studying the detection outlier when occurs at the end of the series, and finally studying the detection outlier when presence the missing data in the series. REFERENCES [1] Ahmad, S., & Purdy, S. (2016). Real-time anomaly detection for streaming analytics. arXiv preprint arXiv:1607.02480. [2] Aminikhanghahi, S., & Cook, D. J. (2017). A survey of methods for time series change point detection. Knowledge and information systems, 51(2), 339-367. [3] Arumugam, P., & Saranya, R. (2018). Outlier Detection and Missing Value in Seasonal ARIMA Model Using Rainfall Data. Materials Today: Proceedings, 5(1), 1791-1799. [4] Cabrieto, J., Tuerlinckx, F., Kuppens, P., Grassmann, M., &Ceulemans, E. (2017). Detecting correlation changes in multivariate time series: A comparison of four non-parametric change point detection methods. Behavior [5] Capozzoli, A., Lauro, F., & Khan, I. (2015). Fault detection analysis using data mining techniques for a cluster of smart office buildings. Expert Systems with Applications, 42(9), 4324-4338.
  • 11. International journal of Chemistry, Mathematics and Physics (IJCMP) [Vol-3, Issue-3, May-Jun, 2019] https://dx.doi.org/10.22161/ijcmp.3.3.2 ISSN: 2456-866X http://www.aipublications.com/ijcmp/ Page | 66 [6] Chen, W., Zhou, K., Yang, S., & Wu, C. (2017). Data quality of electricity consumption data in a smart grid environment. Renewable and Sustainable Energy Reviews, 75, 98-105. [7] Filonov, P., Lavrentyev, A., &Vorontsov, A. (2016). Multivariate industrial time series with cyber-attack simulation: Fault detection using anlstm-based predictive data model. arXiv preprint arXiv:1612.06676. [8] Frantz, D., Röder, A., Udelhoven, T., & Schmidt, M. (2015). Enhancing the detectability of clouds and their shadows in multitemporal dryland Landsat imagery: Extending Fmask. IEEE Geoscience and Remote Sensing Letters, 12(6), 1242-1246. [9] Ganz, F., Puschmann, D., Barnaghi, P., &Carrez, F. (2015). A practical evaluation of information processing and abstraction techniques for the internet of things. IEEE Internet of Things journal, 2(4), 340-354. [10] Hermosilla, T., Wulder, M. A., White, J. C., Coops, N. C., & Hobart, G. W. (2015). An integrated Landsat time series protocol for change detection and generation of annual gap-free surface reflectance composites. Remote Sensing of Environment, 158, 220-234. [11] Johansen, S., & Nielsen, B. (2016). Asymptotic theory of outlier detection algorithms for linear time series regression models. Scandinavian Journal of Statistics, 43(2), 321-348. [12] Kontaki, M., Gounaris, A., Papadopoulos, A. N., Tsichlas, K., &Manolopoulos, Y. (2016). Efficient and flexible algorithms for monitoring distance-based outliers over data streams. Information systems, 55, 37- 53. [13] Li, L., Das, S., John Hansman, R., Palacios, R., & Srivastava, A. N. (2015). Analysis of flight data using clustering techniques for detecting abnormal operations. Journal of Aerospace information systems, 12(9), 587-598. [14] Liu, M., Shi, J., Cao, K., Zhu, J., & Liu, S. (2018). Analyzing the training processes of deep generative models. IEEE transactions on visualization and computer graphics, 24(1), 77-87. [15] Liu, S., Wright, A., &Hauskrecht, M. (2018). Change- point detection method for clinical decision support system rule monitoring. Artificial intelligence in medicine, 91, 49-56. [16] Liu, Z., Verstraete, M. M., & de Jager, G. (2018). Handling outliers in model inversion studies: a remote sensing case study using MISR-HR data in South Africa. South African Geographical Journal, 100(1), 122-139. [17] Loureiro, D., Amado, C., Martins, A., Vitorino, D., Mamade, A., & Coelho, S. T. (2016). Water distribution systems flow monitoring and anomalous event detection: A practical approach. Urban Water Journal, 13(3), 242-252. [18] Martí, L., Sanchez-Pi, N., Molina, J., & Garcia, A. (2015). Anomaly detection based on sensor data in petroleum industry applications. Sensors, 15(2), 2774- 2797. [19] Reiche, J., Verbesselt, J., Hoekman, D., & Herold, M. (2015). Fusing Landsat and SAR time series to detect deforestation in the tropics. Remote Sensing of Environment, 156, 276-293. [20] Rousseeuw, P. J., &Bossche, W. V. D. (2018). Detecting deviating data cells. Technometrics, 60(2), 135-145. [21] Rousseeuw, P., Perrotta, D., Riani, M., & Hubert, M. (2019). Robust monitoring of time series with application to fraud detection. Econometrics and statistics, 9, 108-121. [22] Sprint, G., Cook, D. J., &Schmitter-Edgecombe, M. (2016). Unsupervised detection and analysis of changes in everyday physical activity data. Journal of biomedical informatics, 63, 54-65. [23] Sprint, G., Cook, D. J., Fritz, R., &Schmitter- Edgecombe, M. (2016). Using smart homes to detect and analyze health events. Computer, 49(11), 29-37. [24] Staal, O. M., Sælid, S., Fougner, A., &Stavdahl, Ø. (2019). Kalman smoothing for objective and automatic preprocessing of glucose data. IEEE journal of biomedical and health informatics, 23(1), 218-226. [25] Stumpf, A., Malet, J. P., &Delacourt, C. (2017). Correlation of satellite image time-series for the detection and monitoring of slow-moving landslides. Remote sensing of environment, 189, 40-55. [26] Wang, B., & Mao, Z. (2018). Detecting Outliers in Electric Arc Furnace under the Condition of Unlabeled, Imbalanced, Non-stationary and Noisy Data. Measurement and Control, 51(3-4), 83-93. [27] Zhang, Q., Pandey, B., &Seto, K. C. (2016). A robust method to generate a consistent time series from DMSP/OLS nighttime light data. IEEE Transactions on Geoscience and Remote Sensing, 54(10), 5821-5831.