SlideShare a Scribd company logo
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
DOI : 10.5121/sipij.2017.8402 15
OPTIMAL GLOBAL THRESHOLD ESTIMATION
USING STATISTICAL CHANGE-POINT
DETECTION
Rohit Kamal Chatterjee1
and Avijit Kar2
1
Department of Computer Science & Engineering,
Birla Institute of Technology, Mesra, Ranchi, India.
2
Department of Computer Science & Engineering,
Jadavpur University, Kolkata, India.
ABSTRACT
Aim of this paper is reformulation of global image thresholding problem as a well-founded statistical
method known as change-point detection (CPD) problem. Our proposed CPD thresholding algorithm does
not assume any prior statistical distribution of background and object grey levels. Further, this method is
less influenced by an outlier due to our judicious derivation of a robust criterion function depending on
Kullback-Leibler (KL) divergence measure. Experimental result shows efficacy of proposed method
compared to other popular methods available for global image thresholding. In this paper we also propose
a performance criterion for comparison of thresholding algorithms. This performance criteria does not
depend on any ground truth image. We have used this performance criterion to compare the results of
proposed thresholding algorithm with most cited global thresholding algorithms in the literature.
KEYWORDS
Global image thresholding, Change-point detection, Kullback-Leibler divergence, robust statistical
measure, thresholding performance criteria.
1. INTRODUCTION
A grey-level digital image is a two dimensional signal LI →→→→ΖΖΖΖ××××ΖΖΖΖ: , where L={ li∈ and
i=1,2,…,M} is the set of M grey-levels. The problem of automatic thresholding is to estimate an
optimal threshold t0 which segments the image into two meaningful sets, viz. background
B={bb(x,y)=1| I(x,y)<t0 } and foreground F={bf(x,y)=1| I(x,y)≥ t0} or the opposite. The function
I(x,y) can take any random value li∈L; so, sampling distribution of grey levels becomes an
important deciding factor for t0. In many image processing applications, automating the process
of optimal thresholding is extremely important for low-level segmentation or even final
segmentation of object and background.
In general, automatic thresholding algorithms are divided into two groups, viz. global and local
methods. Global methods estimate a single threshold for the entire image; local methods find an
adaptive threshold for each pixel depending on the characteristics of its neighborhood. Global
methods are used if the image is considered as a mixture of two or more statistical distributions.
In this paper, we address the global thresholding methods guided by the image histogram. Most of
the cases global thresholding methods try to estimate the threshold (t0) iteratively by optimizing a
criterion function [1]. Some other methods attempt to estimate optimal t0 depending on histogram
shape [2, 3], image attribute such as topology [5] or some clustering techniques [4, 8, 20].
Comprehensive surveys discussing various aspects of thresholding methods can be found in the
references [1, 6, 7].
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
16
Many of these classical and recent schemes perform remarkably well for images with matching
underlying assumptions but fail to yield desired results otherwise. Some of the explicit or implicit
reasons for their failure could be: (i) assumption of some standard distribution (e.g. Gaussian)
[19], in reality though, foreground and background classes can have arbitrary asymmetric
distributions, (ii) use of non-robust measures for computing criterion functions which get
influenced by outliers. Further, the effectiveness of these algorithms greatly decreases when the
areas under the two classes are highly unbalanced. Some of the methods depend on user specified
constant (e.g. Renyi or Tsallis entropy based methods) [17, 18], greatly compromising their
performance without its appropriate value.
This paper proposes an algorithm for addressing these drawbacks using a statistical technique
known as change-point detection (CPD). For the last few decades, models of change-point
detection are successfully applied by researchers in statistics and control theory for detecting
abrupt changes in the statistical behavior of an observed signal or time series [9]. The general
principle of change-point detection considers an observed sequence of independent random
variables {Yk}1…n with a probability density function (pdf) pθ(y) depending on a parameter θ. If
any change occurs in the sequence then it is assumed that parameter θ takes a value θ0 before any
change and at some unknown time t0 alters to θ1 (≠θ0). The main problem of statistical change-
point detection is to decide the change in parameter and also the time of change. The theory of
CPD is used in this paper to decide the global threshold in an image depending on the change in
the histogram.
Further, in section 4 of this paper, we propose a new performance index for the evaluation of
thresholding algorithms. It depends on the structural difference between the shapes of background
and foreground. The advantage of this performance index is that it does not depend on any ground
truth image. We use this performance index to compare different thresholding algorithms
including ours.
Rest of the paper is organized as follows: Section 2 provides a short introduction to the problem
of statistical change-point detection, section 3 formulates and derives the global thresholding as a
change-point detection problem, section 4 describes our proposal for thresholding performance
criteria, section 5 presents the experimental results and compares the results with various often
cited global thresholding algorithms, and finally section 6 summarizes main ideas in this paper.
2. THE CHANGE-POINT DETECTION (CPD) PROBLEM
The Change-point detection (CPD) problem can be classified into two broad categories: real-time
or online and retrospective or offline change-point detection. The first targets applications where
the instantaneous response is desired such as robot control; on the other hand, retrospective or
offline change-point detection is used when longer reaction periods are allowed e.g. image
processing problems [10]. The later technique is likely to give more accurate detection since the
entire sample distribution is accessible. Since the image and the corresponding histogram are
available to us, we concentrate on offline change-point detection in this paper. We also assume
that there is only one change point throughout the given observations {yk}1…n. When required, this
assumption can easily be relaxed and extended to multiple change point detection that can be
applied in multi-level threshold detection problems.
2.1. Problem Statement
When taking an offline point of view about the observations y1, y2…, yn with corresponding
probability distribution functions F1, F2, …, Fn, belong to a common parametric family F(φ),
where φ∈ Rp
, p>0. Then the change point problem is to test the null hypothesis (H0) about the
population parameter φj, j = 1,2, …, n:
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
17
njiforH j ≤≤≤≤≤≤≤≤==== 00 : θθθθφφφφ
versus an alternative hypothesis (1)
{
1,
,1
0
1
:
kjfor
njkforjH
≤≤≤≤≤≤≤≤
≤≤≤≤<<<<
====
θθθθ
θθθθ
φφφφ
where θ0 ≠ θ1 and k is an unknown time of change.
These hypotheses together disclose the characteristics of change point inference, determining if
any change point exists in the process and estimating the time of change t0 = k. The likelihood
ratio corresponding to the hypotheses H0 and H1 is given by
where and are pdfs before and after the change occurs and is the overall probability
density. When the only unknown parameter is t0, its maximum likelihood estimate (MLE) is given
by the following statistic
2.2. Offline Estimation of the Change Time
When the problem is to estimate the change time (t0) in the sequence of observations {yj}1...n and
if we assume the existence of a change point with the same presumption as in the last section.
Therefore, considering equation (2) and (3) and the fact that is a constant for a given data,
the corresponding MLE estimate is
where is a maximum log-likelihood estimate of t0. Rewriting equation (4) as
As remains constant for a given observation, estimation of is simplified as
Therefore, the MLE of the change time t0 is the value which maximizes the sum of log-likelihood
ratio corresponding to all k possible values given by equation (6).
3. CHANGE-POINT DETECTION FORMULATION OF GLOBAL
THRESHOLDING
3.1. Assumptions
Let (χ, βχ, Pθ)θ∈∈∈∈Θ be the statistical space of discrete grey-levels associated with a random variable
Y:ℤℤℤℤxℤℤℤℤ→ℤℤℤℤ, where βχ is the σ-field of Borel subsets A ⊂⊂⊂⊂ χ and {Pθ}θ∈∈∈∈Θ is a family of probability
distributions defined on the measurable space (χ, βχ) with parameter space Θ, an open subset of
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
18
ℝq
, q>0.We consider a finite population Π of all gray-level images with N elements that could be
classified into M categories or classes L={l1, ..., lM}, i.e. each sample point in the sample image
can take any random gray-level values from the set L.
3.2. Change-Point Detection Formulation
Since we are mainly interested in discrete gray-level data, we consider the multinomial
distribution model. Let ℘℘℘℘={Ei}, i=1,...,M be a partition of χ. The formula Prθ(Ei) = pi(θ), i = 1, . .
., M, defines the probability of the li
th
gray-level in the discrete statistical model. Further we
assume {y1,. . .,yN} to be a random sample from the population described by the random variable
Y, representing the gray-level of a pixel. And let , where IE is the index function.
Then we can approximate pi(θ)≈Ni/N, i=1,…, M. Estimating θ by maximum likelihood method
consists of maximizing the joint probability distribution for fixed n1, . . . , nM,
or equivalently maximizing the log-likelihood function
Therefore, referring to equation (4), problem of estimating the threshold by MLE can be stated as
where unknown parameter θ =θ0 before the change and θ= θ1 after the change. Now, equation (9)
can be expanded as
The first term within the bracket on the right side of equation (10) is a constant and the last term
is independent of j, i.e. it cannot influence the MLE. So, eliminating these terms from equation
(10) and simplifying we get
Multiplying and dividing N on right side of equation (11) we get
assuming pi(θ)≈ni/N equation (12) can be written as
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
19
The expression in (13) under the summation denotes Kullback-Leibler (KL) divergence between
the density and , where and denotes the pdfs above and below the
threshold location j; therefore equation (13) can be written as
Since total sum is independent of j, i.e. a constant for a given observation, a
sample image, therefore equation (14) can be rewritten as
Hence, equation (15) provides the maximum likelihood estimation of the threshold t0. Equation
(15) can be restated as the following proposition:
Proposition 1: In a mixture of distributions, the maximum likelihood estimate of change-point is
found by minimizing the Kullback-Leibler divergence of the probability mass across successive
thresholds.
In spite of this striking property, KL divergence is not a ‘metric’ since it is not symmetric. An
alternative symmetric formula by “averaging” the two KL divergences is given as [11]
An attractive property of KL divergence is its robustness i.e. KL divergence is little influenced
even when one component of mixture distribution is considerably skewed. A proof of robustness
can be found for generalized divergence measures in [11, 12].
This method can be easily extended to find multiple thresholds for several mixture distributions
by identifying multiple change-points simultaneously.
3.3. Implementation
Section Let us consider an image I:ℤxℤ L, whose pixels assume M gray-levels in the set L={l1,
l2,…, lM }. The empirical distribution of the image can be represented by a normalized histogram
p(li)=ni/N, where ni is the number of pixels in ith
gray-level and N is the total number pixels in the
image.
Now, suppose we are grouping the pixels into two classes B and F (background and object) by
thresholding at the level k. Histogram of gray-levels can be found for the classes B and F; let us
denote them as pB(li)and pF(li). Following statistics are calculated for the level k.
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
20
and finally,
The minimum value of CPD(k) for all values of k in the range [1,…, M] gives an optimal
estimate of threshold t0.
4. THRESHOLDING PERFORMANCE CRITERIA
The objective of the global thresholding algorithm is to divide the image into two binary images
generally called background and foreground (object). Most of the histogram-based thresholding
algorithms try to devise a criterion function which produces a threshold to separate the shapes and
patterns of the foreground and background as much as possible. A good thresholding algorithm
can be judged by how well it sets apart the object and the background binary images, i.e. how
much dissimilarity exists between the foreground and the background. Since the background and
foreground images are binary images dissimilarity between them can be measured by any binary
distance measures. Based on this observation, we propose a threshold evaluation criterion, which
tries to find the dissimilarity between the patterns and shapes in foreground and background.
A number of binary similarity and distance measures have been proposed in different areas, a
comprehensive survey of them can be found in Choi et al. [13]. In order to understand the
distance measure used in our work, it is helpful to refer to the following contingency table (Table
1):
Table 1. Binary contingency table
Foreground
Background
1 0
1 a b
0 c d
The cell entries in Table 1 are the number of pixel locations for which the two binary images
agree or differ. For example, cell entry ‘a’ is the total count of pixel locations where both binary
images take a value one. Hence, b + c denote the total count where foreground and background
pixels differ (Hamming distance) and a+d is the total count where they agree.
In order to extract the shapes and patterns present in the foreground (F) and background (B)
images, we use binary morphological gradient. The binary Morphological gradient is the
difference between the eroded and dilated images. Obviously, any other edge or texture detection
algorithm for binary images can be also used to extract the objects present in foreground and
background.
In this paper, we use a simple binary distance measure known as Normalised Manhattan distance
(DNM) given by
where Fg and Bg denote Binary Morphological gradients of foreground (F) and background(B)
respectively. The range of this distance measure is the interval [0, 1]. It is expected that well-
segmented image will have DNM close to 1, while in the worst case DNM =0. The advantage of this
algorithm is that it does not require any ground truth image.
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
21
5. EXPERIMENTAL RESULTS WITH DISCUSSION
To validate the applicability of proposed Change-Point Detection (CPD) thresholding algorithm,
we provide experimental results and compare the results with existing algorithms. The first row of
Figure 1 shows test images that are labeled from left to right as Dice, Rice, Object, Denise, Train,
and Lena respectively.
TABLE 2: Threshold evaluation criterion (DNM) for the test images (A) Dice, (B) Rice, (C) Object, (D)
Denise, (E) Train, and (F) Lena
The images have deliberately been so selected that the difference of areas between foreground
and background is hugely disproportionate. This gives us an opportunity to test the robustness of
CPD algorithm. To compare the results, we selected five most popular thresholding algorithms,
namely, Kittler-Illingworth [14], Otsu [15], Kurita [16], Sahoo [17] and Entropy [18].
In Figure 1 third row onwards show the outputs of different thresholding algorithms. The last
row shows the output of the proposed CPD thresholding algorithm. Due to substantial skewness
in the distributions of gray-levels in object or background, most of the algorithms confused
foreground with background. But results in the last row clearly show that CPD works
significantly better in all cases.
Table 2 shows optimal thresholds of five selected algorithms and the CPD algorithm using our
proposed performance criteria. It is clear that CPD performs reasonably well. For example,
consider the Denise image and Train image, Kittler-Illingworth thresholding totally fails to
distinguish the object from the background due to its assumption of Gaussian distribution for both
foreground and background [19]. Otsu’s and Kurita’s method yield almost same output due to
their common assumptions. Corresponding histograms are also reproduced in Figure 2 marked
with threshold locations of all the six algorithms above for reference. The threshold locations
show that CPD algorithm is very little influenced by the asymmetry of object of background
distributions.
6. CONCLUSIONS
In this paper we propose a novel global image thresholding algorithm based on Statistical
Change-Point detection (CPD), which is derived based on a symmetric version of Kullback-
Leibler divergence measure. The experimental results clearly show this algorithm is largely
unaffected by disproportionate dispersal of object and background scene and also very little
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
22
influenced by the skewness of distributions of object and background compared to other well-
known algorithms. We also propose a thresholding performance criterion using dissimilarity
between foreground and background binary images. Advantage of this performance criterion is
that it does not require any ground truth image.
Figure 1. Result of thresholding algorithms on tested images: Row-1: Original Images; Row-2: Shapes of
histograms; Row-3: Kittler; Row-4: Otsu; Row-5: Kurita; Row-6: Sahoo; Row-7: Entropy: Row-8: CPD
Threshold.
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
23
Figure 2: Histogram of (a) Denise and (b) Train image with threshold locations
REFERENCES
[1] M. Sezgin and B. Sankur (2004) “Survey over image thresholding techniques and quantitative
performance evaluation”, J. of Electronic Imaging, Vol. 13, No. 1, pp.146–165.
[2] A. Rosenfeld and P. De la Torre, (1983) “Histogram concavity analysis as an aid in threshold
selection”, IEEE Trans. Syst. Man Cybernetics. SMC-13, pp. 231–235.
[3] M. I. Sezan, (1985) “A peak detection algorithm and its application to histogram-based image data
reduction”, Graph. Models Image Process. Vol. 29, pp.47–59.
[4] D. M. Tsai, (1995) “A fast thresholding selection procedure for multimodal and unimodal
histograms”, Pattern Recogn. Lett. Vol. 16, pp. 653–666.
[5] A. Pikaz and A. Averbuch, (1996) “Digital image thresholding based on topological stable state”,
Pattern Recogn. Vol. 29, pp.829–843.
[6] N. R. Pal and S. K. Pal, (1993) “A review on image segmentation techniques,” Pattern Recog., vol.
26, no. 9, pp. 1277–1294.
[7] P. K. Sahoo, S. Soltani, A. K. C. Wong, and Y. C. Chen, (1988) “A survey of thresholding
techniques,” Computer Vision, Graphics, and Image Process., vol. 41, no. 2, pp.233–260.
[8] C. V. Jawahar, P. K. Biswas, and A. K. Ray, (1997) ‘‘Investigations on fuzzy thresholding based on
fuzzy clustering,’’ Pattern Recogn., vol. 30, no. 10, pp. 1605–1613.
[9] H. V. Poor, O. Hadjiliadis, Quickest Detection, Cambridge University Press, New York, 2009.
[10] J. Chen, A. K. Gupta, (2012) Parametric statistical change point analysis, with applications to
genetics, medicine and finance, 2nd Ed., Birkhäuser, Boston.
[11] Y. Wang, (2011) “Generalized Information Theory: A Review and Outlook”, J. of Inform. Tech., Vol.
10, No. 3, pp. 461-469.
[12] L. Pardo, (2006) Statistical Inference Based on Divergence Measures, Chapman & Hall/CRC, pp.
233.
[13] S. Choi, S. Cha, C. C. Tappert, (2010) “A Survey of Binary Similarity and Distance Measures,”
Systemics, Cybernetics and Informatics Vol. 8, No. 1, pp. 43-48, 2010.
Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017
24
[14] J. Kittler and J. Illingworth (1986) “Minimum error thresholding”, Pattern Recognition, Vol. 19, pp.
41–47.
[15] N. Otsu, (1979) “A threshold selection method from gray level histograms”, IEEE Trans. Syst. Man
Cybern. SMC-9, pp. 62–66.
[16] T. Kurita, N. Otsu, and N. Abdelmalek, (1992) “Maximum likelihood thresholding based on
population mixture models”. Pattern Recognition, Vol. 25, pp. 1231-1240.
[17] P. Sahoo, C. Wilkins, and J. Yeager, (1997) “Threshold selection using Renyi’s entropy”, Pattern
Recogn. Vol. 30, pp. 71–84.
[18] P.K.Sahoo, G.Arora, “Image thresholding using two-dimensional Tsallis–Havrda–Charvat entropy”,
Pattern Recognition Letters, Vol. 27, pp. 520–528, 2006.
[19] J. Xue and D. M. Titterington (2011) “t-tests, F-tests and Otsu’s Methods for Image Thresholding,”
IEEE Trans. Image Processing, vol. 20, no. 8, pp. 2392-2396.
[20] H. Tizhoosh, (2005) “Image thresholding using type II fuzzy sets”, Pattern Recognition, vol. 38, pp.
2363 – 2372.

More Related Content

PDF
A046010107
PDF
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
PDF
Applications and Analysis of Bio-Inspired Eagle Strategy for Engineering Opti...
PDF
AROPUB-IJPGE-14-30
PDF
One dimensional vector based pattern
PDF
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
PDF
Medical diagnosis classification
PDF
50120130405020
A046010107
Comparison of Cost Estimation Methods using Hybrid Artificial Intelligence on...
Applications and Analysis of Bio-Inspired Eagle Strategy for Engineering Opti...
AROPUB-IJPGE-14-30
One dimensional vector based pattern
Steganalysis of LSB Embedded Images Using Gray Level Co-Occurrence Matrix
Medical diagnosis classification
50120130405020

What's hot (19)

PDF
A PSO-Based Subtractive Data Clustering Algorithm
PDF
Long-Term Robust Tracking Whith on Failure Recovery
PDF
Control chart pattern recognition using k mica clustering and neural networks
PDF
An Experiment with Sparse Field and Localized Region Based Active Contour Int...
PDF
Comparing between maximum
PDF
Fault diagnosis using genetic algorithms and principal curves
PPTX
A hybrid sine cosine optimization algorithm for solving global optimization p...
PDF
Enhanced Spectral Reflectance Reconstruction Using Pseudo-Inverse Estimation ...
PPTX
Optimization problems and algorithms
PDF
E41033336
PDF
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
PDF
Sca a sine cosine algorithm for solving optimization problems
PDF
MULTI-OBJECTIVE ENERGY EFFICIENT OPTIMIZATION ALGORITHM FOR COVERAGE CONTROL ...
PDF
Sensitivity analysis in a lidar camera calibration
PDF
Feature selection using modified particle swarm optimisation for face recogni...
PDF
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...
PDF
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
PDF
Estimating project development effort using clustered regression approach
PDF
Training and Inference for Deep Gaussian Processes
A PSO-Based Subtractive Data Clustering Algorithm
Long-Term Robust Tracking Whith on Failure Recovery
Control chart pattern recognition using k mica clustering and neural networks
An Experiment with Sparse Field and Localized Region Based Active Contour Int...
Comparing between maximum
Fault diagnosis using genetic algorithms and principal curves
A hybrid sine cosine optimization algorithm for solving global optimization p...
Enhanced Spectral Reflectance Reconstruction Using Pseudo-Inverse Estimation ...
Optimization problems and algorithms
E41033336
A Combined Approach for Feature Subset Selection and Size Reduction for High ...
Sca a sine cosine algorithm for solving optimization problems
MULTI-OBJECTIVE ENERGY EFFICIENT OPTIMIZATION ALGORITHM FOR COVERAGE CONTROL ...
Sensitivity analysis in a lidar camera calibration
Feature selection using modified particle swarm optimisation for face recogni...
The Application Of Bayes Ying-Yang Harmony Based Gmms In On-Line Signature Ve...
ESTIMATING PROJECT DEVELOPMENT EFFORT USING CLUSTERED REGRESSION APPROACH
Estimating project development effort using clustered regression approach
Training and Inference for Deep Gaussian Processes
Ad

Similar to OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION (20)

PDF
Incorporating Index of Fuzziness and Adaptive Thresholding for Image Segmenta...
PDF
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
PDF
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
PDF
Lecture 9&10 computer vision segmentation-no_task
PDF
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDING
PPTX
PDF
On Tracking Behavior of Streaming Data: An Unsupervised Approach
PDF
MRI IMAGES THRESHOLDING FOR ALZHEIMER DETECTION
PDF
MRI IMAGES THRESHOLDING FOR ALZHEIMER DETECTION
PPTX
Digital Image Processing
PDF
AUTOMATIC THRESHOLDING TECHNIQUES FOR OPTICAL IMAGES
PDF
IMAGE SEGMENTATION BY USING THRESHOLDING TECHNIQUES FOR MEDICAL IMAGES
PDF
Comparative between global threshold and adaptative threshold concepts in ima...
PDF
Synthetic aperture radar images
PDF
Fuzzy Entropy Based Optimal Thresholding Technique for Image Enhancement
PPTX
Change Point | Statistics
PDF
GRAY SCALE IMAGE SEGMENTATION USING OTSU THRESHOLDING OPTIMAL APPROACH
PDF
Bay's marko chain
PDF
SEQUENTIAL CLUSTERING-BASED EVENT DETECTION FOR NONINTRUSIVE LOAD MONITORING
PDF
SEQUENTIAL CLUSTERING-BASED EVENT DETECTION FOR NONINTRUSIVE LOAD MONITORING
Incorporating Index of Fuzziness and Adaptive Thresholding for Image Segmenta...
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
AUTOMATIC THRESHOLDING TECHNIQUES FOR SAR IMAGES
Lecture 9&10 computer vision segmentation-no_task
FUZZY SET THEORETIC APPROACH TO IMAGE THRESHOLDING
On Tracking Behavior of Streaming Data: An Unsupervised Approach
MRI IMAGES THRESHOLDING FOR ALZHEIMER DETECTION
MRI IMAGES THRESHOLDING FOR ALZHEIMER DETECTION
Digital Image Processing
AUTOMATIC THRESHOLDING TECHNIQUES FOR OPTICAL IMAGES
IMAGE SEGMENTATION BY USING THRESHOLDING TECHNIQUES FOR MEDICAL IMAGES
Comparative between global threshold and adaptative threshold concepts in ima...
Synthetic aperture radar images
Fuzzy Entropy Based Optimal Thresholding Technique for Image Enhancement
Change Point | Statistics
GRAY SCALE IMAGE SEGMENTATION USING OTSU THRESHOLDING OPTIMAL APPROACH
Bay's marko chain
SEQUENTIAL CLUSTERING-BASED EVENT DETECTION FOR NONINTRUSIVE LOAD MONITORING
SEQUENTIAL CLUSTERING-BASED EVENT DETECTION FOR NONINTRUSIVE LOAD MONITORING
Ad

Recently uploaded (20)

PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
Cell Types and Its function , kingdom of life
PPTX
Lesson notes of climatology university.
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
GDM (1) (1).pptx small presentation for students
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Institutional Correction lecture only . . .
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Insiders guide to clinical Medicine.pdf
PDF
Classroom Observation Tools for Teachers
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Microbial disease of the cardiovascular and lymphatic systems
Cell Types and Its function , kingdom of life
Lesson notes of climatology university.
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
102 student loan defaulters named and shamed – Is someone you know on the list?
O7-L3 Supply Chain Operations - ICLT Program
GDM (1) (1).pptx small presentation for students
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Final Presentation General Medicine 03-08-2024.pptx
Sports Quiz easy sports quiz sports quiz
Complications of Minimal Access Surgery at WLH
Institutional Correction lecture only . . .
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Insiders guide to clinical Medicine.pdf
Classroom Observation Tools for Teachers
school management -TNTEU- B.Ed., Semester II Unit 1.pptx

OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION

  • 1. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 DOI : 10.5121/sipij.2017.8402 15 OPTIMAL GLOBAL THRESHOLD ESTIMATION USING STATISTICAL CHANGE-POINT DETECTION Rohit Kamal Chatterjee1 and Avijit Kar2 1 Department of Computer Science & Engineering, Birla Institute of Technology, Mesra, Ranchi, India. 2 Department of Computer Science & Engineering, Jadavpur University, Kolkata, India. ABSTRACT Aim of this paper is reformulation of global image thresholding problem as a well-founded statistical method known as change-point detection (CPD) problem. Our proposed CPD thresholding algorithm does not assume any prior statistical distribution of background and object grey levels. Further, this method is less influenced by an outlier due to our judicious derivation of a robust criterion function depending on Kullback-Leibler (KL) divergence measure. Experimental result shows efficacy of proposed method compared to other popular methods available for global image thresholding. In this paper we also propose a performance criterion for comparison of thresholding algorithms. This performance criteria does not depend on any ground truth image. We have used this performance criterion to compare the results of proposed thresholding algorithm with most cited global thresholding algorithms in the literature. KEYWORDS Global image thresholding, Change-point detection, Kullback-Leibler divergence, robust statistical measure, thresholding performance criteria. 1. INTRODUCTION A grey-level digital image is a two dimensional signal LI →→→→ΖΖΖΖ××××ΖΖΖΖ: , where L={ li∈ and i=1,2,…,M} is the set of M grey-levels. The problem of automatic thresholding is to estimate an optimal threshold t0 which segments the image into two meaningful sets, viz. background B={bb(x,y)=1| I(x,y)<t0 } and foreground F={bf(x,y)=1| I(x,y)≥ t0} or the opposite. The function I(x,y) can take any random value li∈L; so, sampling distribution of grey levels becomes an important deciding factor for t0. In many image processing applications, automating the process of optimal thresholding is extremely important for low-level segmentation or even final segmentation of object and background. In general, automatic thresholding algorithms are divided into two groups, viz. global and local methods. Global methods estimate a single threshold for the entire image; local methods find an adaptive threshold for each pixel depending on the characteristics of its neighborhood. Global methods are used if the image is considered as a mixture of two or more statistical distributions. In this paper, we address the global thresholding methods guided by the image histogram. Most of the cases global thresholding methods try to estimate the threshold (t0) iteratively by optimizing a criterion function [1]. Some other methods attempt to estimate optimal t0 depending on histogram shape [2, 3], image attribute such as topology [5] or some clustering techniques [4, 8, 20]. Comprehensive surveys discussing various aspects of thresholding methods can be found in the references [1, 6, 7].
  • 2. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 16 Many of these classical and recent schemes perform remarkably well for images with matching underlying assumptions but fail to yield desired results otherwise. Some of the explicit or implicit reasons for their failure could be: (i) assumption of some standard distribution (e.g. Gaussian) [19], in reality though, foreground and background classes can have arbitrary asymmetric distributions, (ii) use of non-robust measures for computing criterion functions which get influenced by outliers. Further, the effectiveness of these algorithms greatly decreases when the areas under the two classes are highly unbalanced. Some of the methods depend on user specified constant (e.g. Renyi or Tsallis entropy based methods) [17, 18], greatly compromising their performance without its appropriate value. This paper proposes an algorithm for addressing these drawbacks using a statistical technique known as change-point detection (CPD). For the last few decades, models of change-point detection are successfully applied by researchers in statistics and control theory for detecting abrupt changes in the statistical behavior of an observed signal or time series [9]. The general principle of change-point detection considers an observed sequence of independent random variables {Yk}1…n with a probability density function (pdf) pθ(y) depending on a parameter θ. If any change occurs in the sequence then it is assumed that parameter θ takes a value θ0 before any change and at some unknown time t0 alters to θ1 (≠θ0). The main problem of statistical change- point detection is to decide the change in parameter and also the time of change. The theory of CPD is used in this paper to decide the global threshold in an image depending on the change in the histogram. Further, in section 4 of this paper, we propose a new performance index for the evaluation of thresholding algorithms. It depends on the structural difference between the shapes of background and foreground. The advantage of this performance index is that it does not depend on any ground truth image. We use this performance index to compare different thresholding algorithms including ours. Rest of the paper is organized as follows: Section 2 provides a short introduction to the problem of statistical change-point detection, section 3 formulates and derives the global thresholding as a change-point detection problem, section 4 describes our proposal for thresholding performance criteria, section 5 presents the experimental results and compares the results with various often cited global thresholding algorithms, and finally section 6 summarizes main ideas in this paper. 2. THE CHANGE-POINT DETECTION (CPD) PROBLEM The Change-point detection (CPD) problem can be classified into two broad categories: real-time or online and retrospective or offline change-point detection. The first targets applications where the instantaneous response is desired such as robot control; on the other hand, retrospective or offline change-point detection is used when longer reaction periods are allowed e.g. image processing problems [10]. The later technique is likely to give more accurate detection since the entire sample distribution is accessible. Since the image and the corresponding histogram are available to us, we concentrate on offline change-point detection in this paper. We also assume that there is only one change point throughout the given observations {yk}1…n. When required, this assumption can easily be relaxed and extended to multiple change point detection that can be applied in multi-level threshold detection problems. 2.1. Problem Statement When taking an offline point of view about the observations y1, y2…, yn with corresponding probability distribution functions F1, F2, …, Fn, belong to a common parametric family F(φ), where φ∈ Rp , p>0. Then the change point problem is to test the null hypothesis (H0) about the population parameter φj, j = 1,2, …, n:
  • 3. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 17 njiforH j ≤≤≤≤≤≤≤≤==== 00 : θθθθφφφφ versus an alternative hypothesis (1) { 1, ,1 0 1 : kjfor njkforjH ≤≤≤≤≤≤≤≤ ≤≤≤≤<<<< ==== θθθθ θθθθ φφφφ where θ0 ≠ θ1 and k is an unknown time of change. These hypotheses together disclose the characteristics of change point inference, determining if any change point exists in the process and estimating the time of change t0 = k. The likelihood ratio corresponding to the hypotheses H0 and H1 is given by where and are pdfs before and after the change occurs and is the overall probability density. When the only unknown parameter is t0, its maximum likelihood estimate (MLE) is given by the following statistic 2.2. Offline Estimation of the Change Time When the problem is to estimate the change time (t0) in the sequence of observations {yj}1...n and if we assume the existence of a change point with the same presumption as in the last section. Therefore, considering equation (2) and (3) and the fact that is a constant for a given data, the corresponding MLE estimate is where is a maximum log-likelihood estimate of t0. Rewriting equation (4) as As remains constant for a given observation, estimation of is simplified as Therefore, the MLE of the change time t0 is the value which maximizes the sum of log-likelihood ratio corresponding to all k possible values given by equation (6). 3. CHANGE-POINT DETECTION FORMULATION OF GLOBAL THRESHOLDING 3.1. Assumptions Let (χ, βχ, Pθ)θ∈∈∈∈Θ be the statistical space of discrete grey-levels associated with a random variable Y:ℤℤℤℤxℤℤℤℤ→ℤℤℤℤ, where βχ is the σ-field of Borel subsets A ⊂⊂⊂⊂ χ and {Pθ}θ∈∈∈∈Θ is a family of probability distributions defined on the measurable space (χ, βχ) with parameter space Θ, an open subset of
  • 4. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 18 ℝq , q>0.We consider a finite population Π of all gray-level images with N elements that could be classified into M categories or classes L={l1, ..., lM}, i.e. each sample point in the sample image can take any random gray-level values from the set L. 3.2. Change-Point Detection Formulation Since we are mainly interested in discrete gray-level data, we consider the multinomial distribution model. Let ℘℘℘℘={Ei}, i=1,...,M be a partition of χ. The formula Prθ(Ei) = pi(θ), i = 1, . . ., M, defines the probability of the li th gray-level in the discrete statistical model. Further we assume {y1,. . .,yN} to be a random sample from the population described by the random variable Y, representing the gray-level of a pixel. And let , where IE is the index function. Then we can approximate pi(θ)≈Ni/N, i=1,…, M. Estimating θ by maximum likelihood method consists of maximizing the joint probability distribution for fixed n1, . . . , nM, or equivalently maximizing the log-likelihood function Therefore, referring to equation (4), problem of estimating the threshold by MLE can be stated as where unknown parameter θ =θ0 before the change and θ= θ1 after the change. Now, equation (9) can be expanded as The first term within the bracket on the right side of equation (10) is a constant and the last term is independent of j, i.e. it cannot influence the MLE. So, eliminating these terms from equation (10) and simplifying we get Multiplying and dividing N on right side of equation (11) we get assuming pi(θ)≈ni/N equation (12) can be written as
  • 5. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 19 The expression in (13) under the summation denotes Kullback-Leibler (KL) divergence between the density and , where and denotes the pdfs above and below the threshold location j; therefore equation (13) can be written as Since total sum is independent of j, i.e. a constant for a given observation, a sample image, therefore equation (14) can be rewritten as Hence, equation (15) provides the maximum likelihood estimation of the threshold t0. Equation (15) can be restated as the following proposition: Proposition 1: In a mixture of distributions, the maximum likelihood estimate of change-point is found by minimizing the Kullback-Leibler divergence of the probability mass across successive thresholds. In spite of this striking property, KL divergence is not a ‘metric’ since it is not symmetric. An alternative symmetric formula by “averaging” the two KL divergences is given as [11] An attractive property of KL divergence is its robustness i.e. KL divergence is little influenced even when one component of mixture distribution is considerably skewed. A proof of robustness can be found for generalized divergence measures in [11, 12]. This method can be easily extended to find multiple thresholds for several mixture distributions by identifying multiple change-points simultaneously. 3.3. Implementation Section Let us consider an image I:ℤxℤ L, whose pixels assume M gray-levels in the set L={l1, l2,…, lM }. The empirical distribution of the image can be represented by a normalized histogram p(li)=ni/N, where ni is the number of pixels in ith gray-level and N is the total number pixels in the image. Now, suppose we are grouping the pixels into two classes B and F (background and object) by thresholding at the level k. Histogram of gray-levels can be found for the classes B and F; let us denote them as pB(li)and pF(li). Following statistics are calculated for the level k.
  • 6. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 20 and finally, The minimum value of CPD(k) for all values of k in the range [1,…, M] gives an optimal estimate of threshold t0. 4. THRESHOLDING PERFORMANCE CRITERIA The objective of the global thresholding algorithm is to divide the image into two binary images generally called background and foreground (object). Most of the histogram-based thresholding algorithms try to devise a criterion function which produces a threshold to separate the shapes and patterns of the foreground and background as much as possible. A good thresholding algorithm can be judged by how well it sets apart the object and the background binary images, i.e. how much dissimilarity exists between the foreground and the background. Since the background and foreground images are binary images dissimilarity between them can be measured by any binary distance measures. Based on this observation, we propose a threshold evaluation criterion, which tries to find the dissimilarity between the patterns and shapes in foreground and background. A number of binary similarity and distance measures have been proposed in different areas, a comprehensive survey of them can be found in Choi et al. [13]. In order to understand the distance measure used in our work, it is helpful to refer to the following contingency table (Table 1): Table 1. Binary contingency table Foreground Background 1 0 1 a b 0 c d The cell entries in Table 1 are the number of pixel locations for which the two binary images agree or differ. For example, cell entry ‘a’ is the total count of pixel locations where both binary images take a value one. Hence, b + c denote the total count where foreground and background pixels differ (Hamming distance) and a+d is the total count where they agree. In order to extract the shapes and patterns present in the foreground (F) and background (B) images, we use binary morphological gradient. The binary Morphological gradient is the difference between the eroded and dilated images. Obviously, any other edge or texture detection algorithm for binary images can be also used to extract the objects present in foreground and background. In this paper, we use a simple binary distance measure known as Normalised Manhattan distance (DNM) given by where Fg and Bg denote Binary Morphological gradients of foreground (F) and background(B) respectively. The range of this distance measure is the interval [0, 1]. It is expected that well- segmented image will have DNM close to 1, while in the worst case DNM =0. The advantage of this algorithm is that it does not require any ground truth image.
  • 7. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 21 5. EXPERIMENTAL RESULTS WITH DISCUSSION To validate the applicability of proposed Change-Point Detection (CPD) thresholding algorithm, we provide experimental results and compare the results with existing algorithms. The first row of Figure 1 shows test images that are labeled from left to right as Dice, Rice, Object, Denise, Train, and Lena respectively. TABLE 2: Threshold evaluation criterion (DNM) for the test images (A) Dice, (B) Rice, (C) Object, (D) Denise, (E) Train, and (F) Lena The images have deliberately been so selected that the difference of areas between foreground and background is hugely disproportionate. This gives us an opportunity to test the robustness of CPD algorithm. To compare the results, we selected five most popular thresholding algorithms, namely, Kittler-Illingworth [14], Otsu [15], Kurita [16], Sahoo [17] and Entropy [18]. In Figure 1 third row onwards show the outputs of different thresholding algorithms. The last row shows the output of the proposed CPD thresholding algorithm. Due to substantial skewness in the distributions of gray-levels in object or background, most of the algorithms confused foreground with background. But results in the last row clearly show that CPD works significantly better in all cases. Table 2 shows optimal thresholds of five selected algorithms and the CPD algorithm using our proposed performance criteria. It is clear that CPD performs reasonably well. For example, consider the Denise image and Train image, Kittler-Illingworth thresholding totally fails to distinguish the object from the background due to its assumption of Gaussian distribution for both foreground and background [19]. Otsu’s and Kurita’s method yield almost same output due to their common assumptions. Corresponding histograms are also reproduced in Figure 2 marked with threshold locations of all the six algorithms above for reference. The threshold locations show that CPD algorithm is very little influenced by the asymmetry of object of background distributions. 6. CONCLUSIONS In this paper we propose a novel global image thresholding algorithm based on Statistical Change-Point detection (CPD), which is derived based on a symmetric version of Kullback- Leibler divergence measure. The experimental results clearly show this algorithm is largely unaffected by disproportionate dispersal of object and background scene and also very little
  • 8. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 22 influenced by the skewness of distributions of object and background compared to other well- known algorithms. We also propose a thresholding performance criterion using dissimilarity between foreground and background binary images. Advantage of this performance criterion is that it does not require any ground truth image. Figure 1. Result of thresholding algorithms on tested images: Row-1: Original Images; Row-2: Shapes of histograms; Row-3: Kittler; Row-4: Otsu; Row-5: Kurita; Row-6: Sahoo; Row-7: Entropy: Row-8: CPD Threshold.
  • 9. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 23 Figure 2: Histogram of (a) Denise and (b) Train image with threshold locations REFERENCES [1] M. Sezgin and B. Sankur (2004) “Survey over image thresholding techniques and quantitative performance evaluation”, J. of Electronic Imaging, Vol. 13, No. 1, pp.146–165. [2] A. Rosenfeld and P. De la Torre, (1983) “Histogram concavity analysis as an aid in threshold selection”, IEEE Trans. Syst. Man Cybernetics. SMC-13, pp. 231–235. [3] M. I. Sezan, (1985) “A peak detection algorithm and its application to histogram-based image data reduction”, Graph. Models Image Process. Vol. 29, pp.47–59. [4] D. M. Tsai, (1995) “A fast thresholding selection procedure for multimodal and unimodal histograms”, Pattern Recogn. Lett. Vol. 16, pp. 653–666. [5] A. Pikaz and A. Averbuch, (1996) “Digital image thresholding based on topological stable state”, Pattern Recogn. Vol. 29, pp.829–843. [6] N. R. Pal and S. K. Pal, (1993) “A review on image segmentation techniques,” Pattern Recog., vol. 26, no. 9, pp. 1277–1294. [7] P. K. Sahoo, S. Soltani, A. K. C. Wong, and Y. C. Chen, (1988) “A survey of thresholding techniques,” Computer Vision, Graphics, and Image Process., vol. 41, no. 2, pp.233–260. [8] C. V. Jawahar, P. K. Biswas, and A. K. Ray, (1997) ‘‘Investigations on fuzzy thresholding based on fuzzy clustering,’’ Pattern Recogn., vol. 30, no. 10, pp. 1605–1613. [9] H. V. Poor, O. Hadjiliadis, Quickest Detection, Cambridge University Press, New York, 2009. [10] J. Chen, A. K. Gupta, (2012) Parametric statistical change point analysis, with applications to genetics, medicine and finance, 2nd Ed., Birkhäuser, Boston. [11] Y. Wang, (2011) “Generalized Information Theory: A Review and Outlook”, J. of Inform. Tech., Vol. 10, No. 3, pp. 461-469. [12] L. Pardo, (2006) Statistical Inference Based on Divergence Measures, Chapman & Hall/CRC, pp. 233. [13] S. Choi, S. Cha, C. C. Tappert, (2010) “A Survey of Binary Similarity and Distance Measures,” Systemics, Cybernetics and Informatics Vol. 8, No. 1, pp. 43-48, 2010.
  • 10. Signal & Image Processing : An International Journal (SIPIJ) Vol.8, No.4, August 2017 24 [14] J. Kittler and J. Illingworth (1986) “Minimum error thresholding”, Pattern Recognition, Vol. 19, pp. 41–47. [15] N. Otsu, (1979) “A threshold selection method from gray level histograms”, IEEE Trans. Syst. Man Cybern. SMC-9, pp. 62–66. [16] T. Kurita, N. Otsu, and N. Abdelmalek, (1992) “Maximum likelihood thresholding based on population mixture models”. Pattern Recognition, Vol. 25, pp. 1231-1240. [17] P. Sahoo, C. Wilkins, and J. Yeager, (1997) “Threshold selection using Renyi’s entropy”, Pattern Recogn. Vol. 30, pp. 71–84. [18] P.K.Sahoo, G.Arora, “Image thresholding using two-dimensional Tsallis–Havrda–Charvat entropy”, Pattern Recognition Letters, Vol. 27, pp. 520–528, 2006. [19] J. Xue and D. M. Titterington (2011) “t-tests, F-tests and Otsu’s Methods for Image Thresholding,” IEEE Trans. Image Processing, vol. 20, no. 8, pp. 2392-2396. [20] H. Tizhoosh, (2005) “Image thresholding using type II fuzzy sets”, Pattern Recognition, vol. 38, pp. 2363 – 2372.