Group Testing with Test Errors Made Easier

Nyongesa L.Kennedy & Syaywa J.Paul
International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 1
Group Testing With Test Errors Made Easier
Nyongesa L. Kennedy knyongesa@hotmail.com
Department of Mathematics,
Masinde Muliro University of Science and Technology,
190 kakamega, Kenya
Paul J. Syaywa syaywa@yahoo.com
Department of Mathematics
Masinde Muliro University of Science and Technology,
190 kakamega, Kenya
Research Partially Supported by MMUST-URF
Abstract
Group testing is a cost-effective procedure for identifying defective items in a
large population. It also improves the efficiency of the testing procedure when
imperfect tests are employed. This study develops computational group-testing
strategy based on [5] testing strategy. Statistical moments based on this applied
design have been generated. With advent of digital computers in 1980‘s, group-
testing strategy under discussion is handled in the context of computational
statistics.
Keywords: False-negative; False-positive; Group; Imperfect-tests; Pool.
1. INTRODUCTION
Sequential testing of a population in the form of grouped sample started the way back in the
second world war by [2] as a cost-effective method for screening syphilis in US soldiers returning
from abroad. The [2] idea entails putting together individuals to form a group, and then testing the
group rather than testing each individual for evidence or absence of the characteristic of interest.
Epidemiological studies that use group testing have one of the two objectives. The first objective
is to screen a large population with a view to identifying those individuals with a trait (cf. [2]). The
second objective is to estimate the rate of the trait (cf. [10] and [9]). For either objective, group
testing is more cost effective than individual testing especially when the rate of the trait is low
because if a group tests negative it implies that none of the individuals that constitute the
group have the trait, and thus it is not necessary to test each individual in the group.
In recent years, there has been renewed interest in group testing strategies of biological
specimens because of the application in HIV/Aids epidemiology (cf. [5]). The procedure has
potential in the application of HIV/Aids testing because disease prevalence is estimated without
necessarily identifying the subject (cf. [3]). [12] studied the cost-effectiveness of pooling algorithm

for the first objective of identifying individuals with the trait. In their procedure, each individual
group that test positive is divided into two equal groups, which are then tested. Groups that tested
positive were further sub-divided and tested and so on. [13] extended this work by considering
pooling algorithms when there are errors and showed that some of these algorithms can reduce
the error rates of the screening procedures (the false positives and false negatives) compared to
individual testing. [7] examined group testing with re-testing and observed that re-test improves
the sensitivity and specificity of the group-testing algorithm.
Recent studies have focused on the second objective of estimating the rate of the trait using
group-testing strategy. [11] discussed the procedure as a potential method for use by
pharmaceutical companies in discovering drugs in the early stages. [8] has proposed an
estimator in pool testing strategy that benefit from re-testing the pools. He observed that re-
testing improves the efficiency of the estimator.
In this study, we discuss the computation of statistical measures in pool testing strategy with
imperfect test via computer package MATLAB based on [5] design of pool testing strategy. To the
authors knowledge no article has appeared in the literature of group-testing as championed by [2]
that has discussed the procedure in computational aspect. The rest of the paper is arranged as
follows: Group testing strategy with imperfect test or in the presence of test errors is introduced in
Section 2. Various statistical moments are generated in Section 3. Misclassification in the
proposed algorithm as a result of test errors is discussed in Section 4.Section 5 provides the
conclusion to the present study.
2. THE TESTING STRATEGY
In this study, we generalize the group testing strategy by introducing the error component in the
testing scheme so that the earlier proposed strategies become special cases as proposed by [5].
The strategy proposed in this study is as follows. Initially, group the population under investigation
into a single group of size n and carry out a test on the group. If the test result is negative, further
testing is discontinued. If the test result is positive, the group is divided into groups of equal sizes
(nk), and each group is subjected to group-testing. If the group tests positive, individual testing is
carried out. Diagrammatic description of the procedure has been presented in Figure 1.
FIGURE 1: Block Testing Strategy
nb21 ……………………i
_
+
…….………………1 2 i
Positive result on the test
Negative result on the test

3. MOMENTS IN THE GROUP TESTING STRATEGY
Generation of random numbers from distributions form a basis for generation of moments in this
section. Notice that, in order to use a computer to initiate a simulation study, we must be able to
generate the values of a uniform (0,1) random variables; such variates are called random
numbers, most computers have in-built subroutines, called a random number generator. For
further discussion on this subject see [6].With the above in mind, we are in a position to generate
moment measures in our proposed group testing strategy. In our testing strategy, we shall
assume that tests under use are imperfect so that when tests are assumed to be perfect would be
a special case. Before the generation of moments, we shall require the composite probability of
classifying a group as positive, denoted by π and given by
(1)
where k is the group size, p is the probably of incidence, and are the sensitivity and
specificity of the test in use, respectively. Equation (1) is easily derived by the law of total
probability. In our study, (1) is the probability of success. Therefore, we shall generate random
numbers from a binomial distribution with probability of success . Now, with (1) at hand, to
convert the data set {xi} generated from U (0, 1) into zeros and ones, we use the indicator
function
But from (1), it is clear that since (0, 1) and and .
Also, notice that in situations where the test kits are perfect, . Then (1)
reduces to
(2)
and if the group size is one, i.e., k = 1, (1) reduces to
(3)
Let X denote the number of defective groups (groups that test positive on the test), then X ~
binomial(n, π ). Hence, various statistical measures; mean, standard deviation, Kurtosis and
skewness have been computed by the aid of statistical packages. In addition, the total numbers
of tests, cost, and relative savings have been computed. We utilize Equations (1) and (3) to
generate these moments as presented in Tables 1a (i), 1a(ii), 1a (iii), though 2(b). Graphical
presentations are provided in Figures2(a) through 2(c) in the Appendix.
The simulation at population size 100 with groups of size 10 when the sensitivity and specificity of
the tests in use are 99% is provided in Table 1(a)(i). It can be observed from the simulated results
that:

• The numbers of defectives increase with increase in the incidence probability p
• The number of tests increases with increase in p,
• Relative savings decrease with increase is p.
If the population size is increased to 500 or 1000 from 100 with group size 20, as presented in
Tables 1a(ii) and 1a(iii), respectively, similar observations are made as noted above. Further,
notice that when the population size is fixed but the group size is increased, more defectives are
realized but there is no significant difference in relative savings. This is noted when we compare
Table 1a(ii) and Table 1a(iii).
Now, varying the sensitively and specificity from 99% to 95% as provided by Tables 1 b(i) to 1
b(iii), we draw similar conclusion. Graphical evidence of the observations on average number of
tests required to identify all defective items in the group is provided by Figures 2(a) through 2(c).
Clearly, the observations made are true in practice. Group testing strategy is only visible when
the incidents probability is small [2]. Otherwise individual testing is preferred. Thus, the tables
provides empirical evidence of the group testing scheme.
4. MISCLASSIFICATIONS IN GROUP TESTING STRATEGY
Our main assumption in the discussion of this study was that test act independently and errors
are part of the design as it is the case in practice ([7],[1]). Thus, misclassifications are bound to
arise in the testing scheme. There are two possible misclassifications in the literature of group-
testing namely:
• A defective item is classified as non-defective and termed as false negative,
• A non-defective item classified as defective, false positive.
The probabilities of interest here are the probability of false positive
and the probability of false negative is
We now utilize (4) and (5) to compute misclassifications as presented in Tables 3(a), 3(b),3(c),
and 3(d). The simulated results are also presented graphically in Figures 3(a), 3(b), 3(c) and 3(d).
Computed values of false positives for group sizes:100, 500, 1000 with group sizes of 10, 20, and
50 have been presented in Table 3(a) and 3(b), when tests with equal sensitively and specificity
are employed. It can be observed that:
• The number of false positive increases with increase in p,
• More false positives are realized with increase in group size. In fact, when the group size is

doubled,
false positive increase by at least two fold,
• Increase in the efficiency of the test kits results into a reduction in false positives.
From Tables 3(c) and 3(d), we observe that:
• The number of false-negative increase at a slow rate with increase in the incidence probability,
• The number of false negatives approximately doubles when the group size is doubled,
• If the efficiency of the tests is increased, fewer false negatives are realized.
5. CONCLUSION
We have presented a computational pool testing strategy with test errors based on [5] design. It is
evident from the computed results; Tables 1 (a)I to 1 (a) iii, that groups should be relatively small
to be able to obtain the desired results as relative savings decrease with increase in pool sizes.
This observation is feasible in situations where dilution effect can affect the results (cf. [4]). Also
notice that relative savings is prominent when the efficiency of the test kits are high. Furthermore,
the computed results support the idea that the procedure is only feasible when the prevalence
rate is small otherwise individual testing is preferable. i.e relative savings decrease with increase
in prevalence rate. Misclassifications are prominent when the efficiency of the test kits are low
and incidence probability high, calling for re-testing, [7] and [8].
6. REFERENCES
1. R. Brookmeyer. `Ànalysis of multistage pooling studies of Biological specimens for
Estimating Disease Incidence and prevalence’’. Biometric 55, 608-612, 1999.
2. R. Dorfman. ``The Detection of Defective Members of Large Population’’. Annals of
Mathematical Statistics 14, 436-440, 1943.
3. J.L.Gastwirth, P.A. Hammick. `Èstimation of the prevalence of a Rare Disease, Preserving
the Anonymity of the Subject by group-Testing; Application to Estimating the Prevalence
of AIDS Antibodies in Blood Donor’’.Journal of Statistical Planning and Inference 22, 15-
27, 1989.
4. F.K. Hwang. ``Group Testing with a Dilution Effect’’. Biometrika 63, 611-613, 1975.
5. R.L. Kline, T. Bothus, R. Brookmeyer, S.Zeyer, T. Quinn.`Èvaluation of Human
Immunodeficiency Virus Sera Prevalence in Population Surveys Using Pooled Sera’’. Journal
of Clinical Microbiology 27, 1449-145, 1989.
6. W.L.Martinez, A.R. Martinez, A.R. ``Computational Statistics Handbook with MATLAB’’.
Chapman & Hall/CRC, (2002).
7. L.K. Nyongesa. `` Multistage group Testing Procedure (group Screening)’’.

Communications in Statistics-Simulation and Computation 33(3), 621-637, 2004.
8 L.K Nyongesa. ``Dual Estimation of Prevalence and Disease Incidence in Pool-Testing
Strategy’’.Communication in Statistics Theory and Method (in Press), 2010.
9. M. Sobel, R.M. Elashoff , R. M. (1975). ``Group-Testing with a New Goal, Estimation’’.
Biometrika 62, 181-193, 1975.
10. K.H.Thompson. ``Estimation of the Population of Vectors in a Natural Population of Insects’’.
Biometrics 18, 568-578, 1962.
11. M. Xie, K. Tatsuoka, J.Sacks, S.Young. ``Group testing with Blockers and synergism’’.
Journal of American Statistical Association 96, 92-101, 2001.
12. N.I. Johnson,S.Kotz, X. Wu. ``Inspection Errors for Attributes in Quality Control’’.
London: Chapman and Hall, (1991).
13. E. Litvak, X.M.Tu, M. Pagano.`` Screening for the Presence of Disease by Pooling Sera
Samples’’. Journal of the American Statistical Association, 89, 424-434, 1994.

Group Testing with Test Errors Made Easier

More Related Content

What's hot (19)

Viewers also liked (17)

Similar to Group Testing with Test Errors Made Easier (20)

More from Waqas Tariq (20)

Recently uploaded (20)

Group Testing with Test Errors Made Easier