SlideShare a Scribd company logo
Nyongesa L.Kennedy & Syaywa J.Paul
International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 1
Group Testing With Test Errors Made Easier
Nyongesa L. Kennedy knyongesa@hotmail.com
Department of Mathematics,
Masinde Muliro University of Science and Technology,
190 kakamega, Kenya
Paul J. Syaywa syaywa@yahoo.com
Department of Mathematics
Masinde Muliro University of Science and Technology,
190 kakamega, Kenya
Research Partially Supported by MMUST-URF
Abstract
Group testing is a cost-effective procedure for identifying defective items in a
large population. It also improves the efficiency of the testing procedure when
imperfect tests are employed. This study develops computational group-testing
strategy based on [5] testing strategy. Statistical moments based on this applied
design have been generated. With advent of digital computers in 1980‘s, group-
testing strategy under discussion is handled in the context of computational
statistics.
Keywords: False-negative; False-positive; Group; Imperfect-tests; Pool.
1. INTRODUCTION
Sequential testing of a population in the form of grouped sample started the way back in the
second world war by [2] as a cost-effective method for screening syphilis in US soldiers returning
from abroad. The [2] idea entails putting together individuals to form a group, and then testing the
group rather than testing each individual for evidence or absence of the characteristic of interest.
Epidemiological studies that use group testing have one of the two objectives. The first objective
is to screen a large population with a view to identifying those individuals with a trait (cf. [2]). The
second objective is to estimate the rate of the trait (cf. [10] and [9]). For either objective, group
testing is more cost effective than individual testing especially when the rate of the trait is low
because if a group tests negative it implies that none of the individuals that constitute the
group have the trait, and thus it is not necessary to test each individual in the group.
In recent years, there has been renewed interest in group testing strategies of biological
specimens because of the application in HIV/Aids epidemiology (cf. [5]). The procedure has
potential in the application of HIV/Aids testing because disease prevalence is estimated without
necessarily identifying the subject (cf. [3]). [12] studied the cost-effectiveness of pooling algorithm
Nyongesa L.Kennedy & Syaywa J.Paul
International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 2
for the first objective of identifying individuals with the trait. In their procedure, each individual
group that test positive is divided into two equal groups, which are then tested. Groups that tested
positive were further sub-divided and tested and so on. [13] extended this work by considering
pooling algorithms when there are errors and showed that some of these algorithms can reduce
the error rates of the screening procedures (the false positives and false negatives) compared to
individual testing. [7] examined group testing with re-testing and observed that re-test improves
the sensitivity and specificity of the group-testing algorithm.
Recent studies have focused on the second objective of estimating the rate of the trait using
group-testing strategy. [11] discussed the procedure as a potential method for use by
pharmaceutical companies in discovering drugs in the early stages. [8] has proposed an
estimator in pool testing strategy that benefit from re-testing the pools. He observed that re-
testing improves the efficiency of the estimator.
In this study, we discuss the computation of statistical measures in pool testing strategy with
imperfect test via computer package MATLAB based on [5] design of pool testing strategy. To the
authors knowledge no article has appeared in the literature of group-testing as championed by [2]
that has discussed the procedure in computational aspect. The rest of the paper is arranged as
follows: Group testing strategy with imperfect test or in the presence of test errors is introduced in
Section 2. Various statistical moments are generated in Section 3. Misclassification in the
proposed algorithm as a result of test errors is discussed in Section 4.Section 5 provides the
conclusion to the present study.
2. THE TESTING STRATEGY
In this study, we generalize the group testing strategy by introducing the error component in the
testing scheme so that the earlier proposed strategies become special cases as proposed by [5].
The strategy proposed in this study is as follows. Initially, group the population under investigation
into a single group of size n and carry out a test on the group. If the test result is negative, further
testing is discontinued. If the test result is positive, the group is divided into groups of equal sizes
(nk), and each group is subjected to group-testing. If the group tests positive, individual testing is
carried out. Diagrammatic description of the procedure has been presented in Figure 1.
FIGURE 1: Block Testing Strategy
nb21 ……………………i
_
+
…….………………1 2 i
Positive result on the test
Negative result on the test
Nyongesa L.Kennedy & Syaywa J.Paul
International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 3
3. MOMENTS IN THE GROUP TESTING STRATEGY
Generation of random numbers from distributions form a basis for generation of moments in this
section. Notice that, in order to use a computer to initiate a simulation study, we must be able to
generate the values of a uniform (0,1) random variables; such variates are called random
numbers, most computers have in-built subroutines, called a random number generator. For
further discussion on this subject see [6].With the above in mind, we are in a position to generate
moment measures in our proposed group testing strategy. In our testing strategy, we shall
assume that tests under use are imperfect so that when tests are assumed to be perfect would be
a special case. Before the generation of moments, we shall require the composite probability of
classifying a group as positive, denoted by π and given by
(1)
where k is the group size, p is the probably of incidence, and are the sensitivity and
specificity of the test in use, respectively. Equation (1) is easily derived by the law of total
probability. In our study, (1) is the probability of success. Therefore, we shall generate random
numbers from a binomial distribution with probability of success . Now, with (1) at hand, to
convert the data set {xi} generated from U (0, 1) into zeros and ones, we use the indicator
function
But from (1), it is clear that since (0, 1) and and .
Also, notice that in situations where the test kits are perfect, . Then (1)
reduces to
(2)
and if the group size is one, i.e., k = 1, (1) reduces to
(3)
Let X denote the number of defective groups (groups that test positive on the test), then X ~
binomial(n, π ). Hence, various statistical measures; mean, standard deviation, Kurtosis and
skewness have been computed by the aid of statistical packages. In addition, the total numbers
of tests, cost, and relative savings have been computed. We utilize Equations (1) and (3) to
generate these moments as presented in Tables 1a (i), 1a(ii), 1a (iii), though 2(b). Graphical
presentations are provided in Figures2(a) through 2(c) in the Appendix.
The simulation at population size 100 with groups of size 10 when the sensitivity and specificity of
the tests in use are 99% is provided in Table 1(a)(i). It can be observed from the simulated results
that:
Nyongesa L.Kennedy & Syaywa J.Paul
International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 4
• The numbers of defectives increase with increase in the incidence probability p
• The number of tests increases with increase in p,
• Relative savings decrease with increase is p.
If the population size is increased to 500 or 1000 from 100 with group size 20, as presented in
Tables 1a(ii) and 1a(iii), respectively, similar observations are made as noted above. Further,
notice that when the population size is fixed but the group size is increased, more defectives are
realized but there is no significant difference in relative savings. This is noted when we compare
Table 1a(ii) and Table 1a(iii).
Now, varying the sensitively and specificity from 99% to 95% as provided by Tables 1 b(i) to 1
b(iii), we draw similar conclusion. Graphical evidence of the observations on average number of
tests required to identify all defective items in the group is provided by Figures 2(a) through 2(c).
Clearly, the observations made are true in practice. Group testing strategy is only visible when
the incidents probability is small [2]. Otherwise individual testing is preferred. Thus, the tables
provides empirical evidence of the group testing scheme.
4. MISCLASSIFICATIONS IN GROUP TESTING STRATEGY
Our main assumption in the discussion of this study was that test act independently and errors
are part of the design as it is the case in practice ([7],[1]). Thus, misclassifications are bound to
arise in the testing scheme. There are two possible misclassifications in the literature of group-
testing namely:
• A defective item is classified as non-defective and termed as false negative,
• A non-defective item classified as defective, false positive.
The probabilities of interest here are the probability of false positive
and the probability of false negative is
We now utilize (4) and (5) to compute misclassifications as presented in Tables 3(a), 3(b),3(c),
and 3(d). The simulated results are also presented graphically in Figures 3(a), 3(b), 3(c) and 3(d).
Computed values of false positives for group sizes:100, 500, 1000 with group sizes of 10, 20, and
50 have been presented in Table 3(a) and 3(b), when tests with equal sensitively and specificity
are employed. It can be observed that:
• The number of false positive increases with increase in p,
• More false positives are realized with increase in group size. In fact, when the group size is
Nyongesa L.Kennedy & Syaywa J.Paul
International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 5
doubled,
false positive increase by at least two fold,
• Increase in the efficiency of the test kits results into a reduction in false positives.
From Tables 3(c) and 3(d), we observe that:
• The number of false-negative increase at a slow rate with increase in the incidence probability,
• The number of false negatives approximately doubles when the group size is doubled,
• If the efficiency of the tests is increased, fewer false negatives are realized.
5. CONCLUSION
We have presented a computational pool testing strategy with test errors based on [5] design. It is
evident from the computed results; Tables 1 (a)I to 1 (a) iii, that groups should be relatively small
to be able to obtain the desired results as relative savings decrease with increase in pool sizes.
This observation is feasible in situations where dilution effect can affect the results (cf. [4]). Also
notice that relative savings is prominent when the efficiency of the test kits are high. Furthermore,
the computed results support the idea that the procedure is only feasible when the prevalence
rate is small otherwise individual testing is preferable. i.e relative savings decrease with increase
in prevalence rate. Misclassifications are prominent when the efficiency of the test kits are low
and incidence probability high, calling for re-testing, [7] and [8].
6. REFERENCES
1. R. Brookmeyer. ``Analysis of multistage pooling studies of Biological specimens for
Estimating Disease Incidence and prevalence’’. Biometric 55, 608-612, 1999.
2. R. Dorfman. ``The Detection of Defective Members of Large Population’’. Annals of
Mathematical Statistics 14, 436-440, 1943.
3. J.L.Gastwirth, P.A. Hammick. ``Estimation of the prevalence of a Rare Disease, Preserving
the Anonymity of the Subject by group-Testing; Application to Estimating the Prevalence
of AIDS Antibodies in Blood Donor’’.Journal of Statistical Planning and Inference 22, 15-
27, 1989.
4. F.K. Hwang. ``Group Testing with a Dilution Effect’’. Biometrika 63, 611-613, 1975.
5. R.L. Kline, T. Bothus, R. Brookmeyer, S.Zeyer, T. Quinn.``Evaluation of Human
Immunodeficiency Virus Sera Prevalence in Population Surveys Using Pooled Sera’’. Journal
of Clinical Microbiology 27, 1449-145, 1989.
6. W.L.Martinez, A.R. Martinez, A.R. ``Computational Statistics Handbook with MATLAB’’.
Chapman & Hall/CRC, (2002).
7. L.K. Nyongesa. `` Multistage group Testing Procedure (group Screening)’’.
Nyongesa L.Kennedy & Syaywa J.Paul
International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 6
Communications in Statistics-Simulation and Computation 33(3), 621-637, 2004.
8 L.K Nyongesa. ``Dual Estimation of Prevalence and Disease Incidence in Pool-Testing
Strategy’’.Communication in Statistics Theory and Method (in Press), 2010.
9. M. Sobel, R.M. Elashoff , R. M. (1975). ``Group-Testing with a New Goal, Estimation’’.
Biometrika 62, 181-193, 1975.
10. K.H.Thompson. ``Estimation of the Population of Vectors in a Natural Population of Insects’’.
Biometrics 18, 568-578, 1962.
11. M. Xie, K. Tatsuoka, J.Sacks, S.Young. ``Group testing with Blockers and synergism’’.
Journal of American Statistical Association 96, 92-101, 2001.
12. N.I. Johnson,S.Kotz, X. Wu. ``Inspection Errors for Attributes in Quality Control’’.
London: Chapman and Hall, (1991).
13. E. Litvak, X.M.Tu, M. Pagano.`` Screening for the Presence of Disease by Pooling Sera
Samples’’. Journal of the American Statistical Association, 89, 424-434, 1994.

More Related Content

PDF
Computational Pool-Testing with Retesting Strategy
PPT
Propensity Scores in Medical Device Trials
PDF
Hiv Replication Model for The Succeeding Period Of Viral Dynamic Studies In A...
PPTX
Repeated measures anova with spss
PDF
Evaluation measures for models assessment over imbalanced data sets
PDF
JSM2013,Proceedings,paper307699_79238,DSweitzer
PDF
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
PPTX
Testing Assumptions in repeated Measures Design using SPSS
Computational Pool-Testing with Retesting Strategy
Propensity Scores in Medical Device Trials
Hiv Replication Model for The Succeeding Period Of Viral Dynamic Studies In A...
Repeated measures anova with spss
Evaluation measures for models assessment over imbalanced data sets
JSM2013,Proceedings,paper307699_79238,DSweitzer
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
Testing Assumptions in repeated Measures Design using SPSS

What's hot (19)

PDF
Optimal Stopping Report Final
PDF
Webers Law Lab Report William Teng
PDF
Group Estimation Lab Report
PPT
eMba i qt unit-5_sampling
PDF
DOC
Poor man's missing value imputation
PDF
Variable and feature selection
PDF
Lecture 6 guidelines_and_assignment
PDF
McCarthy_TermPaperSpring
PDF
Statistical Prediction for analyzing Epidemiological Characteristics of COVID...
PDF
Uncertainity
PDF
Faces Lab Report
PPTX
Discriminant Analysis in Sports
PDF
Samplels & Sampling Techniques
PPTX
2.7.21 sampling methods data analysis
PDF
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
PDF
A New SDM Classifier Using Jaccard Mining Procedure (CASE STUDY: RHEUMATIC FE...
PDF
A new sdm classifier using jaccard mining procedure case study rheumatic feve...
PDF
A Mathematical Model for the Genetic Variation of Prolactin and Prolactin Rec...
Optimal Stopping Report Final
Webers Law Lab Report William Teng
Group Estimation Lab Report
eMba i qt unit-5_sampling
Poor man's missing value imputation
Variable and feature selection
Lecture 6 guidelines_and_assignment
McCarthy_TermPaperSpring
Statistical Prediction for analyzing Epidemiological Characteristics of COVID...
Uncertainity
Faces Lab Report
Discriminant Analysis in Sports
Samplels & Sampling Techniques
2.7.21 sampling methods data analysis
INFLUENCE OF DATA GEOMETRY IN RANDOM SUBSET FEATURE SELECTION
A New SDM Classifier Using Jaccard Mining Procedure (CASE STUDY: RHEUMATIC FE...
A new sdm classifier using jaccard mining procedure case study rheumatic feve...
A Mathematical Model for the Genetic Variation of Prolactin and Prolactin Rec...
Ad

Viewers also liked (17)

PDF
Basel-En Resume (1) (1)
PDF
Personalisierung; der Schlüssel zum Erfolg in Q4 auf Shopware
PDF
The Garden Grocery: Food Safety and Selection at Farmers' Markets
PDF
Visual Resume
PDF
intro to maint management systems
PDF
Layne David Orr Rev 17
PPTX
PDF
Academic certificates
PDF
Webinar: Excediendo las expectativas de tus clientes con Personalización
PPTX
Intervalos
DOCX
Mock exam online 2
PPTX
Typography pp
PDF
Success Factors in Offset Deals: A Case Study Based Examination
PPT
Cистема вознаграждений мозга. Reward system in the brain.
PDF
Impact of Solvency II yield curve extrapolation parameters on the valuation o...
PPT
Cells And Photosynthesis
PDF
WÉBINAIRE : I’ultime guide de survie du eCommerce – Préparez-vous à conquérir...
Basel-En Resume (1) (1)
Personalisierung; der Schlüssel zum Erfolg in Q4 auf Shopware
The Garden Grocery: Food Safety and Selection at Farmers' Markets
Visual Resume
intro to maint management systems
Layne David Orr Rev 17
Academic certificates
Webinar: Excediendo las expectativas de tus clientes con Personalización
Intervalos
Mock exam online 2
Typography pp
Success Factors in Offset Deals: A Case Study Based Examination
Cистема вознаграждений мозга. Reward system in the brain.
Impact of Solvency II yield curve extrapolation parameters on the valuation o...
Cells And Photosynthesis
WÉBINAIRE : I’ultime guide de survie du eCommerce – Préparez-vous à conquérir...
Ad

Similar to Group Testing with Test Errors Made Easier (20)

PDF
Computation of Moments in Group Testing with Re-testing and with Errors in In...
PDF
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
DOC
Ch 12 SIGNIFICANT TESTrr.doc
DOCX
122416, 11(23 AMModule 8 Mastery Exercise SchoologyPa.docx
PPTX
Effects of Sample Size and Budget Allocation.pptx
PPT
Chapter 10 Design
DOCX
1) The path length from A to B in the following graph is .docx
DOCX
CHAPTER 8 QUANTITATIVE METHODSWe turn now from the introductio
DOCX
Assignment 2 Tests of SignificanceThroughout this assignment yo.docx
PDF
An Empirical Study On Diabetes Mellitus Prediction For Typical And Non-Typica...
PDF
A New Concurrent Calibration Method For Nonequivalent Group Design Under Nonr...
PPT
Sample size
DOC
Ch 4 SAMPLE..doc
PDF
Response of Watermelon to Five Different Rates of Poultry Manure in Asaba Are...
PPT
Day 11 t test for independent samples
DOCX
WEEK 7 – EXERCISES Enter your answers in the spaces pr.docx
PPTX
Predictive analytics using 'R' Programming
DOCX
Mba103 statistics for management
Computation of Moments in Group Testing with Re-testing and with Errors in In...
PREDICTING CLASS-IMBALANCED BUSINESS RISK USING RESAMPLING, REGULARIZATION, A...
Ch 12 SIGNIFICANT TESTrr.doc
122416, 11(23 AMModule 8 Mastery Exercise SchoologyPa.docx
Effects of Sample Size and Budget Allocation.pptx
Chapter 10 Design
1) The path length from A to B in the following graph is .docx
CHAPTER 8 QUANTITATIVE METHODSWe turn now from the introductio
Assignment 2 Tests of SignificanceThroughout this assignment yo.docx
An Empirical Study On Diabetes Mellitus Prediction For Typical And Non-Typica...
A New Concurrent Calibration Method For Nonequivalent Group Design Under Nonr...
Sample size
Ch 4 SAMPLE..doc
Response of Watermelon to Five Different Rates of Poultry Manure in Asaba Are...
Day 11 t test for independent samples
WEEK 7 – EXERCISES Enter your answers in the spaces pr.docx
Predictive analytics using 'R' Programming
Mba103 statistics for management

More from Waqas Tariq (20)

PDF
The Use of Java Swing’s Components to Develop a Widget
PDF
3D Human Hand Posture Reconstruction Using a Single 2D Image
PDF
Camera as Mouse and Keyboard for Handicap Person with Troubleshooting Ability...
PDF
A Proposed Web Accessibility Framework for the Arab Disabled
PDF
Real Time Blinking Detection Based on Gabor Filter
PDF
Computer Input with Human Eyes-Only Using Two Purkinje Images Which Works in ...
PDF
Toward a More Robust Usability concept with Perceived Enjoyment in the contex...
PDF
Collaborative Learning of Organisational Knolwedge
PDF
A PNML extension for the HCI design
PDF
Development of Sign Signal Translation System Based on Altera’s FPGA DE2 Board
PDF
An overview on Advanced Research Works on Brain-Computer Interface
PDF
Exploring the Relationship Between Mobile Phone and Senior Citizens: A Malays...
PDF
Principles of Good Screen Design in Websites
PDF
Progress of Virtual Teams in Albania
PDF
Cognitive Approach Towards the Maintenance of Web-Sites Through Quality Evalu...
PDF
USEFul: A Framework to Mainstream Web Site Usability through Automated Evalua...
PDF
Robot Arm Utilized Having Meal Support System Based on Computer Input by Huma...
PDF
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
PDF
An Improved Approach for Word Ambiguity Removal
PDF
Parameters Optimization for Improving ASR Performance in Adverse Real World N...
The Use of Java Swing’s Components to Develop a Widget
3D Human Hand Posture Reconstruction Using a Single 2D Image
Camera as Mouse and Keyboard for Handicap Person with Troubleshooting Ability...
A Proposed Web Accessibility Framework for the Arab Disabled
Real Time Blinking Detection Based on Gabor Filter
Computer Input with Human Eyes-Only Using Two Purkinje Images Which Works in ...
Toward a More Robust Usability concept with Perceived Enjoyment in the contex...
Collaborative Learning of Organisational Knolwedge
A PNML extension for the HCI design
Development of Sign Signal Translation System Based on Altera’s FPGA DE2 Board
An overview on Advanced Research Works on Brain-Computer Interface
Exploring the Relationship Between Mobile Phone and Senior Citizens: A Malays...
Principles of Good Screen Design in Websites
Progress of Virtual Teams in Albania
Cognitive Approach Towards the Maintenance of Web-Sites Through Quality Evalu...
USEFul: A Framework to Mainstream Web Site Usability through Automated Evalua...
Robot Arm Utilized Having Meal Support System Based on Computer Input by Huma...
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
An Improved Approach for Word Ambiguity Removal
Parameters Optimization for Improving ASR Performance in Adverse Real World N...

Recently uploaded (20)

PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Structure & Organelles in detailed.
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PDF
Basic Mud Logging Guide for educational purpose
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Complications of Minimal Access Surgery at WLH
PDF
RMMM.pdf make it easy to upload and study
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
master seminar digital applications in india
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
TR - Agricultural Crops Production NC III.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Structure & Organelles in detailed.
2.FourierTransform-ShortQuestionswithAnswers.pdf
Sports Quiz easy sports quiz sports quiz
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Basic Mud Logging Guide for educational purpose
GDM (1) (1).pptx small presentation for students
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Complications of Minimal Access Surgery at WLH
RMMM.pdf make it easy to upload and study
STATICS OF THE RIGID BODIES Hibbelers.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Microbial disease of the cardiovascular and lymphatic systems
master seminar digital applications in india
O5-L3 Freight Transport Ops (International) V1.pdf

Group Testing with Test Errors Made Easier

  • 1. Nyongesa L.Kennedy & Syaywa J.Paul International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 1 Group Testing With Test Errors Made Easier Nyongesa L. Kennedy knyongesa@hotmail.com Department of Mathematics, Masinde Muliro University of Science and Technology, 190 kakamega, Kenya Paul J. Syaywa syaywa@yahoo.com Department of Mathematics Masinde Muliro University of Science and Technology, 190 kakamega, Kenya Research Partially Supported by MMUST-URF Abstract Group testing is a cost-effective procedure for identifying defective items in a large population. It also improves the efficiency of the testing procedure when imperfect tests are employed. This study develops computational group-testing strategy based on [5] testing strategy. Statistical moments based on this applied design have been generated. With advent of digital computers in 1980‘s, group- testing strategy under discussion is handled in the context of computational statistics. Keywords: False-negative; False-positive; Group; Imperfect-tests; Pool. 1. INTRODUCTION Sequential testing of a population in the form of grouped sample started the way back in the second world war by [2] as a cost-effective method for screening syphilis in US soldiers returning from abroad. The [2] idea entails putting together individuals to form a group, and then testing the group rather than testing each individual for evidence or absence of the characteristic of interest. Epidemiological studies that use group testing have one of the two objectives. The first objective is to screen a large population with a view to identifying those individuals with a trait (cf. [2]). The second objective is to estimate the rate of the trait (cf. [10] and [9]). For either objective, group testing is more cost effective than individual testing especially when the rate of the trait is low because if a group tests negative it implies that none of the individuals that constitute the group have the trait, and thus it is not necessary to test each individual in the group. In recent years, there has been renewed interest in group testing strategies of biological specimens because of the application in HIV/Aids epidemiology (cf. [5]). The procedure has potential in the application of HIV/Aids testing because disease prevalence is estimated without necessarily identifying the subject (cf. [3]). [12] studied the cost-effectiveness of pooling algorithm
  • 2. Nyongesa L.Kennedy & Syaywa J.Paul International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 2 for the first objective of identifying individuals with the trait. In their procedure, each individual group that test positive is divided into two equal groups, which are then tested. Groups that tested positive were further sub-divided and tested and so on. [13] extended this work by considering pooling algorithms when there are errors and showed that some of these algorithms can reduce the error rates of the screening procedures (the false positives and false negatives) compared to individual testing. [7] examined group testing with re-testing and observed that re-test improves the sensitivity and specificity of the group-testing algorithm. Recent studies have focused on the second objective of estimating the rate of the trait using group-testing strategy. [11] discussed the procedure as a potential method for use by pharmaceutical companies in discovering drugs in the early stages. [8] has proposed an estimator in pool testing strategy that benefit from re-testing the pools. He observed that re- testing improves the efficiency of the estimator. In this study, we discuss the computation of statistical measures in pool testing strategy with imperfect test via computer package MATLAB based on [5] design of pool testing strategy. To the authors knowledge no article has appeared in the literature of group-testing as championed by [2] that has discussed the procedure in computational aspect. The rest of the paper is arranged as follows: Group testing strategy with imperfect test or in the presence of test errors is introduced in Section 2. Various statistical moments are generated in Section 3. Misclassification in the proposed algorithm as a result of test errors is discussed in Section 4.Section 5 provides the conclusion to the present study. 2. THE TESTING STRATEGY In this study, we generalize the group testing strategy by introducing the error component in the testing scheme so that the earlier proposed strategies become special cases as proposed by [5]. The strategy proposed in this study is as follows. Initially, group the population under investigation into a single group of size n and carry out a test on the group. If the test result is negative, further testing is discontinued. If the test result is positive, the group is divided into groups of equal sizes (nk), and each group is subjected to group-testing. If the group tests positive, individual testing is carried out. Diagrammatic description of the procedure has been presented in Figure 1. FIGURE 1: Block Testing Strategy nb21 ……………………i _ + …….………………1 2 i Positive result on the test Negative result on the test
  • 3. Nyongesa L.Kennedy & Syaywa J.Paul International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 3 3. MOMENTS IN THE GROUP TESTING STRATEGY Generation of random numbers from distributions form a basis for generation of moments in this section. Notice that, in order to use a computer to initiate a simulation study, we must be able to generate the values of a uniform (0,1) random variables; such variates are called random numbers, most computers have in-built subroutines, called a random number generator. For further discussion on this subject see [6].With the above in mind, we are in a position to generate moment measures in our proposed group testing strategy. In our testing strategy, we shall assume that tests under use are imperfect so that when tests are assumed to be perfect would be a special case. Before the generation of moments, we shall require the composite probability of classifying a group as positive, denoted by π and given by (1) where k is the group size, p is the probably of incidence, and are the sensitivity and specificity of the test in use, respectively. Equation (1) is easily derived by the law of total probability. In our study, (1) is the probability of success. Therefore, we shall generate random numbers from a binomial distribution with probability of success . Now, with (1) at hand, to convert the data set {xi} generated from U (0, 1) into zeros and ones, we use the indicator function But from (1), it is clear that since (0, 1) and and . Also, notice that in situations where the test kits are perfect, . Then (1) reduces to (2) and if the group size is one, i.e., k = 1, (1) reduces to (3) Let X denote the number of defective groups (groups that test positive on the test), then X ~ binomial(n, π ). Hence, various statistical measures; mean, standard deviation, Kurtosis and skewness have been computed by the aid of statistical packages. In addition, the total numbers of tests, cost, and relative savings have been computed. We utilize Equations (1) and (3) to generate these moments as presented in Tables 1a (i), 1a(ii), 1a (iii), though 2(b). Graphical presentations are provided in Figures2(a) through 2(c) in the Appendix. The simulation at population size 100 with groups of size 10 when the sensitivity and specificity of the tests in use are 99% is provided in Table 1(a)(i). It can be observed from the simulated results that:
  • 4. Nyongesa L.Kennedy & Syaywa J.Paul International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 4 • The numbers of defectives increase with increase in the incidence probability p • The number of tests increases with increase in p, • Relative savings decrease with increase is p. If the population size is increased to 500 or 1000 from 100 with group size 20, as presented in Tables 1a(ii) and 1a(iii), respectively, similar observations are made as noted above. Further, notice that when the population size is fixed but the group size is increased, more defectives are realized but there is no significant difference in relative savings. This is noted when we compare Table 1a(ii) and Table 1a(iii). Now, varying the sensitively and specificity from 99% to 95% as provided by Tables 1 b(i) to 1 b(iii), we draw similar conclusion. Graphical evidence of the observations on average number of tests required to identify all defective items in the group is provided by Figures 2(a) through 2(c). Clearly, the observations made are true in practice. Group testing strategy is only visible when the incidents probability is small [2]. Otherwise individual testing is preferred. Thus, the tables provides empirical evidence of the group testing scheme. 4. MISCLASSIFICATIONS IN GROUP TESTING STRATEGY Our main assumption in the discussion of this study was that test act independently and errors are part of the design as it is the case in practice ([7],[1]). Thus, misclassifications are bound to arise in the testing scheme. There are two possible misclassifications in the literature of group- testing namely: • A defective item is classified as non-defective and termed as false negative, • A non-defective item classified as defective, false positive. The probabilities of interest here are the probability of false positive and the probability of false negative is We now utilize (4) and (5) to compute misclassifications as presented in Tables 3(a), 3(b),3(c), and 3(d). The simulated results are also presented graphically in Figures 3(a), 3(b), 3(c) and 3(d). Computed values of false positives for group sizes:100, 500, 1000 with group sizes of 10, 20, and 50 have been presented in Table 3(a) and 3(b), when tests with equal sensitively and specificity are employed. It can be observed that: • The number of false positive increases with increase in p, • More false positives are realized with increase in group size. In fact, when the group size is
  • 5. Nyongesa L.Kennedy & Syaywa J.Paul International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 5 doubled, false positive increase by at least two fold, • Increase in the efficiency of the test kits results into a reduction in false positives. From Tables 3(c) and 3(d), we observe that: • The number of false-negative increase at a slow rate with increase in the incidence probability, • The number of false negatives approximately doubles when the group size is doubled, • If the efficiency of the tests is increased, fewer false negatives are realized. 5. CONCLUSION We have presented a computational pool testing strategy with test errors based on [5] design. It is evident from the computed results; Tables 1 (a)I to 1 (a) iii, that groups should be relatively small to be able to obtain the desired results as relative savings decrease with increase in pool sizes. This observation is feasible in situations where dilution effect can affect the results (cf. [4]). Also notice that relative savings is prominent when the efficiency of the test kits are high. Furthermore, the computed results support the idea that the procedure is only feasible when the prevalence rate is small otherwise individual testing is preferable. i.e relative savings decrease with increase in prevalence rate. Misclassifications are prominent when the efficiency of the test kits are low and incidence probability high, calling for re-testing, [7] and [8]. 6. REFERENCES 1. R. Brookmeyer. ``Analysis of multistage pooling studies of Biological specimens for Estimating Disease Incidence and prevalence’’. Biometric 55, 608-612, 1999. 2. R. Dorfman. ``The Detection of Defective Members of Large Population’’. Annals of Mathematical Statistics 14, 436-440, 1943. 3. J.L.Gastwirth, P.A. Hammick. ``Estimation of the prevalence of a Rare Disease, Preserving the Anonymity of the Subject by group-Testing; Application to Estimating the Prevalence of AIDS Antibodies in Blood Donor’’.Journal of Statistical Planning and Inference 22, 15- 27, 1989. 4. F.K. Hwang. ``Group Testing with a Dilution Effect’’. Biometrika 63, 611-613, 1975. 5. R.L. Kline, T. Bothus, R. Brookmeyer, S.Zeyer, T. Quinn.``Evaluation of Human Immunodeficiency Virus Sera Prevalence in Population Surveys Using Pooled Sera’’. Journal of Clinical Microbiology 27, 1449-145, 1989. 6. W.L.Martinez, A.R. Martinez, A.R. ``Computational Statistics Handbook with MATLAB’’. Chapman & Hall/CRC, (2002). 7. L.K. Nyongesa. `` Multistage group Testing Procedure (group Screening)’’.
  • 6. Nyongesa L.Kennedy & Syaywa J.Paul International Journal of Scientific and Statistical Computing (IJSSC), Volume (1): Issue (1) 6 Communications in Statistics-Simulation and Computation 33(3), 621-637, 2004. 8 L.K Nyongesa. ``Dual Estimation of Prevalence and Disease Incidence in Pool-Testing Strategy’’.Communication in Statistics Theory and Method (in Press), 2010. 9. M. Sobel, R.M. Elashoff , R. M. (1975). ``Group-Testing with a New Goal, Estimation’’. Biometrika 62, 181-193, 1975. 10. K.H.Thompson. ``Estimation of the Population of Vectors in a Natural Population of Insects’’. Biometrics 18, 568-578, 1962. 11. M. Xie, K. Tatsuoka, J.Sacks, S.Young. ``Group testing with Blockers and synergism’’. Journal of American Statistical Association 96, 92-101, 2001. 12. N.I. Johnson,S.Kotz, X. Wu. ``Inspection Errors for Attributes in Quality Control’’. London: Chapman and Hall, (1991). 13. E. Litvak, X.M.Tu, M. Pagano.`` Screening for the Presence of Disease by Pooling Sera Samples’’. Journal of the American Statistical Association, 89, 424-434, 1994.