SlideShare a Scribd company logo
An Analysis of the Accuracies of
AC50 Estimates of Dose-Response
   Curve Modeling Equations
             Mitas Ray
Background
The goal of the High Throughput Screening Initiative is to transform
traditional toxicological testing, one that uses animals such as rodents
and suffers from very high costs and very low throughput, into a non-
rodent animal cell-based assay that can use technological advances to
produce much higher throughput at a lot lower cost (National
Toxicology Program 2012). Through experimentation in a cytotoxicity
assay, quantitative high throughput screening (qHTS) produced robust
and reproducible data (Xia et al. 2008). An equation to quantify the
dose-response points was first created by A.V. Hill in 1910 and is
known as the Hill equation (Hill 1910). An alternate model is proposed
by Dr. K.R. Shockley and used for dose-response model fitting and is
formed as follows:
Background (cont.)
where       is the response for concentration ,        is the minimum
response,              is the maximum response,             is the
concentration at which 50% of               is achieved and
determines how wide or narrow the function is (Shockley 2012). In this
model, there is a log2 transformed AC50 parameter fit for data
generated by a Hill equation.
Similarly, a logistic 4-parameter fit was proposed by Mr. C. Ritz and Mr.
J.C. Streibig for model fitting, and is formed as follows:




Where parameter and represent the upper and lower limits
respectively and where represents the AC50 and represents the
slope (Ritz, Streibig 2005).
Problem
Analyzing data similar to that produced from a cell-based assay is
important to interpreting the meaning of the data. However, the method
in which to analyze the data is quite unclear. Both the equations, the
alternative model and the standard logistic 4-parameter model, were
written to fit curves for the Hill equation. In terms of the accuracy of the
AC50 parameter estimates from the model fitting of both equations,
which equation turns out to fit a data set produced from the Hill
equation itself?
Goals/Hypothesis
The goal of the experiment is to determine which method is better for
fitting a model dose response curve to a data set similar to one that
could be expected from a cell-based assay, one that is produced from a
Hill equation. The systematic approach that the alternative model uses
in estimation of the AC50 parameter seems to account for error more
effectively due to the log2 transformation than the standard logistic 4-
parameter equation. It is to this that I hypothesize that if the standard
deviation for normal error is increased upon the expected values
produced from a Hill equation, and other parameters in the Hill equation
such as maximum value, minimum value and slope are remained
constant, then the alternative model’s estimates of the AC50 parameter
will increase in accuracy in comparison to the standard logistic 4-
parameter equation.
Methods
Using the programming language R (Chambers, 2003), and the add-on
package DRC, used for bioassay analysis, I ran simulations to test both
the alternative model, and the standard logistic 4-parameter model. For
the Hill equation from which I extracted data and used to test the models,
I maintained a maximum response at 100 percent of positive control,
minimum response at 0 percent, and a slope of 4 for a range of
concentration values. The normal error was calculated with the expected
value as the mean and a manipulated standard deviation. The standard
deviation was varied from 0-10 in integer increments starting at 0. For
each standard deviation, both the standard logistic 4-parameter equation
and the alternative model were fitted and the accuracy of the AC50
parameter estimates for each equation was recorded for nine trials per
standard deviation. The averages of the nine trials for each standard
deviation were representative of the average accuracy of the AC50
parameter estimates for that standard deviation.
Methods (cont.)
After the averages were calculated for all eleven standard deviations,
two separate plots were created for each equation, and an appropriate
regression model was fitted to project future changes in standard
deviation. This allowed for the projection of accuracy of the AC50
parameter estimates for higher standard deviations. If there happened
to be an intersection amongst the regression equations, then it was
indicative that up until a certain standard deviation, one of the dose-
response model equations had a better accuracy of the AC50
parameter estimate, but beyond that certain standard deviation, the
other model equation provided a better fit. This allowed me to test my
hypothesis as I was able to directly see the correlation between the
increasing standard deviation and the accuracies of the AC50
parameter estimates.
Figure 1
                          Hill Function

                                                      This is a model Hill
           100




                                                      function that was used
                                                      to simulate data that
           80




                                                      tested the two models.
                                                      The blue curve is the
           60




                                                      Hill function without any
Response




                                                      normal error where as
                                                      the red points represent
           40




                                                      the points of the Hill
                                                      function with a normal
           20




                                                      error with a varying
                                                      standard deviation. In
           0




                                                      this case, the standard
                 0   20   40          60   80   100
                                                      deviation is 4.
                               Dose
Table 1
Std Dev:     0      1      2      3        4        5        6        7        8        9       10

AC50     0.0000 0.1088 0.1847   0.5053   0.8776   0.9027   1.1688   1.0962   1.3599   1.5815   1.6782
Standard 0.0000 0.1072 0.1886   0.4541   0.5759   0.9152   1.0264   1.1318   1.3191   1.7698   1.5294
Error:   0.0000 0.1101 0.4842   0.3856   0.6051   0.8325   0.9367   1.2728   1.2626   1.6200   1.4427
         0.0000 0.1056 0.2116   0.7112   0.4365   0.8944   0.8799   1.2474   1.2025   1.2606   1.5382
         0.0000 0.1141 0.3656   0.7554   0.7739   0.9451   0.9895   1.2467   1.2676   1.3560   1.4801
         0.0000 0.1073 0.4706   0.3840   0.8740   0.8744   0.9654   1.2351   1.2962   1.4443   1.7880
         0.0000 0.0996 0.3445   0.7796   0.7056   0.8458   1.0201   1.2479   1.1218   1.4215   1.7174
         0.0000 0.1103 0.2306   0.6018   0.7786   0.8568   0.9834   1.3414   1.4883   1.4464   1.3557
         0.0000 0.1083 0.4852   0.4351   0.3554   0.8568   0.8808   1.1643   1.3816   1.5715   1.6060

Avg:       0.0000 0.1079 0.3295 0.5569 0.6647 0.8804 0.9835 1.2204 1.3000 1.4968 1.5706


  This table charts the AC50 standard errors for nine trials on the logistic
  4-paramter model. The average AC50 standard error is at the bottom of
  the column for each standard deviation.
Table 2
Std Dev:     0      1      2      3        4        5        6        7      8      9       10

AC50     0.0000 0.1778 0.2642   0.7292   0.9814   0.9416   1.1568   1.0942 1.3594 1.5925   1.6806
Standard 0.0000 0.1708 0.2499   0.5934   0.8911   0.9219   1.0475   1.1660 1.3362 1.7678   1.5206
Error:   0.0000 0.1715 0.2870   0.6008   0.6630   0.8358   0.9663   1.2747 1.2784 1.6158   1.4857
         0.0000 0.1727 0.2304   0.8109   0.6125   0.8984   0.8828   1.2430 1.2125 1.2861   1.5528
         0.0000 0.1772 0.2408   0.9933   0.9187   0.9573   0.9931   1.3091 1.2610 1.3928   1.4876
         0.0000 0.1708 0.2758   0.4578   1.1011   0.8820   0.9902   1.2389 1.3340 1.4736   1.8113
         0.0000 0.1513 0.2241   1.0280   0.8663   0.9039   1.0032   1.2661 1.1197 1.4390   1.7095
         0.0000 0.1853 0.2462   0.8098   0.7801   0.8903   1.0225   1.3510 1.5024 1.4391   1.3618
         0.0000 0.1665 0.2557   0.7028   0.6538   0.8903   0.8483   1.1767 1.4214 1.5578   1.6386

Avg:       0.0000 0.1716 0.2527 0.7473 0.8298 0.9024 0.9901 1.2355 1.3139 1.5072 1.5832

  This table charts the AC50 standard errors for nine trials on the
  alternative model. The average AC50 standard error is at the bottom of
  the column for each standard deviation.
Figure 2
                                   Avg AC50 Standard Error vs. Std Dev
   Avg AC50 Standard Error



                             1.8
                             1.6
                             1.4
                             1.2
                               1
                                                           y = 0.1633x + 0.0116
                             0.8
                             0.6
                             0.4
                             0.2
                               0
                                   0    2     4      6       8        10          12
                                             Standard Deviation

This graph shows the logistic 4-parameter model AC50 standard error
results. The graph plots AC50 standard error versus standard deviation.
More importantly, the linear regression is given as y = 0.1633x + 0.0116.
Figure 3
                                  Avg AC50 Standard Error vs. Std Dev
  Avg AC50 Standard Error



                            1.8
                            1.6
                            1.4
                            1.2
                              1
                                                          y = 0.1598x + 0.0677
                            0.8
                            0.6
                            0.4
                            0.2
                              0
                                  0    2     4      6       8        10          12
                                            Standard Deviation

This graph shows the alternative model AC50 standard error results.
The graph plots AC50 standard error versus standard deviation. More
importantly, the linear regression is given as y = 0.1598x + 0.0677.
Discussion
The research conducted in this project led to a better understanding of
the accuracy of two models, the logistic 4-parameter and the alternative
model, in determining the AC50 parameter estimates to a Hill function
with normal error. This is a critical step in determining which model is
best for analyzing data from high throughput screening (HTS) cell-based
assays. A future goal of this project is to be able to simulate more than
ten sets of data to obtain more stable results for the accuracy of the
parameter estimates. Another future goal of this project is to branch out
from just analyzing the accuracies of the AC50 parameter estimates of
the two models to analyzing the accuracy of all the parameter and finally
the model itself. Other parameters would be redefined such as the range
of concentrations per chemical. An important aspect to consider,
however, is that in HTS data, there are typically fifteen data points or
less. Then, more methods would be analyzed in many head-to-head
comparisons based on this parameter to truly determine which statistical
method is the best for analyzing the HTS data.
Conclusion
From the two graphs, as presented above, it is clear that for smaller
standard deviations, the standard logistic 4-paramter is more accurate
for estimating the AC50 parameter. However, the regression lines for
both graphs will intersect at the standard deviation 15.743. This
indicates that at a standard deviation of 16 and beyond, the accuracy of
the AC50 parameter estimates by the alternative method will supersede
that of the standard logistic 4-parameter model.
References
Hill A.V. 1910. The possible effects of the aggregation of the molecules of hemoglobin on
      its dissociation curves. J Physiol 40
Chambers, John. "What Is R?" The R Project for Statistical Computing. R-project, 2003.
    Web. 14 Feb. 2013. <http://www.r-project.org/>.
National Toxicology Program. 2012. ""Toxicology Testing in the 21st Century" - A New
     Strategy." High Throughput Screening Initiative. National Institute of Health, Web. 5
     Sept.        2012       <http://ntp.niehs.nih.gov/?objectid=06002ADB-F1F6-975E-
     73B25B4E3F2A41CB>.
Ritz C., Streibeig J.C. 2005. Bioassay analysis using R. J Stat Softw 12
Shockley K.R. 2012. A Three-Stage Algorithm to Make Toxicologically Relevant Activity
    Calls from Quantitative High Throughput Screening Data. Environmental Health
    Perspectives 120
Xia M., et al. 2008. Compound Cytotoxicity Profiling Using Quantitative High-Throughput
     Screening. Environmental Health Perspectives 116
Acknowledgements
This research was conducted in the Biostatistics Branch at the National
Institute of Environmental Health Sciences, NIH, DHHS, Research
Triangle Park, NC 27709. Many thanks to Dr. Kissling and Dr. Shockley
for their continued encouragement and guidance throughout this
project.

More Related Content

PPT
31st july talk (20021)
DOCX
Modelling & Forecasting Project Wenjun Wu
PPTX
Neurosurgery 2011
PDF
Application of Multivariate Regression Analysis and Analysis of Variance
PPTX
Class dose response curve
PPTX
Topic17 regression spss
PDF
Regression project
PPTX
Presentation2 stats
31st july talk (20021)
Modelling & Forecasting Project Wenjun Wu
Neurosurgery 2011
Application of Multivariate Regression Analysis and Analysis of Variance
Class dose response curve
Topic17 regression spss
Regression project
Presentation2 stats

Similar to An Analysis of the Accuracies of the AC50 Estimates of Dose-Response Curve Modeling Equations (20)

DOCX
[Q1~12]Aclothingstoreisconsideringtwomethodstoreducetheselosses1).docx
PDF
Durbin watson tables unyu unyu bgt
DOCX
Stat 101 formulae sheet
PPT
BS2506 tutorial3
PDF
Table durbin watson tables
PDF
Durbin watson tables
PPT
PDF
Tablas estadísticas
XLS
Week7 Quiz Help Excel File
DOCX
Name ______________________________Signature ______________________.docx
PDF
20231 MCHA022 (Analytical Chemistry 2).pdf
PDF
Business Statistics_an overview
PPTX
Week8finalexamlivelecture april2012
PPTX
Week8finalexamlivelecture dec2012
PPTX
Six sigma quick references
PPT
Normal distri
PDF
Manual Solution Probability and Statistic Hayter 4th Edition
PDF
toaz.info-instructor-solution-manual-probability-and-statistics-for-engineers...
PPT
Factorial design
[Q1~12]Aclothingstoreisconsideringtwomethodstoreducetheselosses1).docx
Durbin watson tables unyu unyu bgt
Stat 101 formulae sheet
BS2506 tutorial3
Table durbin watson tables
Durbin watson tables
Tablas estadísticas
Week7 Quiz Help Excel File
Name ______________________________Signature ______________________.docx
20231 MCHA022 (Analytical Chemistry 2).pdf
Business Statistics_an overview
Week8finalexamlivelecture april2012
Week8finalexamlivelecture dec2012
Six sigma quick references
Normal distri
Manual Solution Probability and Statistic Hayter 4th Edition
toaz.info-instructor-solution-manual-probability-and-statistics-for-engineers...
Factorial design
Ad

An Analysis of the Accuracies of the AC50 Estimates of Dose-Response Curve Modeling Equations

  • 1. An Analysis of the Accuracies of AC50 Estimates of Dose-Response Curve Modeling Equations Mitas Ray
  • 2. Background The goal of the High Throughput Screening Initiative is to transform traditional toxicological testing, one that uses animals such as rodents and suffers from very high costs and very low throughput, into a non- rodent animal cell-based assay that can use technological advances to produce much higher throughput at a lot lower cost (National Toxicology Program 2012). Through experimentation in a cytotoxicity assay, quantitative high throughput screening (qHTS) produced robust and reproducible data (Xia et al. 2008). An equation to quantify the dose-response points was first created by A.V. Hill in 1910 and is known as the Hill equation (Hill 1910). An alternate model is proposed by Dr. K.R. Shockley and used for dose-response model fitting and is formed as follows:
  • 3. Background (cont.) where is the response for concentration , is the minimum response, is the maximum response, is the concentration at which 50% of is achieved and determines how wide or narrow the function is (Shockley 2012). In this model, there is a log2 transformed AC50 parameter fit for data generated by a Hill equation. Similarly, a logistic 4-parameter fit was proposed by Mr. C. Ritz and Mr. J.C. Streibig for model fitting, and is formed as follows: Where parameter and represent the upper and lower limits respectively and where represents the AC50 and represents the slope (Ritz, Streibig 2005).
  • 4. Problem Analyzing data similar to that produced from a cell-based assay is important to interpreting the meaning of the data. However, the method in which to analyze the data is quite unclear. Both the equations, the alternative model and the standard logistic 4-parameter model, were written to fit curves for the Hill equation. In terms of the accuracy of the AC50 parameter estimates from the model fitting of both equations, which equation turns out to fit a data set produced from the Hill equation itself?
  • 5. Goals/Hypothesis The goal of the experiment is to determine which method is better for fitting a model dose response curve to a data set similar to one that could be expected from a cell-based assay, one that is produced from a Hill equation. The systematic approach that the alternative model uses in estimation of the AC50 parameter seems to account for error more effectively due to the log2 transformation than the standard logistic 4- parameter equation. It is to this that I hypothesize that if the standard deviation for normal error is increased upon the expected values produced from a Hill equation, and other parameters in the Hill equation such as maximum value, minimum value and slope are remained constant, then the alternative model’s estimates of the AC50 parameter will increase in accuracy in comparison to the standard logistic 4- parameter equation.
  • 6. Methods Using the programming language R (Chambers, 2003), and the add-on package DRC, used for bioassay analysis, I ran simulations to test both the alternative model, and the standard logistic 4-parameter model. For the Hill equation from which I extracted data and used to test the models, I maintained a maximum response at 100 percent of positive control, minimum response at 0 percent, and a slope of 4 for a range of concentration values. The normal error was calculated with the expected value as the mean and a manipulated standard deviation. The standard deviation was varied from 0-10 in integer increments starting at 0. For each standard deviation, both the standard logistic 4-parameter equation and the alternative model were fitted and the accuracy of the AC50 parameter estimates for each equation was recorded for nine trials per standard deviation. The averages of the nine trials for each standard deviation were representative of the average accuracy of the AC50 parameter estimates for that standard deviation.
  • 7. Methods (cont.) After the averages were calculated for all eleven standard deviations, two separate plots were created for each equation, and an appropriate regression model was fitted to project future changes in standard deviation. This allowed for the projection of accuracy of the AC50 parameter estimates for higher standard deviations. If there happened to be an intersection amongst the regression equations, then it was indicative that up until a certain standard deviation, one of the dose- response model equations had a better accuracy of the AC50 parameter estimate, but beyond that certain standard deviation, the other model equation provided a better fit. This allowed me to test my hypothesis as I was able to directly see the correlation between the increasing standard deviation and the accuracies of the AC50 parameter estimates.
  • 8. Figure 1 Hill Function This is a model Hill 100 function that was used to simulate data that 80 tested the two models. The blue curve is the 60 Hill function without any Response normal error where as the red points represent 40 the points of the Hill function with a normal 20 error with a varying standard deviation. In 0 this case, the standard 0 20 40 60 80 100 deviation is 4. Dose
  • 9. Table 1 Std Dev: 0 1 2 3 4 5 6 7 8 9 10 AC50 0.0000 0.1088 0.1847 0.5053 0.8776 0.9027 1.1688 1.0962 1.3599 1.5815 1.6782 Standard 0.0000 0.1072 0.1886 0.4541 0.5759 0.9152 1.0264 1.1318 1.3191 1.7698 1.5294 Error: 0.0000 0.1101 0.4842 0.3856 0.6051 0.8325 0.9367 1.2728 1.2626 1.6200 1.4427 0.0000 0.1056 0.2116 0.7112 0.4365 0.8944 0.8799 1.2474 1.2025 1.2606 1.5382 0.0000 0.1141 0.3656 0.7554 0.7739 0.9451 0.9895 1.2467 1.2676 1.3560 1.4801 0.0000 0.1073 0.4706 0.3840 0.8740 0.8744 0.9654 1.2351 1.2962 1.4443 1.7880 0.0000 0.0996 0.3445 0.7796 0.7056 0.8458 1.0201 1.2479 1.1218 1.4215 1.7174 0.0000 0.1103 0.2306 0.6018 0.7786 0.8568 0.9834 1.3414 1.4883 1.4464 1.3557 0.0000 0.1083 0.4852 0.4351 0.3554 0.8568 0.8808 1.1643 1.3816 1.5715 1.6060 Avg: 0.0000 0.1079 0.3295 0.5569 0.6647 0.8804 0.9835 1.2204 1.3000 1.4968 1.5706 This table charts the AC50 standard errors for nine trials on the logistic 4-paramter model. The average AC50 standard error is at the bottom of the column for each standard deviation.
  • 10. Table 2 Std Dev: 0 1 2 3 4 5 6 7 8 9 10 AC50 0.0000 0.1778 0.2642 0.7292 0.9814 0.9416 1.1568 1.0942 1.3594 1.5925 1.6806 Standard 0.0000 0.1708 0.2499 0.5934 0.8911 0.9219 1.0475 1.1660 1.3362 1.7678 1.5206 Error: 0.0000 0.1715 0.2870 0.6008 0.6630 0.8358 0.9663 1.2747 1.2784 1.6158 1.4857 0.0000 0.1727 0.2304 0.8109 0.6125 0.8984 0.8828 1.2430 1.2125 1.2861 1.5528 0.0000 0.1772 0.2408 0.9933 0.9187 0.9573 0.9931 1.3091 1.2610 1.3928 1.4876 0.0000 0.1708 0.2758 0.4578 1.1011 0.8820 0.9902 1.2389 1.3340 1.4736 1.8113 0.0000 0.1513 0.2241 1.0280 0.8663 0.9039 1.0032 1.2661 1.1197 1.4390 1.7095 0.0000 0.1853 0.2462 0.8098 0.7801 0.8903 1.0225 1.3510 1.5024 1.4391 1.3618 0.0000 0.1665 0.2557 0.7028 0.6538 0.8903 0.8483 1.1767 1.4214 1.5578 1.6386 Avg: 0.0000 0.1716 0.2527 0.7473 0.8298 0.9024 0.9901 1.2355 1.3139 1.5072 1.5832 This table charts the AC50 standard errors for nine trials on the alternative model. The average AC50 standard error is at the bottom of the column for each standard deviation.
  • 11. Figure 2 Avg AC50 Standard Error vs. Std Dev Avg AC50 Standard Error 1.8 1.6 1.4 1.2 1 y = 0.1633x + 0.0116 0.8 0.6 0.4 0.2 0 0 2 4 6 8 10 12 Standard Deviation This graph shows the logistic 4-parameter model AC50 standard error results. The graph plots AC50 standard error versus standard deviation. More importantly, the linear regression is given as y = 0.1633x + 0.0116.
  • 12. Figure 3 Avg AC50 Standard Error vs. Std Dev Avg AC50 Standard Error 1.8 1.6 1.4 1.2 1 y = 0.1598x + 0.0677 0.8 0.6 0.4 0.2 0 0 2 4 6 8 10 12 Standard Deviation This graph shows the alternative model AC50 standard error results. The graph plots AC50 standard error versus standard deviation. More importantly, the linear regression is given as y = 0.1598x + 0.0677.
  • 13. Discussion The research conducted in this project led to a better understanding of the accuracy of two models, the logistic 4-parameter and the alternative model, in determining the AC50 parameter estimates to a Hill function with normal error. This is a critical step in determining which model is best for analyzing data from high throughput screening (HTS) cell-based assays. A future goal of this project is to be able to simulate more than ten sets of data to obtain more stable results for the accuracy of the parameter estimates. Another future goal of this project is to branch out from just analyzing the accuracies of the AC50 parameter estimates of the two models to analyzing the accuracy of all the parameter and finally the model itself. Other parameters would be redefined such as the range of concentrations per chemical. An important aspect to consider, however, is that in HTS data, there are typically fifteen data points or less. Then, more methods would be analyzed in many head-to-head comparisons based on this parameter to truly determine which statistical method is the best for analyzing the HTS data.
  • 14. Conclusion From the two graphs, as presented above, it is clear that for smaller standard deviations, the standard logistic 4-paramter is more accurate for estimating the AC50 parameter. However, the regression lines for both graphs will intersect at the standard deviation 15.743. This indicates that at a standard deviation of 16 and beyond, the accuracy of the AC50 parameter estimates by the alternative method will supersede that of the standard logistic 4-parameter model.
  • 15. References Hill A.V. 1910. The possible effects of the aggregation of the molecules of hemoglobin on its dissociation curves. J Physiol 40 Chambers, John. "What Is R?" The R Project for Statistical Computing. R-project, 2003. Web. 14 Feb. 2013. <http://www.r-project.org/>. National Toxicology Program. 2012. ""Toxicology Testing in the 21st Century" - A New Strategy." High Throughput Screening Initiative. National Institute of Health, Web. 5 Sept. 2012 <http://ntp.niehs.nih.gov/?objectid=06002ADB-F1F6-975E- 73B25B4E3F2A41CB>. Ritz C., Streibeig J.C. 2005. Bioassay analysis using R. J Stat Softw 12 Shockley K.R. 2012. A Three-Stage Algorithm to Make Toxicologically Relevant Activity Calls from Quantitative High Throughput Screening Data. Environmental Health Perspectives 120 Xia M., et al. 2008. Compound Cytotoxicity Profiling Using Quantitative High-Throughput Screening. Environmental Health Perspectives 116
  • 16. Acknowledgements This research was conducted in the Biostatistics Branch at the National Institute of Environmental Health Sciences, NIH, DHHS, Research Triangle Park, NC 27709. Many thanks to Dr. Kissling and Dr. Shockley for their continued encouragement and guidance throughout this project.