SlideShare a Scribd company logo
Bayesian inference of deterministic population growth models
Luiz Max F. de Carvalho∗
[lmax.procc@gmail.com]
Claudio J. Struchiner [stru@fiocruz.br]
Leonardo S. Bastos [lsbastos@fiocruz.br]
Scientific Computing Programme (PROCC), Oswaldo Cruz Foundation (Fiocruz), Rio de Janeiro,
Brazil
March, 2014
12th EBEB - Atibaia – SP
Nice to meet you!
• BSc. in Microbiology, UFRJ (2013);
• Statistics Assistant, Pan American Health Organization, 2010-2013;
• Currently at PROCC and DME/IM-UFRJ (MSc);
• Soon to be moving to the University of Edinburgh for a PhD in
Evolutionary Biology.
2 of 17
Motivation
• Deterministic models are widely used in Science, let alone Biology;
◦ Population Growth;
◦ Disease Spreading;
◦ Cell and molecular interactions.
• They provide a crude but easily interpretable representation of reality;
• Temperature is a key factor to the growth of several organisms.
◦ Disease-carrying arthropds;
◦ Pathogenic bacteria;
◦ Economically important plants.
• With a deterministic model and some time series data at hand, how to
learn about model parameters?
3 of 17
Background
• Consider a deterministic model M(·);
◦ Let x ∈ X ⊂ Rp
be the set of model inputs and y ∈ Y ⊂ Rn
be the model
outputs. The deterministic model M(x; θ) = y, where θ ∈ Θ ⊂ Rq
is a
q-dimensional parameter vector, completely specifies the relationship
between x and y (Poole & Raftery, 2000);
◦ In our particular case, we have laid our dirty hands on some data y and
inputs x that we think can be modelled as y = M(x; θ)
• We are now interested in learning about θ
4 of 17
Temperature-dependent Population Growth
• Consider the ordinary non-linear differential equation (Verhulst, 1838):
dP
dt
= r 1 −
P
K
P ∴ P(t) =
K
1 + K−N0
N0
e−rt
(1)
• We formulate a modified version of (1), with temperature-dependent
parameters
P(t, T) =
K(T)
1 + K(T)−N0
N0
e−r(T)t
(2)
5 of 17
Temperature-dependent Population Growth (cont.)
• To complete model specification, we propose two smooth functions on
temperature T:
K(T) = cK exp −
(T − aK )2
bK
(3)
r(T) = cr exp −
(T − ar )2
br
(4)
We want to learn about θ = {aK , bK , cK , ar , br , cr }
6 of 17
Likelihood
• Assume P(t, T) to be a Gaussian process with fixed variance τ2
;
• Let y = {y1, y2, . . . , yN } be an output vector with N measurements,
which we observe directly;
• Moreover, let t = {t1, t2, ..., tN } and T = {T1, T2, ..., TN } be the vectors
with observed times and temperatures. Then
yi |ti , Ti , N0, θ ∼ N(µ(ti , Ti , N0; θ), τ2
) (5)
µ(ti , Ti , θ) =
K(Ti ; θK )
1 + K(Ti ;θK )−N0
N0
e−r(Ti ;θr )ti
, ∀i = 1, 2, . . . , N (6)
which is equivalent to writing yi = M(ti , Ti , N0; θ) + , ∼ N(0, τ2
).
7 of 17
Priors
• Biologically motivated, proper priors, elicited to maintain functional form
while remaining diffuse.
aK , ar ∼ Normal(20, 10)
bK , br ∼ Gamma(4, 1/5)
cK ∼ Gamma(1, 1/1000)
cr ∼ Normal(1/2, 2)
τ2
∼ Gamma(1/10, 1/10)
8 of 17
Posterior
• From the Bayes theorem
p(θ|y, t, T) ∝ p(y|θ, t, T)π(θ|t, T) (7)
• The model for P(t, T) is thus hierarchical and depends on two latent
quantities, r(T) and K(T).
9 of 17
Posterior Approximation – Stan
• Hamiltonian Monte Carlo (HMC):
◦ Avoids Random Walk behaviour;
◦ Allows big moves, with high acceptance probability;
◦ Does not suffer with highly correlated posteriors;
• We used the stan package of the R Statistical Computing Environment
to approximate the posterior through HMC.
◦ Fast C++ implementation;
◦ No-U-Turn Sampler (NUTS);
◦ Neat BUGS-like syntax for model specification;
◦ Smooth interface with R.
• MCMC was run for 50, 000 iterations with 25, 000 burn-in and
convergence was assessed by inspecting the trace- and autocorrelation
plots and potential scale reduction factor.
10 of 17
Results I – Simulation study
• From a set of parameters θ∗
and a grid {t, T} we generate Q data sets
of size N = nt × nT by sampling from y|θ∗
, t, T;
• We then obtain Q posterior estimates and calculate MSE, normalized
bias and coverage probability of the 95% credibility intervals;
Parameter Value Posterior Mean Bias MSE Coverage
aK 30.00 29.44 0.01 7.99 0.93
ar 23.00 22.71 0.00 3.31 0.86
bK 10.00 13.08 0.95 450.26 0.94
br 15.00 16.82 0.22 16.77 0.88
cK 700.00 692.17 0.09 7203.06 0.96
cr 0.40 0.43 0.00 0.04 0.85
τ 3.16 4.89 0.94 67.02 0.88
◦ Consistent results for N = 350, 630 and 1600.
11 of 17
Results II – Rhodnius prolixus data
• Important Chagas disease vector;
◦ We fear climatic change may increase suitability in previously uncolonized
areas;
• Population sizes measured in 10 temperatures in the range 16 − 38 ◦
C for
35 days (N0 = 30 for all T);
(a) K(T) (b) r(T)
12 of 17
Results II – Rhodnius prolixus data (cont.)
Posterior Mean (95% C.I.) Prior Mean (95% C.I.)
aK 19.23 (17.56 – 21.09) 25.00 (5.40 – 44.60)
ar 25.73 (25.44 – 26.10) 25.00 (5.40 – 44.60)
bK 106.17 (75.25 – 137.31) 20.00 (5.44 – 43.84)
br 26.77 (22.59 – 32.19) 20.00 (5.44 – 43.84)
cK 1023.32 (898.28 – 1165.40) 1000.00 (25.31 – 3688.87)
cr 0.66 (0.58 – 0.76) 0.50 (-3.41 – 4.41)
τ 177.33 (166.10 – 191.78) 1.00 (0.00 – 9.78)
13 of 17
Results II – Rhodnius prolixus data (cont.)
(c) Data
(d) Posterior
14 of 17
Conclusions and Perspectives
• We stress the importance of using Bayesian Inference to learn about
model parameters
◦ Parameters retain direct interpretability
• Stan
◦ Efficient sampling through HMC;
◦ NUTS drops the need for hand-tuning;
◦ Consirable speed-up and quicker convergence.
• Perspectives
◦ Dynamic variance, τ2
(t) ;
◦ Other data sets, e.g, bacterial growth;
◦ Complete treatment of uncertainty: Bayesian melding.
15 of 17
Thank you!
• References
D. Poole and A. E. Raftery, “Inference for deterministic simulation
models: the bayesian melding approach,” Journal of the American
Statistical Association, vol. 95, no. 452, pp. 1244–1255, 2000.
P.-F. Verhulst, “Notice sur la loi que la population suit dans son
accroissement. correspondance math´ematique et physique publi´ee
par a,” Quetelet, vol. 10, pp. 113–121, 1838.
• Acknowledgements
◦ My advisors, Claudio and Leo;
◦ Leonardo B. Santos, INPE;
◦ PROCC, who provided an office and a Gourmet coffee machine!
16 of 17
Questions
17 of 17

More Related Content

PPTX
Modelling of Bacterial Growth
PDF
Research proposal
PPTX
Analysis Of Deterministic Arc Model
PPT
Discrete And Continuous Simulation
PPTX
RADIATION CARCINOGENESIS
PPT
Kinetics of growth
DOC
Deterministic vs stochastic
PPT
extreme times in finance heston model.ppt
Modelling of Bacterial Growth
Research proposal
Analysis Of Deterministic Arc Model
Discrete And Continuous Simulation
RADIATION CARCINOGENESIS
Kinetics of growth
Deterministic vs stochastic
extreme times in finance heston model.ppt

Similar to Bayesian Inference of deterministic population growth models -- Brazilian Meeting on Bayesian Statistics (20)

PDF
Quantum Annealing for Dirichlet Process Mixture Models with Applications to N...
PDF
MUMS Undergraduate Workshop - Introduction to Bayesian Inference & Uncertaint...
PDF
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...
PDF
Uncertain_Henry_problem-poster.pdf
PDF
Litvinenko_Poster_Henry_22May.pdf
PDF
restore.pdf
PDF
Poster for Information, probability and inference in systems biology (IPISB 2...
PDF
PDF
On observer design methods for a
PDF
A Fast Multiparametric Least-Squares Adjustment of GC Data.pdf
PPTX
Four Hats of Math: CFD
PDF
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
PDF
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
PDF
Nonlinear Stochastic Optimization by the Monte-Carlo Method
PDF
Calibrating the Lee-Carter and the Poisson Lee-Carter models via Neural Netw...
PDF
Adaptive Restore algorithm & importance Monte Carlo
PDF
Epidemic processes on switching networks
PDF
PhD defense talk slides
PDF
K-adaptive partitioning for survival data
PDF
Scalable inference for a full multivariate stochastic volatility
Quantum Annealing for Dirichlet Process Mixture Models with Applications to N...
MUMS Undergraduate Workshop - Introduction to Bayesian Inference & Uncertaint...
Spillover Dynamics for Systemic Risk Measurement Using Spatial Financial Time...
Uncertain_Henry_problem-poster.pdf
Litvinenko_Poster_Henry_22May.pdf
restore.pdf
Poster for Information, probability and inference in systems biology (IPISB 2...
On observer design methods for a
A Fast Multiparametric Least-Squares Adjustment of GC Data.pdf
Four Hats of Math: CFD
Anomaly Detection in Sequences of Short Text Using Iterative Language Models
Allele Frequencies as Stochastic Processes: Mathematical & Statistical Approa...
Nonlinear Stochastic Optimization by the Monte-Carlo Method
Calibrating the Lee-Carter and the Poisson Lee-Carter models via Neural Netw...
Adaptive Restore algorithm & importance Monte Carlo
Epidemic processes on switching networks
PhD defense talk slides
K-adaptive partitioning for survival data
Scalable inference for a full multivariate stochastic volatility
Ad

Recently uploaded (20)

PDF
Basic Mud Logging Guide for educational purpose
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Lesson notes of climatology university.
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
01-Introduction-to-Information-Management.pdf
PPTX
Cell Structure & Organelles in detailed.
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Pre independence Education in Inndia.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Institutional Correction lecture only . . .
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Basic Mud Logging Guide for educational purpose
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Lesson notes of climatology university.
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Microbial disease of the cardiovascular and lymphatic systems
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
01-Introduction-to-Information-Management.pdf
Cell Structure & Organelles in detailed.
O5-L3 Freight Transport Ops (International) V1.pdf
Classroom Observation Tools for Teachers
Abdominal Access Techniques with Prof. Dr. R K Mishra
Microbial diseases, their pathogenesis and prophylaxis
Pre independence Education in Inndia.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Institutional Correction lecture only . . .
Supply Chain Operations Speaking Notes -ICLT Program
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Sports Quiz easy sports quiz sports quiz
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Ad

Bayesian Inference of deterministic population growth models -- Brazilian Meeting on Bayesian Statistics

  • 1. Bayesian inference of deterministic population growth models Luiz Max F. de Carvalho∗ [lmax.procc@gmail.com] Claudio J. Struchiner [stru@fiocruz.br] Leonardo S. Bastos [lsbastos@fiocruz.br] Scientific Computing Programme (PROCC), Oswaldo Cruz Foundation (Fiocruz), Rio de Janeiro, Brazil March, 2014 12th EBEB - Atibaia – SP
  • 2. Nice to meet you! • BSc. in Microbiology, UFRJ (2013); • Statistics Assistant, Pan American Health Organization, 2010-2013; • Currently at PROCC and DME/IM-UFRJ (MSc); • Soon to be moving to the University of Edinburgh for a PhD in Evolutionary Biology. 2 of 17
  • 3. Motivation • Deterministic models are widely used in Science, let alone Biology; ◦ Population Growth; ◦ Disease Spreading; ◦ Cell and molecular interactions. • They provide a crude but easily interpretable representation of reality; • Temperature is a key factor to the growth of several organisms. ◦ Disease-carrying arthropds; ◦ Pathogenic bacteria; ◦ Economically important plants. • With a deterministic model and some time series data at hand, how to learn about model parameters? 3 of 17
  • 4. Background • Consider a deterministic model M(·); ◦ Let x ∈ X ⊂ Rp be the set of model inputs and y ∈ Y ⊂ Rn be the model outputs. The deterministic model M(x; θ) = y, where θ ∈ Θ ⊂ Rq is a q-dimensional parameter vector, completely specifies the relationship between x and y (Poole & Raftery, 2000); ◦ In our particular case, we have laid our dirty hands on some data y and inputs x that we think can be modelled as y = M(x; θ) • We are now interested in learning about θ 4 of 17
  • 5. Temperature-dependent Population Growth • Consider the ordinary non-linear differential equation (Verhulst, 1838): dP dt = r 1 − P K P ∴ P(t) = K 1 + K−N0 N0 e−rt (1) • We formulate a modified version of (1), with temperature-dependent parameters P(t, T) = K(T) 1 + K(T)−N0 N0 e−r(T)t (2) 5 of 17
  • 6. Temperature-dependent Population Growth (cont.) • To complete model specification, we propose two smooth functions on temperature T: K(T) = cK exp − (T − aK )2 bK (3) r(T) = cr exp − (T − ar )2 br (4) We want to learn about θ = {aK , bK , cK , ar , br , cr } 6 of 17
  • 7. Likelihood • Assume P(t, T) to be a Gaussian process with fixed variance τ2 ; • Let y = {y1, y2, . . . , yN } be an output vector with N measurements, which we observe directly; • Moreover, let t = {t1, t2, ..., tN } and T = {T1, T2, ..., TN } be the vectors with observed times and temperatures. Then yi |ti , Ti , N0, θ ∼ N(µ(ti , Ti , N0; θ), τ2 ) (5) µ(ti , Ti , θ) = K(Ti ; θK ) 1 + K(Ti ;θK )−N0 N0 e−r(Ti ;θr )ti , ∀i = 1, 2, . . . , N (6) which is equivalent to writing yi = M(ti , Ti , N0; θ) + , ∼ N(0, τ2 ). 7 of 17
  • 8. Priors • Biologically motivated, proper priors, elicited to maintain functional form while remaining diffuse. aK , ar ∼ Normal(20, 10) bK , br ∼ Gamma(4, 1/5) cK ∼ Gamma(1, 1/1000) cr ∼ Normal(1/2, 2) τ2 ∼ Gamma(1/10, 1/10) 8 of 17
  • 9. Posterior • From the Bayes theorem p(θ|y, t, T) ∝ p(y|θ, t, T)π(θ|t, T) (7) • The model for P(t, T) is thus hierarchical and depends on two latent quantities, r(T) and K(T). 9 of 17
  • 10. Posterior Approximation – Stan • Hamiltonian Monte Carlo (HMC): ◦ Avoids Random Walk behaviour; ◦ Allows big moves, with high acceptance probability; ◦ Does not suffer with highly correlated posteriors; • We used the stan package of the R Statistical Computing Environment to approximate the posterior through HMC. ◦ Fast C++ implementation; ◦ No-U-Turn Sampler (NUTS); ◦ Neat BUGS-like syntax for model specification; ◦ Smooth interface with R. • MCMC was run for 50, 000 iterations with 25, 000 burn-in and convergence was assessed by inspecting the trace- and autocorrelation plots and potential scale reduction factor. 10 of 17
  • 11. Results I – Simulation study • From a set of parameters θ∗ and a grid {t, T} we generate Q data sets of size N = nt × nT by sampling from y|θ∗ , t, T; • We then obtain Q posterior estimates and calculate MSE, normalized bias and coverage probability of the 95% credibility intervals; Parameter Value Posterior Mean Bias MSE Coverage aK 30.00 29.44 0.01 7.99 0.93 ar 23.00 22.71 0.00 3.31 0.86 bK 10.00 13.08 0.95 450.26 0.94 br 15.00 16.82 0.22 16.77 0.88 cK 700.00 692.17 0.09 7203.06 0.96 cr 0.40 0.43 0.00 0.04 0.85 τ 3.16 4.89 0.94 67.02 0.88 ◦ Consistent results for N = 350, 630 and 1600. 11 of 17
  • 12. Results II – Rhodnius prolixus data • Important Chagas disease vector; ◦ We fear climatic change may increase suitability in previously uncolonized areas; • Population sizes measured in 10 temperatures in the range 16 − 38 ◦ C for 35 days (N0 = 30 for all T); (a) K(T) (b) r(T) 12 of 17
  • 13. Results II – Rhodnius prolixus data (cont.) Posterior Mean (95% C.I.) Prior Mean (95% C.I.) aK 19.23 (17.56 – 21.09) 25.00 (5.40 – 44.60) ar 25.73 (25.44 – 26.10) 25.00 (5.40 – 44.60) bK 106.17 (75.25 – 137.31) 20.00 (5.44 – 43.84) br 26.77 (22.59 – 32.19) 20.00 (5.44 – 43.84) cK 1023.32 (898.28 – 1165.40) 1000.00 (25.31 – 3688.87) cr 0.66 (0.58 – 0.76) 0.50 (-3.41 – 4.41) τ 177.33 (166.10 – 191.78) 1.00 (0.00 – 9.78) 13 of 17
  • 14. Results II – Rhodnius prolixus data (cont.) (c) Data (d) Posterior 14 of 17
  • 15. Conclusions and Perspectives • We stress the importance of using Bayesian Inference to learn about model parameters ◦ Parameters retain direct interpretability • Stan ◦ Efficient sampling through HMC; ◦ NUTS drops the need for hand-tuning; ◦ Consirable speed-up and quicker convergence. • Perspectives ◦ Dynamic variance, τ2 (t) ; ◦ Other data sets, e.g, bacterial growth; ◦ Complete treatment of uncertainty: Bayesian melding. 15 of 17
  • 16. Thank you! • References D. Poole and A. E. Raftery, “Inference for deterministic simulation models: the bayesian melding approach,” Journal of the American Statistical Association, vol. 95, no. 452, pp. 1244–1255, 2000. P.-F. Verhulst, “Notice sur la loi que la population suit dans son accroissement. correspondance math´ematique et physique publi´ee par a,” Quetelet, vol. 10, pp. 113–121, 1838. • Acknowledgements ◦ My advisors, Claudio and Leo; ◦ Leonardo B. Santos, INPE; ◦ PROCC, who provided an office and a Gourmet coffee machine! 16 of 17