SlideShare a Scribd company logo
Modelling tick dynamics
using volunteer data
Irene GARCIA-MARTI
20th June 2017
This session
 Present a use case of GIS modelling for Public Health:
 data collection  modelling  production of results
 Volunteered Geographic Information (VGI) plays a key role
 Relate with course objectives:
 Visualize spatial patterns of a disease
 Identify risk factors of a disease
2
Ticks
3
Distribution of Lyme disease
http://en.wikipedia.org/wiki/Lyme_disease
Lyme, CT
4
Evolution of Lyme disease in the Netherlands
Source: RIVM
5
6
0
10
20
30
40
50
60
70
80
90
100
1994 2001 2005 2009 2014
#ofcases
Thousands
Tick bites and Lyme cases
1994 – 2014
Tick Bites Lyme Disease
Sources:
 Dutch National Atlas
 Press releases
 (Hofhuis et al, 2015)
 Highlights:
 Decrease in tick bite
consultations
 Increase (gentler) in the
cases of Lyme disease
 Causes:
 More awareness and more
people self removing ticks
 More carelessness in
people due to higher tick
densities
 More ticks means more
Lyme cases, but there is
no increase on infection
rate
Serious matter
Actions in the Netherlands
Source: 7
Serious matter
8
Serious matter
 Lyme disease has a high cost:
 For individuals:
 Long-lasting sequels:
muscles, joints, brain
deterioration
 For public health agents:
 Treatment for a potentially
chronical disease
 Population at risk:
 Children and elder
9
Why is this happening?
Causes
 Global changes
 Weather dynamics
 Wildlife ecosystems
 Socio-economic changes
 Heavy urbanization
 Fragmentation
 Human leisure
Consequences
 Ecological:
 Longer season
 Higher densities
 New suitable habitats
 Spatial:
 Northwards expansion
 More human-tick
contact
 How to monitor ticks?
 What variables influence
the number of ticks?
 Can we predict tick
dynamics for each point in
NL?
Challenges
10
Why is this happening?
Causes
 Global changes
 Weather dynamics
 Wildlife ecosystems
 Socio-economic changes
 Heavy urbanization
 Fragmentation
 Human leisure
Consequences
 Ecological:
 Longer season
 Higher densities
 New suitable habitats
 Spatial:
 Northwards expansion
 More human-tick
contact
 How to monitor ticks?
 What variables influence
the number of ticks?
 Can we predict tick
dynamics for each point in
NL?
Challenges
11
Do we even have data
to start figuring out
these mechanisms?
Volunteered data collection
 Since 2006:
 Group of volunteers sample 17
locations in NL on a monthly
basis
 Count ticks in its different life
stages (i.e. larvae, nymph,
adults)
 First citizen science project of
its kind!
Source: WUR
12
13
 Volunteer flagging dataset:
 Three types of habitat:
coniferous, deciduous and
grasses/bushes
 One-day tick counts per
month between 2006-2014
(currently, around 3.000
samples)
Volunteered data collection
Motivation
Now we know the evolution of tick counts in the time series
and we can link it to environmental variables
to train models that predict tick activity…
…and understand main drivers of the phenomenon
14
Important factors on tick dynamics
15
From “Lyme disease: The ecology of a complex system”
R.Ostfeld (2012)
Important factors on tick dynamics
16
From “Lyme disease: The ecology of a complex system”
R.Ostfeld (2012)
Important factors on tick dynamics
• Start questing season
• Survival through winterTemperature
• Increases tick survival
• Prevent tick dessicationPrecipitation
• Keeps soil moisture high
• Prevent tick dessicationVegetation
• Sustains tick populationWildlife
17
Where are the data coming from?
• KNMITemperature
• KNMIPrecipitation
• Official product: Land Cover by WUR
• Remote sensing: MODIS imagery in GEEVegetation
• No dataWildlife
Mostly, raster layers
18
Modelling
Is tick dynamics a linear phenomenon?
19
Source: (Dantas-Torres & Otranto, 2013)
A variable alone, does not predict the tick dynamics accurately
Modelling
Is tick dynamics a linear phenomenon?
20
Source: (Dantas-Torres & Otranto, 2013)
Yet, there should be a non-linear relationship out there!
Modelling
 Machine Learning algorithms:
 Non-linear models
 Not based on any previously known assumption
 They learn from data
 Capable of working with a huge number of variables
21
Modelling: Random Forest
 Ensemble learning method for
classification and regression
 Based in multiple decision trees
(a forest)
 Data are randomized and passed
to each tree in the ensemble
 Robust and stable algorithm
22
Data and underlying function Single regression tree
10 regression trees Average of 100 regression trees
Source: Quantitative Economics Tartu Blog
Modelling: Random Forest (Regression)
𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥 𝑛)
1) Given a set of predictor variables X
2) Given a continuous response variable Y
Build a model to predict the value of Y for a new array of X
24
Modelling: Random Forest (Regression)
𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥 𝑛)
1) Given a set of predictor variables X:
 Weather predictors: T, P, EV, RH, SD, VP
 Vegetation: NDVI, EVI, NDWI
 Land use and land cover
2) Given a continuous response variable Y
 The number of active questing ticks
Build a model to predict the value of Y for a new array of X
25
Modelling: The recipe
for each flagging_site:
for each sampling_date:
# Time to get our environmental predictors!
Get mean weather in the previous 7 days before the date of the sampling
Get vegetation indices (NDVI, EVI, NDWI) for a site and date of sampling
Get land cover of the site
Build a table with 3.000 rows and 7 columns
Model with the Random Forest for Regression
Visualize results in the geographic space
Tools
Python
GDAL
scikit-learn
matplotlib
QGIS/ArcGIS
26
Modelling: General performance
27
Visualizing: Back to geographic space
28
Predicted tick activity
June 1st, 2014
 Trained model is applied to each
pixel with forest of the Netherlands
 Interpretation:
 Provinces of Drenthe and
Groningen presented high activity
of ticks on that day
 Randstat area presented the
lowest tick activity
Visualizing: Back to geographic space
29
Conclusions
 Collective effort of volunteers can be used to devise models capturing tick
dynamics at the country level
 Contribution intended to help designing public health campaigns:
 Reduce the incidence of Lyme disease
 Increase the awareness of citizens to this problem
 Deliver results in platforms such as tekenradar.nl
30
Discussion
 Do any of you have experience in modeling species?
 What kind of spatial modelling is done in your countries?
 What else would you include in the analysis?
31
Thanks!
Questions?

More Related Content

PDF
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...
PDF
Modelling tick densities using VGI and machine learning (2016)
PDF
GeoComputation Conference - Dallas (2015)
PDF
Vector-borne diseases and Lyme disease (2016)
PDF
Computational Epidemiology as a scientific computing area: cellular automata ...
PPTX
RPG iEvoBio 2010 Keynote
PPTX
iEvoBio Keynote Talk 2010
PDF
Modelling tick bites dynamics using VGI (2015)
Spatial Statistics 2017 Conference: Towards the modelling and mapping of acti...
Modelling tick densities using VGI and machine learning (2016)
GeoComputation Conference - Dallas (2015)
Vector-borne diseases and Lyme disease (2016)
Computational Epidemiology as a scientific computing area: cellular automata ...
RPG iEvoBio 2010 Keynote
iEvoBio Keynote Talk 2010
Modelling tick bites dynamics using VGI (2015)

Similar to Modelling tick dynamics using volunteer data (2017) (20)

PDF
Gdrp pres oct_2018_niels_hen
PPTX
Repurposing Classification & Regression Trees for Causal Research with High-D...
PDF
La statistique et le machine learning pour l'intégration de données de la bio...
PDF
EuFMDiS Meetings 01/2020 - Wild boar distribution maps for Spain + Contact be...
PPT
10. ilri 9june2010
PDF
Massively Parallel Simulations of Spread of Infectious Diseases over Realisti...
PDF
Massively Parallel Simulations of Spread of Infectious Diseases over Realisti...
PPTX
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...
PPTX
Open Science and Ecological meta-anlaysis
PPT
Modelling the role of neighbourhood support in regional climate change adapta...
PDF
Machine Learning of Epidemic Processes in Networks
PPTX
An Agent-Based Model of Epidemic Spread using Human Mobility and Social Netwo...
PPTX
PMED Transition Workshop - Creating Virtual Populations for Modeling Tumor He...
PDF
Diminishing Returns: When Should Real- world Surveys Stop Sampling?
PDF
Ecological Niche Modelling of Potential RVF Vector Mosquito Species and their...
PDF
Geohealth and Safe Society
PDF
Geohealth symposium-UTen ITC
PDF
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
PPT
Information, Science, and Society
PDF
Gis and gps in plant biosecurity
Gdrp pres oct_2018_niels_hen
Repurposing Classification & Regression Trees for Causal Research with High-D...
La statistique et le machine learning pour l'intégration de données de la bio...
EuFMDiS Meetings 01/2020 - Wild boar distribution maps for Spain + Contact be...
10. ilri 9june2010
Massively Parallel Simulations of Spread of Infectious Diseases over Realisti...
Massively Parallel Simulations of Spread of Infectious Diseases over Realisti...
Google Earth Engine: Health Applications of Google’s Cloud Platform for Big E...
Open Science and Ecological meta-anlaysis
Modelling the role of neighbourhood support in regional climate change adapta...
Machine Learning of Epidemic Processes in Networks
An Agent-Based Model of Epidemic Spread using Human Mobility and Social Netwo...
PMED Transition Workshop - Creating Virtual Populations for Modeling Tumor He...
Diminishing Returns: When Should Real- world Surveys Stop Sampling?
Ecological Niche Modelling of Potential RVF Vector Mosquito Species and their...
Geohealth and Safe Society
Geohealth symposium-UTen ITC
Computational Social Science:The Collaborative Futures of Big Data, Computer ...
Information, Science, and Society
Gis and gps in plant biosecurity
Ad

Recently uploaded (20)

PPTX
A Complete Guide to Streamlining Business Processes
PDF
Microsoft Core Cloud Services powerpoint
PPTX
Introduction to Inferential Statistics.pptx
DOCX
Factor Analysis Word Document Presentation
PPTX
retention in jsjsksksksnbsndjddjdnFPD.pptx
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Navigating the Thai Supplements Landscape.pdf
PPTX
Business_Capability_Map_Collection__pptx
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PPT
Predictive modeling basics in data cleaning process
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PPTX
SET 1 Compulsory MNH machine learning intro
PPT
statistic analysis for study - data collection
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
Global Data and Analytics Market Outlook Report
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
modul_python (1).pptx for professional and student
A Complete Guide to Streamlining Business Processes
Microsoft Core Cloud Services powerpoint
Introduction to Inferential Statistics.pptx
Factor Analysis Word Document Presentation
retention in jsjsksksksnbsndjddjdnFPD.pptx
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
STERILIZATION AND DISINFECTION-1.ppthhhbx
Navigating the Thai Supplements Landscape.pdf
Business_Capability_Map_Collection__pptx
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Predictive modeling basics in data cleaning process
Pilar Kemerdekaan dan Identi Bangsa.pptx
SET 1 Compulsory MNH machine learning intro
statistic analysis for study - data collection
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Global Data and Analytics Market Outlook Report
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
Topic 5 Presentation 5 Lesson 5 Corporate Fin
modul_python (1).pptx for professional and student
Ad

Modelling tick dynamics using volunteer data (2017)

  • 1. Modelling tick dynamics using volunteer data Irene GARCIA-MARTI 20th June 2017
  • 2. This session  Present a use case of GIS modelling for Public Health:  data collection  modelling  production of results  Volunteered Geographic Information (VGI) plays a key role  Relate with course objectives:  Visualize spatial patterns of a disease  Identify risk factors of a disease 2
  • 4. Distribution of Lyme disease http://en.wikipedia.org/wiki/Lyme_disease Lyme, CT 4
  • 5. Evolution of Lyme disease in the Netherlands Source: RIVM 5
  • 6. 6 0 10 20 30 40 50 60 70 80 90 100 1994 2001 2005 2009 2014 #ofcases Thousands Tick bites and Lyme cases 1994 – 2014 Tick Bites Lyme Disease Sources:  Dutch National Atlas  Press releases  (Hofhuis et al, 2015)  Highlights:  Decrease in tick bite consultations  Increase (gentler) in the cases of Lyme disease  Causes:  More awareness and more people self removing ticks  More carelessness in people due to higher tick densities  More ticks means more Lyme cases, but there is no increase on infection rate
  • 7. Serious matter Actions in the Netherlands Source: 7
  • 9. Serious matter  Lyme disease has a high cost:  For individuals:  Long-lasting sequels: muscles, joints, brain deterioration  For public health agents:  Treatment for a potentially chronical disease  Population at risk:  Children and elder 9
  • 10. Why is this happening? Causes  Global changes  Weather dynamics  Wildlife ecosystems  Socio-economic changes  Heavy urbanization  Fragmentation  Human leisure Consequences  Ecological:  Longer season  Higher densities  New suitable habitats  Spatial:  Northwards expansion  More human-tick contact  How to monitor ticks?  What variables influence the number of ticks?  Can we predict tick dynamics for each point in NL? Challenges 10
  • 11. Why is this happening? Causes  Global changes  Weather dynamics  Wildlife ecosystems  Socio-economic changes  Heavy urbanization  Fragmentation  Human leisure Consequences  Ecological:  Longer season  Higher densities  New suitable habitats  Spatial:  Northwards expansion  More human-tick contact  How to monitor ticks?  What variables influence the number of ticks?  Can we predict tick dynamics for each point in NL? Challenges 11 Do we even have data to start figuring out these mechanisms?
  • 12. Volunteered data collection  Since 2006:  Group of volunteers sample 17 locations in NL on a monthly basis  Count ticks in its different life stages (i.e. larvae, nymph, adults)  First citizen science project of its kind! Source: WUR 12
  • 13. 13  Volunteer flagging dataset:  Three types of habitat: coniferous, deciduous and grasses/bushes  One-day tick counts per month between 2006-2014 (currently, around 3.000 samples) Volunteered data collection
  • 14. Motivation Now we know the evolution of tick counts in the time series and we can link it to environmental variables to train models that predict tick activity… …and understand main drivers of the phenomenon 14
  • 15. Important factors on tick dynamics 15 From “Lyme disease: The ecology of a complex system” R.Ostfeld (2012)
  • 16. Important factors on tick dynamics 16 From “Lyme disease: The ecology of a complex system” R.Ostfeld (2012)
  • 17. Important factors on tick dynamics • Start questing season • Survival through winterTemperature • Increases tick survival • Prevent tick dessicationPrecipitation • Keeps soil moisture high • Prevent tick dessicationVegetation • Sustains tick populationWildlife 17
  • 18. Where are the data coming from? • KNMITemperature • KNMIPrecipitation • Official product: Land Cover by WUR • Remote sensing: MODIS imagery in GEEVegetation • No dataWildlife Mostly, raster layers 18
  • 19. Modelling Is tick dynamics a linear phenomenon? 19 Source: (Dantas-Torres & Otranto, 2013) A variable alone, does not predict the tick dynamics accurately
  • 20. Modelling Is tick dynamics a linear phenomenon? 20 Source: (Dantas-Torres & Otranto, 2013) Yet, there should be a non-linear relationship out there!
  • 21. Modelling  Machine Learning algorithms:  Non-linear models  Not based on any previously known assumption  They learn from data  Capable of working with a huge number of variables 21
  • 22. Modelling: Random Forest  Ensemble learning method for classification and regression  Based in multiple decision trees (a forest)  Data are randomized and passed to each tree in the ensemble  Robust and stable algorithm 22
  • 23. Data and underlying function Single regression tree 10 regression trees Average of 100 regression trees Source: Quantitative Economics Tartu Blog
  • 24. Modelling: Random Forest (Regression) 𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥 𝑛) 1) Given a set of predictor variables X 2) Given a continuous response variable Y Build a model to predict the value of Y for a new array of X 24
  • 25. Modelling: Random Forest (Regression) 𝑦 = 𝑓(𝑥1, 𝑥2, … , 𝑥 𝑛) 1) Given a set of predictor variables X:  Weather predictors: T, P, EV, RH, SD, VP  Vegetation: NDVI, EVI, NDWI  Land use and land cover 2) Given a continuous response variable Y  The number of active questing ticks Build a model to predict the value of Y for a new array of X 25
  • 26. Modelling: The recipe for each flagging_site: for each sampling_date: # Time to get our environmental predictors! Get mean weather in the previous 7 days before the date of the sampling Get vegetation indices (NDVI, EVI, NDWI) for a site and date of sampling Get land cover of the site Build a table with 3.000 rows and 7 columns Model with the Random Forest for Regression Visualize results in the geographic space Tools Python GDAL scikit-learn matplotlib QGIS/ArcGIS 26
  • 28. Visualizing: Back to geographic space 28 Predicted tick activity June 1st, 2014  Trained model is applied to each pixel with forest of the Netherlands  Interpretation:  Provinces of Drenthe and Groningen presented high activity of ticks on that day  Randstat area presented the lowest tick activity
  • 29. Visualizing: Back to geographic space 29
  • 30. Conclusions  Collective effort of volunteers can be used to devise models capturing tick dynamics at the country level  Contribution intended to help designing public health campaigns:  Reduce the incidence of Lyme disease  Increase the awareness of citizens to this problem  Deliver results in platforms such as tekenradar.nl 30
  • 31. Discussion  Do any of you have experience in modeling species?  What kind of spatial modelling is done in your countries?  What else would you include in the analysis? 31