SlideShare a Scribd company logo
GenGIS 2
New approaches to understand the
geography of our microbial world
Rob Beiko
Donovan Parks




        Timothy Mankowski


Mike Porter




                Brett O`Donnell
demo: the GenGIS environment




2-24                 GenGIS v1: Parks et al (2009) Genome Res
GenGIS v1 overview

    GUI (wxPython)
                                     Core                   Output
                                  application           Saved image files
                                    (C++)
  Scripting interface
      (Python, R)

                           Data
                           Map – many formats (GDAL)
                           Samples – CSV
                           Sequences – CSV
                           Trees - Newick

 Crossing minimization + statistical test
 Supported platforms: Windows XP, Vista, 7; OS X 10.4, 10.5, 10.6
 Open source: Creative Commons Attribution – Share Alike 3.0
what's new in v2

    GUI (wxPython)
                                    Core                 Output
                                 application         Saved image files
                                   (C++)              Save / restore
   Scripting interface                                   sessions
       (Python, R)

                         Data
                         Map – many formats (GDAL)
                         Samples – CSV
                         Sequences – CSV
    Python plugins       Trees - Newick
                         External files
Stability improvements, various things now work properly on the Mac
Interface updates (legends, data visualizations)
Linear axes analysis
bringing map data into GenGIS
• Maps:
  – MapMaker (included application)




  – Digital elevation data (Geobase.ca, NASA Shuttle
    Topography data, etc.)
  – Images (.png, .tif, etc.)
three views of the LineP transect




                Original data: Jody Wright, Steven Hallam
diversity and depth
clustering based on Canberra
beta-diversity
relative abundance of SUP05
demo: plugins and R scripts



                   Linear regression of group frequencies

                   Heatmap RPy2 script




                         Original data:
10-29                    Costello et al. Science 326:1694-1697
clustering of fecal samples

Female subjects: F1 – F3
Male subjects: M1 – M3

Two sampling methods:
       - TP
       - Direct from feces

Two time points

= 4 samples per individual. Do these
samples cluster with each other?
Beiko gen gis2-share
Wood Buffalo National Park


      • Canada’s largest National Park

      • UNESCO World Heritage status (Boreal Forest)

      • Threatened by encroaching development
          – Oil Sands mining (Alberta)
          – Metal mining (NWT)
          – Hydro-electric dams (Peace River, BC)

      • Natural resources sustain traditional use by Métis and
      First Nations peoples


Photos: D Baird
biomonitoring 2.0
    what is being collected

•   Benthic invertebrates (COI, 28S) – kick sample
•   Water (16S, 18S, 28S) – 1L volume
•   Soil (16S, COI, ITS, 18S, 28S, RbcL) - cores
•   Terrestrial arthropods (COI, 28S) – malaise / pitfall traps



• All samples replicated 3 times
• 5 time points in initial study
• Lots of metadata (soil chemistry,
  flooding, etc.)
biomonitoring 2.0
replication results – 2010 trial
• fjej
biomonitoring 2.0
sampling progress


• August 2011
   • Samples collected, starting analysis of sequences
   • 'traditional' taxonomy where applicable (arthropods
     si, bacteria no)
• June 2012
   • Samples collected
• Future sampling: August 2012, June – August 2013
biomonitoring 2.0
our three-year mission (and beyond)

• Develop robust sampling techniques for sequence-
  based biomonitoring
• Develop and apply different approaches for
  assessing biodiversity (taxon-based and taxon-
  free), and compare their performance on WBNP data
• Identify whether “reference conditions” can be
  established against which future samples can be
  compared
call for collaborators
• Currently underway:
  – Combined axis tests (Many trees, one optimal gradient)
  – Regional tests of diversity
  – Canonical correlation analysis and related
  – Bio2.0 analysis
• Goals:
  – Integrate with online data sources
  – Support more data types (especially vector data)
  – More plugins!
the long-term goal




Online data sources                       Analysis:
     with APIs                            -Geo gradients
                      Automated dataset   -Diversity vs. habitat
        +                generation /     -Diversity networks
 Local data             visualization     -Functional models
Beiko gen gis2-share
acknowledgments
GenGIS developers
  (Dal)
Donovan Parks
Mike Porter            LineP (UBC)
Timothy Mankowski      Jody Wright
Brett O'Donnell        Steven Hallam
Kathryn Dunphy
                       Bio2.0
Sylvia Churcher
Mike Porter            Mehrdad Hajibabaei (Guelph)
Suwen Wang             Donald Baird, Wendy Monk (UNB)
Harman Clair           Brian Golding (McMaster)
Greg Smolyn            Jeff Shatford (Parks Canada)
Stephen Brooks
Christian Blouin
Jacqueline Whalley
   (Auckland U Tech)
Beiko gen gis2-share
New Zealand fungus beetle
                       (Agyrtodes labralis)




COI phylogeny
Ecological niche modelling suggests
                                       Marske et al. Mol Ecol (2009)
several glacial refugia, phylogenies   Data shown in GenGIS
suggest transalpine migration
map
locations
sequence summaries
tree vs geography
axes test
body site data
linear regression
heatmaps using R

More Related Content

PPTX
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
PDF
Bioinformatics, Data Integration, and Data Representation Working Group Summa...
PPSX
Stoltzfus_EvoIO_2010
PPTX
Stoltzfus_EvoIO_2010
PDF
Presentation mml
PDF
TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bas...
PPTX
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
PPTX
ReComp: optimising the re-execution of analytics pipelines in response to cha...
Rob's GenGIS presentation at IBS Special Meeting (Montreal 2013)
Bioinformatics, Data Integration, and Data Representation Working Group Summa...
Stoltzfus_EvoIO_2010
Stoltzfus_EvoIO_2010
Presentation mml
TALP-UPC at MediaEval 2014 Placing Task: Combining Geographical Knowledge Bas...
Capturing and querying fine-grained provenance of preprocessing pipelines in ...
ReComp: optimising the re-execution of analytics pipelines in response to cha...

Similar to Beiko gen gis2-share (20)

PPTX
GenGIS at iEvoBio 2012
PDF
Pycon 2012 Taiwan
PDF
那些年 Python 攻佔了 GIS / The Year Python Takes Over GIS
PDF
Arc gis desktop_and_geoprocessing
PDF
Advancing Spatio-temporal Analysis of Ecological Data Examples in R.pdf
PDF
Moving Biodiversity to the Cloud @TDWG09
KEY
Giida2009
PDF
Geo script opengeo spring 2013
PPT
C:\Documents And Settings\Sachink\Desktop\Shilpa Study\Gis Projects Test
PPTX
GenGIS presentation at Vizbi 2016
PDF
Mel McIntyre, OpenApp.ie LGMA
PDF
ANTABIF at BNCAR
PPTX
2014 nyu-bio-talk
PDF
Spatial Analysis with R - the Good, the Bad, and the Pretty
KEY
Danis biosystematics2011
PDF
Mansoor Ansari, TOOLS & SOFTWARES IN PLANT IDENTIFICATION.pdf
PDF
Open Source GeoSpatial
PPTX
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
KEY
Danis xlink2011
PPT
Os Racicot
GenGIS at iEvoBio 2012
Pycon 2012 Taiwan
那些年 Python 攻佔了 GIS / The Year Python Takes Over GIS
Arc gis desktop_and_geoprocessing
Advancing Spatio-temporal Analysis of Ecological Data Examples in R.pdf
Moving Biodiversity to the Cloud @TDWG09
Giida2009
Geo script opengeo spring 2013
C:\Documents And Settings\Sachink\Desktop\Shilpa Study\Gis Projects Test
GenGIS presentation at Vizbi 2016
Mel McIntyre, OpenApp.ie LGMA
ANTABIF at BNCAR
2014 nyu-bio-talk
Spatial Analysis with R - the Good, the Bad, and the Pretty
Danis biosystematics2011
Mansoor Ansari, TOOLS & SOFTWARES IN PLANT IDENTIFICATION.pdf
Open Source GeoSpatial
2014 talk at NYU CUSP: "Biology Caught the Bus: Now what? Sequencing, Big Dat...
Danis xlink2011
Os Racicot
Ad

More from beiko (20)

PPTX
ASMNGS_ARETE_Beiko_2022Oct19.pptx
PPTX
Beiko cmo gen_epi_monday
PPTX
Beiko networks 2019_final
PPTX
Biomedical data
PPTX
Rob csm2018
PPTX
Beiko taconic-nov3
PPTX
CCBC tutorial beiko
PPTX
Beiko ANL Soil Metagenomics presentation
PPTX
DCSI presentation 2015
PPTX
2015 06-12-beiko-irida-big data
PPTX
Beiko cms final
PPTX
Is microbial ecology driven by roaming genes?
PPTX
Beiko hpcs
PPTX
Gene sharing in microbes: good for the individual, good for the community?
PPTX
Beiko biogeography
PPTX
2014 04-beiko-biology
PPTX
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
PPTX
Beiko dcsi2013
PPTX
Beiko smbe2013-final
PPTX
Rob Beiko - #SMBE12 presentation
ASMNGS_ARETE_Beiko_2022Oct19.pptx
Beiko cmo gen_epi_monday
Beiko networks 2019_final
Biomedical data
Rob csm2018
Beiko taconic-nov3
CCBC tutorial beiko
Beiko ANL Soil Metagenomics presentation
DCSI presentation 2015
2015 06-12-beiko-irida-big data
Beiko cms final
Is microbial ecology driven by roaming genes?
Beiko hpcs
Gene sharing in microbes: good for the individual, good for the community?
Beiko biogeography
2014 04-beiko-biology
Beiko Deep Genomics presentation - "Grand theft operon - lateral city"
Beiko dcsi2013
Beiko smbe2013-final
Rob Beiko - #SMBE12 presentation
Ad

Recently uploaded (20)

PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
1. Introduction to Computer Programming.pptx
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Approach and Philosophy of On baking technology
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
TLE Review Electricity (Electricity).pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
A Presentation on Touch Screen Technology
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Getting Started with Data Integration: FME Form 101
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
A comparative study of natural language inference in Swahili using monolingua...
1. Introduction to Computer Programming.pptx
Web App vs Mobile App What Should You Build First.pdf
Hybrid model detection and classification of lung cancer
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
A novel scalable deep ensemble learning framework for big data classification...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Enhancing emotion recognition model for a student engagement use case through...
MIND Revenue Release Quarter 2 2025 Press Release
Approach and Philosophy of On baking technology
NewMind AI Weekly Chronicles - August'25-Week II
Programs and apps: productivity, graphics, security and other tools
TLE Review Electricity (Electricity).pptx
DP Operators-handbook-extract for the Mautical Institute
Encapsulation_ Review paper, used for researhc scholars
A Presentation on Touch Screen Technology
WOOl fibre morphology and structure.pdf for textiles
Getting Started with Data Integration: FME Form 101
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
SOPHOS-XG Firewall Administrator PPT.pptx

Beiko gen gis2-share

  • 1. GenGIS 2 New approaches to understand the geography of our microbial world Rob Beiko
  • 2. Donovan Parks Timothy Mankowski Mike Porter Brett O`Donnell
  • 3. demo: the GenGIS environment 2-24 GenGIS v1: Parks et al (2009) Genome Res
  • 4. GenGIS v1 overview GUI (wxPython) Core Output application Saved image files (C++) Scripting interface (Python, R) Data Map – many formats (GDAL) Samples – CSV Sequences – CSV Trees - Newick Crossing minimization + statistical test Supported platforms: Windows XP, Vista, 7; OS X 10.4, 10.5, 10.6 Open source: Creative Commons Attribution – Share Alike 3.0
  • 5. what's new in v2 GUI (wxPython) Core Output application Saved image files (C++) Save / restore Scripting interface sessions (Python, R) Data Map – many formats (GDAL) Samples – CSV Sequences – CSV Python plugins Trees - Newick External files Stability improvements, various things now work properly on the Mac Interface updates (legends, data visualizations) Linear axes analysis
  • 6. bringing map data into GenGIS • Maps: – MapMaker (included application) – Digital elevation data (Geobase.ca, NASA Shuttle Topography data, etc.) – Images (.png, .tif, etc.)
  • 7. three views of the LineP transect Original data: Jody Wright, Steven Hallam
  • 9. clustering based on Canberra beta-diversity
  • 11. demo: plugins and R scripts Linear regression of group frequencies Heatmap RPy2 script Original data: 10-29 Costello et al. Science 326:1694-1697
  • 12. clustering of fecal samples Female subjects: F1 – F3 Male subjects: M1 – M3 Two sampling methods: - TP - Direct from feces Two time points = 4 samples per individual. Do these samples cluster with each other?
  • 14. Wood Buffalo National Park • Canada’s largest National Park • UNESCO World Heritage status (Boreal Forest) • Threatened by encroaching development – Oil Sands mining (Alberta) – Metal mining (NWT) – Hydro-electric dams (Peace River, BC) • Natural resources sustain traditional use by Métis and First Nations peoples Photos: D Baird
  • 15. biomonitoring 2.0 what is being collected • Benthic invertebrates (COI, 28S) – kick sample • Water (16S, 18S, 28S) – 1L volume • Soil (16S, COI, ITS, 18S, 28S, RbcL) - cores • Terrestrial arthropods (COI, 28S) – malaise / pitfall traps • All samples replicated 3 times • 5 time points in initial study • Lots of metadata (soil chemistry, flooding, etc.)
  • 16. biomonitoring 2.0 replication results – 2010 trial • fjej
  • 17. biomonitoring 2.0 sampling progress • August 2011 • Samples collected, starting analysis of sequences • 'traditional' taxonomy where applicable (arthropods si, bacteria no) • June 2012 • Samples collected • Future sampling: August 2012, June – August 2013
  • 18. biomonitoring 2.0 our three-year mission (and beyond) • Develop robust sampling techniques for sequence- based biomonitoring • Develop and apply different approaches for assessing biodiversity (taxon-based and taxon- free), and compare their performance on WBNP data • Identify whether “reference conditions” can be established against which future samples can be compared
  • 19. call for collaborators • Currently underway: – Combined axis tests (Many trees, one optimal gradient) – Regional tests of diversity – Canonical correlation analysis and related – Bio2.0 analysis • Goals: – Integrate with online data sources – Support more data types (especially vector data) – More plugins!
  • 20. the long-term goal Online data sources Analysis: with APIs -Geo gradients Automated dataset -Diversity vs. habitat + generation / -Diversity networks Local data visualization -Functional models
  • 22. acknowledgments GenGIS developers (Dal) Donovan Parks Mike Porter LineP (UBC) Timothy Mankowski Jody Wright Brett O'Donnell Steven Hallam Kathryn Dunphy Bio2.0 Sylvia Churcher Mike Porter Mehrdad Hajibabaei (Guelph) Suwen Wang Donald Baird, Wendy Monk (UNB) Harman Clair Brian Golding (McMaster) Greg Smolyn Jeff Shatford (Parks Canada) Stephen Brooks Christian Blouin Jacqueline Whalley (Auckland U Tech)
  • 24. New Zealand fungus beetle (Agyrtodes labralis) COI phylogeny Ecological niche modelling suggests Marske et al. Mol Ecol (2009) several glacial refugia, phylogenies Data shown in GenGIS suggest transalpine migration
  • 25. map