SlideShare a Scribd company logo
Community           Integration        Democratization




            Biopython: challenges

                 Brad Chapman
                   Peter Cock
              Biopython contributors
             http://biopython.org


                  10 July 2010
Community               Integration           Democratization




    3 challenges for successful open source
    projects

            Community
            Integration
            Democratization
Community     Integration   Democratization



Distributed code access
Community             Integration          Democratization



Recruiting and training
    Google Summer of Code

            2009   Eric Talevich
                   phyloXML; Bio.Phylo
                   Nick Matzke
                   Biogeographical Phylogenetics
            2010   Jo˜o Rodrigues
                     a
                   Structural biology; Bio.PDB
Community     Integration    Democratization



Answering questions better
Community     Integration   Democratization



Recognizing contributions
Community     Integration    Democratization



Diversity of Python bioinformatics
Community               Integration           Democratization



Interoperability


            Avoid re-implementation
            Convert core objects
            Document workflows with multiple
            libraries
            Communicate better
Community               Integration            Democratization



Wrapping external tools


    import subprocess
    from Bio.Blast.Applications import (
            NcbiblastxCommandline)
    cl = NcbiblastxCommandline(query="opuntia.fasta",
            db="nr", evalue=0.001, outfmt=5,
            out="opuntia.xml")
    subprocess.call(str(cl))
Community    Integration   Democratization



Documenting standards
Community               Integration            Democratization



Making code easier to use

    >>> from Bio import SeqIO
    >>> memory_dict = SeqIO.index("in.gb", "genbank")
    >>> memory_dict.keys()
    [’Z78484.1’, ... ’Z78471.1’]
    >>> seq_record = memory_dict["Z78475.1"]
    >>> print seq_record.description
    P.supardii 5.8S rRNA gene and ITS1 and ITS2 DNA
    >>> seq_record.seq
    Seq(’CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGG...GGT’,
            IUPACAmbiguousDNA())
Community     Integration   Democratization



Challenges of big data
Community                  Integration                 Democratization



Cloud: easier to distribute

            On-demand computational resources like
            Amazon EC2
            Provide ready-to-go images
            Biopython and many associated
            bioinformatics libraries
            Biological data
    http://github.com/chapmanb/bcbb/tree/master/ec2/biolinux/
Community          Integration      Democratization



Following up


       Home http://biopython.org
        Code http://github.com/biopython
       BOSC Talk to Eric, Tiago or myself

More Related Content

PPTX
Huizenprijzen in amsterdam
PPTX
Tabagisme et thrombose habbal
PPT
Elalamy DiabèTe Et Aap Sfa 2009
PDF
201506 CSE340 Lecture 15
PDF
201506 CSE340 Lecture 20
PDF
Syst reninangiot pp cv aomi 02fev 2
PPT
LiveOffice Email Archiving & Compliance 201
PDF
201010 SPLASH Tutorial
Huizenprijzen in amsterdam
Tabagisme et thrombose habbal
Elalamy DiabèTe Et Aap Sfa 2009
201506 CSE340 Lecture 15
201506 CSE340 Lecture 20
Syst reninangiot pp cv aomi 02fev 2
LiveOffice Email Archiving & Compliance 201
201010 SPLASH Tutorial

Viewers also liked (19)

PPT
Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...
PPT
Barya Perception
PDF
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
PDF
Laserendoveineux b anastasie 1 er partie
PDF
Week5-Group-J
PDF
Final programme 27 06
PDF
201505 CSE340 Lecture 03
PPT
Sociale media en journalistiek
PDF
Angeiologie 4 2013 - 1-2014 livre des resumes
PPT
Uzbekistan caving 2011
PDF
201005 accelerometer and core Location
PDF
201506 CSE340 Lecture 07
PDF
PPS
KANSAS CITY INVESTMENT PROPERTIES
PDF
Thomasville
PDF
201506 CSE340 Lecture 18
PDF
201506 CSE340 Lecture 21
PDF
201506 CSE340 Lecture 23
PDF
Windowsxp
Mobile Social Media, Sept. 2010, Do You Want To Be Visible?, Marketing Club K...
Barya Perception
201404 Multimodal Detection of Affective States: A Roadmap Through Diverse Te...
Laserendoveineux b anastasie 1 er partie
Week5-Group-J
Final programme 27 06
201505 CSE340 Lecture 03
Sociale media en journalistiek
Angeiologie 4 2013 - 1-2014 livre des resumes
Uzbekistan caving 2011
201005 accelerometer and core Location
201506 CSE340 Lecture 07
KANSAS CITY INVESTMENT PROPERTIES
Thomasville
201506 CSE340 Lecture 18
201506 CSE340 Lecture 21
201506 CSE340 Lecture 23
Windowsxp
Ad

Similar to Biopython at BOSC 2010 (20)

PDF
Bio-UnaGrid: Easing bioinformatics workflow execution
PDF
Biopython Project Update 2013
PPT
myExperiment @ Nettab
PDF
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
PDF
Data-driven design of cell factories and communities
PPTX
Bio world going digital, 27 March 2015, Ireland
PPTX
AI for All: Biology is eating the world & AI is eating Biology
PPTX
Mercer bosc2010 microsoft_framework
PPTX
PERICLES Building Digital Ecosystem Models - ‘Eye of the Storm: Preserving Di...
PDF
Micropython for the iot
PPT
Ten Simple Rules for Changing How Scholars Communicate
PPT
September 23 2015 NISO Virtual Conference: Scholarly Communication Models: Ev...
PDF
Advanced computationalsyntbio
PDF
Talk6 biopython bosc2011
PPTX
Python for Big Data Analytics
PDF
Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ...
PDF
Structure your academic writing well in English
PDF
Machine Learning Based Botnet Detection
PPTX
Nicole Nogoy at the Auckland BMC RoadShow
PDF
Bosc2011 ntino-krampis-full
Bio-UnaGrid: Easing bioinformatics workflow execution
Biopython Project Update 2013
myExperiment @ Nettab
VIZBI 2015 Tutorial: Cytoscape, IPython, Docker, and Reproducible Network Dat...
Data-driven design of cell factories and communities
Bio world going digital, 27 March 2015, Ireland
AI for All: Biology is eating the world & AI is eating Biology
Mercer bosc2010 microsoft_framework
PERICLES Building Digital Ecosystem Models - ‘Eye of the Storm: Preserving Di...
Micropython for the iot
Ten Simple Rules for Changing How Scholars Communicate
September 23 2015 NISO Virtual Conference: Scholarly Communication Models: Ev...
Advanced computationalsyntbio
Talk6 biopython bosc2011
Python for Big Data Analytics
Evaluation of Container Virtualized MEGADOCK System in Distributed Computing ...
Structure your academic writing well in English
Machine Learning Based Botnet Detection
Nicole Nogoy at the Auckland BMC RoadShow
Bosc2011 ntino-krampis-full
Ad

More from Brad Chapman (7)

PDF
Amazon resource for bioinformatics
PDF
Developing distributed analysis pipelines with shared community resources usi...
PDF
Developing an open source community for cloud bioinformatics
PDF
GATK recalibration plot
PDF
Next-generation sequencing request management system in Galaxy
PDF
BioHackathon 2010 Intro
PDF
Lowering barriers to publishing biological data on the web
Amazon resource for bioinformatics
Developing distributed analysis pipelines with shared community resources usi...
Developing an open source community for cloud bioinformatics
GATK recalibration plot
Next-generation sequencing request management system in Galaxy
BioHackathon 2010 Intro
Lowering barriers to publishing biological data on the web

Recently uploaded (20)

PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation theory and applications.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Programs and apps: productivity, graphics, security and other tools
PPT
Teaching material agriculture food technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Unlocking AI with Model Context Protocol (MCP)
Understanding_Digital_Forensics_Presentation.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation_ Review paper, used for researhc scholars
MYSQL Presentation for SQL database connectivity
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
cuic standard and advanced reporting.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Programs and apps: productivity, graphics, security and other tools
Teaching material agriculture food technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Biopython at BOSC 2010

  • 1. Community Integration Democratization Biopython: challenges Brad Chapman Peter Cock Biopython contributors http://biopython.org 10 July 2010
  • 2. Community Integration Democratization 3 challenges for successful open source projects Community Integration Democratization
  • 3. Community Integration Democratization Distributed code access
  • 4. Community Integration Democratization Recruiting and training Google Summer of Code 2009 Eric Talevich phyloXML; Bio.Phylo Nick Matzke Biogeographical Phylogenetics 2010 Jo˜o Rodrigues a Structural biology; Bio.PDB
  • 5. Community Integration Democratization Answering questions better
  • 6. Community Integration Democratization Recognizing contributions
  • 7. Community Integration Democratization Diversity of Python bioinformatics
  • 8. Community Integration Democratization Interoperability Avoid re-implementation Convert core objects Document workflows with multiple libraries Communicate better
  • 9. Community Integration Democratization Wrapping external tools import subprocess from Bio.Blast.Applications import ( NcbiblastxCommandline) cl = NcbiblastxCommandline(query="opuntia.fasta", db="nr", evalue=0.001, outfmt=5, out="opuntia.xml") subprocess.call(str(cl))
  • 10. Community Integration Democratization Documenting standards
  • 11. Community Integration Democratization Making code easier to use >>> from Bio import SeqIO >>> memory_dict = SeqIO.index("in.gb", "genbank") >>> memory_dict.keys() [’Z78484.1’, ... ’Z78471.1’] >>> seq_record = memory_dict["Z78475.1"] >>> print seq_record.description P.supardii 5.8S rRNA gene and ITS1 and ITS2 DNA >>> seq_record.seq Seq(’CGTAACAAGGTTTCCGTAGGTGAACCTGCGGAAGG...GGT’, IUPACAmbiguousDNA())
  • 12. Community Integration Democratization Challenges of big data
  • 13. Community Integration Democratization Cloud: easier to distribute On-demand computational resources like Amazon EC2 Provide ready-to-go images Biopython and many associated bioinformatics libraries Biological data http://github.com/chapmanb/bcbb/tree/master/ec2/biolinux/
  • 14. Community Integration Democratization Following up Home http://biopython.org Code http://github.com/biopython BOSC Talk to Eric, Tiago or myself