MicrobeDB Overview

      Morgan Langille
morgan.gi.langille@gmail.com
Main Features
   Centralized storage and access to completed archaeal and
      bacterial genomes
       Genomes obtained from NCBI RefSeq:
             http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi
       Genome/Flat files are stored in one central location
                 Including files .gbk, .gff, .fna, .faa, etc.
         Unpublished genomes can be added as well
      

   Information at the genome project, chromosome, and gene level
       are parsed and stored in a MySQL database

   A Perl MicrobeDB API provides non-MySQL interface with the
       database.
Main MicrobeDB Tables
   Version
          Each download of genomes from NCBI is given a new version numbe
          Data will not change if you always use the same version number of
              microbedb
          Version date can be cited for any method publications
          A version can be saved by users so not automatically deleted.
   Genome Project
          Contains information about the genome project and the organism that was
              sequenced
          Each genome project contains one or more replicons
   Replicon
          Chromosome, plasmids, or contigs
          Each replicon contains one or more genes
   Gene
          Contains gene annotations and also the DNA and protein sequences (if
              protein coding gene)
MicrobeDB Annotations
Accessing MicrobeDB
• Any traditional MySQL programs
   – phpMyAdmin:

      – Web-based
      – http://phpmyadmin.net

   – MySQL Workbench

      – Local desktop client
      –   http://www.mysql.com/products/workbench/


• MicrobeDB Perl API
      – Allows interaction with database directly from within a Perl script
      – Requires no knowledge of SQL
MySQL Workbench
phpMyAdmin
MicrobeDB API Example
#Use the MicrobeDB Search library
use MicrobeDB::Search;

#create the search object
my $search_obj= new MicorbeDB::Search();

#Create an object with certain features that we want (i.e. only pathogens)
my $obj = new GenomeProject( version_id => '1', patho_status => 'pathogen' );

#This does the actual search and returns a list of all genome projects that match search parameters
my @result_objs = $search_obj->object_search($obj);

#Now we can iterate through each genome project
foreach my $gp_obj (@result_objs) {

      #get the name of the genome
      $gp_obj->org_name()
      foreach my $gene_obj ($gp_obj->genes()){
            if($gene_obj->gene_type() eq 'tRNA'){
                    #write the genes in fasta format with gid as the identifier
                    print '>',$gene_obj->gid,”n”,$gene_obj->gene_seq();
            }}}

More Related Content

PDF
Leveraging ancestral state reconstruction to infer community function from a ...
PPTX
GLBIO/CCBC Metagenomics Workshop
PDF
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
PDF
Variant analysis and whole exome sequencing
PDF
Overview of Next Gen Sequencing Data Analysis
PPTX
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
PDF
Intro to metagenomic binning
PPTX
Whole exome sequencing(wes)
Leveraging ancestral state reconstruction to infer community function from a ...
GLBIO/CCBC Metagenomics Workshop
A Tovchigrechko - MGTAXA: a toolkit and webserver for predicting taxonomy of ...
Variant analysis and whole exome sequencing
Overview of Next Gen Sequencing Data Analysis
Rare Variant Analysis Workflows: Analyzing NGS Data in Large Cohorts
Intro to metagenomic binning
Whole exome sequencing(wes)

What's hot (20)

PPTX
ECCMID 2015 Meet-The-Expert: Bioinformatics Tools
PDF
2015.04.08-Next-generation-sequencing-issues
PDF
NGS: Mapping and de novo assembly
PDF
Next Generation Sequencing (NGS) in food safety-Game changer or just another ...
PPTX
Transcriptome project
PPTX
Functional genomics
PDF
BITS - Comparative genomics: the Contra tool
PPTX
Dgaston dec-06-2012
PPT
Genome annotation 2013
PDF
Tyler functional annotation thurs 1120
PPTX
Unknown Genes, Community Profiling, & Biotorrents.net
PPTX
Introduction to Bayesian phylogenetics and BEAST
PDF
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
PDF
Adding Transparency and Automation into the Galaxy Tool Installation Process
PDF
Genome res. 2002-kent-656-64
PDF
A short introduction to single-cell RNA-seq analyses
PDF
Next Generation Sequencing Informatics - Challenges and Opportunities
PDF
Transcriptome Analysis & Applications
PPTX
Cloud bioinformatics 2
PDF
exRNA Data Analysis Tools in the Genboree Workbench
ECCMID 2015 Meet-The-Expert: Bioinformatics Tools
2015.04.08-Next-generation-sequencing-issues
NGS: Mapping and de novo assembly
Next Generation Sequencing (NGS) in food safety-Game changer or just another ...
Transcriptome project
Functional genomics
BITS - Comparative genomics: the Contra tool
Dgaston dec-06-2012
Genome annotation 2013
Tyler functional annotation thurs 1120
Unknown Genes, Community Profiling, & Biotorrents.net
Introduction to Bayesian phylogenetics and BEAST
Exploring Spark for Scalable Metagenomics Analysis: Spark Summit East talk by...
Adding Transparency and Automation into the Galaxy Tool Installation Process
Genome res. 2002-kent-656-64
A short introduction to single-cell RNA-seq analyses
Next Generation Sequencing Informatics - Challenges and Opportunities
Transcriptome Analysis & Applications
Cloud bioinformatics 2
exRNA Data Analysis Tools in the Genboree Workbench
Ad

Similar to MicrobeDB Overview (20)

PPT
Project report-on-bio-informatics
PDF
PDF文档.pdf
PPTX
biological detabase
PPT
Bioinformatic_Databases_2.ppt Bioinformatics
PPTX
Presentation2013
PPTX
Biological databases
PPT
Introduction to Bioinformatics and DatabasesDay1.ppt
PPT
Bioinformatic_Databases_2.ppt
PPT
Bioinformatic_Databases_2xcxzczxcxzxcxzc
PPT
Bioinformatic databases 2
PPT
Bioinformatic databases 2
PDF
Friedberg bosc2010 iprstats
PPT
Biological Database Systems
PDF
Biological databases
PPTX
Databases_CSS2.pptx
PDF
Protein function and bioinformatics
PDF
The Ruby UCSC API @ISMB2012
PPT
Intro to databases
PDF
A Genome Sequence Analysis System Built with Hypertable
PPTX
Informal presentation on bioinformatics
Project report-on-bio-informatics
PDF文档.pdf
biological detabase
Bioinformatic_Databases_2.ppt Bioinformatics
Presentation2013
Biological databases
Introduction to Bioinformatics and DatabasesDay1.ppt
Bioinformatic_Databases_2.ppt
Bioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic databases 2
Bioinformatic databases 2
Friedberg bosc2010 iprstats
Biological Database Systems
Biological databases
Databases_CSS2.pptx
Protein function and bioinformatics
The Ruby UCSC API @ISMB2012
Intro to databases
A Genome Sequence Analysis System Built with Hypertable
Informal presentation on bioinformatics
Ad

More from Morgan Langille (7)

PPTX
Inferring microbial community function from taxonomic composition
PPTX
Characterizing Protein Families of Unknown Function
PPT
BioTorrents: A File Sharing Service for Scientific Data
PPTX
HMMER 3 & Community Profiling
PPT
Computational prediction and characterization of genomic islands: insights i...
PPT
Microbial Genomics 2008 Conference Review
PPT
A graduate student's experience in bioinformatics
Inferring microbial community function from taxonomic composition
Characterizing Protein Families of Unknown Function
BioTorrents: A File Sharing Service for Scientific Data
HMMER 3 & Community Profiling
Computational prediction and characterization of genomic islands: insights i...
Microbial Genomics 2008 Conference Review
A graduate student's experience in bioinformatics

Recently uploaded (20)

PDF
Complications of Minimal Access-Surgery.pdf
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
Core Concepts of Personalized Learning and Virtual Learning Environments
PPTX
Introduction to pro and eukaryotes and differences.pptx
PPTX
What’s under the hood: Parsing standardized learning content for AI
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
HVAC Specification 2024 according to central public works department
PDF
My India Quiz Book_20210205121199924.pdf
PDF
advance database management system book.pdf
PDF
Journal of Dental Science - UDMY (2021).pdf
PDF
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf
PDF
International_Financial_Reporting_Standa.pdf
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 2).pdf
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
PDF
Empowerment Technology for Senior High School Guide
PDF
Race Reva University – Shaping Future Leaders in Artificial Intelligence
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Complications of Minimal Access-Surgery.pdf
What if we spent less time fighting change, and more time building what’s rig...
Core Concepts of Personalized Learning and Virtual Learning Environments
Introduction to pro and eukaryotes and differences.pptx
What’s under the hood: Parsing standardized learning content for AI
Share_Module_2_Power_conflict_and_negotiation.pptx
HVAC Specification 2024 according to central public works department
My India Quiz Book_20210205121199924.pdf
advance database management system book.pdf
Journal of Dental Science - UDMY (2021).pdf
MICROENCAPSULATION_NDDS_BPHARMACY__SEM VII_PCI .pdf
International_Financial_Reporting_Standa.pdf
Virtual and Augmented Reality in Current Scenario
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 2).pdf
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Empowerment Technology for Senior High School Guide
Race Reva University – Shaping Future Leaders in Artificial Intelligence
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf

MicrobeDB Overview

  • 1. MicrobeDB Overview Morgan Langille morgan.gi.langille@gmail.com
  • 2. Main Features  Centralized storage and access to completed archaeal and bacterial genomes  Genomes obtained from NCBI RefSeq: http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi  Genome/Flat files are stored in one central location  Including files .gbk, .gff, .fna, .faa, etc.  Unpublished genomes can be added as well   Information at the genome project, chromosome, and gene level are parsed and stored in a MySQL database  A Perl MicrobeDB API provides non-MySQL interface with the database.
  • 3. Main MicrobeDB Tables  Version  Each download of genomes from NCBI is given a new version numbe  Data will not change if you always use the same version number of microbedb  Version date can be cited for any method publications  A version can be saved by users so not automatically deleted.  Genome Project  Contains information about the genome project and the organism that was sequenced  Each genome project contains one or more replicons  Replicon  Chromosome, plasmids, or contigs  Each replicon contains one or more genes  Gene  Contains gene annotations and also the DNA and protein sequences (if protein coding gene)
  • 5. Accessing MicrobeDB • Any traditional MySQL programs – phpMyAdmin: – Web-based – http://phpmyadmin.net – MySQL Workbench – Local desktop client – http://www.mysql.com/products/workbench/ • MicrobeDB Perl API – Allows interaction with database directly from within a Perl script – Requires no knowledge of SQL
  • 8. MicrobeDB API Example #Use the MicrobeDB Search library use MicrobeDB::Search; #create the search object my $search_obj= new MicorbeDB::Search(); #Create an object with certain features that we want (i.e. only pathogens) my $obj = new GenomeProject( version_id => '1', patho_status => 'pathogen' ); #This does the actual search and returns a list of all genome projects that match search parameters my @result_objs = $search_obj->object_search($obj); #Now we can iterate through each genome project foreach my $gp_obj (@result_objs) { #get the name of the genome $gp_obj->org_name() foreach my $gene_obj ($gp_obj->genes()){ if($gene_obj->gene_type() eq 'tRNA'){ #write the genes in fasta format with gid as the identifier print '>',$gene_obj->gid,”n”,$gene_obj->gene_seq(); }}}