SlideShare a Scribd company logo
Biological Databases
Some databases in the field of molecular biology…
AATDB, AceDb, ACUTS, ADB, AFDB, AGIS, AMSdb,
ARR, AsDb,BBDB, BCGD,Beanref,Biolmage,
BioMagResBank, BIOMDB, BLOCKS, BovGBASE,
BOVMAP, BSORF, BTKbase, CANSITE, CarbBank,
CARBHYD, CATH, CAZY, CCDC, CD4OLbase, CGAP,
ChickGBASE, Colibri, COPE, CottonDB, CSNDB, CUTG,
CyanoBase, dbCFC, dbEST, dbSTS, DDBJ, DGP, DictyDb,
Picty_cDB, DIP, DOGS, DOMO, DPD, DPlnteract, ECDC,
ECGC, EC02DBASE, EcoCyc, EcoGene, EMBL, EMD db,
ENZYME, EPD, EpoDB, ESTHER, FlyBase, FlyView,
GCRDB, GDB, GENATLAS, Genbank, GeneCards,
Genline, GenLink, GENOTK, GenProtEC, GIFTS,
GPCRDB, GRAP, GRBase, gRNAsdb, GRR, GSDB,
HAEMB, HAMSTERS, HEART-2DPAGE, HEXAdb, HGMD,
HIDB, HIDC, HlVdb, HotMolecBase, HOVERGEN, HPDB,
HSC-2DPAGE, ICN, ICTVDB, IL2RGbase, IMGT, Kabat,
KDNA, KEGG, Klotho, LGIC, MAD, MaizeDb, MDB,
Medline, Mendel, MEROPS, MGDB, MGI, MHCPEP5
Micado, MitoDat, MITOMAP, MJDB, MmtDB, Mol-R-Us,
MPDB, MRR, MutBase, MycDB, NDB, NRSub, 0-lycBase,
OMIA, OMIM, OPD, ORDB, OWL, PAHdb, PatBase, PDB,
PDD, Pfam, PhosphoBase, PigBASE, PIR, PKR, PMD,
PPDB, PRESAGE, PRINTS, ProDom, Prolysis, PROSITE,
PROTOMAP, RatMAP, RDP, REBASE, RGP, SBASE,
SCOP, SeqAnaiRef, SGD, SGP, SheepMap, Soybase,
SPAD, SRNA db, SRPDB, STACK, StyGene,Sub2D,
SubtiList, SWISS-2DPAGE, SWISS-3DIMAGE, SWISS-
MODEL Repository, SWISS-PROT, TelDB, TGN, tmRDB,
TOPS, TRANSFAC, TRR, UniGene, URNADB, V BASE,
VDRR, VectorDB, WDCM, WIT, WormPep, YEPD, YPD,
YPM, etc .................. !!!!
What we expect from a database..!!
• Sequence, functional, structural information,
related bibliography
• Well Structured and Indexed
• Well cross-referenced (with other databases)
• Periodically updated
• Tools for analysis and visualization
Biological Databases
• Sequence databases
• Structure databases
Sequence databases
• Nucleotide databases
• Protein databases
Sequence databases
Nucleotide databases
• International Nucleotide Sequence
Database Collaboration (INSDC)
– NCBI
– EMBL
– DDBJ
Standard contents of a sequence
database
• Sequences
• Accession number
• References
• Taxonomic data
• Annotation/curation
• Keywords
• Cross-references
• Documentation
NCBI
• Very comprehensive biological database
• GENBANK: The nucleotide sequence database
• Provides 42 different resource
• Provides a simple and easy to use web
interface
http://www.ncbi.nlm.nih.gov/
• Sequence submission: done using Bankit or
Sequin
• Search Engine for data retrieval: Entrez
• Retrieves information across all the resources
under NCBI
Example: PubMed, taxonomy, SNP, PubChem
etc.
Tools for analysis
• BLAST
• Primer-BLAST
• ORF finder
• Genome workbench
Protein Sequence databases
• UniProt
• PFAM
• Prosite
• Motif scan
UniProt
• Universal Protein Resource
• Formed through the merger of :
– SIB
– EBI-SwissProt
– TrEMBL
– PIR-PSD
• Entry names are often the names of the gene
followed by the species.
• Accession numbers are of the following
format:
• e.g. P26367 (PAX6_HUMAN)
Uniprot features
• Blast
• Align
• Retrieve
• ID mapping
Pfam
• Proteins contain conserved regions
• Based on the conserved regions, proteins are
classified into families
• Provides links to external databases like PDB,
SCOP, CATH etc.
Pfam: Features
• Sequence search
• View Pfam family
• View a clan
• View a sequence
• View a structure
• Keyword search
Gene Indices
• Project aimed at indexing genes and their
variants in the various genome sequences.
• Creating a catalogue of genes in a wide range
of organisms
• Reduce redundancy
Gene Indices Software Tools
• TGI Clustering tools
• Clview
• SeqClean
• Cdbfasta/cdbyank
Structural databases
• PDB – Protein Data Bank
• CATH
• SCOP – Structural Classification of Proteins
wwPDB
• Contains information about experimentally
determined structures of proteins, nucleic
acids, and complex assemblies
• RCSB-PDB, PDBe, PDBj, BMRB – repositories of
protein structure data
• Files in PDB, mmCIF, PDBML/XML formats
• Advanced search – provides comprehensive
information about a protein.
• Sequence info, domain info, sequence
similarity, literature, apart from the details of
the structure.
• Cross referenced to SCOP and CATH
CATH
• Classification of proteins based on domain
structures
• Each protein chopped into individual domains
and assigned into homologous superfamilies.
• Hierarchial domain classification of PDB
entries.
CATH hierarchy
• Class – derived from secondary structure content is assigned
automatically
• Architecture – describes gross orientation of secondary
structures, independent of connectivity
• Topology – clusters structures according to their
topological connections and numbers of secondary
structures
• Homologous superfamily – this level groups
together protein domains which are thought to
share a common ancestor and can therefore be
described as homologous
SCOP
• Description of structural and evolutionary
relationships between all the proteins with
known structures
• Uses the PDB entries
• Search using keywords or PDB identifiers
Hierarchy in SCOP
• Class
• Fold
• Superfamily
• Family
• Species
Thank you

More Related Content

PDF
PDF文档.pdf
PPT
Bioinformatic_Databases_2.ppt
PPT
Bioinformatic databases 2
PPT
Bioinformatic_Databases_2xcxzczxcxzxcxzc
PPT
Bioinformatic databases 2
PPT
Bioinformatic_Databases and Sequence Analysis
PDF
Biological Database (1)pptxpdfpdfpdf.pdf
PPTX
Biological databases
PDF文档.pdf
Bioinformatic_Databases_2.ppt
Bioinformatic databases 2
Bioinformatic_Databases_2xcxzczxcxzxcxzc
Bioinformatic databases 2
Bioinformatic_Databases and Sequence Analysis
Biological Database (1)pptxpdfpdfpdf.pdf
Biological databases

Similar to Bioinformatic_Databases_2.ppt Bioinformatics (20)

PPT
Data Base in Bioinformatics.ppt
PPTX
Major databases in bioinformatics
PPTX
biological databases.pptx
PPTX
BIOINFORMATICS and Data analysis ppt 202
PPTX
Databases_CSS2.pptx
PPTX
BIOINFORMATICS BIOLOGICAL DATABASES DATA BASES.pptx
PPTX
Proteins databases
PPT
Role of bioinformatics in life sciences research
PDF
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
PPT
Intro to databases
PDF
Data Retrieval Systems
PDF
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
PPTX
Nucleic acid database
PPTX
Primary Bioinformatics Database.pptx
PPTX
Protein database
PPTX
Important protein databases and proteomics softwares
PPTX
Biological databases
PPTX
Genomic Databases-.pptx
PDF
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Data Base in Bioinformatics.ppt
Major databases in bioinformatics
biological databases.pptx
BIOINFORMATICS and Data analysis ppt 202
Databases_CSS2.pptx
BIOINFORMATICS BIOLOGICAL DATABASES DATA BASES.pptx
Proteins databases
Role of bioinformatics in life sciences research
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
Intro to databases
Data Retrieval Systems
Storing and Accessing Information. Databases and Queries (UEB-UAT Bioinformat...
Nucleic acid database
Primary Bioinformatics Database.pptx
Protein database
Important protein databases and proteomics softwares
Biological databases
Genomic Databases-.pptx
Development of FDA MicroDB: A Regulatory-Grade Microbial Reference Database
Ad

More from MohamedHasan816582 (20)

PPT
Introduction to Genetics and molecular biology.ppt
PPTX
Application of Biotechnology for Improving Medicinal Plants.pptx
PPT
structure Am Health Final and Technology. ppt
PPTX
Bioinformatics & AI- in Medicinal and aromatic plant.pptx
PPTX
Basic Bioinformatics and Biotechnology.pptx
PPT
2- Basics of Molecular Biology and biochemistry.ppt
PPT
3- introduction(SEQU ANAL of PCR products 9 9 12 (2).ppt
PPTX
TNBC Research Presentation and medical virology .pptx
PPTX
EBOV Presentation and medical Virology .pptx
PPTX
Presentation of medical biotechnology.pptx
PPTX
Mohamed El-Sayed Hasan and curriculum vitae.pptx
PPT
Introduction to classical and modern Genetics.ppt
PPTX
Topic 5 of the genomics and proteomics.pptx
PPTX
EmZ medical microbiology and classification.pptx
PPTX
presentation and microbial biotechnology.pptx
PPTX
EmZ medical microbiology and classification.pptx
PPTX
IMAN of medical microbiology and classification.pptx
PPT
aya presentation of discussion seminar .ppt
PPTX
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_3.pptx
PPT
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_2.ppt
Introduction to Genetics and molecular biology.ppt
Application of Biotechnology for Improving Medicinal Plants.pptx
structure Am Health Final and Technology. ppt
Bioinformatics & AI- in Medicinal and aromatic plant.pptx
Basic Bioinformatics and Biotechnology.pptx
2- Basics of Molecular Biology and biochemistry.ppt
3- introduction(SEQU ANAL of PCR products 9 9 12 (2).ppt
TNBC Research Presentation and medical virology .pptx
EBOV Presentation and medical Virology .pptx
Presentation of medical biotechnology.pptx
Mohamed El-Sayed Hasan and curriculum vitae.pptx
Introduction to classical and modern Genetics.ppt
Topic 5 of the genomics and proteomics.pptx
EmZ medical microbiology and classification.pptx
presentation and microbial biotechnology.pptx
EmZ medical microbiology and classification.pptx
IMAN of medical microbiology and classification.pptx
aya presentation of discussion seminar .ppt
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_3.pptx
INTRODUCTION-TO-RESEARCH-METHODOLOGY-2020_2.ppt
Ad

Recently uploaded (20)

PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Lesson notes of climatology university.
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Sports Quiz easy sports quiz sports quiz
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Computing-Curriculum for Schools in Ghana
PDF
RMMM.pdf make it easy to upload and study
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
master seminar digital applications in india
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
Institutional Correction lecture only . . .
PPTX
Cell Structure & Organelles in detailed.
PDF
Pre independence Education in Inndia.pdf
PDF
Basic Mud Logging Guide for educational purpose
human mycosis Human fungal infections are called human mycosis..pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Lesson notes of climatology university.
Anesthesia in Laparoscopic Surgery in India
GDM (1) (1).pptx small presentation for students
Sports Quiz easy sports quiz sports quiz
Microbial disease of the cardiovascular and lymphatic systems
Computing-Curriculum for Schools in Ghana
RMMM.pdf make it easy to upload and study
TR - Agricultural Crops Production NC III.pdf
Pharmacology of Heart Failure /Pharmacotherapy of CHF
master seminar digital applications in india
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
O5-L3 Freight Transport Ops (International) V1.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
Institutional Correction lecture only . . .
Cell Structure & Organelles in detailed.
Pre independence Education in Inndia.pdf
Basic Mud Logging Guide for educational purpose

Bioinformatic_Databases_2.ppt Bioinformatics

  • 2. Some databases in the field of molecular biology… AATDB, AceDb, ACUTS, ADB, AFDB, AGIS, AMSdb, ARR, AsDb,BBDB, BCGD,Beanref,Biolmage, BioMagResBank, BIOMDB, BLOCKS, BovGBASE, BOVMAP, BSORF, BTKbase, CANSITE, CarbBank, CARBHYD, CATH, CAZY, CCDC, CD4OLbase, CGAP, ChickGBASE, Colibri, COPE, CottonDB, CSNDB, CUTG, CyanoBase, dbCFC, dbEST, dbSTS, DDBJ, DGP, DictyDb, Picty_cDB, DIP, DOGS, DOMO, DPD, DPlnteract, ECDC, ECGC, EC02DBASE, EcoCyc, EcoGene, EMBL, EMD db, ENZYME, EPD, EpoDB, ESTHER, FlyBase, FlyView, GCRDB, GDB, GENATLAS, Genbank, GeneCards, Genline, GenLink, GENOTK, GenProtEC, GIFTS, GPCRDB, GRAP, GRBase, gRNAsdb, GRR, GSDB, HAEMB, HAMSTERS, HEART-2DPAGE, HEXAdb, HGMD, HIDB, HIDC, HlVdb, HotMolecBase, HOVERGEN, HPDB, HSC-2DPAGE, ICN, ICTVDB, IL2RGbase, IMGT, Kabat, KDNA, KEGG, Klotho, LGIC, MAD, MaizeDb, MDB, Medline, Mendel, MEROPS, MGDB, MGI, MHCPEP5 Micado, MitoDat, MITOMAP, MJDB, MmtDB, Mol-R-Us, MPDB, MRR, MutBase, MycDB, NDB, NRSub, 0-lycBase, OMIA, OMIM, OPD, ORDB, OWL, PAHdb, PatBase, PDB, PDD, Pfam, PhosphoBase, PigBASE, PIR, PKR, PMD, PPDB, PRESAGE, PRINTS, ProDom, Prolysis, PROSITE, PROTOMAP, RatMAP, RDP, REBASE, RGP, SBASE, SCOP, SeqAnaiRef, SGD, SGP, SheepMap, Soybase, SPAD, SRNA db, SRPDB, STACK, StyGene,Sub2D, SubtiList, SWISS-2DPAGE, SWISS-3DIMAGE, SWISS- MODEL Repository, SWISS-PROT, TelDB, TGN, tmRDB, TOPS, TRANSFAC, TRR, UniGene, URNADB, V BASE, VDRR, VectorDB, WDCM, WIT, WormPep, YEPD, YPD, YPM, etc .................. !!!!
  • 3. What we expect from a database..!! • Sequence, functional, structural information, related bibliography • Well Structured and Indexed • Well cross-referenced (with other databases) • Periodically updated • Tools for analysis and visualization
  • 4. Biological Databases • Sequence databases • Structure databases
  • 5. Sequence databases • Nucleotide databases • Protein databases
  • 7. Nucleotide databases • International Nucleotide Sequence Database Collaboration (INSDC) – NCBI – EMBL – DDBJ
  • 8. Standard contents of a sequence database • Sequences • Accession number • References • Taxonomic data • Annotation/curation • Keywords • Cross-references • Documentation
  • 9. NCBI • Very comprehensive biological database • GENBANK: The nucleotide sequence database • Provides 42 different resource • Provides a simple and easy to use web interface http://www.ncbi.nlm.nih.gov/
  • 10. • Sequence submission: done using Bankit or Sequin • Search Engine for data retrieval: Entrez • Retrieves information across all the resources under NCBI Example: PubMed, taxonomy, SNP, PubChem etc.
  • 11. Tools for analysis • BLAST • Primer-BLAST • ORF finder • Genome workbench
  • 12. Protein Sequence databases • UniProt • PFAM • Prosite • Motif scan
  • 13. UniProt • Universal Protein Resource • Formed through the merger of : – SIB – EBI-SwissProt – TrEMBL – PIR-PSD
  • 14. • Entry names are often the names of the gene followed by the species. • Accession numbers are of the following format: • e.g. P26367 (PAX6_HUMAN)
  • 15. Uniprot features • Blast • Align • Retrieve • ID mapping
  • 16. Pfam • Proteins contain conserved regions • Based on the conserved regions, proteins are classified into families • Provides links to external databases like PDB, SCOP, CATH etc.
  • 17. Pfam: Features • Sequence search • View Pfam family • View a clan • View a sequence • View a structure • Keyword search
  • 18. Gene Indices • Project aimed at indexing genes and their variants in the various genome sequences. • Creating a catalogue of genes in a wide range of organisms • Reduce redundancy
  • 19. Gene Indices Software Tools • TGI Clustering tools • Clview • SeqClean • Cdbfasta/cdbyank
  • 21. • PDB – Protein Data Bank • CATH • SCOP – Structural Classification of Proteins
  • 22. wwPDB • Contains information about experimentally determined structures of proteins, nucleic acids, and complex assemblies • RCSB-PDB, PDBe, PDBj, BMRB – repositories of protein structure data • Files in PDB, mmCIF, PDBML/XML formats
  • 23. • Advanced search – provides comprehensive information about a protein. • Sequence info, domain info, sequence similarity, literature, apart from the details of the structure. • Cross referenced to SCOP and CATH
  • 24. CATH • Classification of proteins based on domain structures • Each protein chopped into individual domains and assigned into homologous superfamilies. • Hierarchial domain classification of PDB entries.
  • 25. CATH hierarchy • Class – derived from secondary structure content is assigned automatically • Architecture – describes gross orientation of secondary structures, independent of connectivity • Topology – clusters structures according to their topological connections and numbers of secondary structures • Homologous superfamily – this level groups together protein domains which are thought to share a common ancestor and can therefore be described as homologous
  • 26. SCOP • Description of structural and evolutionary relationships between all the proteins with known structures • Uses the PDB entries • Search using keywords or PDB identifiers
  • 27. Hierarchy in SCOP • Class • Fold • Superfamily • Family • Species

Editor's Notes

  • #8: Each database exchange data every day. Each database has its own sequence submission and retrieval tools They follow a standardized annotation The Collaboration created a Feature Table Definition that outlines legal features and syntax
  • #10: Currently, NCBI receives and processes about 20,000 direct submission sequences per month, in addition to the approximately 200,000 bulk submissions that are processed automatically. Collaboration with EMBL and DDBJ
  • #11: Database continues to grow at exponential rate. Doubling in size every 10 months Has sequences of 250,000 distinct organisms
  • #12: All tools can be downloaded and used on your local workstations as standalone.
  • #19: The goal of this project is ultimately to represent a non-redundant view of all human genes and data on their expression patterns, cellular roles, functions, and evolutionary relationships. The database will also include links to genomic sequences, mapping data, 3D structures, and literature references