SlideShare a Scribd company logo
Databases
INSDC
International Nucleotide Sequence Database Collaboration
GenBank EMBL DDBJ
Sequence types:
Eukaryotic gene
Bacterial operon
Artificial Cloning vectors
Plasmid
Repeat element
Transfer RNA
GenBank
Nucleic Acids Research, 2008, 36, D25-D30
GenBank
Doubled in size about every 18 months
80 billion nucleotide bases from more than 76 million individual
sequences
Sequence-based
taxonomy
2,60,000 named species
1700 species are being added per month
12 % are human origin & 8% are human EST
The top species
Homo sapiens ------- 12.7 billion
Mus musculus ------- 8.3 billion
Rattus norvegicus ------- 5.8 billion
Bos taurus ------- 3.8 billion
Zea mays ------- 3.6 billion
GenBank
Records and Divisions
GenBank records
Partitioned into “divisions”
Traditional BCT, VRL, PRI, ROD
Recent EST, GSS, HTG, HTC, ENV
WGS
Special TPA
WGS Accession numbers are issued to these sequences
eg., AAAA01072744
AAAA Project ID
01 Version number
072744 Contig number
TPA Third party annotation
1. Experimental
2. Inferential
Data submission
BankIt
Use BankIt if:
you have one or a few sequence submissions
you prefer to use a WWW-based submission tool
your sequence annotation is not complicated
you do not require sequence analysis tools to submit your sequence(s)
Sequin
Use Sequin if:
you are submitting long or complex submissions
you are submitting mutation, phylogenetic, population,
environmental, or segmented sets
you would like graphical viewing and editing options,
including the alignment editor
you would like network access to related analytical tools
EMBL
EMBL
Maintained by EBI
Other databases of EBI
Swiss-Prot
TrEMBL
UniProt
InterPro
E-MSD
ArrayExpress
EMBL
Taxonomic
Invertebrates, Organells, Bacteriophages, Plants, Prokaryotes,
Rodents
Non-Taxonomic
HTG, HTC, GSS, WGS, EST
EMBL representation
For Genomic data Coding strand
cDNA data RNA sequence
tRNA Mature transcript
WebIn Data submission tool
DDBJ
DDBJ
Entry, format, abbreviation key same as GenBank
SAKURA Data submission tool
Secondary Nucleotide
Sequence Databases
UniGene
Database of unique
gene clusters
STACK (Sequence Tag Alignment and Consensus Knowledgebase)
Ribosomal database
HIV sequence database
EPD (Eukaryotic Promoter Database)
REBASE
SwissProt
Curated protein sequence database
High level of annotation
Description of the function
Domains structure
PTMs
Variants
TrEMBL
Consists of entries in SWISS-PROT-like format
PIR-PSD
Protein Information resource- Protein Sequence Database
World’s first database of classified and funtionally annotated
protein sequences
Grew out of The Atlas of Protein Sequence and Structure
UniProt
Universal Protein Resource
Comprehensive resource for protein sequence and annotated
data
Sequence Alignments
Alignments:
Pairwise
Multiple
Multiple Sequence Alignments provide information on,
Alignment itself
Consensus Sequences
Conserved residues
Conserved residue patterns
Sequence Profiles
Consensus Sequence Databases
Multiple Alignment
↓
A single sequence in which each residue is the most common or
consensus for the sequence family
↓
Consensus Sequence Database
Consensus Sequence Databases
 Disadvantage:
Much information from the sequences that do not
contain the consensus residue is ignored, even though these
hold information about allowed substitutions.
PROSITE
 Database of sequence patterns
 Associated with protein family membership.
 Developed using patterns that best fit particular protein
families and functions.
PROSITE
Serine Protease Family :-
 Pattern1:
[LIVM]-[ST]-A-[STAG]-H-C
 Pattern2:
[DNSTAGC]-[GSTAPIMVQH]-x(2)-G-[DE]-S-G-[GS]-[SAPHV]-
[LIVMFYWH]-PA-[LIVMFYSTANQH]
PROSITE
Features:
1.Much shorter than total sequence length
2.Provide information on acceptable substitution.
3.Provide information on shared biological functions.
PROSITE
Disadvantage:
1. Lack of specificity.
2. They have no way of attaching
probabilities to the variation.
PRINTS and BLOCKS
 Contain multiply aligned ungapped segments.
BLOCKS- blocks
PRINTS - motifs
PRINTS and BLOCKS
 Advantages
1. Potentially more sensitive (more
distant relationships can be found)
2.More specific (fewer false positives
occur)
Specialized Sequence Database
 rRNA database
 tRNA database
 5S rRNA database
 Promoter sequence database
 InBase, a database on inteins
OMIM
 Online Mendelian Inheritance in Man
 Comprehensive database of human genes and genetic
disorders.
 Has numerous links to databases like SWISS- PROT,
PubMed, Mutation databases, Mapviewer.
Biological databases
Structural Databases
 RCSB
1. PDB
2. NDB
Biological databases
Biological databases
Biological databases
Structural Databases
 PDBe of EBI
 MMDB
Structures derived from the PDB, with value-added features
such as,
Explicit chemical graphs,
Links to literature,
Similar sequences,
Related 3D structures,
Information about chemicals
Biological databases
Biological databases
Structural Databases
CATH
C- Class
A- Architecture
T- Topology
H- Homologous superfamily
SCOP
Structural Classification of Proteins
Higher Order Functions
Databases
 KEGG (Kyoto Encyclopedia of Genes and Genomes)
Subsidiary Databases
Contains 16 main databases
Biological databases
Biological databases
Biological databases
Higher Order Functions
Databases
 DIP (Database of Interacting Proteins)
 BIND (Biomolecular Interaction Network Database)
Literature Databases
 PubMed
 Web of Science
 BioMedNet
Data retrieval tools
Data retrieval tools
Entrez
Biological databases

More Related Content

PPTX
sequence of file formats in bioinformatics
PPTX
Major databases in bioinformatics
PPT
PPTX
Introduction to databases.pptx
PPTX
DOCX
Bioinformatics on internet
PPTX
Kegg
PPTX
Protein database
sequence of file formats in bioinformatics
Major databases in bioinformatics
Introduction to databases.pptx
Bioinformatics on internet
Kegg
Protein database

What's hot (20)

PPTX
PPT ON ALGORITHM
PPTX
Bioinformatics
PPT
The Smith Waterman algorithm
PPT
Pairwise sequence alignment
PPT
PPTX
Protein data bank
PDF
The ensembl database
PPTX
Genome annotation
PPTX
Genomic databases
PPTX
Data retriveal ,srg and dbget
PDF
Gene prediction method
PPTX
Sequence homology search and multiple sequence alignment(1)
PPTX
Protein Threading
PPTX
Chou fasman algorithm for protein structure prediction
PPTX
(Expasy)
PDF
Gene prediction methods vijay
PPTX
Protein Data Bank ( PDB ) - Bioinformatics
PPTX
PROTEIN DATABASE
PPTX
Sequence Alignment
PPT ON ALGORITHM
Bioinformatics
The Smith Waterman algorithm
Pairwise sequence alignment
Protein data bank
The ensembl database
Genome annotation
Genomic databases
Data retriveal ,srg and dbget
Gene prediction method
Sequence homology search and multiple sequence alignment(1)
Protein Threading
Chou fasman algorithm for protein structure prediction
(Expasy)
Gene prediction methods vijay
Protein Data Bank ( PDB ) - Bioinformatics
PROTEIN DATABASE
Sequence Alignment
Ad

Viewers also liked (11)

PPTX
Chemical File Formats for storing chemical data
PPT
Sequence file formats
PPTX
BIOLOGICAL SEQUENCE DATABASES
PPTX
Intro to Open Babel
PPTX
molecular file formats in bioinformatics
PPTX
Design your own test automation tool
PPT
Biological databases
PPTX
Biological databases
PPT
Biological Databases
PPTX
databases in bioinformatics
PPT
Biological databases
Chemical File Formats for storing chemical data
Sequence file formats
BIOLOGICAL SEQUENCE DATABASES
Intro to Open Babel
molecular file formats in bioinformatics
Design your own test automation tool
Biological databases
Biological databases
Biological Databases
databases in bioinformatics
Biological databases
Ad

Similar to Biological databases (20)

DOC
Protein databases
PDF
100505 koenig biological_databases
PPT
EST Clustering.ppt
PPT
Role of bioinformatics in life sciences research
PPT
Genome database and its applications.ppt
PPTX
Bioinformatics final
PPT
Proteome databases
PPT
Biodatabases 101220022654-phpapp02
PPT
Databases
PDF
Biological databases
PPTX
NCBI Boot Camp for Beginners Slides
PDF
Araport Data Integration - 2015 UMD Minisymposium
PDF
ncbi embl notes bioinformatics unit notes
PPTX
Bioinformatic, and tools by kk sahu
PPTX
blast bioinformatics
PPTX
PPT
Prediction of protein function
PPT
The uni prot knowledgebase
PPTX
Databases_CSS2.pptx
Protein databases
100505 koenig biological_databases
EST Clustering.ppt
Role of bioinformatics in life sciences research
Genome database and its applications.ppt
Bioinformatics final
Proteome databases
Biodatabases 101220022654-phpapp02
Databases
Biological databases
NCBI Boot Camp for Beginners Slides
Araport Data Integration - 2015 UMD Minisymposium
ncbi embl notes bioinformatics unit notes
Bioinformatic, and tools by kk sahu
blast bioinformatics
Prediction of protein function
The uni prot knowledgebase
Databases_CSS2.pptx

More from Malla Reddy College of Pharmacy (20)

PPT
Rna secondary structure prediction
PPT
Protein structure classification
PPT
Protein identication characterization
PPT
Phylogenetic studies
PPT
Multiple sequence alignment
PPTX
Homology modeling tools
PPT
PPT
PPT
PPT
Bio info statistical-methods[1]

Recently uploaded (20)

PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
RMMM.pdf make it easy to upload and study
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PPTX
master seminar digital applications in india
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Insiders guide to clinical Medicine.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Cell Structure & Organelles in detailed.
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Renaissance Architecture: A Journey from Faith to Humanism
Pharmacology of Heart Failure /Pharmacotherapy of CHF
RMMM.pdf make it easy to upload and study
Anesthesia in Laparoscopic Surgery in India
Microbial diseases, their pathogenesis and prophylaxis
O7-L3 Supply Chain Operations - ICLT Program
Week 4 Term 3 Study Techniques revisited.pptx
STATICS OF THE RIGID BODIES Hibbelers.pdf
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
master seminar digital applications in india
TR - Agricultural Crops Production NC III.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Insiders guide to clinical Medicine.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Supply Chain Operations Speaking Notes -ICLT Program
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Microbial disease of the cardiovascular and lymphatic systems
O5-L3 Freight Transport Ops (International) V1.pdf
Cell Structure & Organelles in detailed.
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx

Biological databases