SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
BIOINFORMATICS
Name: Anuja Vilas Konde
Msc II
CONTENT
What is bioinformatics ?
What is data and information?
Biological databases
Types of biological databases
Retrieval of databases
Advantages of biological databases
Bioinformatics
Definition
Marriage between computer science and Molecular Biology.
Techniques of computer science problems of molecular biology.
Information technology applied to analyse biological data.
Helps to gain understanding of biological data.
Plays important role in molecular medicine, evolutionary studies, drug
development and in biotechnology.
Analysis of gene and protein expression, comparison of genomic data , storing of
biological information.
Data and Information
Raw data
Processed and
Analysed
Information
BIOLOGICAL DATABASES
Biological Living.
Databases Collection of data in organized manner (i.e information )
Libraries of life sciences information, collected from scientific experiments which is
stored using computational analysis.
Information Accessed, Managed and updated.
Features:
Data heterogeneity
High volume data
Data curation
Types of biological databases
Biological databases
Primary Secondary
Basis on source of data: Basis of databases stored:
Biological databases
Sequence
Nucleic acid Protein
Structure
PDB
SCOP
CATH
On The basis of Data sources
Primary databases :
Contains experimentally derived data i.e raw data
Examples: Nucleotide sequence, Protein or Macromolecular sequence
Experimental results submitted into databases .
Swiss Prot, PIR , Gen bank, DDBJ.
Secondary databases :
Data derived from analysing primary data i.e information
Examples: Conserved regions, Signature sequence etc
Submitted data Analysed and stored
SCOP, CATH
On basis of data stored
Structural databases :
Includes structures of experimentally derived proteins and domains
Main aim is to organize protein structure providing biological community to access the
information
A. PDB ( Protein Data Bank):
Databases of experimentally determined 3D structure of protein
Currently stores 80,000 protein structure
Obtained from NMR spectroscopy and X ray crystallography
Easily accessible, can be downloaded and utilised
B. SCOP (Structural classification of proteins):
Contains information about classification and structures of proteins
Also describes evolutionary relation between proteins.
Currently contains 38,000 protein structures
Freely accessible to the internet
C. CATH ( Class architecture topology homology):
Contains information about classification and structures of proteins.
also gives information of bonding of proteins and evolutionary relationships of proteins.
currently contains 8,078 proteins domains information
Sequence databases :
Composed of large collections of nucleic acid sequence, protein sequence stored in computer.
Mainly of two types
Nucleic acid sequence:
Contains collections of sequences of genome, gene and transcript sequence.
Three chief databases store and make available raw nucleic acid data to public
Gene bank, EMBL, and DDBJ
referred to as primary sequence databases.
Genebank:
Located in USA.
Accessible through NCBI portal
Contains annotated collections of nucleotide sequence and their protein translations.
Receives 100,000 distinct organism sequences from all over world.
EMBL( European molecular biology laboratory):
Maintained by EBI (European bioinformatics Institute)
Comprises of primary nucleotide sequence
Data receives from genome sequencing centers.
DDBJ (DNA data bank of Japan) :
1. Located at the National Institute of Genetics (NIG).
2. Only nucleotide sequence data bank in Asia.
3. Exchange data with Gen Bank and EMBL.
4. Mainly receives data from Japanese researchers.
Protein sequence :
1. Database which include a protein’s amino acid sequence, conformation, structure, and features
such as active sites.
2. Compiled by the translation of DNA sequences from different gene databases.
3. Important resource because proteins mediate most biological functions.
4. Includes PIR, Swiss PROT, PDB.
1.PIR (Protein Information Resource):
2. Established in 1984 by National Biomedical Research Foundation
3. Provides a high level of annotation.
4. Contains sequence of amino acid and information about protein function prediction
5. Also contains sequences of domains.
Swiss PROT:
1. Swiss institute of bioinformatics in collaborations with EMBL data provides a databank
2. very high quality and consistent annotations
3.
It incorporates:
Functions of proteins
A. Post-translational modification such as phosphorylation, acetylation
B. domains and sites
C. Secondary structural feature and quaternary structure of the protein.
PDB (Protein data bank) :
1. Includes sequences of proteins.
2. Helps to predict 3D structure of proteins.
3. Database holds data derived from mainly two sources: Structure determined by X-ray
crystallography, NMR experiments
Retrieval of biological databases
Accessing the stored data of an organism or a particular gene from the databases.
When obtaining a new DNA sequence, one needs to know whether it has already been deposited in
the databanks.
Requirement for retrieval:
name of organism
name of gene
Data retrieval system :
Entrez
SRS
BLAST
Entrez :
Molecular biology databases and retrieval system
Developed by NCBI
Nucleotide and protein sequence data, 3D structure data
Easy to access but limited information to search
SRS ( Sequence retrieval system)
Home to over 80,000 biological databases
Developed by European Bioinformatics Institute (EBI)
Includes sequence of metabolic pathways, transcription factors, and conserved regions.
Provides the description of gene, date on which it is uploaded and updated.
BLAST (Basic Local Alignment Search Tool) :
Developed by NCBI
Blast programs were designed for fast database searching.
Helps to retrieve the data
Also helps for comparing primary biological sequence information
Raw data obtained
from experiment
Submit that data to
databases
Accession number
Entry accession
number in blast
Search
Find relationship
among them
Variants of BLAST
BLASTN - Compares a DNA query to DNA databases
BLASTP - Compares a protein query to a protein database.
BLASTX - Compares a DNA query to a protein database , by translating the query in the 6 possible frames .
TBLASTN -Compares a protein query to a DNA database, in the 6 possible frames of the database.
Advantages
Databases act as a store house of information.
Used to store and organize data in such a way that information can be retrieved
easily via a variety of search criteria.
It allows knowledge discovery, which refers to the identification of connections
between pieces of information .
Databases are important tools in assisting scientists to analyze and explain a
host of biological phenomena from the structure of biomolecules and their
interaction, to the whole metabolism of organisms and to understanding the
evolution of species.
THANK YOU

More Related Content

PPTX
Proteins databases
PDF
Bioinformatics biological databases
PPT
Biological databases
PPT
Databases
PPT
Bioinformatics
PPTX
(Expasy)
PPTX
Protein information resource (PIR)
Proteins databases
Bioinformatics biological databases
Biological databases
Databases
Bioinformatics
(Expasy)
Protein information resource (PIR)

What's hot (20)

PPTX
Introduction to NCBI
PPTX
Databases in Bioinformatics
PPT
Primary, secondary, tertiary biological database
PDF
BITS: Basics of sequence databases
PPTX
Entrez databases
PPTX
Nucleic Acid Sequence Databases
PDF
Data Retrieval Systems
PPTX
Introduction to ncbi, embl, ddbj
PPTX
Data base in detail
PPTX
Kegg
PPTX
Biological data bioinformatics
PPTX
Nucleic acid database
PPTX
Major databases in bioinformatics
PPTX
Features of biological databases
PPT
Biological data base
PPTX
PPTX
Data retreival system
PPTX
Biological Database
Introduction to NCBI
Databases in Bioinformatics
Primary, secondary, tertiary biological database
BITS: Basics of sequence databases
Entrez databases
Nucleic Acid Sequence Databases
Data Retrieval Systems
Introduction to ncbi, embl, ddbj
Data base in detail
Kegg
Biological data bioinformatics
Nucleic acid database
Major databases in bioinformatics
Features of biological databases
Biological data base
Data retreival system
Biological Database
Ad

Similar to Bioinformatics introduction (20)

PPTX
Biological Data bases for biological sciences bioinformatics
PPTX
Biological database
PPTX
Presentation on Biological database By Elufer Akram @ University Of Science ...
PDF
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
PPTX
Introduction to databases.pptx
PPTX
DATABASES...............................pptx
PPTX
Biological databases.pptx
PPTX
Sequence and Structural Databases of DNA and Protein, and its significance in...
PPTX
Sequence and Structural Databases of DNA and Protein, and its significance in...
PPTX
Biological databasesBiological databases
PPTX
Biological database ppt(1).pptx Introuction
PPTX
Biological database ppt(1).pptx Introuction
PPTX
Database in bioinformatics
PPT
Bioinformatics in biotechnology by kk sahu
PPTX
Protein Databases
PPTX
BioInformatics Tools -Genomics , Proteomics and metablomics
PPTX
Presentation.pptx
PPTX
BIOINFO unit 1.pptx
PPTX
Protein databases
PPTX
Bioinformatics
Biological Data bases for biological sciences bioinformatics
Biological database
Presentation on Biological database By Elufer Akram @ University Of Science ...
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
Introduction to databases.pptx
DATABASES...............................pptx
Biological databases.pptx
Sequence and Structural Databases of DNA and Protein, and its significance in...
Sequence and Structural Databases of DNA and Protein, and its significance in...
Biological databasesBiological databases
Biological database ppt(1).pptx Introuction
Biological database ppt(1).pptx Introuction
Database in bioinformatics
Bioinformatics in biotechnology by kk sahu
Protein Databases
BioInformatics Tools -Genomics , Proteomics and metablomics
Presentation.pptx
BIOINFO unit 1.pptx
Protein databases
Bioinformatics
Ad

Recently uploaded (20)

PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPTX
Microbiology with diagram medical studies .pptx
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PDF
Placing the Near-Earth Object Impact Probability in Context
PDF
HPLC-PPT.docx high performance liquid chromatography
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPT
POSITIONING IN OPERATION THEATRE ROOM.ppt
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
Comparative Structure of Integument in Vertebrates.pptx
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
AlphaEarth Foundations and the Satellite Embedding dataset
bbec55_b34400a7914c42429908233dbd381773.pdf
Microbiology with diagram medical studies .pptx
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
The KM-GBF monitoring framework – status & key messages.pptx
Placing the Near-Earth Object Impact Probability in Context
HPLC-PPT.docx high performance liquid chromatography
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
7. General Toxicologyfor clinical phrmacy.pptx
POSITIONING IN OPERATION THEATRE ROOM.ppt
Taita Taveta Laboratory Technician Workshop Presentation.pptx
neck nodes and dissection types and lymph nodes levels
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Introduction to Cardiovascular system_structure and functions-1
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS

Bioinformatics introduction

  • 2. CONTENT What is bioinformatics ? What is data and information? Biological databases Types of biological databases Retrieval of databases Advantages of biological databases
  • 3. Bioinformatics Definition Marriage between computer science and Molecular Biology. Techniques of computer science problems of molecular biology. Information technology applied to analyse biological data. Helps to gain understanding of biological data. Plays important role in molecular medicine, evolutionary studies, drug development and in biotechnology. Analysis of gene and protein expression, comparison of genomic data , storing of biological information.
  • 4. Data and Information Raw data Processed and Analysed Information
  • 5. BIOLOGICAL DATABASES Biological Living. Databases Collection of data in organized manner (i.e information ) Libraries of life sciences information, collected from scientific experiments which is stored using computational analysis. Information Accessed, Managed and updated. Features: Data heterogeneity High volume data Data curation
  • 6. Types of biological databases Biological databases Primary Secondary Basis on source of data: Basis of databases stored: Biological databases Sequence Nucleic acid Protein Structure PDB SCOP CATH
  • 7. On The basis of Data sources Primary databases : Contains experimentally derived data i.e raw data Examples: Nucleotide sequence, Protein or Macromolecular sequence Experimental results submitted into databases . Swiss Prot, PIR , Gen bank, DDBJ. Secondary databases : Data derived from analysing primary data i.e information Examples: Conserved regions, Signature sequence etc Submitted data Analysed and stored SCOP, CATH
  • 8. On basis of data stored Structural databases : Includes structures of experimentally derived proteins and domains Main aim is to organize protein structure providing biological community to access the information A. PDB ( Protein Data Bank): Databases of experimentally determined 3D structure of protein Currently stores 80,000 protein structure Obtained from NMR spectroscopy and X ray crystallography Easily accessible, can be downloaded and utilised
  • 9. B. SCOP (Structural classification of proteins): Contains information about classification and structures of proteins Also describes evolutionary relation between proteins. Currently contains 38,000 protein structures Freely accessible to the internet C. CATH ( Class architecture topology homology): Contains information about classification and structures of proteins. also gives information of bonding of proteins and evolutionary relationships of proteins. currently contains 8,078 proteins domains information
  • 10. Sequence databases : Composed of large collections of nucleic acid sequence, protein sequence stored in computer. Mainly of two types Nucleic acid sequence: Contains collections of sequences of genome, gene and transcript sequence. Three chief databases store and make available raw nucleic acid data to public Gene bank, EMBL, and DDBJ referred to as primary sequence databases. Genebank: Located in USA. Accessible through NCBI portal Contains annotated collections of nucleotide sequence and their protein translations. Receives 100,000 distinct organism sequences from all over world.
  • 11. EMBL( European molecular biology laboratory): Maintained by EBI (European bioinformatics Institute) Comprises of primary nucleotide sequence Data receives from genome sequencing centers. DDBJ (DNA data bank of Japan) : 1. Located at the National Institute of Genetics (NIG). 2. Only nucleotide sequence data bank in Asia. 3. Exchange data with Gen Bank and EMBL. 4. Mainly receives data from Japanese researchers.
  • 12. Protein sequence : 1. Database which include a protein’s amino acid sequence, conformation, structure, and features such as active sites. 2. Compiled by the translation of DNA sequences from different gene databases. 3. Important resource because proteins mediate most biological functions. 4. Includes PIR, Swiss PROT, PDB. 1.PIR (Protein Information Resource): 2. Established in 1984 by National Biomedical Research Foundation 3. Provides a high level of annotation. 4. Contains sequence of amino acid and information about protein function prediction 5. Also contains sequences of domains.
  • 13. Swiss PROT: 1. Swiss institute of bioinformatics in collaborations with EMBL data provides a databank 2. very high quality and consistent annotations 3. It incorporates: Functions of proteins A. Post-translational modification such as phosphorylation, acetylation B. domains and sites C. Secondary structural feature and quaternary structure of the protein. PDB (Protein data bank) : 1. Includes sequences of proteins. 2. Helps to predict 3D structure of proteins. 3. Database holds data derived from mainly two sources: Structure determined by X-ray crystallography, NMR experiments
  • 14. Retrieval of biological databases Accessing the stored data of an organism or a particular gene from the databases. When obtaining a new DNA sequence, one needs to know whether it has already been deposited in the databanks. Requirement for retrieval: name of organism name of gene Data retrieval system : Entrez SRS BLAST
  • 15. Entrez : Molecular biology databases and retrieval system Developed by NCBI Nucleotide and protein sequence data, 3D structure data Easy to access but limited information to search SRS ( Sequence retrieval system) Home to over 80,000 biological databases Developed by European Bioinformatics Institute (EBI) Includes sequence of metabolic pathways, transcription factors, and conserved regions. Provides the description of gene, date on which it is uploaded and updated.
  • 16. BLAST (Basic Local Alignment Search Tool) : Developed by NCBI Blast programs were designed for fast database searching. Helps to retrieve the data Also helps for comparing primary biological sequence information Raw data obtained from experiment Submit that data to databases Accession number Entry accession number in blast Search Find relationship among them
  • 17. Variants of BLAST BLASTN - Compares a DNA query to DNA databases BLASTP - Compares a protein query to a protein database. BLASTX - Compares a DNA query to a protein database , by translating the query in the 6 possible frames . TBLASTN -Compares a protein query to a DNA database, in the 6 possible frames of the database.
  • 18. Advantages Databases act as a store house of information. Used to store and organize data in such a way that information can be retrieved easily via a variety of search criteria. It allows knowledge discovery, which refers to the identification of connections between pieces of information . Databases are important tools in assisting scientists to analyze and explain a host of biological phenomena from the structure of biomolecules and their interaction, to the whole metabolism of organisms and to understanding the evolution of species.