SlideShare a Scribd company logo
NEETHUASOKAN
Introduction
• Bioinformatics is the science concerned with the development and
application of computer hardware and software to the acquisition,
storge,analysis, and visualization of biological information.
• It has the following three component.
- The development of new algorithms and statistics for
assessing the relationship among large sets of biological data.
e.g DNA Sequence data.
- Application of these tools for the analysis and interpretation of
the various biological data. e.g nucleotide sequences, amino acid
sequences.
- The development of database of database for an efficient
storage, access and management of various biological
informations.
• The ‘bioinformatics’ is a combination of ‘biology’ and
informatics.
NEETHUASOKAN
Definition
• Bioinformatics derives knowledge from computer analysis of
biological data.
• These can consist of the information stored in the genetic code,
but also experimental results from various sources, patient
statistics, and scientific literature.
• Research in bioinformatics includes method development for
storage, retrieval, and analysis of the data.
• Bioinformatics is a rapidly developing branch of biology and
is highly interdisciplinary, using techniques and concepts from
informatics, statistics, mathematics, chemistry, biochemistry,
and physics.
• It has many practical applications in different areas of
biology and medicine.
NEETHUASOKAN
History of bioinformatics
• The collection of amino acids sequences was complied in the ‘
Atlas of protein sequence and structure’ by the National
Biomedical Foundation.
• This collection was edited by margaret O.Dayhoff from 1965
to 1978.
• Dayhoff and coworkers contributions to the comparison of
amino acid sequences by developing computer software for
detecting distantly related sequences.
• The EMBL established their data library in 1980 to collect,
organize and distribute nucleotide sequence data and related
information.
• NCBI was established in U.S.A. NCBI serves as primary
information databank and provider of information.
• The National Biomedical Research Foundation established the
PIR in 1984.
NEETHUASOKAN
DNASequences
• The symbols used to represent DNA sequence data.
• The four bases are denoted by single letters A (Adenine), C
(cytosine ), G (guanine), and T (Thymine)
• But often sequence data contain ambiguities in that it is not
clear as to which of the four base present at several positions.
• For example , the sequence data may indicate that the base
present at a specific position may be either G or A, it is purine.
• Similarly , if a position may have either C or T, it is
pyrimidine.
• The base sequence of the two complementary strands of a
DNA molecules are represented by this system of symbols.
NEETHUASOKAN
AminoAcid Sequences of Proteins
• The amino acids were conventionally represented by three-
letters symbols..e.g. Ala for alanine, Val for valine, etc.
• But in Bioinformatics, they are denoted by single letter, e.g A
for alanine C for cyctine, D for aspartics acid, etc.
• But some position in protein sequences have ambiguities this
situation is comparable to that for DNA sequences.
• For e.g , it may not be clear that a position has glutamine or
glutamic acid , the position is given the symbol Z.
• The Protein synthesis begin at the N-terminus and proceeds
to the C-terminus.
• The amino acid Sequences in databases are listed from the N-
terminus to the C-terminus of the polypeptide.
NEETHUASOKAN
Conti...
Single letter code Amino acid Three letter Code
A Alanine Ala
B Asparagine Asx
C Cystine Cys
D Aspartic acid Asp
E Glutamic Acid Glu
F Phenylanine Phe
G Glcine Gly
H Histidine His
I Isoleucine Ile
K Lysine Lys
L Leucine Leu
M Methionine Met
NEETHUASOKAN
Conti.....
Single letter code Amino acid Three letter Code
N Asparagine Asn
P Proline Pro
Q Glutamine Glu
R Arginine Arg
S Serine Ser
T Threonine Thr
V Valine Val
W Tryptophan Trp
Y Tyrosine Tyr
Z Glutamic acid Glx
X Any amino acid Xaa
NEETHUASOKAN
Types of Sequences in Nucleotide Sequence
Databases
• The databases on DNA sequences contain a different types.
cDNA sequences :
• A cDNA molecule is obtained by reverse transcription of an
RNA molecule.
• The cDNA sequences, therefore represent that part of the
genome that is transcribed into RNA.
• If the cDNA is obtained from mRNA, it will represent only the
exon sequences of the gene expressed in the concerned cell /
tissue/organisms.
Genomic DNA sequences :
• These sequences represent the complete genome of the
organisms.
• When the genome sequences is completed, it will contain the
sequences of the entire genome of the organisms.
NEETHUASOKAN
Cont...
• In Case of prokaryotes, genome consists of usually, a single
chromosome, while in case of eukaryotes it relates to the
nuclear DNA
Expressed Sequence Tag (EST) sequences :
• The sequences are obtained by sequencing only a part of the
cDNA molecules produced using mRNA.
• These sequences are dubbed as ‘tags’ because they can be used
as probes for the isolation of the concerned genes from the
genomic DNA.
• This approach was used by J. Craig Venter and his group for
obtaining the sequence of expressed portion of human
genome.
• The EST technique generated enormous sequence data that
permitted the construction of a preliminary transcript map of
the human genome.
NEETHUASOKAN
Conti...
Genome Sequence Tag (GST) Sequences:
• GSTs were developed for identifying the genes of Plasmodium
falciparum.
• It was observed that the enzyme mung bean nuclease (Mnase)
cuts P.falciparum genomic DNA between genes.
• GSTs are developed by sequencing the DNA Fragments on
either side of the points of cuts generated by Mnase.
Organellar DNA Sequences:
• Organellar DNA is the DNA found in mitochondria (mtDNA )
and chloroplasts (cpDNA).
• The sequence of the data are complied in databases.
NEETHUASOKAN
Branches of Bioinformatics
• A living cell is a system where cellular components such as
genome, the gene transcript, and the proteins interact with
each other, and these interactions determine the fact of the
cell. e.g Whether a stem cell is going to become a liver cell
or a cancer cell.
The three branches of bioinformatcs...
1. Genomics
2. Transcriptomics
3. Poteomics
NEETHUASOKAN
Conti....
Genomics
Makes
Trancriptomics
Makes
Proteomics
The three major branches of Bioinformatics
DNA
RNA
Protein
NEETHUASOKAN
Genomics :
• Genomics play a significant role in modern biological research
in which the nucleotide sequences of ali the chromosomes of
an organism are mapped and the location of different genes
and their sequence are determined.
• This involves extensive analysis of the nucleic acids through
molecular biology techniques before the data are ready for
processing by Computer.
• It is a science that attempts to describe a living organisms in
terms of the sequence of its genome.
• It Was not reliable to estimate the number of genes in an
organism based on the number of nucleotide base pairs
because of the presence of high numbers of redundant copies
of many genes.
• Genomics has helped to rectify this problem.
NEETHUASOKAN
Conti...
• Genomics uses technique of molecular biology and bioinformatics
to identify cellular components such as proteins, rRNA, tRAN,etc
and analyse the sequences attributed to the structural genes
regulatory sequences, and non-coding sequence.
• The first automatic DNA sequencer was developed in 1986 by
Leroy Hood.
• Haemophilus influenzae was the first bacterium to be sequenced
in 1995.
• Even if one can identify all the genes on a genome , the genes
only indicate that, at some point in time, it might be transcribed to
produce cellular componts.
• eg. A human genome contains about 30,000 to 60,000 protein
coding genes, but only a subset of them is expressed in a particular
cell type at a particular time.
NEETHUASOKAN
Transcriptomics
• Transcriptomics is the study of the transcriptome, which
includes the whole set of mRAN molecules in one or a
population of biological cells.
• This study helps us to depict the expression level of genes,
often using techaniques such as DNA microarrys, that is
capable of sampling ten thousands of different mRNAs at a
time.
• This kind of new technique has helped biologist to routinely
monitor the gene expression between the control cells and
treatment cells.
• Transcriptomics has a few limitations
• The relative abundance of transcripts as characterized by the
sequential analysis of gene expression (SAGE) or microarry
experiments.
NEETHUASOKAN
Conti....
1. Differential adaptation to the translational machinery.
2. Differential usage of amino acid of different abundances.
3. The lack of information on post-translation modification of
amino acid residues although post-transcriptional
modification such as acetylation , hydroxylation,
glycosylation, phosphorylation, and cleavage are
fundamental in understanding the interaction of cellular
components.
Proteomics :
• Proteomics represents the earliest to identify a major sub-
class of cellular components, the proteins and their
interactions.
• Proteomics involves the sequencing of amino acid in a
protein determining its 3D structure and relating it to the
function of the protein.
NEETHUASOKAN
Cont...
• Before computer processing comes into the picture, extensive
data, particularly through crystallography and nuclear
magnetic resonance (NMR).
• With such data known as proteins, the structure and its
relationship to the function of newly discovered proteins.
• In such areas, bioinformatics has enormous analytical and
predictive potential.
• Metabolic proteins such as haemoglobin and insulin have been
subjected to intensive proteomic investigation.
• The term ’proteomics’ was coined to make an analogy with
genomics.
• Scientists feel that the bioinformatics of proteins is crucial , to
understands the cellular components and the interactions
completely.
NEETHUASOKAN
Aims of Bioinformatics
• The various important ways in which bioinformatics can be
used.
• The aim of bioinformatics is fourfold and includes data
acquisition, tool and database development, data analysis, and
data integration.
Data Acquisition:
• Data Acquisition is primarily concerned with accessing and
storing data generated directly from the biological
experiments.
• The data generated by various sequencing projects have to be
retrieved in the appropriate format, and capable of being
linked to all the information related to the DNA samples.
• The data are organized in different databases so that the
researchers can access existing information.
NEETHUASOKAN
Tool and Database Development
• Many laboratories generate large volumes of data such as DNA
sequences, gene expression information, 3D molecular structure ,
and highly-throughput screening.
• Consequently, they must develop effective databases for storing
and quickly accessing data. The other aim is to develop tools and
resources that aid in the analysis of data.
Data Analysis:
• The third aim is to use these tool to analyse the data and interpret
the results in a biologically meaningful manner. Efficient analysis
require an efficiently deigned database.
• It must allow researchers to place their query effectively and
provide them with all the information they need to begin their data
analysis.
NEETHUASOKAN
Conti...
• If queries cannot be performed , or if the performance is too
slow, the whole system breaks down since scientists will not
be inclined to use the database.
Data Integration :
• Once information has been analysed , a researcher must often
associate or integrate it with the related data from the other
databases.
• For e.g a scientist may run a series of gene expression analysis
experiments and observe that a particular et of 100 genes is
more highly expressed in a cancerous lung tissue than in a
normal lung tissue.
• The scientist may wonder which of the genes is most likely to
be truly related to the disease.
NEETHUASOKAN
Bioinformatics Applications
Molecular medicine :
• The human genome will have profound effects on the fields of
biomedical research and clinical medicine. Every disease has a
genetic component.
• This may be inherited or a result of the body's response to an
environmental stress which causes alterations in the genome
(eg. cancers, heart disease, diabetes.)
• The completion of the human genome means that we can
search for the genes directly associated with different diseases
and begin to understand the molecular basis of these diseases
more clearly.
• This new knowledge of the molecular mechanisms of disease
will enable better treatments, cures and even preventative tests
to be developed.
NEETHUASOKAN
Conti...
• Personalised medicine:
• Clinical medicine will become more personalised with the
development of the field of pharmacogenomics.
This is the study of how an individual's genetic inheritence
affects the body's response to drugs.
• At present, some drugs fail to make it to the market because a
small percentage of the clinical patient population show
adverse affects to a drug due to sequence variants in their
DNA.
• As a result, potentially life saving drugs never make it to the
marketplace. Today, doctors have to use trial and error to find
the best drug to treat a particular patient as those with the same
clinical symptoms can show a wide range of responses to the
same treatment.
NEETHUASOKAN
Conti...
Drug development :
• At present all drugs on the market target only about 500
proteins.
• With an improved understanding of disease mechanisms and
using computational tools to identify and validate new drug
targets, more specific medicines that act on the cause, not
merely the symptoms, of the disease can be developed.
• These highly specific drugs promise to have fewer side effects
than many of today's medicines.
NEETHUASOKAN
Conti...
Gene therapy :
• In the not too distant future, the potential for using genes
themselves to treat disease may become a reality.
• Gene therapy is the approach used to treat, cure or even
prevent disease by changing the expression of a persons genes.
• Currently, this field is in its infantile stage with clinical trials
for many different types of cancer and other diseases ongoing.
NEETHUASOKAN
Conti...
• The reality of bioweapon creation :
• Scientists have recently built the virus poliomyelitis using
entirely artificial means.
• They did this using genomic data available on the Internet and
materials from a mail-order chemical supply.
• The research was financed by the US Department of Defence
as part of a biowarfare response program to prove to the world
the reality of bioweapons. The researchers also hope their
work will discourage officials from ever relaxing programs of
immunisation. This project has been met with very mixed
feelings.
NEETHUASOKAN
Conti.....
Antibiotic resistance :
• Scientists have been examining the genome of Enterococcus
faecalisa leading cause of bacterial infection among hospital
patients.
• They have discovered a virulence region made up of a number
of antibiotic-resistant genes that may contribute to the
bacterium's transformation from a harmless gut bacteria to a
menacing invader.
• The discovery of the region, known as a pathogenicity island,
could provide useful markers for detecting pathogenic strains
and help to establish controls to prevent the spread of infection
in wards.
NEETHUASOKAN
References
• Bioinformatics Principles and Applications – Zhumur
Ghosh, Bibekanand Mallick.
• Bioinformatics – B.D Singh
• WWW.Scfbio.iitd.org
• WWW.ncbi.nim.nih.gov
• http.//genes.mit.edu/Genscan html.
NEETHUASOKAN

More Related Content

PPTX
Introduction to Bioinformatics
PPTX
History and scope in bioinformatics
PDF
Basics of bioinformatics
PPT
Bioinformatics
PPT
Bioinformatics
PPTX
Tools of bioinforformatics by kk
PPTX
Bioinformatics
PPTX
History and devolopment of bioinfomatics.ppt (1)
Introduction to Bioinformatics
History and scope in bioinformatics
Basics of bioinformatics
Bioinformatics
Bioinformatics
Tools of bioinforformatics by kk
Bioinformatics
History and devolopment of bioinfomatics.ppt (1)

What's hot (20)

PDF
NCBI National Center for Biotechnology Information
PPTX
Introduction OF BIOLOGICAL DATABASE
PPTX
Genome mapping
PDF
Sequence alignment
PDF
Sequence analysis - Bioinformatics
PPTX
Blast and fasta
PPTX
Introduction to NCBI
PPTX
Primary and secondary databases ppt by puneet kulyana
PPTX
Protein data bank
PPTX
Entrez databases
PPTX
Transposones
PDF
Tools and database of NCBI
PDF
LECTURE NOTES ON BIOINFORMATICS
PDF
Restriction enzymes and their types
PPTX
liposome mediated gene delivery
PPT
Application of Bioinformatics in different fields of sciences
PPTX
Gene transfer methods
PPTX
Genomic library construction
NCBI National Center for Biotechnology Information
Introduction OF BIOLOGICAL DATABASE
Genome mapping
Sequence alignment
Sequence analysis - Bioinformatics
Blast and fasta
Introduction to NCBI
Primary and secondary databases ppt by puneet kulyana
Protein data bank
Entrez databases
Transposones
Tools and database of NCBI
LECTURE NOTES ON BIOINFORMATICS
Restriction enzymes and their types
liposome mediated gene delivery
Application of Bioinformatics in different fields of sciences
Gene transfer methods
Genomic library construction
Ad

Similar to Introduction of bioinformatics (20)

PPTX
Bioinformatics, application by kk sahu sir
PPTX
GENOMICS AND BIOINFORMATICS
PDF
Bioinformatics manual
PPTX
Bioinformatics final
PPTX
Introduction to Biological database ppt(1).pptx
PDF
Bioinformatics issues and challanges presentation at s p college
PDF
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
PDF
Intro to Bioinformatics for beginers student
PPT
Lecture 1 Introduction to Bioinformatics BCH 433.ppt
PPTX
introduction of Bioinformatics
PPTX
MOLECULAR BIOLOGY TECHNIQUES AND APPLICATIONS
PPTX
MLS 5321 MOLECULAR BIOLOGY II TECHNIQUES AND APPLICATIONS POWER POINT.pptx
PPTX
Introduction to databases.pptx
PPTX
Bioinformatics
PPTX
Microbial physiology in genomic era
PPT
Bioinformatics applications and challenges
PPTX
Introduction to bioinformatics and databases .pptx
PDF
proteome.pdf
DOCX
Bioinformatics
Bioinformatics, application by kk sahu sir
GENOMICS AND BIOINFORMATICS
Bioinformatics manual
Bioinformatics final
Introduction to Biological database ppt(1).pptx
Bioinformatics issues and challanges presentation at s p college
Bioinformatics: History of Bioinformatics, Components of Bioinformatics, Geno...
Intro to Bioinformatics for beginers student
Lecture 1 Introduction to Bioinformatics BCH 433.ppt
introduction of Bioinformatics
MOLECULAR BIOLOGY TECHNIQUES AND APPLICATIONS
MLS 5321 MOLECULAR BIOLOGY II TECHNIQUES AND APPLICATIONS POWER POINT.pptx
Introduction to databases.pptx
Bioinformatics
Microbial physiology in genomic era
Bioinformatics applications and challenges
Introduction to bioinformatics and databases .pptx
proteome.pdf
Bioinformatics
Ad

More from Dr NEETHU ASOKAN (20)

PPTX
IMPACT OF DIFFERENT CHEMICAL PRETREATMENT.pptx
PPTX
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
PPTX
Reading and reflecting on text Neethu Asokan. pptx
PPTX
Gram-Positive-Bacteria-Characteristics-Classification-and-Pathogenicity.pptx
PPTX
Fundamental-Immunology.pptx
PPTX
Algal-Biotechnology-and-its-Scope.pptx
PPTX
Microbial-Taxonomy-and-Tools.pptx
PPTX
VECTORS.pptx
PPTX
BASIC DNA REPAIR MECHANISMS.pptx
PPTX
Gene regulation
PPTX
Antigen antibody reactions
PPTX
Screening of industrial microorganisms
PPTX
PPTX
Fermentation of wine
PPTX
PPTX
Enzyme production
PPTX
Biosynthesis of purine and pyrimidine
PPTX
PPTX
Genetic code
PPTX
Mapping of genome
IMPACT OF DIFFERENT CHEMICAL PRETREATMENT.pptx
Bioconversion of sago waste and oil cakes into biobutanol using Environmental...
Reading and reflecting on text Neethu Asokan. pptx
Gram-Positive-Bacteria-Characteristics-Classification-and-Pathogenicity.pptx
Fundamental-Immunology.pptx
Algal-Biotechnology-and-its-Scope.pptx
Microbial-Taxonomy-and-Tools.pptx
VECTORS.pptx
BASIC DNA REPAIR MECHANISMS.pptx
Gene regulation
Antigen antibody reactions
Screening of industrial microorganisms
Fermentation of wine
Enzyme production
Biosynthesis of purine and pyrimidine
Genetic code
Mapping of genome

Recently uploaded (20)

PPTX
Cell Types and Its function , kingdom of life
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
Pharma ospi slides which help in ospi learning
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
master seminar digital applications in india
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Cell Types and Its function , kingdom of life
STATICS OF THE RIGID BODIES Hibbelers.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Final Presentation General Medicine 03-08-2024.pptx
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
GDM (1) (1).pptx small presentation for students
Complications of Minimal Access Surgery at WLH
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Pharma ospi slides which help in ospi learning
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
O5-L3 Freight Transport Ops (International) V1.pdf
Classroom Observation Tools for Teachers
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
O7-L3 Supply Chain Operations - ICLT Program
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
master seminar digital applications in india
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Chinmaya Tiranga quiz Grand Finale.pdf
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...

Introduction of bioinformatics

  • 2. Introduction • Bioinformatics is the science concerned with the development and application of computer hardware and software to the acquisition, storge,analysis, and visualization of biological information. • It has the following three component. - The development of new algorithms and statistics for assessing the relationship among large sets of biological data. e.g DNA Sequence data. - Application of these tools for the analysis and interpretation of the various biological data. e.g nucleotide sequences, amino acid sequences. - The development of database of database for an efficient storage, access and management of various biological informations. • The ‘bioinformatics’ is a combination of ‘biology’ and informatics. NEETHUASOKAN
  • 3. Definition • Bioinformatics derives knowledge from computer analysis of biological data. • These can consist of the information stored in the genetic code, but also experimental results from various sources, patient statistics, and scientific literature. • Research in bioinformatics includes method development for storage, retrieval, and analysis of the data. • Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, and physics. • It has many practical applications in different areas of biology and medicine. NEETHUASOKAN
  • 4. History of bioinformatics • The collection of amino acids sequences was complied in the ‘ Atlas of protein sequence and structure’ by the National Biomedical Foundation. • This collection was edited by margaret O.Dayhoff from 1965 to 1978. • Dayhoff and coworkers contributions to the comparison of amino acid sequences by developing computer software for detecting distantly related sequences. • The EMBL established their data library in 1980 to collect, organize and distribute nucleotide sequence data and related information. • NCBI was established in U.S.A. NCBI serves as primary information databank and provider of information. • The National Biomedical Research Foundation established the PIR in 1984. NEETHUASOKAN
  • 5. DNASequences • The symbols used to represent DNA sequence data. • The four bases are denoted by single letters A (Adenine), C (cytosine ), G (guanine), and T (Thymine) • But often sequence data contain ambiguities in that it is not clear as to which of the four base present at several positions. • For example , the sequence data may indicate that the base present at a specific position may be either G or A, it is purine. • Similarly , if a position may have either C or T, it is pyrimidine. • The base sequence of the two complementary strands of a DNA molecules are represented by this system of symbols. NEETHUASOKAN
  • 6. AminoAcid Sequences of Proteins • The amino acids were conventionally represented by three- letters symbols..e.g. Ala for alanine, Val for valine, etc. • But in Bioinformatics, they are denoted by single letter, e.g A for alanine C for cyctine, D for aspartics acid, etc. • But some position in protein sequences have ambiguities this situation is comparable to that for DNA sequences. • For e.g , it may not be clear that a position has glutamine or glutamic acid , the position is given the symbol Z. • The Protein synthesis begin at the N-terminus and proceeds to the C-terminus. • The amino acid Sequences in databases are listed from the N- terminus to the C-terminus of the polypeptide. NEETHUASOKAN
  • 7. Conti... Single letter code Amino acid Three letter Code A Alanine Ala B Asparagine Asx C Cystine Cys D Aspartic acid Asp E Glutamic Acid Glu F Phenylanine Phe G Glcine Gly H Histidine His I Isoleucine Ile K Lysine Lys L Leucine Leu M Methionine Met NEETHUASOKAN
  • 8. Conti..... Single letter code Amino acid Three letter Code N Asparagine Asn P Proline Pro Q Glutamine Glu R Arginine Arg S Serine Ser T Threonine Thr V Valine Val W Tryptophan Trp Y Tyrosine Tyr Z Glutamic acid Glx X Any amino acid Xaa NEETHUASOKAN
  • 9. Types of Sequences in Nucleotide Sequence Databases • The databases on DNA sequences contain a different types. cDNA sequences : • A cDNA molecule is obtained by reverse transcription of an RNA molecule. • The cDNA sequences, therefore represent that part of the genome that is transcribed into RNA. • If the cDNA is obtained from mRNA, it will represent only the exon sequences of the gene expressed in the concerned cell / tissue/organisms. Genomic DNA sequences : • These sequences represent the complete genome of the organisms. • When the genome sequences is completed, it will contain the sequences of the entire genome of the organisms. NEETHUASOKAN
  • 10. Cont... • In Case of prokaryotes, genome consists of usually, a single chromosome, while in case of eukaryotes it relates to the nuclear DNA Expressed Sequence Tag (EST) sequences : • The sequences are obtained by sequencing only a part of the cDNA molecules produced using mRNA. • These sequences are dubbed as ‘tags’ because they can be used as probes for the isolation of the concerned genes from the genomic DNA. • This approach was used by J. Craig Venter and his group for obtaining the sequence of expressed portion of human genome. • The EST technique generated enormous sequence data that permitted the construction of a preliminary transcript map of the human genome. NEETHUASOKAN
  • 11. Conti... Genome Sequence Tag (GST) Sequences: • GSTs were developed for identifying the genes of Plasmodium falciparum. • It was observed that the enzyme mung bean nuclease (Mnase) cuts P.falciparum genomic DNA between genes. • GSTs are developed by sequencing the DNA Fragments on either side of the points of cuts generated by Mnase. Organellar DNA Sequences: • Organellar DNA is the DNA found in mitochondria (mtDNA ) and chloroplasts (cpDNA). • The sequence of the data are complied in databases. NEETHUASOKAN
  • 12. Branches of Bioinformatics • A living cell is a system where cellular components such as genome, the gene transcript, and the proteins interact with each other, and these interactions determine the fact of the cell. e.g Whether a stem cell is going to become a liver cell or a cancer cell. The three branches of bioinformatcs... 1. Genomics 2. Transcriptomics 3. Poteomics NEETHUASOKAN
  • 13. Conti.... Genomics Makes Trancriptomics Makes Proteomics The three major branches of Bioinformatics DNA RNA Protein NEETHUASOKAN
  • 14. Genomics : • Genomics play a significant role in modern biological research in which the nucleotide sequences of ali the chromosomes of an organism are mapped and the location of different genes and their sequence are determined. • This involves extensive analysis of the nucleic acids through molecular biology techniques before the data are ready for processing by Computer. • It is a science that attempts to describe a living organisms in terms of the sequence of its genome. • It Was not reliable to estimate the number of genes in an organism based on the number of nucleotide base pairs because of the presence of high numbers of redundant copies of many genes. • Genomics has helped to rectify this problem. NEETHUASOKAN
  • 15. Conti... • Genomics uses technique of molecular biology and bioinformatics to identify cellular components such as proteins, rRNA, tRAN,etc and analyse the sequences attributed to the structural genes regulatory sequences, and non-coding sequence. • The first automatic DNA sequencer was developed in 1986 by Leroy Hood. • Haemophilus influenzae was the first bacterium to be sequenced in 1995. • Even if one can identify all the genes on a genome , the genes only indicate that, at some point in time, it might be transcribed to produce cellular componts. • eg. A human genome contains about 30,000 to 60,000 protein coding genes, but only a subset of them is expressed in a particular cell type at a particular time. NEETHUASOKAN
  • 16. Transcriptomics • Transcriptomics is the study of the transcriptome, which includes the whole set of mRAN molecules in one or a population of biological cells. • This study helps us to depict the expression level of genes, often using techaniques such as DNA microarrys, that is capable of sampling ten thousands of different mRNAs at a time. • This kind of new technique has helped biologist to routinely monitor the gene expression between the control cells and treatment cells. • Transcriptomics has a few limitations • The relative abundance of transcripts as characterized by the sequential analysis of gene expression (SAGE) or microarry experiments. NEETHUASOKAN
  • 17. Conti.... 1. Differential adaptation to the translational machinery. 2. Differential usage of amino acid of different abundances. 3. The lack of information on post-translation modification of amino acid residues although post-transcriptional modification such as acetylation , hydroxylation, glycosylation, phosphorylation, and cleavage are fundamental in understanding the interaction of cellular components. Proteomics : • Proteomics represents the earliest to identify a major sub- class of cellular components, the proteins and their interactions. • Proteomics involves the sequencing of amino acid in a protein determining its 3D structure and relating it to the function of the protein. NEETHUASOKAN
  • 18. Cont... • Before computer processing comes into the picture, extensive data, particularly through crystallography and nuclear magnetic resonance (NMR). • With such data known as proteins, the structure and its relationship to the function of newly discovered proteins. • In such areas, bioinformatics has enormous analytical and predictive potential. • Metabolic proteins such as haemoglobin and insulin have been subjected to intensive proteomic investigation. • The term ’proteomics’ was coined to make an analogy with genomics. • Scientists feel that the bioinformatics of proteins is crucial , to understands the cellular components and the interactions completely. NEETHUASOKAN
  • 19. Aims of Bioinformatics • The various important ways in which bioinformatics can be used. • The aim of bioinformatics is fourfold and includes data acquisition, tool and database development, data analysis, and data integration. Data Acquisition: • Data Acquisition is primarily concerned with accessing and storing data generated directly from the biological experiments. • The data generated by various sequencing projects have to be retrieved in the appropriate format, and capable of being linked to all the information related to the DNA samples. • The data are organized in different databases so that the researchers can access existing information. NEETHUASOKAN
  • 20. Tool and Database Development • Many laboratories generate large volumes of data such as DNA sequences, gene expression information, 3D molecular structure , and highly-throughput screening. • Consequently, they must develop effective databases for storing and quickly accessing data. The other aim is to develop tools and resources that aid in the analysis of data. Data Analysis: • The third aim is to use these tool to analyse the data and interpret the results in a biologically meaningful manner. Efficient analysis require an efficiently deigned database. • It must allow researchers to place their query effectively and provide them with all the information they need to begin their data analysis. NEETHUASOKAN
  • 21. Conti... • If queries cannot be performed , or if the performance is too slow, the whole system breaks down since scientists will not be inclined to use the database. Data Integration : • Once information has been analysed , a researcher must often associate or integrate it with the related data from the other databases. • For e.g a scientist may run a series of gene expression analysis experiments and observe that a particular et of 100 genes is more highly expressed in a cancerous lung tissue than in a normal lung tissue. • The scientist may wonder which of the genes is most likely to be truly related to the disease. NEETHUASOKAN
  • 22. Bioinformatics Applications Molecular medicine : • The human genome will have profound effects on the fields of biomedical research and clinical medicine. Every disease has a genetic component. • This may be inherited or a result of the body's response to an environmental stress which causes alterations in the genome (eg. cancers, heart disease, diabetes.) • The completion of the human genome means that we can search for the genes directly associated with different diseases and begin to understand the molecular basis of these diseases more clearly. • This new knowledge of the molecular mechanisms of disease will enable better treatments, cures and even preventative tests to be developed. NEETHUASOKAN
  • 23. Conti... • Personalised medicine: • Clinical medicine will become more personalised with the development of the field of pharmacogenomics. This is the study of how an individual's genetic inheritence affects the body's response to drugs. • At present, some drugs fail to make it to the market because a small percentage of the clinical patient population show adverse affects to a drug due to sequence variants in their DNA. • As a result, potentially life saving drugs never make it to the marketplace. Today, doctors have to use trial and error to find the best drug to treat a particular patient as those with the same clinical symptoms can show a wide range of responses to the same treatment. NEETHUASOKAN
  • 24. Conti... Drug development : • At present all drugs on the market target only about 500 proteins. • With an improved understanding of disease mechanisms and using computational tools to identify and validate new drug targets, more specific medicines that act on the cause, not merely the symptoms, of the disease can be developed. • These highly specific drugs promise to have fewer side effects than many of today's medicines. NEETHUASOKAN
  • 25. Conti... Gene therapy : • In the not too distant future, the potential for using genes themselves to treat disease may become a reality. • Gene therapy is the approach used to treat, cure or even prevent disease by changing the expression of a persons genes. • Currently, this field is in its infantile stage with clinical trials for many different types of cancer and other diseases ongoing. NEETHUASOKAN
  • 26. Conti... • The reality of bioweapon creation : • Scientists have recently built the virus poliomyelitis using entirely artificial means. • They did this using genomic data available on the Internet and materials from a mail-order chemical supply. • The research was financed by the US Department of Defence as part of a biowarfare response program to prove to the world the reality of bioweapons. The researchers also hope their work will discourage officials from ever relaxing programs of immunisation. This project has been met with very mixed feelings. NEETHUASOKAN
  • 27. Conti..... Antibiotic resistance : • Scientists have been examining the genome of Enterococcus faecalisa leading cause of bacterial infection among hospital patients. • They have discovered a virulence region made up of a number of antibiotic-resistant genes that may contribute to the bacterium's transformation from a harmless gut bacteria to a menacing invader. • The discovery of the region, known as a pathogenicity island, could provide useful markers for detecting pathogenic strains and help to establish controls to prevent the spread of infection in wards. NEETHUASOKAN
  • 28. References • Bioinformatics Principles and Applications – Zhumur Ghosh, Bibekanand Mallick. • Bioinformatics – B.D Singh • WWW.Scfbio.iitd.org • WWW.ncbi.nim.nih.gov • http.//genes.mit.edu/Genscan html. NEETHUASOKAN