SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
Genomics and Bioinformatics


          Peter Gregory and Senthil Natesan
Genomics
   Genomics is the study of the        Requires a large amount of
    genomes (i.e. the entire             information per individual:
    hereditary information) of           • Expensive in agriculture
    organisms and includes:                where many individuals
    • Determining the entire DNA           need to be analyzed
      sequence
    • Fine-scale genetic mapping
    • Studies of intragenomic
      phenomena
   Used to determine an ideal
    genotype instead of just a
    few genes.
   The study of whole genomes
    of populations of individuals
    can reveal the genetic basis
    of different responses to both
    biotic and abiotic stresses
Benefits of Genomics to Crop
                 Improvement
   Unlimited possibilities for crop improvement,
    especially in combination with genetic
    engineering:
    • Improved crop productivity
    • Increased nutritional quality and quantity
    • Tolerance to abiotic stresses – drought, low
      quality soils (acidity, low nutrient content)
    • Tolerance to biotic stresses - pests and diseases
    • Etc, etc, etc
Proteomics
   The study of the proteins
    present in a cell (specific
    time and conditions)
   Proteomics includes:
    • Identification of all proteins
      in a cell
    • Posttranslational
      modifications of proteins
    • Protein-protein interaction
    • Subcellular location of
      proteins
Bioinformatics
   A term describing the tools to
    handle the enormous
    amounts of data coming from
    the genomics and proteomics
    programs
   Makes possible the
    investigation of correlations
    which would not be possible
    manually
   The algorithms to analyze
    the data are still at an
    experimental stage and there
    are still questions over doing
    experiments in silico which
    might not be relevant in the
    biological world
Subfields of Genomics
   Structural genomics:
    •   Construction of genomic sequence data
    •   Gene discovery and localization
    •   Construction of gene maps
   Functional genomics:
    •   Biological function of genes
    •   Regulation
    •   Products
    •   Plant development studies
   Comparative genomics:
    • Compares gene sequences to elucidate
      functional or evolutionary relationships
Structural Genomics

   Uses DNA sequencing technology and
    software programs to generate, store, and
    analyze genomic sequence information
   Two approaches to genome sequencing:
    • Map-based sequencing
    • Shotgun sequencing
Map-Based Sequencing
Map-Based Sequencing
Shotgun Sequencing
       •   Multiple copies of the genome are randomly
           shredded into pieces by squeezing the DNA
           through a pressurized syringe. This is done
           a second time to generate pieces that are
           10,000 bp long
       •   Each 2,000 and 10,000 bp fragment is
           inserted into a plasmid
            – The two collections of plasmids containing
              2,000 and 10,000 bp chunks of DNA are
              plasmid libraries
       •   Both the 2,000 and the 10,000 bp plasmid
           libraries are sequenced. 500 bp from each
           end of each fragment are decoded
           generating millions of sequences
           Sequencing both ends of each insert is
           critical for the assembling the entire
           chromosome
       •   Computer algorithms assemble the millions
           of sequenced fragments into a continuous
           stretch resembling each chromosome
Finding the Genes

   After sequencing, need to find the
    genes, using computer algorithms –
    this step is called ‘annotation’
   Annotation identifies:
    • Protein-coding genes
    • Initiation sequences
    • Regulatory sequences
    • Termination sequences
    • Nonprotein-coding sequences
Finding the Genes, Cont’d
   The identifying features of protein-coding
    genes are open reading frames (ORFs):
     • Continuous sets of DNA nucleotide triplets
       that can be translated into the amino acid
       sequence of a protein
     • ORFs begin with an initiation sequence,
       usually ATG
     • ORFs end with a termination sequence,
       usually TAA, TAG or TGA
Introns and Exons
Analysis of DNA Sequence Information I:
    Location of Genes Not Apparent
Analysis of DNA Sequence Information II:
Location of Regulatory Sequence and ORF’s
Gene Function?
   After genome sequencing is annotated, functions
    need to be assigned to all genes in the sequence
    • Some of the identified genes might have functions
      assigned already via classical methods of
      mutagenesis and linkage mapping
    • Some may not have assigned functions – use
      homology searches:
          Computer-based comparisons of the sequence under
           study with known sequences from other organisms
Genome Size and Gene Number in Selected
              Eukaryotes
Unique Features of Eukaryotic Genomes

   Gene density
    • Wide range compared to prokaryotes
   Introns
    • Wide variation among eukaryotes
   Repetitive sequences
    • Along with the presence of introns, repetitive
      sequences are responsible for the wide range
      of genome sizes in eukaryotes
          In maize two thirds of the genome comprises
           repetitive DNA
Plant Model Organisms
          • Arabidiopsis thaliana
             --Model flowering plant and dicot
               --Sequence finished in 2001
               --First flowering plant to be sequenced

          • Oryza sativa (rice)
              --Model monocot
               --Sequence finished in 2005
               --First crop plant to be sequenced
          • Medicago truncatula (barrel medic)
             --Model legume
          • Lycopersicon esculentum (tomato)
              --Model fruit-bearing plant


Note: Hundreds of other genomes (plant, animal, bacterial and viral)
have been, or are being, sequenced
Arabidopsis Sequencing Facts
•Arabidopsis has a small (125 Mb) sized-genome on 5
chromosomes
    -Human has 3,000 Mb on 23 chromosomes
    -Maize has 2,500 Mb on 10 chromosomes
    -Medicago has 520 Mb on 8 chromosomes
    -Rice has 430 Mb on 12 chromosomes
    -Lily has 50,000 Mb on 12 chromosomes


•Arabidopsis has approx.
25,500 genes
    -humans have slightly fewer,
    about 24,000
Arabidopsis Genome
Comparative Genomics
   The study of the relationship of genome structure
    and function across different biological species or
    strains:
    • Holds great promise to yield insights into many
      aspects of the evolution of modern species
          Enormous potential for crop genetics and breeding
    • The vast amount of information contained in
      modern genomes necessitates that the methods of
      comparative genomics are automated
    • Having come a long way from its initial use of
      finding functional proteins, comparative genomics
      is now concentrating on finding regulatory regions
      and other features of the genome

More Related Content

PPTX
Applications of genomics in plants
PPTX
Bioinformatics intervention in crop improvement
PPT
Bioinformatics
PPT
Genomics and Plant Genomics
PPTX
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
PDF
Genomics
Applications of genomics in plants
Bioinformatics intervention in crop improvement
Bioinformatics
Genomics and Plant Genomics
Bioinformatics and its Applications in Agriculture/Sericulture and in other F...
Genomics

What's hot (20)

PPTX
Functional genomics
PPTX
Bioinformatics
PDF
PPTX
An Introduction to Genomics
PPTX
Genome sequencing
PPTX
Genomics
PPTX
Functional genomics
PPTX
Bioinformatics
PPTX
DNA data bank of japan (DDBJ)
PPT
ENTREZ.ppt
PPT
Structural genomics
PPTX
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
PDF
PPTX
Genes, Genomics and Proteomics
PPTX
DNA Sequencing
PPTX
Protein Data Bank ( PDB ) - Bioinformatics
PPTX
Genomics(functional genomics)
PPT
Gene bank by kk sahu
PPSX
Functional genomics
Functional genomics
Bioinformatics
An Introduction to Genomics
Genome sequencing
Genomics
Functional genomics
Bioinformatics
DNA data bank of japan (DDBJ)
ENTREZ.ppt
Structural genomics
STRUCTURAL GENOMICS, FUNCTIONAL GENOMICS, COMPARATIVE GENOMICS
Genes, Genomics and Proteomics
DNA Sequencing
Protein Data Bank ( PDB ) - Bioinformatics
Genomics(functional genomics)
Gene bank by kk sahu
Functional genomics
Ad

Viewers also liked (20)

PPT
PPTX
Types of genomics ppt
PPTX
Genomics seminar
PPTX
Genomics
PPTX
Proteomics ppt
PPTX
Bioinformatics and functional genomics
PPTX
Structural genomics
PPTX
Genomics
PPTX
Proteomics
PPT
Proteomics
PDF
Proteomics
PPT
Bioinformatics
PPT
Introduction to genomes
PPTX
Crop plants genetic and genomic resources
PPTX
Proteomics
PPTX
Bioinformatics Final Presentation
PPT
Bioinformatics
PDF
Basics of bioinformatics
PPTX
Proteomics
Types of genomics ppt
Genomics seminar
Genomics
Proteomics ppt
Bioinformatics and functional genomics
Structural genomics
Genomics
Proteomics
Proteomics
Proteomics
Bioinformatics
Introduction to genomes
Crop plants genetic and genomic resources
Proteomics
Bioinformatics Final Presentation
Bioinformatics
Basics of bioinformatics
Proteomics
Ad

Similar to Genomics and bioinformatics (20)

PPTX
Comparative genomics and proteomics
PPTX
Topic 5 of the genomics and proteomics.pptx
PPTX
genomics-180323095216.pptx
PDF
Genome project.pdf
DOCX
rheumatoid arthritis
PDF
bioinformatics enabling knowledge generation from agricultural omics data
PPTX
Chapter 20 ppt
PPT
Genome sequencing
PPTX
THE human genome
PPT
21 lecture genome_and_evolution
PPT
Genomes and Their Evolution detailed explanation
PPTX
Genomics platform for agriculture-CAT lecture
PPT
Unilag workshop complex genome analysis
PPTX
Genomics and proteomics ppt
PPTX
Whole genome sequencing of arabidopsis thaliana
PDF
Biotech 2011-01-intro
PDF
Genomics and proteomics I
PPTX
Whole genome sequencing of bacteria & analysis
PDF
BITS - Introduction to comparative genomics
Comparative genomics and proteomics
Topic 5 of the genomics and proteomics.pptx
genomics-180323095216.pptx
Genome project.pdf
rheumatoid arthritis
bioinformatics enabling knowledge generation from agricultural omics data
Chapter 20 ppt
Genome sequencing
THE human genome
21 lecture genome_and_evolution
Genomes and Their Evolution detailed explanation
Genomics platform for agriculture-CAT lecture
Unilag workshop complex genome analysis
Genomics and proteomics ppt
Whole genome sequencing of arabidopsis thaliana
Biotech 2011-01-intro
Genomics and proteomics I
Whole genome sequencing of bacteria & analysis
BITS - Introduction to comparative genomics

More from Senthil Natesan (20)

PPTX
Centre of innovation, Agricultural College and Research Institute,Madurai
PPTX
Indian agriculture
PPTX
wheat association mapping LTN
PPT
Paradigm shift in breeding for Sugarcane to Energycane – An exclusive biofuel...
PPTX
The need for nutrient efficient rice varieties Status and prospects
PPT
Deployment of rust resistance genes in wheat varieties
PPTX
Caster pollination
PPT
Genomics Assisted Breeding for Resilient Rice: Progress and Prospects
PPTX
COCONUT GENETIC RESOURCES CONSERVATION & UTILIZATION IN INDIA
PPTX
Germplasm conservation in Oil Palm
PPTX
Improvement of Medicinal Plants: Challenges and Innovative Approaches
PPTX
Role of induced mutations in legume improvement-Dr.Souframanien
PPT
Towards improvement of oil content in safflower (Carthamus tinctorius L.)
PPT
New paradigm in Seed industry
PPTX
Castor database ; Casterdp
PPTX
Engineering fatty acid biosynthesis
PPTX
Edible vaccine
PPTX
Cellular signal transduction pathways under abiotic stress
PPTX
Genotyping by Sequencing
PPTX
TNAU CRMD - A Customer Relationship Management datahouse for TNAU
Centre of innovation, Agricultural College and Research Institute,Madurai
Indian agriculture
wheat association mapping LTN
Paradigm shift in breeding for Sugarcane to Energycane – An exclusive biofuel...
The need for nutrient efficient rice varieties Status and prospects
Deployment of rust resistance genes in wheat varieties
Caster pollination
Genomics Assisted Breeding for Resilient Rice: Progress and Prospects
COCONUT GENETIC RESOURCES CONSERVATION & UTILIZATION IN INDIA
Germplasm conservation in Oil Palm
Improvement of Medicinal Plants: Challenges and Innovative Approaches
Role of induced mutations in legume improvement-Dr.Souframanien
Towards improvement of oil content in safflower (Carthamus tinctorius L.)
New paradigm in Seed industry
Castor database ; Casterdp
Engineering fatty acid biosynthesis
Edible vaccine
Cellular signal transduction pathways under abiotic stress
Genotyping by Sequencing
TNAU CRMD - A Customer Relationship Management datahouse for TNAU

Recently uploaded (20)

PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
RMMM.pdf make it easy to upload and study
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Lesson notes of climatology university.
PPTX
master seminar digital applications in india
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Basic Mud Logging Guide for educational purpose
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Complications of Minimal Access Surgery at WLH
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
human mycosis Human fungal infections are called human mycosis..pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Microbial diseases, their pathogenesis and prophylaxis
RMMM.pdf make it easy to upload and study
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
O7-L3 Supply Chain Operations - ICLT Program
Cell Types and Its function , kingdom of life
Module 4: Burden of Disease Tutorial Slides S2 2025
Lesson notes of climatology university.
master seminar digital applications in india
Anesthesia in Laparoscopic Surgery in India
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Supply Chain Operations Speaking Notes -ICLT Program
O5-L3 Freight Transport Ops (International) V1.pdf
Basic Mud Logging Guide for educational purpose
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Complications of Minimal Access Surgery at WLH
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...

Genomics and bioinformatics

  • 1. Genomics and Bioinformatics Peter Gregory and Senthil Natesan
  • 2. Genomics  Genomics is the study of the  Requires a large amount of genomes (i.e. the entire information per individual: hereditary information) of • Expensive in agriculture organisms and includes: where many individuals • Determining the entire DNA need to be analyzed sequence • Fine-scale genetic mapping • Studies of intragenomic phenomena  Used to determine an ideal genotype instead of just a few genes.  The study of whole genomes of populations of individuals can reveal the genetic basis of different responses to both biotic and abiotic stresses
  • 3. Benefits of Genomics to Crop Improvement  Unlimited possibilities for crop improvement, especially in combination with genetic engineering: • Improved crop productivity • Increased nutritional quality and quantity • Tolerance to abiotic stresses – drought, low quality soils (acidity, low nutrient content) • Tolerance to biotic stresses - pests and diseases • Etc, etc, etc
  • 4. Proteomics  The study of the proteins present in a cell (specific time and conditions)  Proteomics includes: • Identification of all proteins in a cell • Posttranslational modifications of proteins • Protein-protein interaction • Subcellular location of proteins
  • 5. Bioinformatics  A term describing the tools to handle the enormous amounts of data coming from the genomics and proteomics programs  Makes possible the investigation of correlations which would not be possible manually  The algorithms to analyze the data are still at an experimental stage and there are still questions over doing experiments in silico which might not be relevant in the biological world
  • 6. Subfields of Genomics  Structural genomics: • Construction of genomic sequence data • Gene discovery and localization • Construction of gene maps  Functional genomics: • Biological function of genes • Regulation • Products • Plant development studies  Comparative genomics: • Compares gene sequences to elucidate functional or evolutionary relationships
  • 7. Structural Genomics  Uses DNA sequencing technology and software programs to generate, store, and analyze genomic sequence information  Two approaches to genome sequencing: • Map-based sequencing • Shotgun sequencing
  • 10. Shotgun Sequencing • Multiple copies of the genome are randomly shredded into pieces by squeezing the DNA through a pressurized syringe. This is done a second time to generate pieces that are 10,000 bp long • Each 2,000 and 10,000 bp fragment is inserted into a plasmid – The two collections of plasmids containing 2,000 and 10,000 bp chunks of DNA are plasmid libraries • Both the 2,000 and the 10,000 bp plasmid libraries are sequenced. 500 bp from each end of each fragment are decoded generating millions of sequences Sequencing both ends of each insert is critical for the assembling the entire chromosome • Computer algorithms assemble the millions of sequenced fragments into a continuous stretch resembling each chromosome
  • 11. Finding the Genes  After sequencing, need to find the genes, using computer algorithms – this step is called ‘annotation’  Annotation identifies: • Protein-coding genes • Initiation sequences • Regulatory sequences • Termination sequences • Nonprotein-coding sequences
  • 12. Finding the Genes, Cont’d  The identifying features of protein-coding genes are open reading frames (ORFs): • Continuous sets of DNA nucleotide triplets that can be translated into the amino acid sequence of a protein • ORFs begin with an initiation sequence, usually ATG • ORFs end with a termination sequence, usually TAA, TAG or TGA
  • 14. Analysis of DNA Sequence Information I: Location of Genes Not Apparent
  • 15. Analysis of DNA Sequence Information II: Location of Regulatory Sequence and ORF’s
  • 16. Gene Function?  After genome sequencing is annotated, functions need to be assigned to all genes in the sequence • Some of the identified genes might have functions assigned already via classical methods of mutagenesis and linkage mapping • Some may not have assigned functions – use homology searches:  Computer-based comparisons of the sequence under study with known sequences from other organisms
  • 17. Genome Size and Gene Number in Selected Eukaryotes
  • 18. Unique Features of Eukaryotic Genomes  Gene density • Wide range compared to prokaryotes  Introns • Wide variation among eukaryotes  Repetitive sequences • Along with the presence of introns, repetitive sequences are responsible for the wide range of genome sizes in eukaryotes  In maize two thirds of the genome comprises repetitive DNA
  • 19. Plant Model Organisms • Arabidiopsis thaliana --Model flowering plant and dicot --Sequence finished in 2001 --First flowering plant to be sequenced • Oryza sativa (rice) --Model monocot --Sequence finished in 2005 --First crop plant to be sequenced • Medicago truncatula (barrel medic) --Model legume • Lycopersicon esculentum (tomato) --Model fruit-bearing plant Note: Hundreds of other genomes (plant, animal, bacterial and viral) have been, or are being, sequenced
  • 20. Arabidopsis Sequencing Facts •Arabidopsis has a small (125 Mb) sized-genome on 5 chromosomes -Human has 3,000 Mb on 23 chromosomes -Maize has 2,500 Mb on 10 chromosomes -Medicago has 520 Mb on 8 chromosomes -Rice has 430 Mb on 12 chromosomes -Lily has 50,000 Mb on 12 chromosomes •Arabidopsis has approx. 25,500 genes -humans have slightly fewer, about 24,000
  • 22. Comparative Genomics  The study of the relationship of genome structure and function across different biological species or strains: • Holds great promise to yield insights into many aspects of the evolution of modern species  Enormous potential for crop genetics and breeding • The vast amount of information contained in modern genomes necessitates that the methods of comparative genomics are automated • Having come a long way from its initial use of finding functional proteins, comparative genomics is now concentrating on finding regulatory regions and other features of the genome

Editor's Notes

  • #3: Bullet 1: The genome of an organism is a complete genetic sequence of a full set of chromosomes in a gamete Intragenomic phenomena such as heterosis, epistasis, pleiotropy and other interactions between loci and alleles within the genome. In contrast, the investigation of the roles and functions of single genes is a primary focus of molecular biology or genetics Research on single genes does not fall into the definition of genomics unless the aim of the study is to elucidate its effect on, place in, and response to the entire genome Heterosis or hybrid vigor: increased strength of different characteristics in hybrids; Epistasis: the interaction between genes/when the effects of one gene are modified by one or several other genes, which are sometimes called modifier genes Pleiotropy: describes the genetic effect of a single gene on multiple phenotypic traits Bullet 3: beyond the single gene affects which are currently being studied.
  • #5: Outgrowth of genomics Analyzing the functional aspects of genes (i.e. the proteins and their activation) provides information about how genes operate to affect traits. This information enables rational design of modification to genes and genomes rather than simply identifying ‘good’ or ‘bad’ genes The level of complexity is very high and currently the protein spectrum in very few tissues (mostly human) has been analyzed in any detail Metabolomics: Study of the metabolome, the complete set of metabolites in an organism
  • #6: Bullet 1: Not a technology per se, but Already a vital tool to handle data, much of the software was developed during the human genome project in the 90s Bullet 3: algorithm is an effective method expressed as a finite list [1] of well-defined instructions for calculating a function
  • #9: A GENETIC MAP IS A CHROMOSOME MAP OF A SPECIES OR EXPERIMENTAL POPULATION THAT SHOWS THE POSITION OF ITS KNOWN GENES AND/OR MARKERS RELATIVE TO EACH OTHER, RATHER THAN AS SPECIFIC PHYSICAL POINTS ON EACH CHROMOSOME. MAP-BASED SEQUENCING BEGINS WITH THE CONSTRUCTION OF A GENOMIC LIBRARY USING VECTORS THAT CAN ACCOMMODATE LARGE FRAGMENTS OF AN ORGANISM’S GENOME A GENOMIC LIBRARY IS A POPULATION OF HOST BACTERIA, EACH OF WHICH CARRIES A DNA MOLECULE THAT WAS INSERTED INTO A CLONING VECTOR, SUCH THAT THE COLLECTION OF CLONED DNA MOLECULES REPRESENTS THE ENTIRE GENOME OF THE SOURCE ORGANISM. NEXT, THE CLONES ARE ASSEMBLED INTO GENETIC MAPS GENETIC MAPS ARE BASED ON THE FREQUENCIES OF RECOMBINATION BETWEEN MARKERS DURING CROSSOVER OF HOMOLOGOUS CHROMOSOMES. THE GREATER THE FREQUENCY OF RECOMBINATION (SEGREGATION) BETWEEN TWO GENETIC MARKERS, THE FARTHER APART THEY ARE ASSUMED TO BE.
  • #10: PHYSICAL MAP IS BASED ON DIRECT ANALYSIS OF DNA RATHER THAN RECOMBINATIONAL FREQUENCY – INCREASES MAP RESOLUTION SEE FIG – PHYSICAL MAP BASED ON SET OF OVERLAPPING ORDERED CLONES (CONTIGS) A CONTIG (FROM CONTIGUOUS ) IS A SET OF OVERLAPPING DNA SEGMENTS DERIVED FROM A SINGLE GENETIC SOURCE EACH CLONE SEQUENCED INDIVIDUALLY INCREASING RESOLUTION FROM START TO FINISH
  • #11: The shotgun sequencing method goes straight to the job of decoding, bypassing the need for a physical map Much faster than map-based sequencing COMBINATION OF SEQUENCING TECHNOLOGY AND SOFTWARE DEVELOPMENT CLONAL SELECTION IS RANDOM LAST STEP: COMPUTER FACILITATES ID OF SEQUENCE OVERLAPS TO ASSEMBLE TOTAL SEQUENCE
  • #14: BUT, ORGANIZATION OF GENES IN EUKARYOTES MAKES DIRECT SEARCHING OFR ORFs MORE DIFFICULT THAN IN PROKARYOTES 1. EUKARYOTIC GENES HAVE INTRONS, NON CODING REGIONS, BETWEEN CODING REGIONS - SO MOST EUKARYOTIC GENES COMPRISE ORFs (EXONS) INTERSPERSED WITH INTRONS BOTTOM LINE – IS IMPORTANT TO DISTINGUISH BETWEEN INTRONS, EXONS, AND GENES
  • #15: SHOWS PORTION OF THE HUMAN GENOME SEQUENCE INITIALLY NOT CLEAR IF IT CONTAINS ANY GENES BUT CONTROL REGIONS AT THE BEGINNING OF GENES ARE MARKED BY IDENTIFIABLE SEQUENCES SPLICE SITES BETWEEN EXONS AND INTRONS HAVE A PREDICTABLE SEQUENCE: MOST INTRONS BEGIN WITH GT AND END WITH AG. END OF GENE HAS A POLY A TAIL IF A DNA SEQUENCE ENCODES A PROTEIN, AFTER SPLICING THE SEQUENCE CONTAINS ONE OR MORE ORFs In molecular biology splicing is a modification of an RNA after transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic messenger RNA before it can be used to produce a correct protein through translation. For many eukaryotic introns, splicing is done in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs, but there are also self-splicing introns.
  • #16: ANALYSIS OF THE SEQUENCE SHOWS IT CONTAINS A CONTROL REGION AND THREE EXONS USING THIS SEQUENCE TO SEARCH GENOMIC DATABASES, SHOWED THIS IS THE SEQUENCE OF A SINGLE GENE – THE HUMAN BETA GLOBIN GENE MOST INTRONS BEGIN WITH GT AND END WITH AG. END OF GENE HAS A POLY A TAIL
  • #18: NOW CONSIDER SOME FEATURES OF EUKARYOTIC GENOMES BASIC FEATURES SIMILAR, GENOME SIZE IS HIGHLY VARIABLE 10,000 FOLD RANGE BETWEEN FUNGI AND FLOWERING PLANTS NUMBER OF GENES VARIES MUCH LESS
  • #21: FRUIT FLY OF THE PLANT WORLD SMALL SHORT GENERATION TIME SMALL GENOME ON 5 CHROMOSOMES MAP BASED SEQUENCING USED
  • #22: ASSIGMENT OF GENES BASED ON HOMOLOGY SEARCHES CROP PLANTS HAVE MUCH LARGER GENOMES THAN ARABIDOPSIS BUT SOME HAVE ABOUT THE SAME NUMBER OF GENES IN THE LARGE GENOME PLANTS, GENES ARE CLUSTERED IN STRETCHES OF DNA SEPARATED LONG STRETCHES OF SPACER DNA SEQUENCING OF RICE GENOME COMPLETED IN 2005 90% OF ARABIDOPSIS GENES FOUND IN RICE BUT ONLY 70% OF RICE GENES FOUND IN ARABIDOPSIS INDICATES THAT CEREAL CROPS MAY HAVE UNIQUE GENE SETS ID OF THESE UNIQUE GENES CRITICAL TO REALIZING THE FULL POTENTIAL OF GENOMICS (AND GENETIC ENGINEERING) IN HELPING TO FEED THE EARTH’S GROWING POPULATION