SlideShare a Scribd company logo
Scientific Applications of Machine Learning Eric Mjolsness Scientific Inference Systems Laboratory Donald Bren School of Information and Computer Sciences,  and   Institute for Genomics and Bioinformatics University of California, Irvine
Scientific Imagery Applications NGC 7331  -  http://photojournal.jpl.nasa.gov/catalog/PIA06322 Arabidopsis SAM - Meyerowitz Lab
Some Basic Machine Learning Distinctions Supervised vs. unsupervised learning Supervised e.g. classification and regression  Feature selection regression for phenomenological model fitting e.g. GRN’s Unsupervised e.g. clustering; may be preprocessor Generative vs. Kernal methods Generative (statistical inference) models Kernal methods e.g Support Vector Machines Vector vs. Relationship data Vector data: preprocessed image features   log I,   x, … Images, time series, shifted spectra - semigroup actions Sparse graph/relationship data - permutation actions
Correspondence Problems Extended sources - map morphologies Similar to biological imaging problems Fewer sources but many pixels Moving or changing point sources E.g. Ida and Dactyl / JPL MLS Dense point sources with instrument noise e.g. globular clusters (radial density function) Techniques:  soft permutations, geometric transformations via optimization & continuation Embedding inside a graph clustering (optimization) algorithm Multiscale acceleration of optimization
Mixture Models Mixture of Gaussians, t-distributions, … Can do outlier detection Mixture of factor analyzers Mixture of time series models * Problem-specific generative models Can formulate with a Stochastic Parameterized Grammar Clustering graphs Frey et al. 1998 Utsugi and Kumagai 2000
Stochastic Grammars for Data Modeling
Text & Biology Models
More Detailed Clustering Grammars Clusters generate data Priors on cluster centers & variances Iterative through levels in a hierarchy Recursive through hierarchy
Rock Field Grammar grammar rockfield() {
Transcriptional Gene Regulation Networks Gene Regulation Network (GRN) model T v E xt r ac el lu la r co mm un i ca ti on Drosophila  eve   stripe expression in model (right) and data (left).  Green:  eve  expression,  red:  kni  expression.  From [Reinitz and Sharp, Mech. of Devel., 49:133-158, 1995 ].   [Mjolsness et al. J. Theor. Biol. 152: 429-453, 1991]
Gene Regulation + Signal Transduction Network T transcriptional regulation targets receptors ligands cell nucleus 
Software architectures for  systems biology: Sigmoid & Cellerator
3-tier architecture Database Access Model Translation Sigmoid Pathway  Representation/Storage Database Cellerator Simulation/Inference Engine ME NU Interactive Graphic Model (SVG/Applet) Graphic Output OJB API JLink API P R O P E R T Y SOAP:  Web Service XML(Object),   Image, v ia HTTP
Possible software support Machine learning (open source/academic) CompClust (CIT/JPL):  Scripting/GUI dichotomy data point;  dataset views WEKA data mining Intel: PNL Probabilistic Networks Library Future: stochastic grammar modeler  + autogeneration (as in Cellerator) Image processing, data environments  Matlab, IDL, Mathematica, Khoros/VisiQuest, … NIHImage/ImageJ, …
Metadata in Systems Biology SBML Sigmoid UML
Fletcher et al., Science v. 283, 1999 Brand et. al., Science  289 , 617-619, (2000)  WUS
SAM gene network: Results protein concentrations  Y wus (init)  and  L1 X
SAM: Gene Network Model Z wus clv1 clv3 X diffusive Y L1 diffusive
SAM growth imagery PIN1 cell walls
Venu Gonehal
Basic Machine Learning Distinctions Supervised vs. unsupervised learning Supervised e.g. classification and regression  Feature selection regression for phenomenological model fitting e.g. GRN’s Unsupervised e.g. clustering; may be preprocessor Generative vs. Kernal methods Generative (statistical inference) models Kernal methods e.g Support Vector Machines Vector vs. Relationship data Vector data: preprocessed image features   log I,   x, … Images, time series, shifted spectra - semigroup actions Sparse graph/relationship data - permutation actions
Contacts Wayne Hayes, UCI ICS faculty scientific computing UCI ICS Maching Learning Padhraic Smyth Pierre Baldi Chris Hart, Caltech Biology grad student

More Related Content

PPTX
BioInformatics Software
DOC
syl_f99.doc
PPTX
Interpretable machine-learning (in endocrinology and beyond)
PDF
Standardization of the HIPC Data Templates: The Story So Far
PPSX
2013: Prototype-based learning and adaptive distances for classification
PPT
TreeBASE CIPRES
PPTX
Variant (SNPs/Indels) calling in DNA sequences, Part 1
PDF
Nirmal k. bose
BioInformatics Software
syl_f99.doc
Interpretable machine-learning (in endocrinology and beyond)
Standardization of the HIPC Data Templates: The Story So Far
2013: Prototype-based learning and adaptive distances for classification
TreeBASE CIPRES
Variant (SNPs/Indels) calling in DNA sequences, Part 1
Nirmal k. bose

What's hot (6)

PPSX
Prototype-based classifiers and their applications in the life sciences
PPT
20090219 The case for another systems biology modelling environment
PPSX
2015: Distance based classifiers: Basic concepts, recent developments and app...
PPSX
2016: Classification of FDG-PET Brain Data
Prototype-based classifiers and their applications in the life sciences
20090219 The case for another systems biology modelling environment
2015: Distance based classifiers: Basic concepts, recent developments and app...
2016: Classification of FDG-PET Brain Data
Ad

Similar to Scientific applications of machine learning (20)

PPT
Lec1-Into
PPT
32_Nov07_MachineLear..
PDF
Executable Biology Tutorial
PPT
Cornell Pbsb 20090126 Nets
PPS
Brief Tour of Machine Learning
PPTX
2013 nas-ehs-data-integration-dc
PPT
. An introduction to machine learning and probabilistic ...
PDF
Genomic Signal Processing 1st Edition Ilya Shmulevich Edward R Dougherty
PDF
Standards and software: practical aids for reproducibility of computational r...
PDF
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
PDF
Computational Approaches to Systems Biology
PPT
Machine Learning and Inductive Inference
PDF
introduction_Machine_Learning_Slides.pdf
DOC
NatashaBME1450.doc
PPT
COMP60431 Machine Learning Advanced Computer Science MSc
PDF
A status update on COMBINE standardization activities, with a focus on SBML
PPTX
Soft computing
PDF
Data and model management in Systems Biology
PPT
Services For Science April 2009
PPT
Computational Biology, Part 4 Protein Coding Regions
Lec1-Into
32_Nov07_MachineLear..
Executable Biology Tutorial
Cornell Pbsb 20090126 Nets
Brief Tour of Machine Learning
2013 nas-ehs-data-integration-dc
. An introduction to machine learning and probabilistic ...
Genomic Signal Processing 1st Edition Ilya Shmulevich Edward R Dougherty
Standards and software: practical aids for reproducibility of computational r...
P
 Systems 
Model 
Optimisation 
by
 Means 
of 
Evolutionary 
Based 
Search
 ...
Computational Approaches to Systems Biology
Machine Learning and Inductive Inference
introduction_Machine_Learning_Slides.pdf
NatashaBME1450.doc
COMP60431 Machine Learning Advanced Computer Science MSc
A status update on COMBINE standardization activities, with a focus on SBML
Soft computing
Data and model management in Systems Biology
Services For Science April 2009
Computational Biology, Part 4 Protein Coding Regions
Ad

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
DOC
1. MPEG I.B.P frame之不同
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPT
Timeline: The Life of Michael Jackson
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPTX
Com 380, Summer II
PPT
PPT
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
DOC
MICHAEL JACKSON.doc
PPTX
Social Networks: Twitter Facebook SL - Slide 1
PPT
Facebook
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
DOC
NEWS ANNOUNCEMENT
DOC
C-2100 Ultra Zoom.doc
DOC
MAC Printing on ITS Printers.doc.doc
DOC
Mac OS X Guide.doc
DOC
hier
DOC
WEB DESIGN!
EL MODELO DE NEGOCIO DE YOUTUBE
1. MPEG I.B.P frame之不同
LESSONS FROM THE MICHAEL JACKSON TRIAL
Timeline: The Life of Michael Jackson
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
LESSONS FROM THE MICHAEL JACKSON TRIAL
Com 380, Summer II
PPT
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
MICHAEL JACKSON.doc
Social Networks: Twitter Facebook SL - Slide 1
Facebook
Executive Summary Hare Chevrolet is a General Motors dealership ...
Welcome to the Dougherty County Public Library's Facebook and ...
NEWS ANNOUNCEMENT
C-2100 Ultra Zoom.doc
MAC Printing on ITS Printers.doc.doc
Mac OS X Guide.doc
hier
WEB DESIGN!

Scientific applications of machine learning

  • 1. Scientific Applications of Machine Learning Eric Mjolsness Scientific Inference Systems Laboratory Donald Bren School of Information and Computer Sciences, and Institute for Genomics and Bioinformatics University of California, Irvine
  • 2. Scientific Imagery Applications NGC 7331 - http://photojournal.jpl.nasa.gov/catalog/PIA06322 Arabidopsis SAM - Meyerowitz Lab
  • 3. Some Basic Machine Learning Distinctions Supervised vs. unsupervised learning Supervised e.g. classification and regression Feature selection regression for phenomenological model fitting e.g. GRN’s Unsupervised e.g. clustering; may be preprocessor Generative vs. Kernal methods Generative (statistical inference) models Kernal methods e.g Support Vector Machines Vector vs. Relationship data Vector data: preprocessed image features  log I,  x, … Images, time series, shifted spectra - semigroup actions Sparse graph/relationship data - permutation actions
  • 4. Correspondence Problems Extended sources - map morphologies Similar to biological imaging problems Fewer sources but many pixels Moving or changing point sources E.g. Ida and Dactyl / JPL MLS Dense point sources with instrument noise e.g. globular clusters (radial density function) Techniques: soft permutations, geometric transformations via optimization & continuation Embedding inside a graph clustering (optimization) algorithm Multiscale acceleration of optimization
  • 5. Mixture Models Mixture of Gaussians, t-distributions, … Can do outlier detection Mixture of factor analyzers Mixture of time series models * Problem-specific generative models Can formulate with a Stochastic Parameterized Grammar Clustering graphs Frey et al. 1998 Utsugi and Kumagai 2000
  • 6. Stochastic Grammars for Data Modeling
  • 7. Text & Biology Models
  • 8. More Detailed Clustering Grammars Clusters generate data Priors on cluster centers & variances Iterative through levels in a hierarchy Recursive through hierarchy
  • 9. Rock Field Grammar grammar rockfield() {
  • 10. Transcriptional Gene Regulation Networks Gene Regulation Network (GRN) model T v E xt r ac el lu la r co mm un i ca ti on Drosophila eve stripe expression in model (right) and data (left). Green: eve expression, red: kni expression. From [Reinitz and Sharp, Mech. of Devel., 49:133-158, 1995 ]. [Mjolsness et al. J. Theor. Biol. 152: 429-453, 1991]
  • 11. Gene Regulation + Signal Transduction Network T transcriptional regulation targets receptors ligands cell nucleus 
  • 12. Software architectures for systems biology: Sigmoid & Cellerator
  • 13. 3-tier architecture Database Access Model Translation Sigmoid Pathway Representation/Storage Database Cellerator Simulation/Inference Engine ME NU Interactive Graphic Model (SVG/Applet) Graphic Output OJB API JLink API P R O P E R T Y SOAP: Web Service XML(Object), Image, v ia HTTP
  • 14. Possible software support Machine learning (open source/academic) CompClust (CIT/JPL): Scripting/GUI dichotomy data point; dataset views WEKA data mining Intel: PNL Probabilistic Networks Library Future: stochastic grammar modeler + autogeneration (as in Cellerator) Image processing, data environments Matlab, IDL, Mathematica, Khoros/VisiQuest, … NIHImage/ImageJ, …
  • 15. Metadata in Systems Biology SBML Sigmoid UML
  • 16. Fletcher et al., Science v. 283, 1999 Brand et. al., Science 289 , 617-619, (2000) WUS
  • 17. SAM gene network: Results protein concentrations Y wus (init) and L1 X
  • 18. SAM: Gene Network Model Z wus clv1 clv3 X diffusive Y L1 diffusive
  • 19. SAM growth imagery PIN1 cell walls
  • 21. Basic Machine Learning Distinctions Supervised vs. unsupervised learning Supervised e.g. classification and regression Feature selection regression for phenomenological model fitting e.g. GRN’s Unsupervised e.g. clustering; may be preprocessor Generative vs. Kernal methods Generative (statistical inference) models Kernal methods e.g Support Vector Machines Vector vs. Relationship data Vector data: preprocessed image features  log I,  x, … Images, time series, shifted spectra - semigroup actions Sparse graph/relationship data - permutation actions
  • 22. Contacts Wayne Hayes, UCI ICS faculty scientific computing UCI ICS Maching Learning Padhraic Smyth Pierre Baldi Chris Hart, Caltech Biology grad student