SlideShare a Scribd company logo
Computational Classification
Techniques for Neuroimaging
A Machine Learning Based Approach
Adrian Smith – Undergraduate
Computer Science Department
Sonoma State University
Fundamentals
• Understanding the human
brain has been a central
theme of human history
• By growing our
understanding of the brain,
we improve our ability to
treat diseases (Gur2002)
• Understanding the brain
helps us be aware of it’s
limitations
Artist’s Depiction of Neurons
UCI Research
Courtesy of OSA Student Chapter at UCI Art in Science Contest.
Photo by: Ardy Rahman
fMRI Scanning
• Functional Magnetic
Resonance Imaging (fMRI)
allows us to measure localized
brain activity
• This allows one to find
relationships between cognition
and brain activity
• Blood oxygen is used as a
measure of activity (BOLD
imaging)
• This technique produces rich
data, but contains high levels of
noise
CSRB (Keck MRI Center)
Data Collection
• One major advantage of
researching fMRI data is it’s
availability on a variety of
online locations
• We worked with 1452 total
brain scans each
corresponding to one of 9
categories
• The categories refer to the
image a subject was
observing
Analysis Goals
• Our goal was to be able to, given the
fMRI scan of a subject, predict what
image they were observing
• This means differentiating scans
based on the image the subject is
observing
• What is the relationship?
Haxby2001 Stimulus Images
Machine Learning Techniques
• Machine learning is an
information processing
technique
• The field of machine learning is
at the heart of understanding
“Big Data”
• We aimed to use modern
machine learning techniques to
help classify fMRI data.
How does Machine Learning Work?
• Machine Learning classification
focuses on designing
algorithms which are trained to
categorize objects
• This is done by combining
some defining characteristics
and a label
• The algorithm trains on one set
of data, and then is tested to
see how accurately it can
predict the label of some piece
of data.
• What is the data?
By Antti Ajanki AnAj (Own work) [GFDL
(http://www.gnu.org/copyleft/fdl.html), CC-BY-SA-
3.0 (http://creativecommons.org/licenses/by-sa/3.0/)
Which is active before processing?
Unprocessed Active Unprocessed Rest
Which is active after processing?
Processed Active Processed Rest
Preprocessing
• We applied masks that came with
the dataset in order to focus on the
Ventral Temporal cortex, our region
of interest
• We then applied a polynomial
detrender, which eliminates
systematic trends, such as signal
increase as the machine warms up
• This was followed by a key step, z-
scoring against the rest position
Graph of Normal Distribution
Public Domain
Classification
• We now had to decide how to
process the image data
• This meant choosing features
that best represented the data
we sought
• We also tested a variety of
classification algorithms which
would label images based on the
chosen feature
Features
• We started with the our preprocessed values, and then looked at a
variety of transforms
• We chose the full vector and the PCA reduced version as our main
features of interest
• Principle Component Analysis (PCA) is a tool to reduce the dimensionality of a
dataset
PCA
Full Vector (Samples)
50 Highest Values
Histogram
[0.5, .01, -.02, 1.5, 2.0, … -3.0]576
One Volume
Experimental Design
• Data was split evenly and randomly into training and test
• We used several feature vectors to test each classifier
• We primarily focused on k Nearest Neighbor (kNN) and Support
Vector Machine (SVM) classifiers
• Tests were repeated 15 times and scores averaged
Train
Feature
Training
Label
Trained
Classifier
Testing
Feature
Predicted
Label
Testing
Label
Comparison
Accuracy and
Confusion
Matrix
Classifier
kNN vs. SVM
• SVM preforms better than kNN
• Increase in accuracy is likely due to the weakness of kNN when
dealing with high dimensionality
SVM on samples, 90.9%
accuracy
kNN on samples, 75.6%
accuracy
• We applied PCA to the processed data
• This produced a vector over half the size of our original
• This smaller vector produces more accurate results
Samples vs. PCA
PCA (SVM), 92.1%
accuracy
SVM on samples, 90.9%
accuracy
• PCA and SVM in combination gave the best results
after repeated testing
• We achieved on average 92.1% accuracy among 9
labels, with a 2.0% standard deviation.
• Our classification methods are effective and
repeatable
• We also gained a variety of insights about the
nature of the data
Classification Results: Accuracy
• We saw several labels which
repeatedly misclassified, and
saw accuracy improve as they
were removed
• One area of further study is
investigating whether these
patterns exist between multiple
subjects, and why
PCA (SVM), 92.1%
accuracy
Classification Results: Insights
Future Exploration
• We intend to move towards classifying
across multiple subjects
• This is of utmost importance to clinical
applications of fMRI data
• Multisubject comparison presents
challenges due to the variation in brain
structure
• We intend to build upon previous work on
feature detection and scaling maps
(Gill2014)
Sources
• Gur, R. E., McGrath, C., Chan, R. M., Schroeder, L., Turner, T., Turetsky, B. I., ...
& Gur, R. C. (2002). An fMRI study of facial emotion processing in patients with
schizophrenia. American Journal of Psychiatry.
• Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P.
(2001). Distributed and overlapping representations of faces and objects in
ventral temporal cortex. Science, 293(5539), 2425-2430.
• Gill, G., Bauer, C., & Beichel, R. R. (2014). A method for avoiding overlap of left
and right lungs in shape model guided segmentation of lungs in CT volumes.
Medical physics, 41(10), 101908.
• Dataset: This data was obtained from the OpenfMRI database. Its accession
number is ds000105. The original authors of :ref:`Haxby et al. (2001) <HGF+01>`
hold the copyright of this dataset and made it available under the terms of the
`Creative Commons Attribution-Share Alike 3.0`_ license.
Acknowledgments
• Dr. Gurman Gill – Mentor
• OpenfMRI – Source of all data, and amazing example of open
data in science
• pyMVPA – Python toolkit used in preprocessing
• Scikit-learn – Python toolkit used in classification
• Dr. Yaroslav Halchenko – Researcher who provided extensive
aid in understanding and dealing with fMRI data
Questions?
Extra Graphics
SVM of top 400 values.
30.9% accuracy SVM on 90% PCA. 92.2%
accuracy

More Related Content

PDF
Ja3615721579
PDF
NeuroVault and the vision for data sharing in neuroimaging
PDF
Q04503100104
PDF
Utilization of Super Pixel Based Microarray Image Segmentation
PDF
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
PDF
An intensity based medical image registration using genetic algorithm
PDF
Data sharing in neuroimaging: incentives, tools, and challenges
PDF
C013141723
Ja3615721579
NeuroVault and the vision for data sharing in neuroimaging
Q04503100104
Utilization of Super Pixel Based Microarray Image Segmentation
WEAKLY SUPERVISED FINE-GRAINED CATEGORIZATION WITH PART-BASED IMAGE REPRESENT...
An intensity based medical image registration using genetic algorithm
Data sharing in neuroimaging: incentives, tools, and challenges
C013141723

What's hot (20)

PDF
IRJET - Deep Learning Approach to Inpainting and Outpainting System
PDF
My experiment
PDF
MIP AND UNSUPERVISED CLUSTERING FOR THE DETECTION OF BRAIN TUMOUR CELLS
PDF
Comparison between the genetic algorithms optimization and particle swarm opt...
PDF
Clustering of medline documents using semi supervised spectral clustering
PDF
Segmentation of unhealthy region of plant leaf using image processing techniques
PDF
A NOVEL APPROACH FOR FEATURE EXTRACTION AND SELECTION ON MRI IMAGES FOR BRAIN...
PDF
Two are better than one IEEE-SMC talk
PDF
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
PDF
Are we really including all relevant evidence
PPTX
Deep Attention Model for Triage of Emergency Department Patients - Djordje Gl...
PPTX
Mapping to the Metabolomic Manifold
PPTX
Temporal based Recommendation System
PDF
Identification of Disease in Leaves using Genetic Algorithm
PPTX
Hanaa phd presentation 14-4-2017
PDF
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
PPTX
Parkinson disease classification recorded v2.0
PPTX
Connecting Metabolomic Data with Context
PPT
Technical Portion of PhD Research
IRJET - Deep Learning Approach to Inpainting and Outpainting System
My experiment
MIP AND UNSUPERVISED CLUSTERING FOR THE DETECTION OF BRAIN TUMOUR CELLS
Comparison between the genetic algorithms optimization and particle swarm opt...
Clustering of medline documents using semi supervised spectral clustering
Segmentation of unhealthy region of plant leaf using image processing techniques
A NOVEL APPROACH FOR FEATURE EXTRACTION AND SELECTION ON MRI IMAGES FOR BRAIN...
Two are better than one IEEE-SMC talk
New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...
Are we really including all relevant evidence
Deep Attention Model for Triage of Emergency Department Patients - Djordje Gl...
Mapping to the Metabolomic Manifold
Temporal based Recommendation System
Identification of Disease in Leaves using Genetic Algorithm
Hanaa phd presentation 14-4-2017
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Parkinson disease classification recorded v2.0
Connecting Metabolomic Data with Context
Technical Portion of PhD Research
Ad

Viewers also liked (15)

PDF
Diagram-1
PDF
Storm Sewer Design Cert
DOCX
PDF
සතියේ සුපුවත july 26 aug 01
DOCX
ประวัติส่วนตัว
PDF
Certificate Saurabh Bhargava
DOC
Maestra virutdes. el mejor aestudiante
PDF
Pos-DOC
PPTX
Jornadas ul
PDF
Yellow Shade Wrinkle-Resistant Background
PDF
Family Ski Photo
PDF
Westlaw Insight (Deferred Prosecution Agreements)
PDF
1999CertifTitulo.Ing.Quimico
ODP
Aleix xisco informatica
PPTX
Meet anna rose pierre
Diagram-1
Storm Sewer Design Cert
සතියේ සුපුවත july 26 aug 01
ประวัติส่วนตัว
Certificate Saurabh Bhargava
Maestra virutdes. el mejor aestudiante
Pos-DOC
Jornadas ul
Yellow Shade Wrinkle-Resistant Background
Family Ski Photo
Westlaw Insight (Deferred Prosecution Agreements)
1999CertifTitulo.Ing.Quimico
Aleix xisco informatica
Meet anna rose pierre
Ad

Similar to CSU_comp (20)

PPTX
Automatic System for Detection and Classification of Brain Tumors
PDF
Lect#1_Pattern_Recognition_PGIT204D_By_Dr_TSSinha.pdf
PPTX
Diagnosis Support by Machine Learning Using Posturography Data
PPTX
Machine Learning and Deep Contemplation of Data
PPTX
Emotion_Recognition_using_Brainwave_Datasets_Bethlehem_Seid (1).pptx
PPTX
Artificial Intelligence in pathology
PPTX
Use of Artificial Intelligence in Veterinary Diagnostic Imaging.pptx
PDF
MRI Brain Tumour Classification Using SURF and SIFT Features
PDF
A Review on the Brain Tumor Detection and Segmentation Techniques
PPTX
Brain tumor detection using image segmentation ppt
PDF
braintumordetectionusingimagesegmentationppt-210830184640.pdf
PDF
braintumordetectionusingimagesegmentationppt-210830184640.pdf
PDF
braintumordetectionusingimagesegmentationppt-210830184640 (1).pdf
PPTX
Introduction to Machine Learning and Texture Analysis for Lesion Characteriza...
PPT
PPTX
Parkinson disease classification v2.0
PDF
Share and Reuse: how data sharing can take your research to the next level
PPTX
Meta-Analysis -- Introduction.pptx
PPTX
Basic image analysis(processing and classification) and visualization using m...
Automatic System for Detection and Classification of Brain Tumors
Lect#1_Pattern_Recognition_PGIT204D_By_Dr_TSSinha.pdf
Diagnosis Support by Machine Learning Using Posturography Data
Machine Learning and Deep Contemplation of Data
Emotion_Recognition_using_Brainwave_Datasets_Bethlehem_Seid (1).pptx
Artificial Intelligence in pathology
Use of Artificial Intelligence in Veterinary Diagnostic Imaging.pptx
MRI Brain Tumour Classification Using SURF and SIFT Features
A Review on the Brain Tumor Detection and Segmentation Techniques
Brain tumor detection using image segmentation ppt
braintumordetectionusingimagesegmentationppt-210830184640.pdf
braintumordetectionusingimagesegmentationppt-210830184640.pdf
braintumordetectionusingimagesegmentationppt-210830184640 (1).pdf
Introduction to Machine Learning and Texture Analysis for Lesion Characteriza...
Parkinson disease classification v2.0
Share and Reuse: how data sharing can take your research to the next level
Meta-Analysis -- Introduction.pptx
Basic image analysis(processing and classification) and visualization using m...

CSU_comp

  • 1. Computational Classification Techniques for Neuroimaging A Machine Learning Based Approach Adrian Smith – Undergraduate Computer Science Department Sonoma State University
  • 2. Fundamentals • Understanding the human brain has been a central theme of human history • By growing our understanding of the brain, we improve our ability to treat diseases (Gur2002) • Understanding the brain helps us be aware of it’s limitations Artist’s Depiction of Neurons UCI Research Courtesy of OSA Student Chapter at UCI Art in Science Contest. Photo by: Ardy Rahman
  • 3. fMRI Scanning • Functional Magnetic Resonance Imaging (fMRI) allows us to measure localized brain activity • This allows one to find relationships between cognition and brain activity • Blood oxygen is used as a measure of activity (BOLD imaging) • This technique produces rich data, but contains high levels of noise CSRB (Keck MRI Center)
  • 4. Data Collection • One major advantage of researching fMRI data is it’s availability on a variety of online locations • We worked with 1452 total brain scans each corresponding to one of 9 categories • The categories refer to the image a subject was observing
  • 5. Analysis Goals • Our goal was to be able to, given the fMRI scan of a subject, predict what image they were observing • This means differentiating scans based on the image the subject is observing • What is the relationship? Haxby2001 Stimulus Images
  • 6. Machine Learning Techniques • Machine learning is an information processing technique • The field of machine learning is at the heart of understanding “Big Data” • We aimed to use modern machine learning techniques to help classify fMRI data.
  • 7. How does Machine Learning Work? • Machine Learning classification focuses on designing algorithms which are trained to categorize objects • This is done by combining some defining characteristics and a label • The algorithm trains on one set of data, and then is tested to see how accurately it can predict the label of some piece of data. • What is the data? By Antti Ajanki AnAj (Own work) [GFDL (http://www.gnu.org/copyleft/fdl.html), CC-BY-SA- 3.0 (http://creativecommons.org/licenses/by-sa/3.0/)
  • 8. Which is active before processing? Unprocessed Active Unprocessed Rest
  • 9. Which is active after processing? Processed Active Processed Rest
  • 10. Preprocessing • We applied masks that came with the dataset in order to focus on the Ventral Temporal cortex, our region of interest • We then applied a polynomial detrender, which eliminates systematic trends, such as signal increase as the machine warms up • This was followed by a key step, z- scoring against the rest position Graph of Normal Distribution Public Domain
  • 11. Classification • We now had to decide how to process the image data • This meant choosing features that best represented the data we sought • We also tested a variety of classification algorithms which would label images based on the chosen feature
  • 12. Features • We started with the our preprocessed values, and then looked at a variety of transforms • We chose the full vector and the PCA reduced version as our main features of interest • Principle Component Analysis (PCA) is a tool to reduce the dimensionality of a dataset PCA Full Vector (Samples) 50 Highest Values Histogram [0.5, .01, -.02, 1.5, 2.0, … -3.0]576 One Volume
  • 13. Experimental Design • Data was split evenly and randomly into training and test • We used several feature vectors to test each classifier • We primarily focused on k Nearest Neighbor (kNN) and Support Vector Machine (SVM) classifiers • Tests were repeated 15 times and scores averaged Train Feature Training Label Trained Classifier Testing Feature Predicted Label Testing Label Comparison Accuracy and Confusion Matrix Classifier
  • 14. kNN vs. SVM • SVM preforms better than kNN • Increase in accuracy is likely due to the weakness of kNN when dealing with high dimensionality SVM on samples, 90.9% accuracy kNN on samples, 75.6% accuracy
  • 15. • We applied PCA to the processed data • This produced a vector over half the size of our original • This smaller vector produces more accurate results Samples vs. PCA PCA (SVM), 92.1% accuracy SVM on samples, 90.9% accuracy
  • 16. • PCA and SVM in combination gave the best results after repeated testing • We achieved on average 92.1% accuracy among 9 labels, with a 2.0% standard deviation. • Our classification methods are effective and repeatable • We also gained a variety of insights about the nature of the data Classification Results: Accuracy
  • 17. • We saw several labels which repeatedly misclassified, and saw accuracy improve as they were removed • One area of further study is investigating whether these patterns exist between multiple subjects, and why PCA (SVM), 92.1% accuracy Classification Results: Insights
  • 18. Future Exploration • We intend to move towards classifying across multiple subjects • This is of utmost importance to clinical applications of fMRI data • Multisubject comparison presents challenges due to the variation in brain structure • We intend to build upon previous work on feature detection and scaling maps (Gill2014)
  • 19. Sources • Gur, R. E., McGrath, C., Chan, R. M., Schroeder, L., Turner, T., Turetsky, B. I., ... & Gur, R. C. (2002). An fMRI study of facial emotion processing in patients with schizophrenia. American Journal of Psychiatry. • Haxby, J. V., Gobbini, M. I., Furey, M. L., Ishai, A., Schouten, J. L., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293(5539), 2425-2430. • Gill, G., Bauer, C., & Beichel, R. R. (2014). A method for avoiding overlap of left and right lungs in shape model guided segmentation of lungs in CT volumes. Medical physics, 41(10), 101908. • Dataset: This data was obtained from the OpenfMRI database. Its accession number is ds000105. The original authors of :ref:`Haxby et al. (2001) <HGF+01>` hold the copyright of this dataset and made it available under the terms of the `Creative Commons Attribution-Share Alike 3.0`_ license.
  • 20. Acknowledgments • Dr. Gurman Gill – Mentor • OpenfMRI – Source of all data, and amazing example of open data in science • pyMVPA – Python toolkit used in preprocessing • Scikit-learn – Python toolkit used in classification • Dr. Yaroslav Halchenko – Researcher who provided extensive aid in understanding and dealing with fMRI data
  • 22. Extra Graphics SVM of top 400 values. 30.9% accuracy SVM on 90% PCA. 92.2% accuracy