SlideShare a Scribd company logo
Machine Learning on Cell Processor

                    Supervisor: Dr. Eric McCreath
                       Student: Robin Srivastava
Background and Motivation
                             Machine
                             Learning



                Batch                    Online
               Learning                 Learning




                                                   HAM


 Email-N ……..… email-2 Email-1

                                                   SPAM
Background and Motivation
                             Machine
                             Learning


                                                          Sequential
                Batch                    Online           in Nature
               Learning                 Learning




                                                   HAM


 Email-N ……..… email-2 Email-1

                                                   SPAM
Object
    Performance evaluation of a parallel online machine
     learning algorithm (Langford et. al. [1])
    Target Machines
         Cell Processor: One 3 GHz 64-bit IBM PowerPC, six
          specialized co-processors
         Intel Dual Core Machine: 2GHz dual core processor, 1.86 GB
          of main memory
Stochastic Gradient Descent
        Step 1: Initialize weight vector w0 with some arbitrary
         values
        Step 2: Update the weight vector as follows

                      w (t +1) = w t − η∇E ( w t )

    where ∇E is the gradient of error function and η is the
       learning rate
           €
      Step 3: Follow Step 2 for all the units for data
€                                            €
Delayed Stochastic Gradient Descent
        Step 1: Initialize weight vector w0 with some arbitrary
         values
        Step 2: Update the weight vector as follows

                     w (t +1) = w t − η∇E ( w t−τ )

    where ∇E is the gradient of error function and η is the
       learning rate
          €
      Step 3: Follow Step 2 for all the units for data
€                                            €
Implementation Model
Complete Dataset
Implementation
    Dataset – TREC 2007 Public Corpus
         Number of mail: 75,419
         Each mail classified as either ‘ham’ or ‘spam’
    Pre-processing
         Total number of features extracted: 2,218,878
         Pre-processed email format


<Number of features><space><index>:<count><space>…………..<index>:<count>
Memory Requirement
    Algorithm Implemented
         Online Logistic Regression with delayed update
         Requirement per level of parallelization
              Two private copy of weight vectors
              Two shared copy of weight vectors
              Two error gradients
              Required Dimension for each = Number of features = 2,218,878
              Data type: Float (On Cell takes 4 bytes)
              Total = (6 x 2218878) x 4 = 53,253,072 bytes = 50.78 MB
              Size occupied by other auxiliary variables
         Alternatively
              Make only shared copy use the full dimension
              Total size = (2 x 2218878) x 4 = 16.9 MB + others
Limitations on Cell
    Memory limitation of SPE
         Available: 256 KB
         Required: approx. 51 MB
         Work Around:
              Reduced the number of features
              Done one more level of pre-processing
    SIMD limitation
         The time wasted in preparing the data for SIMD surpassed its
          benefits for this implementation
Results
    Serial implementation of logistic regression on Intel Dual
     core took 36.93 and 36.45 sec respectively for two
     consecutive executions.
    Parallel implementation using stochastic gradient process
Results (contd.)
    Performance on Cell




                           Time in microseconds
References
①    John Langford, Alexander J. Samola and Martin Zinkevich.
     Slow learners are fast published in Journal of Machine
     Learning Research 1(2009)
②    Michael Kistler, Michael Perrone, Fabrizio Petrini. Cell
     Multiprocessor Communication Network: Built for Speed.
③    Thomas Chen , Ram Raghavan , Jason Dale and Eiji Iwata. Cell
     Broadband Engine Architecture and its first implementation
④    Jonathan Bartlett. Programming high-performance
     applications on the Cell/B.E. processor, Part 6: Smart buffer
     management with DMA transfers
⑤    Introduction to Statistical Machine Learning, 2010 course
     assignment 1
⑥    Christopher Bishop, Pattern Recognition and Machine
     Learning.

More Related Content

PPT
Principles of soft computing-Associative memory networks
PPT
2.5 backpropagation
PPTX
Deep Belief Networks for Spam Filtering
PPT
Lec 6-bp
PPTX
Deep belief networks for spam filtering
PPTX
CNN and its applications by ketaki
PPTX
Deep learning lecture - part 1 (basics, CNN)
PDF
Learning in Networks: were Pavlov and Hebb right?
Principles of soft computing-Associative memory networks
2.5 backpropagation
Deep Belief Networks for Spam Filtering
Lec 6-bp
Deep belief networks for spam filtering
CNN and its applications by ketaki
Deep learning lecture - part 1 (basics, CNN)
Learning in Networks: were Pavlov and Hebb right?

What's hot (20)

PPTX
Deep Learning - RNN and CNN
PPTX
Convolutional neural network from VGG to DenseNet
PDF
Convolutional Neural Network Models - Deep Learning
PPTX
Back propagation method
PPT
Classification using back propagation algorithm
PPTX
PDF
Classification by back propagation, multi layered feed forward neural network...
DOCX
Backpropagation
PDF
Classification By Back Propagation
PPTX
Handwritten Digit Recognition and performance of various modelsation[autosaved]
PPT
Artificial Neural Networks
PPTX
04 Multi-layer Feedforward Networks
PDF
Deep learning
PPTX
Convolutional Neural Network and Its Applications
PPT
NIPS2007: deep belief nets
PPTX
Associative memory network
PPTX
PPTX
Back propagation network
PPT
nural network ER. Abhishek k. upadhyay
Deep Learning - RNN and CNN
Convolutional neural network from VGG to DenseNet
Convolutional Neural Network Models - Deep Learning
Back propagation method
Classification using back propagation algorithm
Classification by back propagation, multi layered feed forward neural network...
Backpropagation
Classification By Back Propagation
Handwritten Digit Recognition and performance of various modelsation[autosaved]
Artificial Neural Networks
04 Multi-layer Feedforward Networks
Deep learning
Convolutional Neural Network and Its Applications
NIPS2007: deep belief nets
Associative memory network
Back propagation network
nural network ER. Abhishek k. upadhyay
Ad

Similar to Presentation on experimental setup for verigying - &quot;Slow Learners are Fast&quot; (20)

PDF
Implementation of Back-Propagation Neural Network using Scilab and its Conver...
PDF
Deep Learning for Natural Language Processing
PDF
Lesson 39
PDF
AI Lesson 39
PDF
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
PDF
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
PDF
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
PDF
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
PPTX
Online learning, Vowpal Wabbit and Hadoop
PDF
N ns 1
PPTX
MLConf 2013: Metronome and Parallel Iterative Algorithms on YARN
PDF
Web spam classification using supervised artificial neural network algorithms
PDF
A Survey of Deep Learning Algorithms for Malware Detection
PPTX
Unit ii supervised ii
PDF
A Platform for Accelerating Machine Learning Applications
PDF
Alphabet Recognition System Based on Artifical Neural Network
PPT
ECCV2010: feature learning for image classification, part 4
PPTX
Machine Learning DR PRKRao-PPT UNIT-II.pptx
PPTX
Deep learning from a novice perspective
PPTX
Autoencoders for image_classification
Implementation of Back-Propagation Neural Network using Scilab and its Conver...
Deep Learning for Natural Language Processing
Lesson 39
AI Lesson 39
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Web Spam Classification Using Supervised Artificial Neural Network Algorithms
Online learning, Vowpal Wabbit and Hadoop
N ns 1
MLConf 2013: Metronome and Parallel Iterative Algorithms on YARN
Web spam classification using supervised artificial neural network algorithms
A Survey of Deep Learning Algorithms for Malware Detection
Unit ii supervised ii
A Platform for Accelerating Machine Learning Applications
Alphabet Recognition System Based on Artifical Neural Network
ECCV2010: feature learning for image classification, part 4
Machine Learning DR PRKRao-PPT UNIT-II.pptx
Deep learning from a novice perspective
Autoencoders for image_classification
Ad

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
A Presentation on Artificial Intelligence
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Electronic commerce courselecture one. Pdf
PPT
Teaching material agriculture food technology
PPTX
Cloud computing and distributed systems.
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
cuic standard and advanced reporting.pdf
Spectroscopy.pptx food analysis technology
Advanced methodologies resolving dimensionality complications for autism neur...
A Presentation on Artificial Intelligence
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Assigned Numbers - 2025 - Bluetooth® Document
Diabetes mellitus diagnosis method based random forest with bat algorithm
Per capita expenditure prediction using model stacking based on satellite ima...
MYSQL Presentation for SQL database connectivity
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
NewMind AI Weekly Chronicles - August'25-Week II
Encapsulation_ Review paper, used for researhc scholars
Electronic commerce courselecture one. Pdf
Teaching material agriculture food technology
Cloud computing and distributed systems.
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
The AUB Centre for AI in Media Proposal.docx

Presentation on experimental setup for verigying - &quot;Slow Learners are Fast&quot;

  • 1. Machine Learning on Cell Processor Supervisor: Dr. Eric McCreath Student: Robin Srivastava
  • 2. Background and Motivation Machine Learning Batch Online Learning Learning HAM Email-N ……..… email-2 Email-1 SPAM
  • 3. Background and Motivation Machine Learning Sequential Batch Online in Nature Learning Learning HAM Email-N ……..… email-2 Email-1 SPAM
  • 4. Object   Performance evaluation of a parallel online machine learning algorithm (Langford et. al. [1])   Target Machines   Cell Processor: One 3 GHz 64-bit IBM PowerPC, six specialized co-processors   Intel Dual Core Machine: 2GHz dual core processor, 1.86 GB of main memory
  • 5. Stochastic Gradient Descent   Step 1: Initialize weight vector w0 with some arbitrary values   Step 2: Update the weight vector as follows w (t +1) = w t − η∇E ( w t ) where ∇E is the gradient of error function and η is the learning rate €   Step 3: Follow Step 2 for all the units for data € €
  • 6. Delayed Stochastic Gradient Descent   Step 1: Initialize weight vector w0 with some arbitrary values   Step 2: Update the weight vector as follows w (t +1) = w t − η∇E ( w t−τ ) where ∇E is the gradient of error function and η is the learning rate €   Step 3: Follow Step 2 for all the units for data € €
  • 8. Implementation   Dataset – TREC 2007 Public Corpus   Number of mail: 75,419   Each mail classified as either ‘ham’ or ‘spam’   Pre-processing   Total number of features extracted: 2,218,878   Pre-processed email format <Number of features><space><index>:<count><space>…………..<index>:<count>
  • 9. Memory Requirement   Algorithm Implemented   Online Logistic Regression with delayed update   Requirement per level of parallelization   Two private copy of weight vectors   Two shared copy of weight vectors   Two error gradients   Required Dimension for each = Number of features = 2,218,878   Data type: Float (On Cell takes 4 bytes)   Total = (6 x 2218878) x 4 = 53,253,072 bytes = 50.78 MB   Size occupied by other auxiliary variables   Alternatively   Make only shared copy use the full dimension   Total size = (2 x 2218878) x 4 = 16.9 MB + others
  • 10. Limitations on Cell   Memory limitation of SPE   Available: 256 KB   Required: approx. 51 MB   Work Around:   Reduced the number of features   Done one more level of pre-processing   SIMD limitation   The time wasted in preparing the data for SIMD surpassed its benefits for this implementation
  • 11. Results   Serial implementation of logistic regression on Intel Dual core took 36.93 and 36.45 sec respectively for two consecutive executions.   Parallel implementation using stochastic gradient process
  • 12. Results (contd.)   Performance on Cell Time in microseconds
  • 13. References ①  John Langford, Alexander J. Samola and Martin Zinkevich. Slow learners are fast published in Journal of Machine Learning Research 1(2009) ②  Michael Kistler, Michael Perrone, Fabrizio Petrini. Cell Multiprocessor Communication Network: Built for Speed. ③  Thomas Chen , Ram Raghavan , Jason Dale and Eiji Iwata. Cell Broadband Engine Architecture and its first implementation ④  Jonathan Bartlett. Programming high-performance applications on the Cell/B.E. processor, Part 6: Smart buffer management with DMA transfers ⑤  Introduction to Statistical Machine Learning, 2010 course assignment 1 ⑥  Christopher Bishop, Pattern Recognition and Machine Learning.