SlideShare a Scribd company logo
Parallel MCMC
         Random Number Generators
                        Summary




Parallel Bayesian computation in R ≥ 2.14
     using the packages foreach and parallel


            Matt Moores             Cathy Hargrave

           Bayesian Research & Applications Group
     Queensland University of Technology, Brisbane, Australia
                 CRICOS provider no. 00213J


            Thursday September 27, 2012




                    BRAG Sept. 27    Parallel MCMC in R
Parallel MCMC
               Random Number Generators
                              Summary


Outline



  1   Parallel MCMC
        Introduction
        R packages


  2   Random Number Generators
        RNG and parallel MCMC
        RNGs available in R




                          BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                             Introduction
                  Random Number Generators
                                             R packages
                                 Summary


Motivation




  Why parallel?
      large datasets
      many MCMC iterations
      multiple CPU cores now commonplace
          eg. Intel Core i5 and i7
          even mobile phones have multicore CPUs



                             BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                           Introduction
                Random Number Generators
                                           R packages
                               Summary


Parallel MCMC



  2 kinds of parallelism:
       concurrent MCMC chains
           always applicable
           straightforward to implement
      concurrent updates within an iteration
           only useful for a very large parameter space
           ideally in a compiled language (eg. Rcpp with OpenMP)
  also implicit parallelism, eg. with Intel Math Kernel Library




                           BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                      Introduction
           Random Number Generators
                                      R packages
                          Summary


Concurrent Chains




                      BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                          Introduction
               Random Number Generators
                                          R packages
                              Summary


Simple Network Of Workstations


  R package snow by Luke Tierney, et al.
      spawns multiple copies of R
      provides several options for inter-process communication
          TCP sockets
               available on any platform, including Microsoft Windows
          Message Passing Interface (via the package Rmpi)
          Parallel Virtual Machine (via the package rpvm)
          NetWorkSpaces (via the package nws)
      can either run on a local machine or a cluster (eg. Lyra)




                          BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                            Introduction
                 Random Number Generators
                                            R packages
                                Summary


multicore



  R package by Simon Urbanek
      implemented using the POSIX fork system call
            available on Linux and Mac OS X
            clones the R instance (functions + data)
            takes advantage of copy-on-write
            will fork as many processes as there are available CPU
            cores, unless told otherwise




                            BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                           Introduction
                Random Number Generators
                                           R packages
                               Summary


parallel




  R package parallel included in the core R distribution
      available in versions ≥ 2.14.0
      incorporates subsets of snow, multicore, and rlecuyer
      sensible default behaviour




                           BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                            Introduction
                 Random Number Generators
                                            R packages
                                Summary


foreach

  "syntactic sugar"
 §
  l i b r a r y ( foreach )
  library ( parallel )
  library ( doParallel )

  # w i l l a u t o m a t i c a l l y use a SOCK c l u s t e r on Windows
  # ( o t h e r w i s e uses m u l t i c o r e )
  r e g i s t e r D o P a r a l l e l ( cores = d e t e c t C o r e s ( ) )

  f o r e a c h ( i =1: getDoParWorkers ( ) ) %dopar% {
      # t h i s code w i l l be executed c o n c u r r e n t l y
       ...
  }

                            BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                           Introduction
                Random Number Generators
                                           R packages
                               Summary


foreach with SNOW
 §
  l i b r a r y ( foreach )
  library ( parallel )
  library ( doParallel )

  # setup l o c a l SOCK c l u s t e r f o r 4 CPU cores
  c l ← makePSOCKcluster ( 4 )
  registerDoParallel ( cl )

  f o r e a c h ( i =1: getDoParWorkers ( ) ) %dopar% {
      # t h i s code w i l l be executed c o n c u r r e n t l y
       ...
  }
  stopCluster ( cl )

                           BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                           Introduction
                Random Number Generators
                                           R packages
                               Summary


foreach with multicore
 §
  l i b r a r y ( foreach )
  library ( parallel )
  library ( doParallel )

  # f o r k one c h i l d process f o r each CPU core
  c l ← makeForkCluster ( d e t e c t C o r e s ( ) )
  registerDoParallel ( cl )

  f o r e a c h ( i =1: getDoParWorkers ( ) ) %dopar% {
      # t h i s code w i l l be executed c o n c u r r e n t l y
       ...
  }


                           BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                              Introduction
                   Random Number Generators
                                              R packages
                                  Summary


foreach with CODA


  If your Gibbs sampler returns an mcmc object, these can be
  conbined into an mcmc.list:
 §
  l i b r a r y ( coda )

  samples . l i s t ← f o r e a c h ( i =1: getDoParWorkers ( ) ,
                                   . combine=mcmc . l i s t ,
                                   . m u l t i c o m b i n e =T ) %dopar% {
    # t h i s code w i l l be executed c o n c u r r e n t l y
    ...
  }



                              BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                              Introduction
                  Random Number Generators
                                              R packages
                                 Summary


foreach with other libraries


  You need to declare any libraries that are used inside the child
  process. For example:
 §
  l i b r a r y ( mvtnorm )
  l i b r a r y ( coda )

  f o r e a c h ( i =1: getDoParWorkers ( ) ,
                  . packages=c ( "mvtnorm" , "coda" ) ) %dopar% {
      # t h i s code uses mcmc ( . . . ) and rmvnorm ( . . . )
       ...
  }



                              BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                           RNG and parallel MCMC
                Random Number Generators
                                           RNGs available in R
                               Summary


Random Number Generators for parallel MCMC


  The chains of our Gibbs sampler run independently, but:
      if the same RNG is seeded with the same value, all of the
      chains will generate the same random numbers in the
      same sequence - they will be identical!
      we either need to use:
          different seeds, or
          different random number generators
      for each chain (preferably both)
      it is also advisable to choose (or generate) different initial
      values in each chain of our Gibbs sampler



                           BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                             RNG and parallel MCMC
                  Random Number Generators
                                             RNGs available in R
                                 Summary


Mersenne Twister



   The default RNG in R
        pseudo-random sequence with 32bit precision
        periodicity of 219937 − 1
        takes 0.4 seconds to generate 107 random numbers
        on an Intel Core i5 running R 2.15.1 and Windows 7
   open-source implementation available at:
   http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html




Matsumoto & Nishimura (1998) TOMACS 8: 3–30.

                             BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                           RNG and parallel MCMC
                Random Number Generators
                                           RNGs available in R
                               Summary


Other RNGs in the base package



      Wichmann-Hill (1982) Applied Statistics 31, 188–190.
      Marsaglia-Multicarry
      (Usenet newsgroup sci.stat.math, 1997)
      Super-Duper
      (Reeds, J., Hubert, S. and Abrahams, M., 1982–4)
  For JAGS with up to 4 concurrent chains:
 §
  r n g I n i t s ← p a r a l l e l . seeds ( "base::BaseRNG" , 4 )




                           BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
                                               RNG and parallel MCMC
                    Random Number Generators
                                               RNGs available in R
                                   Summary


L’Ecuyer

         Available via R libraries rlecuyer or parallel
         Multiple independent streams of random numbers
         Periodicity ≈ 2191
         (each stream is a subsequence of length 2127 )
         0.6 seconds to generate 107 random numbers via runif
    To initialize each child process in a SNOW cluster with an
    independent stream:
   §
    c l ← makeCluster ( 4 )
    clusterSetRNGStream ( c l )
    registerDoParallel ( cl )
L’Ecuyer, et al. (2002) Operations Research, 50(6): 1073–1075.

                               BRAG Sept. 27   Parallel MCMC in R
Parallel MCMC
             Random Number Generators
                            Summary


Summary


    Most MCMC algorithms are "embarrasingly parallel"
        chains run independently
        (as long as the RNG is set up correctly)
    The R packages foreach and doParallel make parallelism
    easy, on any computing platform


    Related topics (not covered in this presentation):
        Running R on a supercomputer (eg. lyra.qut.edu.au)
        Cloud computing with Apache Hadoop
        GPU programming in R (nVidia CUDA)




                        BRAG Sept. 27   Parallel MCMC in R
Appendix   For Further Reading



For Further Reading

     Norman Matloff
     The Art of R Programming.
     No Starch Press, 2011.
     M. Schmidberger, M. Morgan, D. Eddelbuettel, H. Yu, L. Tierney & U.
     Mansmann
     State of the Art in Parallel Computing with R.
     Journal of Statistical Software, 31(1), 2009.
     P. L’Ecuyer, R. Simard, E.J. Chen & W.D. Kelton
     An Object-Oriented Random-Number Package with Many Long Streams
     and Substreams.
     Operations Research, 50(6): 1073–1075, 2002.
     M. Matsumoto & T. Nishimura
     Mersenne Twister: A 623-Dimensionally Equidistributed Uniform
     Pseudo-Random Number Generator.
     ACM Transactions on Modeling and Computer Simulation, 8: 3–30,
     1998.

                         BRAG Sept. 27   Parallel MCMC in R

More Related Content

PDF
RISC-V Linker Relaxation and LLD
PDF
gcov和clang中的实现
PDF
Directive-based approach to Heterogeneous Computing
PDF
1 Vampir Overview
PDF
Andrade sep15 fromlowarchitecturalexpertiseuptohighthroughputnonbinaryldpcdec...
PDF
Open cl programming using python syntax
PDF
Cg in Two Pages
PDF
Crossing the border with Qt: the i18n system
RISC-V Linker Relaxation and LLD
gcov和clang中的实现
Directive-based approach to Heterogeneous Computing
1 Vampir Overview
Andrade sep15 fromlowarchitecturalexpertiseuptohighthroughputnonbinaryldpcdec...
Open cl programming using python syntax
Cg in Two Pages
Crossing the border with Qt: the i18n system

What's hot (19)

PDF
BPF - All your packets belong to me
PDF
A Follow-up Cg Runtime Tutorial for Readers of The Cg Tutorial
PDF
Challenges in GPU compilers
PDF
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...
PDF
Code GPU with CUDA - Memory Subsystem
PPT
Nug2004 yhe
PDF
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
PDF
Code GPU with CUDA - Device code optimization principle
PDF
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
PDF
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
PDF
20 -miscellaneous
PDF
Two-level Just-in-Time Compilation with One Interpreter and One Engine
PDF
A Framework for Efficient Rapid Prototyping by Virtually Enlarging FPGA Resou...
PDF
Arm tools and roadmap for SVE compiler support
PPTX
Modular Pick and Place Simulator using ROS Framework
PDF
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
PDF
Claire protorpc
PDF
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
PPT
Debugging Applications with GNU Debugger
BPF - All your packets belong to me
A Follow-up Cg Runtime Tutorial for Readers of The Cg Tutorial
Challenges in GPU compilers
OSDC 2015: Roland Kammerer | DRBD9: Managing High-Available Storage in Many-N...
Code GPU with CUDA - Memory Subsystem
Nug2004 yhe
Pragmatic Optimization in Modern Programming - Mastering Compiler Optimizations
Code GPU with CUDA - Device code optimization principle
Pragmatic Optimization in Modern Programming - Demystifying the Compiler
High Speed Decoding of Non-Binary Irregular LDPC Codes Using GPUs (Paper)
20 -miscellaneous
Two-level Just-in-Time Compilation with One Interpreter and One Engine
A Framework for Efficient Rapid Prototyping by Virtually Enlarging FPGA Resou...
Arm tools and roadmap for SVE compiler support
Modular Pick and Place Simulator using ROS Framework
Veriloggen.Stream: データフローからハードウェアを作る(2018年3月3日 高位合成友の会 第5回 @東京工業大学)
Claire protorpc
Stack Hybridization: A Mechanism for Bridging Two Compilation Strategies in a...
Debugging Applications with GNU Debugger
Ad

Similar to Parallel R (20)

PDF
R workshop xx -- Parallel Computing with R
PDF
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
PDF
PPTX
Through the firewall with miniCRAN
PDF
Declare Your Language: What is a Compiler?
PPTX
Using R on High Performance Computers
PDF
Wilmott Nyc Jul2012 Nag Talk John Holden
PPTX
Crossing Abstraction Barriers When Debugging In Dynamic Languages
PDF
OpenCL programming using Python syntax
PDF
Rcpp
PDF
Porting a Streaming Pipeline from Scala to Rust
PDF
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
PDF
"Massive Parallel Decoding of Low-Density Parity-Check Codes Using Graphic Ca...
PPTX
Gpu workshop cluster universe: scripting cuda
PDF
L Fu - Dao: a novel programming language for bioinformatics
PDF
NVIDIA HPC ソフトウエア斜め読み
PDF
Partial Compilers
PDF
Research Inventy : International Journal of Engineering and Science is publis...
PDF
Workshop de Ruby on Rails
PDF
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
R workshop xx -- Parallel Computing with R
R & CDK: A Sturdy Platform in the Oceans of Chemical Data}
Through the firewall with miniCRAN
Declare Your Language: What is a Compiler?
Using R on High Performance Computers
Wilmott Nyc Jul2012 Nag Talk John Holden
Crossing Abstraction Barriers When Debugging In Dynamic Languages
OpenCL programming using Python syntax
Rcpp
Porting a Streaming Pipeline from Scala to Rust
XConf 2022 - Code As Data: How data insights on legacy codebases can fill the...
"Massive Parallel Decoding of Low-Density Parity-Check Codes Using Graphic Ca...
Gpu workshop cluster universe: scripting cuda
L Fu - Dao: a novel programming language for bioinformatics
NVIDIA HPC ソフトウエア斜め読み
Partial Compilers
Research Inventy : International Journal of Engineering and Science is publis...
Workshop de Ruby on Rails
SF Big Analytics 2019112: Uncovering performance regressions in the TCP SACK...
Ad

More from Matt Moores (16)

PDF
Bayesian Inference and Uncertainty Quantification for Inverse Problems
PDF
bayesImageS: an R package for Bayesian image analysis
PDF
Exploratory Analysis of Multivariate Data
PDF
R package bayesImageS: Scalable Inference for Intractable Likelihoods
PDF
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
PDF
Approximate Bayesian computation for the Ising/Potts model
PDF
Importing satellite imagery into R from NASA and the U.S. Geological Survey
PDF
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
PDF
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
PDF
Bayesian modelling and computation for Raman spectroscopy
PDF
Final PhD Seminar
PDF
Precomputation for SMC-ABC with undirected graphical models
PDF
Intro to ABC
PDF
Pre-computation for ABC in image analysis
PDF
Variational Bayes
PDF
Informative Priors for Segmentation of Medical Images
Bayesian Inference and Uncertainty Quantification for Inverse Problems
bayesImageS: an R package for Bayesian image analysis
Exploratory Analysis of Multivariate Data
R package bayesImageS: Scalable Inference for Intractable Likelihoods
bayesImageS: Bayesian computation for medical Image Segmentation using a hidd...
Approximate Bayesian computation for the Ising/Potts model
Importing satellite imagery into R from NASA and the U.S. Geological Survey
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
R package 'bayesImageS': a case study in Bayesian computation using Rcpp and ...
Bayesian modelling and computation for Raman spectroscopy
Final PhD Seminar
Precomputation for SMC-ABC with undirected graphical models
Intro to ABC
Pre-computation for ABC in image analysis
Variational Bayes
Informative Priors for Segmentation of Medical Images

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Big Data Technologies - Introduction.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
Understanding_Digital_Forensics_Presentation.pptx
cuic standard and advanced reporting.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Empathic Computing: Creating Shared Understanding
Big Data Technologies - Introduction.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Spectral efficient network and resource selection model in 5G networks
Mobile App Security Testing_ A Comprehensive Guide.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
20250228 LYD VKU AI Blended-Learning.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
The Rise and Fall of 3GPP – Time for a Sabbatical?
sap open course for s4hana steps from ECC to s4
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation_ Review paper, used for researhc scholars
The AUB Centre for AI in Media Proposal.docx
MIND Revenue Release Quarter 2 2025 Press Release

Parallel R

  • 1. Parallel MCMC Random Number Generators Summary Parallel Bayesian computation in R ≥ 2.14 using the packages foreach and parallel Matt Moores Cathy Hargrave Bayesian Research & Applications Group Queensland University of Technology, Brisbane, Australia CRICOS provider no. 00213J Thursday September 27, 2012 BRAG Sept. 27 Parallel MCMC in R
  • 2. Parallel MCMC Random Number Generators Summary Outline 1 Parallel MCMC Introduction R packages 2 Random Number Generators RNG and parallel MCMC RNGs available in R BRAG Sept. 27 Parallel MCMC in R
  • 3. Parallel MCMC Introduction Random Number Generators R packages Summary Motivation Why parallel? large datasets many MCMC iterations multiple CPU cores now commonplace eg. Intel Core i5 and i7 even mobile phones have multicore CPUs BRAG Sept. 27 Parallel MCMC in R
  • 4. Parallel MCMC Introduction Random Number Generators R packages Summary Parallel MCMC 2 kinds of parallelism: concurrent MCMC chains always applicable straightforward to implement concurrent updates within an iteration only useful for a very large parameter space ideally in a compiled language (eg. Rcpp with OpenMP) also implicit parallelism, eg. with Intel Math Kernel Library BRAG Sept. 27 Parallel MCMC in R
  • 5. Parallel MCMC Introduction Random Number Generators R packages Summary Concurrent Chains BRAG Sept. 27 Parallel MCMC in R
  • 6. Parallel MCMC Introduction Random Number Generators R packages Summary Simple Network Of Workstations R package snow by Luke Tierney, et al. spawns multiple copies of R provides several options for inter-process communication TCP sockets available on any platform, including Microsoft Windows Message Passing Interface (via the package Rmpi) Parallel Virtual Machine (via the package rpvm) NetWorkSpaces (via the package nws) can either run on a local machine or a cluster (eg. Lyra) BRAG Sept. 27 Parallel MCMC in R
  • 7. Parallel MCMC Introduction Random Number Generators R packages Summary multicore R package by Simon Urbanek implemented using the POSIX fork system call available on Linux and Mac OS X clones the R instance (functions + data) takes advantage of copy-on-write will fork as many processes as there are available CPU cores, unless told otherwise BRAG Sept. 27 Parallel MCMC in R
  • 8. Parallel MCMC Introduction Random Number Generators R packages Summary parallel R package parallel included in the core R distribution available in versions ≥ 2.14.0 incorporates subsets of snow, multicore, and rlecuyer sensible default behaviour BRAG Sept. 27 Parallel MCMC in R
  • 9. Parallel MCMC Introduction Random Number Generators R packages Summary foreach "syntactic sugar" § l i b r a r y ( foreach ) library ( parallel ) library ( doParallel ) # w i l l a u t o m a t i c a l l y use a SOCK c l u s t e r on Windows # ( o t h e r w i s e uses m u l t i c o r e ) r e g i s t e r D o P a r a l l e l ( cores = d e t e c t C o r e s ( ) ) f o r e a c h ( i =1: getDoParWorkers ( ) ) %dopar% { # t h i s code w i l l be executed c o n c u r r e n t l y ... } BRAG Sept. 27 Parallel MCMC in R
  • 10. Parallel MCMC Introduction Random Number Generators R packages Summary foreach with SNOW § l i b r a r y ( foreach ) library ( parallel ) library ( doParallel ) # setup l o c a l SOCK c l u s t e r f o r 4 CPU cores c l ← makePSOCKcluster ( 4 ) registerDoParallel ( cl ) f o r e a c h ( i =1: getDoParWorkers ( ) ) %dopar% { # t h i s code w i l l be executed c o n c u r r e n t l y ... } stopCluster ( cl ) BRAG Sept. 27 Parallel MCMC in R
  • 11. Parallel MCMC Introduction Random Number Generators R packages Summary foreach with multicore § l i b r a r y ( foreach ) library ( parallel ) library ( doParallel ) # f o r k one c h i l d process f o r each CPU core c l ← makeForkCluster ( d e t e c t C o r e s ( ) ) registerDoParallel ( cl ) f o r e a c h ( i =1: getDoParWorkers ( ) ) %dopar% { # t h i s code w i l l be executed c o n c u r r e n t l y ... } BRAG Sept. 27 Parallel MCMC in R
  • 12. Parallel MCMC Introduction Random Number Generators R packages Summary foreach with CODA If your Gibbs sampler returns an mcmc object, these can be conbined into an mcmc.list: § l i b r a r y ( coda ) samples . l i s t ← f o r e a c h ( i =1: getDoParWorkers ( ) , . combine=mcmc . l i s t , . m u l t i c o m b i n e =T ) %dopar% { # t h i s code w i l l be executed c o n c u r r e n t l y ... } BRAG Sept. 27 Parallel MCMC in R
  • 13. Parallel MCMC Introduction Random Number Generators R packages Summary foreach with other libraries You need to declare any libraries that are used inside the child process. For example: § l i b r a r y ( mvtnorm ) l i b r a r y ( coda ) f o r e a c h ( i =1: getDoParWorkers ( ) , . packages=c ( "mvtnorm" , "coda" ) ) %dopar% { # t h i s code uses mcmc ( . . . ) and rmvnorm ( . . . ) ... } BRAG Sept. 27 Parallel MCMC in R
  • 14. Parallel MCMC RNG and parallel MCMC Random Number Generators RNGs available in R Summary Random Number Generators for parallel MCMC The chains of our Gibbs sampler run independently, but: if the same RNG is seeded with the same value, all of the chains will generate the same random numbers in the same sequence - they will be identical! we either need to use: different seeds, or different random number generators for each chain (preferably both) it is also advisable to choose (or generate) different initial values in each chain of our Gibbs sampler BRAG Sept. 27 Parallel MCMC in R
  • 15. Parallel MCMC RNG and parallel MCMC Random Number Generators RNGs available in R Summary Mersenne Twister The default RNG in R pseudo-random sequence with 32bit precision periodicity of 219937 − 1 takes 0.4 seconds to generate 107 random numbers on an Intel Core i5 running R 2.15.1 and Windows 7 open-source implementation available at: http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html Matsumoto & Nishimura (1998) TOMACS 8: 3–30. BRAG Sept. 27 Parallel MCMC in R
  • 16. Parallel MCMC RNG and parallel MCMC Random Number Generators RNGs available in R Summary Other RNGs in the base package Wichmann-Hill (1982) Applied Statistics 31, 188–190. Marsaglia-Multicarry (Usenet newsgroup sci.stat.math, 1997) Super-Duper (Reeds, J., Hubert, S. and Abrahams, M., 1982–4) For JAGS with up to 4 concurrent chains: § r n g I n i t s ← p a r a l l e l . seeds ( "base::BaseRNG" , 4 ) BRAG Sept. 27 Parallel MCMC in R
  • 17. Parallel MCMC RNG and parallel MCMC Random Number Generators RNGs available in R Summary L’Ecuyer Available via R libraries rlecuyer or parallel Multiple independent streams of random numbers Periodicity ≈ 2191 (each stream is a subsequence of length 2127 ) 0.6 seconds to generate 107 random numbers via runif To initialize each child process in a SNOW cluster with an independent stream: § c l ← makeCluster ( 4 ) clusterSetRNGStream ( c l ) registerDoParallel ( cl ) L’Ecuyer, et al. (2002) Operations Research, 50(6): 1073–1075. BRAG Sept. 27 Parallel MCMC in R
  • 18. Parallel MCMC Random Number Generators Summary Summary Most MCMC algorithms are "embarrasingly parallel" chains run independently (as long as the RNG is set up correctly) The R packages foreach and doParallel make parallelism easy, on any computing platform Related topics (not covered in this presentation): Running R on a supercomputer (eg. lyra.qut.edu.au) Cloud computing with Apache Hadoop GPU programming in R (nVidia CUDA) BRAG Sept. 27 Parallel MCMC in R
  • 19. Appendix For Further Reading For Further Reading Norman Matloff The Art of R Programming. No Starch Press, 2011. M. Schmidberger, M. Morgan, D. Eddelbuettel, H. Yu, L. Tierney & U. Mansmann State of the Art in Parallel Computing with R. Journal of Statistical Software, 31(1), 2009. P. L’Ecuyer, R. Simard, E.J. Chen & W.D. Kelton An Object-Oriented Random-Number Package with Many Long Streams and Substreams. Operations Research, 50(6): 1073–1075, 2002. M. Matsumoto & T. Nishimura Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator. ACM Transactions on Modeling and Computer Simulation, 8: 3–30, 1998. BRAG Sept. 27 Parallel MCMC in R