SlideShare a Scribd company logo
Simple Regenerating Codes:
Network Coding for Cloud
Storage
Dimitris S. Papailiopoulos, Jianqiang Luo,
Alexandros G. Dimakis, Cheng Huang, and Jin Li

INFOCOM 2012

Presented by Tangkai
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
About the author
   Jianqiang Luo
    ◦ Experience
        Senior Software Engineer @ EMC
        Received PhD, Wayne State University
        Intern @ Microsoft, Data Domain
        Team Leader @ Actuate
        Received MS, SJTU
    ◦ Specialties
      Working on distributed storage systems during
       PhD
       Performance profiling.
About the author
   Alexandros G. Dimakis
    ◦ Assistant Professor
      Dept of EE – Systems, USC

    ◦ Research interests:
      Communications, signal processing and
       networking.


    ◦ INFOCOM 2012 - 2
    ◦ Erasure code MDS MSR MBR etc
About the author
   Cheng Huang
    ◦ Education
      Microsoft Research
      Ph.D. Washington University
      B.S. and M.S. EE Dept, SJTU

    ◦ Research interest
      cloud services, internet measurements, erasure
       correction codes, distributed storage systems, peer-to-
       peer streaming, networking and multimedia
       communications.

    ◦ INFOCOM 2011
      Public DNS System and Global Traffic Management
      Estimating the Performance of Hypothetical Cloud
       Service Deployments: A Measurement-Based Approach
About the author
   Jin Li
    ◦ Experience
       Microsoft Research
       BS/MS/PhD THU (within 7 years)
       计算机普及要从娃娃抓起


    ◦ Title
       IEEE Fellow
       GLOBECOM/ICME/ACM MM Chair
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
Introduction
   Background
    ◦ We have come into BIG DATA ERA!
      Digital Universe 1.8 ZB (=1.8e9 TB)
      Several PBs photo stored on Facebook
      14.1PB data stored on Taobao (2010)


    ◦ Data security is IMPORTANT
      Free from unwanted actions of unauthorized
       users.
      Free from data loss caused by destructive
       forces
Introduction
   Background
    ◦ Recovery
       rare exception -> regular operation
         GFS[1]:
           Hundreds or even thousands of machines
           Inexpensive commodity parts
           High concurrency/IO
    ◦ High failure tolerance, both for
       High availability and to prevent data loss
[1] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,” in
SOSP ’03: Proc. of the 19th ACM Symposium on Operating Systems
Principles, 2003.
Introduction
   Background
    ◦ Erasure coding > replication
      1. redundancy level, reliability
      2. reliability, storage cost
    ◦ Some applications
      Cloud storage systems
      Archival storage
      Peer-to-peer storage systems
Introduction
   Erasure coding: MDS            n=3                n=4
                       k=2
         File or                    A                  A
          data          A
         object

     A             B                B                  B

                        B

                                  A+B                A+B


                             (3,2) MDS code,
                               (single parity)      A+2B
                              used in RAID 5
                                                   (4,2) MDS
                                                 code. Tolerates
                                                  any 2 failures
                                                 Used in RAID 6
Introduction
                 Erasure coding vs. Replica[3]erasure code
                                        (4,2) MDS
                                             Replication        (any 2 suffice to recover)

            File or                              A                      A
             data              A
            object


                                                 A                      B
                                                           vs
                               B

                                                 B                    A+B



                                                 B                   A+2B

[3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network
coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
Introduction
                 Erasure coding vs. Replica[3]erasure code
                                        (4,2) MDS
                                                  Replication    (any 2 suffice to recover)

            File or                                    A                 A
             data                  A
            object


                                                       A                 B
                        Erasure coding is introducing redundancy in an optimal way.
                                                                 vs
                                    B      Very useful in practice
                      i.e. Reed-Solomon codes, Fountain Codes, (LT and Raptor)…
                                                       B               A+B



                                                       B              A+2B

[3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network
coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
Introduction
   Metrics
    ◦ Storage per node (α)
    ◦ Repair Bandwidth per single node repair
      (γ)
    ◦ Disk Accesses per single node repair (d)
    ◦ Effective Coding Rate (R)

   Contribution
    ◦ High R, Small d
    ◦ Low repair computation complexity
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
SRC
   SRC: Simple Regenerating Codes
    ◦ Regenerating Codes
      address the issue of rebuilding (also called
       repairing) lost encoded fragments from existing
       encoded fragments. This issue arises in
       distributed storage systems where
       communication to maintain encoded
       redundancy is a problem.
SRC
    Object
        Requirement I: (n, k) property
            MDS[2]




[2] Alexandros G. Dimakis, Kannan Ramchandran, Yunnan
Wu, Changho Suh:
A Survey on Network Codes for Distributed Storage. in Proceedings of the
SRC
 ◦ MDS
SRC
    Requirement II: efficient exact repair
     ◦ Efficient: Low complexity
     ◦ Exact repair (vs. functional repair)[3] :
        1. [demands]Data have to stay in systematic
         form
        2. [complexity]Updating repairing-decoding
         rules-> additional overhead
        3. [security] dynamic repairing-and-decoding
         rules observed by eavesdroppers ->
         information leakage
[2] Changho Suh, Kannan Ramchandran: Exact Regeneration Codes
for Distributed Storage Repair Using Interference Alignment. in IEEE
TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH
SRC
   Solution

    ◦ MDS codes are used to provide reliability
      to meets Requirement I

    ◦ simple XORs applied over the MDS coded
      packets provide efficient exact repair to
      meets Requirement II
SRC
   Construction
SRC
   Repair
(n,k,2)-SRC
   Code Construction
    ◦ File f , of size M = 2k
    ◦ Split into 2 parts

    ◦ 1. 2 independent (n,k)-MDS encoding

    ◦ 2. Generating a parity sum vector using
      XOR
(n,k,2)-SRC
   Distribution
    ◦ 3n chunks in n storage nodes
(n,k,2)-SRC
   Repair
(n,k,f)-SRC
   General Code Construction
    ◦ File f , of size M = fk
    ◦ Cut into f parts

    ◦ 1. f independent (n,k)-MDS encoding

    ◦ 2. Generating a parity sum vector using
      XOR
(n,k,f)-SRC
   Distribution
    ◦ (f+1)n chunks in n storage nodes
(n,k,f)-SRC
   Repair
(n,k,f)-SRC
   Theorem
    ◦ Effective Coding Rate (R)



      SRC is a fraction f/f+1 of the coding rate of an
       (n, k) MDS code, hence is upper bounded
(n,k,f)-SRC
   Theorem
    ◦ Effective Coding Rate (R)
(n,k,f)-SRC
   Theorem
    ◦ Storage per node (α)

    ◦ Repair Bandwidth per single node repair
      (γ)

    ◦ Disk Accesses per single node repair (d)
      Seek time
(n,k,f)-SRC
   Theorem
    ◦ Disk Accesses per single node repair (d)
      Starting with f disk accesses for the first chunk
       repair
(n,k,f)-SRC
   Theorem
    ◦ Disk Accesses per single node repair (d)



      each additional chunk repair requires an
       additional disk access
(n,k,f)-SRC
   Comparasion
(n,k,f)-SRC
   Asymptotics of the SRC -> MDS
    ◦ let the degree of parities f grow as a
      function of k

    ◦ Repair Bandwidth per single node repair
      (γ)



    ◦ Effective Coding Rate (R)
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
Simulations
   Simulator Introduction
    ◦ One master, other storage server.
    ◦ Chunks form the smallest accessible data
      units and in our system are set to be
      64MB

   Simulator Validation
    ◦   16 machines
    ◦   1Gbps network.
    ◦   410GB data per machine
    ◦   Approximately 6400 chunks
Simulations
   Simulator Validation
    ◦ matches very well, when the percentile is
      below 95
Simulations
   Storage Cost Analysis
    ◦ 3-way replication as baseline
Simulations
   Repair Performance
    ◦ Calculated on time
    ◦ Highlights: Scalability
Simulations
   Degraded Read Performance
    ◦ The only difference is after a chunk is
      repaired, we do not write it back.
Simulations
   Data Reliability Analysis
    ◦ simple Markov model to estimate the
      reliability
    ◦ 5 years /1PB data /
    ◦ 30 min for replica / 15 min for SRC
Simulations
   Data Reliability Analysis
      Several order of magnitude of reliablity
      Scalability
Index
 About the author
 Introduction
 SRC
 Simulations
 Conclusion
Conclusions
   Highlight
    ◦ R-S
      Low IO/bandwidth -> scalability
    ◦ replica
      High reliability
      Decent repair/degraded read performance
Critical Thinking
 Simulation
 (n, k)as n grows, erasure
  performance is weaker
 Compare
    ◦ MSR?
    ◦ Exact?
    ◦ Implementation - > Simulation

More Related Content

PDF
Network Coding for Distributed Storage Systems(Group Meeting Talk)
PPT
Network coding
PDF
Network Coding
PPTX
A short introduction to Network coding
PDF
Hamming net based Low Complexity Successive Cancellation Polar Decoder
PDF
An Efficient FPGA Implementation of the Advanced Encryption Standard Algorithm
PDF
Network Coding for Distributed Storage Systems(Group Meeting Talk)
Network coding
Network Coding
A short introduction to Network coding
Hamming net based Low Complexity Successive Cancellation Polar Decoder
An Efficient FPGA Implementation of the Advanced Encryption Standard Algorithm

What's hot (19)

PDF
Hardware Implementations of RS Decoding Algorithm for Multi-Gb/s Communicatio...
PDF
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
PDF
Design and Implementation of an Embedded System for Software Defined Radio
PDF
Ecc cipher processor based on knapsack algorithm
PDF
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
PDF
Design of Reversible Sequential Circuit Using Reversible Logic Synthesis
PDF
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
PDF
Design and implementation of log domain decoder
PDF
Watermarking of JPEG2000 Compressed Images with Improved Encryption
DOC
Research Paper
PDF
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
PDF
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
PDF
Cryptoghraphy
PDF
129966862758614726[1]
PPTX
Rc6 algorithm
PDF
Performance Analysis of Steepest Descent Decoding Algorithm for LDPC Codes
PDF
IRJET- FPGA Implementation of Image Encryption and Decryption using Fully Hom...
PDF
DESIGN OF SOFT VITERBI ALGORITHM DECODER ENHANCED WITH NON-TRANSMITTABLE CODE...
PDF
International Journal of Engineering Research and Development (IJERD)
Hardware Implementations of RS Decoding Algorithm for Multi-Gb/s Communicatio...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Design and Implementation of an Embedded System for Software Defined Radio
Ecc cipher processor based on knapsack algorithm
Aes cryptography algorithm based on intelligent blum blum-shub prn gs publica...
Design of Reversible Sequential Circuit Using Reversible Logic Synthesis
Hardware implementation of (63, 51) bch encoder and decoder for wban using lf...
Design and implementation of log domain decoder
Watermarking of JPEG2000 Compressed Images with Improved Encryption
Research Paper
Reduced Complexity Maximum Likelihood Decoding Algorithm for LDPC Code Correc...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Cryptoghraphy
129966862758614726[1]
Rc6 algorithm
Performance Analysis of Steepest Descent Decoding Algorithm for LDPC Codes
IRJET- FPGA Implementation of Image Encryption and Decryption using Fully Hom...
DESIGN OF SOFT VITERBI ALGORITHM DECODER ENHANCED WITH NON-TRANSMITTABLE CODE...
International Journal of Engineering Research and Development (IJERD)
Ad

Viewers also liked (20)

DOC
Enabling data integrity protection in regenerating coding-based cloud storage...
PDF
Ieeepro techno solutions ieee dotnet project - nc cloud applying network co...
PPTX
140320702029 maurya ppt
PPTX
The Performance of MapReduce: An In-depth Study
PDF
臺灣閩南語推薦用字700字表
PDF
臺灣閩南語推薦用字第二批
PDF
臺灣閩南語羅馬字拼音方案使用手冊
PDF
全球最佳外派目的地 新加坡居冠台灣第8
PDF
漢語間統計式機器翻譯語料處理-用臺灣閩南語示範
PDF
Transport methods in 3DTV--A Survey
PDF
走入現代生活的台灣諺語
PDF
花宅聚落數位典藏執行簡報20081124
PPTX
Analysis of Adaptive Streaming for Hybrid CDN/P2P Live Video Systems
PDF
談莫札特的歌劇《女人皆如此》
PDF
閩南俚語
PPTX
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
PDF
Parte 1 - Linux ed i sistemi embedded per le reti (di Andrea Tassi)
PDF
Real-Coded Extended Compact Genetic Algorithm based on Mixtures of Models
PDF
On Extended Compact Genetic Algorithm
PDF
Enabling data integrity protection in regenerating coding-based cloud storage...
Ieeepro techno solutions ieee dotnet project - nc cloud applying network co...
140320702029 maurya ppt
The Performance of MapReduce: An In-depth Study
臺灣閩南語推薦用字700字表
臺灣閩南語推薦用字第二批
臺灣閩南語羅馬字拼音方案使用手冊
全球最佳外派目的地 新加坡居冠台灣第8
漢語間統計式機器翻譯語料處理-用臺灣閩南語示範
Transport methods in 3DTV--A Survey
走入現代生活的台灣諺語
花宅聚落數位典藏執行簡報20081124
Analysis of Adaptive Streaming for Hybrid CDN/P2P Live Video Systems
談莫札特的歌劇《女人皆如此》
閩南俚語
TCP-FIT: An Improved TCP Congestion Control Algorithm and its Performance
Parte 1 - Linux ed i sistemi embedded per le reti (di Andrea Tassi)
Real-Coded Extended Compact Genetic Algorithm based on Mixtures of Models
On Extended Compact Genetic Algorithm
Ad

Similar to Simple regenerating codes: Network Coding for Cloud Storage (20)

PDF
Grouping of Hashtags using Co-relating the Occurrence in Microblogs
PDF
QuadIron An open source library for number theoretic transform-based erasure ...
PDF
Erasure codes fast 2012
PDF
Innovative Improvement of Data Storage Using Error Correction Codes
PDF
my presentation of the paper "FAST'12 NCCloud"
PDF
Dn4301681689
PPTX
Information & Communication System --Syndrome.pptx
PDF
Binary_Codes.pdfhfkhvfkdhvbhfvfdhfgffhfdhyf
PDF
Error-Correcting codes: Application of convolutional codes to Video Streaming
DOCX
ieee paper
PPTX
module 5-Virtual memory for cao and fuzzy sets
PDF
IRJET- A Survey on Encode-Compare and Decode-Compare Architecture for Tag Mat...
PDF
Bj32392395
PDF
Energy-Efficient LDPC Decoder using DVFS for binary sources
PPTX
Coding Scheme/ Information theory/ Error coding scheme
PDF
Erasure Coding: Revolutionizing Data Durability and Storage Efficiency
PDF
Jb2415831591
PPTX
Binary and EC codes
DOCX
ROUGH DOC.437
Grouping of Hashtags using Co-relating the Occurrence in Microblogs
QuadIron An open source library for number theoretic transform-based erasure ...
Erasure codes fast 2012
Innovative Improvement of Data Storage Using Error Correction Codes
my presentation of the paper "FAST'12 NCCloud"
Dn4301681689
Information & Communication System --Syndrome.pptx
Binary_Codes.pdfhfkhvfkdhvbhfvfdhfgffhfdhyf
Error-Correcting codes: Application of convolutional codes to Video Streaming
ieee paper
module 5-Virtual memory for cao and fuzzy sets
IRJET- A Survey on Encode-Compare and Decode-Compare Architecture for Tag Mat...
Bj32392395
Energy-Efficient LDPC Decoder using DVFS for binary sources
Coding Scheme/ Information theory/ Error coding scheme
Erasure Coding: Revolutionizing Data Durability and Storage Efficiency
Jb2415831591
Binary and EC codes
ROUGH DOC.437

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
KodekX | Application Modernization Development
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Big Data Technologies - Introduction.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
Spectral efficient network and resource selection model in 5G networks
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
KodekX | Application Modernization Development
Understanding_Digital_Forensics_Presentation.pptx
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
Diabetes mellitus diagnosis method based random forest with bat algorithm
Mobile App Security Testing_ A Comprehensive Guide.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Network Security Unit 5.pdf for BCA BBA.
Big Data Technologies - Introduction.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
MIND Revenue Release Quarter 2 2025 Press Release
20250228 LYD VKU AI Blended-Learning.pptx

Simple regenerating codes: Network Coding for Cloud Storage

  • 1. Simple Regenerating Codes: Network Coding for Cloud Storage Dimitris S. Papailiopoulos, Jianqiang Luo, Alexandros G. Dimakis, Cheng Huang, and Jin Li INFOCOM 2012 Presented by Tangkai
  • 2. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 3. About the author  Jianqiang Luo ◦ Experience  Senior Software Engineer @ EMC  Received PhD, Wayne State University  Intern @ Microsoft, Data Domain  Team Leader @ Actuate  Received MS, SJTU ◦ Specialties  Working on distributed storage systems during PhD Performance profiling.
  • 4. About the author  Alexandros G. Dimakis ◦ Assistant Professor Dept of EE – Systems, USC ◦ Research interests:  Communications, signal processing and networking. ◦ INFOCOM 2012 - 2 ◦ Erasure code MDS MSR MBR etc
  • 5. About the author  Cheng Huang ◦ Education  Microsoft Research  Ph.D. Washington University  B.S. and M.S. EE Dept, SJTU ◦ Research interest  cloud services, internet measurements, erasure correction codes, distributed storage systems, peer-to- peer streaming, networking and multimedia communications. ◦ INFOCOM 2011  Public DNS System and Global Traffic Management  Estimating the Performance of Hypothetical Cloud Service Deployments: A Measurement-Based Approach
  • 6. About the author  Jin Li ◦ Experience  Microsoft Research  BS/MS/PhD THU (within 7 years)  计算机普及要从娃娃抓起 ◦ Title  IEEE Fellow  GLOBECOM/ICME/ACM MM Chair
  • 7. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 8. Introduction  Background ◦ We have come into BIG DATA ERA!  Digital Universe 1.8 ZB (=1.8e9 TB)  Several PBs photo stored on Facebook  14.1PB data stored on Taobao (2010) ◦ Data security is IMPORTANT  Free from unwanted actions of unauthorized users.  Free from data loss caused by destructive forces
  • 9. Introduction  Background ◦ Recovery  rare exception -> regular operation  GFS[1]:  Hundreds or even thousands of machines  Inexpensive commodity parts  High concurrency/IO ◦ High failure tolerance, both for  High availability and to prevent data loss [1] S. Ghemawat, H. Gobioff, and S.-T. Leung, “The Google file system,” in SOSP ’03: Proc. of the 19th ACM Symposium on Operating Systems Principles, 2003.
  • 10. Introduction  Background ◦ Erasure coding > replication  1. redundancy level, reliability  2. reliability, storage cost ◦ Some applications  Cloud storage systems  Archival storage  Peer-to-peer storage systems
  • 11. Introduction  Erasure coding: MDS n=3 n=4 k=2 File or A A data A object A B B B B A+B A+B (3,2) MDS code, (single parity) A+2B used in RAID 5 (4,2) MDS code. Tolerates any 2 failures Used in RAID 6
  • 12. Introduction  Erasure coding vs. Replica[3]erasure code (4,2) MDS Replication (any 2 suffice to recover) File or A A data A object A B vs B B A+B B A+2B [3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
  • 13. Introduction  Erasure coding vs. Replica[3]erasure code (4,2) MDS Replication (any 2 suffice to recover) File or A A data A object A B Erasure coding is introducing redundancy in an optimal way. vs B Very useful in practice i.e. Reed-Solomon codes, Fountain Codes, (LT and Raptor)… B A+B B A+2B [3]A. G. Dimakis, P. G. Godfrey, Y. Wu, M. J. Wainwright, and K. Ramchandran,“Network coding for distributed storage systems,” in IEEE Trans. on Inform. Theory, vol. 56, pp.
  • 14. Introduction  Metrics ◦ Storage per node (α) ◦ Repair Bandwidth per single node repair (γ) ◦ Disk Accesses per single node repair (d) ◦ Effective Coding Rate (R)  Contribution ◦ High R, Small d ◦ Low repair computation complexity
  • 15. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 16. SRC  SRC: Simple Regenerating Codes ◦ Regenerating Codes  address the issue of rebuilding (also called repairing) lost encoded fragments from existing encoded fragments. This issue arises in distributed storage systems where communication to maintain encoded redundancy is a problem.
  • 17. SRC  Object  Requirement I: (n, k) property  MDS[2] [2] Alexandros G. Dimakis, Kannan Ramchandran, Yunnan Wu, Changho Suh: A Survey on Network Codes for Distributed Storage. in Proceedings of the
  • 19. SRC  Requirement II: efficient exact repair ◦ Efficient: Low complexity ◦ Exact repair (vs. functional repair)[3] :  1. [demands]Data have to stay in systematic form  2. [complexity]Updating repairing-decoding rules-> additional overhead  3. [security] dynamic repairing-and-decoding rules observed by eavesdroppers -> information leakage [2] Changho Suh, Kannan Ramchandran: Exact Regeneration Codes for Distributed Storage Repair Using Interference Alignment. in IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 57, NO. 3, MARCH
  • 20. SRC  Solution ◦ MDS codes are used to provide reliability to meets Requirement I ◦ simple XORs applied over the MDS coded packets provide efficient exact repair to meets Requirement II
  • 21. SRC  Construction
  • 22. SRC  Repair
  • 23. (n,k,2)-SRC  Code Construction ◦ File f , of size M = 2k ◦ Split into 2 parts ◦ 1. 2 independent (n,k)-MDS encoding ◦ 2. Generating a parity sum vector using XOR
  • 24. (n,k,2)-SRC  Distribution ◦ 3n chunks in n storage nodes
  • 25. (n,k,2)-SRC  Repair
  • 26. (n,k,f)-SRC  General Code Construction ◦ File f , of size M = fk ◦ Cut into f parts ◦ 1. f independent (n,k)-MDS encoding ◦ 2. Generating a parity sum vector using XOR
  • 27. (n,k,f)-SRC  Distribution ◦ (f+1)n chunks in n storage nodes
  • 28. (n,k,f)-SRC  Repair
  • 29. (n,k,f)-SRC  Theorem ◦ Effective Coding Rate (R)  SRC is a fraction f/f+1 of the coding rate of an (n, k) MDS code, hence is upper bounded
  • 30. (n,k,f)-SRC  Theorem ◦ Effective Coding Rate (R)
  • 31. (n,k,f)-SRC  Theorem ◦ Storage per node (α) ◦ Repair Bandwidth per single node repair (γ) ◦ Disk Accesses per single node repair (d)  Seek time
  • 32. (n,k,f)-SRC  Theorem ◦ Disk Accesses per single node repair (d)  Starting with f disk accesses for the first chunk repair
  • 33. (n,k,f)-SRC  Theorem ◦ Disk Accesses per single node repair (d)  each additional chunk repair requires an additional disk access
  • 34. (n,k,f)-SRC  Comparasion
  • 35. (n,k,f)-SRC  Asymptotics of the SRC -> MDS ◦ let the degree of parities f grow as a function of k ◦ Repair Bandwidth per single node repair (γ) ◦ Effective Coding Rate (R)
  • 36. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 37. Simulations  Simulator Introduction ◦ One master, other storage server. ◦ Chunks form the smallest accessible data units and in our system are set to be 64MB  Simulator Validation ◦ 16 machines ◦ 1Gbps network. ◦ 410GB data per machine ◦ Approximately 6400 chunks
  • 38. Simulations  Simulator Validation ◦ matches very well, when the percentile is below 95
  • 39. Simulations  Storage Cost Analysis ◦ 3-way replication as baseline
  • 40. Simulations  Repair Performance ◦ Calculated on time ◦ Highlights: Scalability
  • 41. Simulations  Degraded Read Performance ◦ The only difference is after a chunk is repaired, we do not write it back.
  • 42. Simulations  Data Reliability Analysis ◦ simple Markov model to estimate the reliability ◦ 5 years /1PB data / ◦ 30 min for replica / 15 min for SRC
  • 43. Simulations  Data Reliability Analysis  Several order of magnitude of reliablity  Scalability
  • 44. Index  About the author  Introduction  SRC  Simulations  Conclusion
  • 45. Conclusions  Highlight ◦ R-S  Low IO/bandwidth -> scalability ◦ replica  High reliability  Decent repair/degraded read performance
  • 46. Critical Thinking  Simulation  (n, k)as n grows, erasure performance is weaker  Compare ◦ MSR? ◦ Exact? ◦ Implementation - > Simulation