SlideShare a Scribd company logo
GlusterFS 3.3 Deep-dive
                              AB Periasamy
                Office of the CTO, Red Hat

                        John Mark Walker
                  Gluster Community Guy
Topics
 Review
 Community and Evolution of GlusterFS
 Feature overview
      Granular locking
      Replication Improvements (AFR)
      Unified file and object storage
      HDFS compatibility



06/13/12
1. Quick Review



06/13/12
Simple Economics
        Simplicity, scalability, less cost


  Virtualized     Multi-Tenant   Automated   Commoditized


Scale on Demand   In the Cloud   Scale Out   Open Source




  06/13/12
What is GlusterFS,
 Really?
           Gluster is a unified, distributed
            storage system
             DHT, stackable, POSIX, Swift, HDFS




06/13/12
What Can You Store?
      Media – Docs, Photos, Video
      VM Filesystem – VM Disk Images
      Big Data – Log Files, RFID Data
      Objects – Long Tail Data



06/13/12
2. Community and
           GlusterFS
            Evolution

06/13/12
Community-led Features
      2009 – GlusterFS easier to use
      2010 – CLI, shell, glusterd
      2011 – Marker framework, geo-replication




06/13/12
GlusterFS in 2011
      Scale-out NAS
      Distributed and replicated
      NFS, CIFS and native GlusterFS
      User-space, stackable architecture

 → A good platform to build on
06/13/12
GlusterFS in 2011: The
  Gaps
 Object storage – popularized by S3
      Simplicity bias – GET & PUT
      Combined with RESTful API
      Used mostly in web-based applications


06/13/12
GlusterFS in 2011: The
  Gaps
 Big data, semi-structured data
   No Hadoop, MapReduce capabilities
 Structured data (databases)
      No MongoDB, Oracle, MySQL capability

06/13/12
GlusterFS in 2011: The
  Gaps
 VM image hosting difficulties
  Difficulty in self-heal, rebalancing
 Small files
      PHP-based web sites, primary email storage

06/13/12
3. Feature Overview



06/13/12
GlusterFS in 2012:
  Filling the Gaps
 Better replication
   Granular locking
   Proactive self-healing
   Quorum enforcement
 Synchronous translator API

06/13/12
Granular Locking
           Server fails, comes back
           Files evaluated
           Block-by-block until healed
                                          Blocks compared



           Virtual Disk 1-1
                         Virtual Disk 1-2Virtual Disk 2-1
                                                       Virtual Disk 2-2


                    GlusterFS                     GlusterFS

                     Server 1                      Server 2
06/13/12
Proactive Self-healing
              Performed server-to-server
              Recovered node queries peers
              Server 1 - good                         Server 3 - good

                         / Symlink 1
                 Hidden | Symlink 2
Distributed




                          Symlink 3
                                       Replicated

              Server 2 - recovered                    Server 4 - good

               File 1                                 File 1
               File 2                                 File 2
               File 3                  Self-healing   File 3


 06/13/12
Split Brain
           Nodes cannot see each other, but can
            all still write
           Often due to network outages
           Sometimes results in conflicts
           Up to 3.2, GlusterFS had no concept of
            “quorum”


06/13/12
Quorum Enforcement
           Which node has valid data?
           If quorum, keep writing, else stop
               Configurable option

  Server 1                         Server 2           Server 3

      -No quorum                       -Quorum            -Quorum
     -Stops writing                  -Keeps writing     -Keeps writing

                       Broken
                      Connection



06/13/12
Quorum Enforcement
              After connection restored, self-heal kicks off


     Replica 1                          Replica 2         Replica 3

            -No quorum                     -Quorum            -Quorum
           -Stops writing   Self-heal    -Keeps writing     -Keeps writing



            -No quorum                     -Quorum            -Quorum
           -Stops writing                -Keeps writing     -Keeps writing




06/13/12
GlusterFS in 2012:
  Filling the Gaps
 Synchronous translator API
 Unified File and Object Storage (UFO)
 HDFS-compatible storage layer


06/13/12
Synchronous Translator
  API
 GlusterFS runs asynchronously
  non-blocking I/O, for performance
 Writing code for async I/O confusing


06/13/12
Synchronous Translator
  API
 3.3 introduces synchronous translators
      Easier to write
      Great for non-core operations
        Eg. background scrubbing

06/13/12
Unified File and Object
  (UFO)
           S3, Swift-style object storage
           Access via UFO or Gluster mount
                    HTTP Request
           Client                                   Account    Volume
                                            Proxy
                    ID=/dir/sub/sub2/file
                                                    Containe
                                                               Directory
                                                        r
                           NFS or
           Client                                                File
                                                     Object
                          GlusterFS Mount

06/13/12
Unified File and Object
  (UFO)
           Your gateway to the cloud
           Your data, accessed your way




06/13/12
HDFS Compatibility
           Run MapReduce jobs on GlusterFS
           Add unstructured data to Hadoop


              Hadoop Server                GlusterF   GlusterF
                                               S          S


                                           GlusterF   GlusterF
                 Local Disk                    S          S
                              HDFS
                              Connector
06/13/12                      (Jar file)
Thank you!
                     AB Periasamy
       Office of the CTO, Red Hat
                   ab@redhat.com

               John Mark Walker
         Gluster Community Guy
          johnmark@redhat.com

More Related Content

PDF
Gluster for Geeks: Performance Tuning Tips & Tricks
ODP
Efficient data maintaince in GlusterFS using Databases
ODP
Performance characterization in large distributed file system with gluster fs
ODP
Gluster fs architecture_future_directions_tlv
PDF
PostgreSQL Scaling And Failover
PDF
Red Hat Storage Server Administration Deep Dive
PDF
GlusterFS And Big Data
PDF
Gluster.community.day.2013
Gluster for Geeks: Performance Tuning Tips & Tricks
Efficient data maintaince in GlusterFS using Databases
Performance characterization in large distributed file system with gluster fs
Gluster fs architecture_future_directions_tlv
PostgreSQL Scaling And Failover
Red Hat Storage Server Administration Deep Dive
GlusterFS And Big Data
Gluster.community.day.2013

What's hot (19)

PDF
Filesystem Comparison: NFS vs GFS2 vs OCFS2
PDF
Storage as a Service with Gluster
PPTX
Gluster the ugly parts with Jeff Darcy
ODP
Gluster Data Tiering
PDF
Ceph Block Devices: A Deep Dive
PDF
State of Gluster Performance
ODP
Glusterfs and Hadoop
PPTX
Backup, Restore, and Disaster Recovery
PDF
The Future of GlusterFS and Gluster.org
PDF
PostreSQL HA and DR Setup & Use Cases
PPTX
Ambari Meetup: NameNode HA
ODP
Red Hat Gluster Storage : GlusterFS
ODP
Sdc challenges-2012
ODP
Dustin Black - Red Hat Storage Server Administration Deep Dive
PPTX
Webinar: Backups + Disaster Recovery
PDF
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
PDF
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
PPTX
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
PDF
004 architecture andadvanceduse
Filesystem Comparison: NFS vs GFS2 vs OCFS2
Storage as a Service with Gluster
Gluster the ugly parts with Jeff Darcy
Gluster Data Tiering
Ceph Block Devices: A Deep Dive
State of Gluster Performance
Glusterfs and Hadoop
Backup, Restore, and Disaster Recovery
The Future of GlusterFS and Gluster.org
PostreSQL HA and DR Setup & Use Cases
Ambari Meetup: NameNode HA
Red Hat Gluster Storage : GlusterFS
Sdc challenges-2012
Dustin Black - Red Hat Storage Server Administration Deep Dive
Webinar: Backups + Disaster Recovery
Ceph Day Amsterdam 2015: Measuring and predicting performance of Ceph clusters
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
Ibm spectrum scale fundamentals workshop for americas part 2 IBM Spectrum Sca...
004 architecture andadvanceduse
Ad

Viewers also liked (8)

PDF
Gluster.next feb-2016
PDF
Red Hat TUG Utrecht - Storage Update june 2015
PDF
Gluster overview & future directions vault 2015
PPTX
Redhat ha cluster with pacemaker
PDF
High Availability Options for Modern Oracle Infrastructures
ODP
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
PPT
Pacemaker+DRBD
PDF
Linux-HA with Pacemaker
Gluster.next feb-2016
Red Hat TUG Utrecht - Storage Update june 2015
Gluster overview & future directions vault 2015
Redhat ha cluster with pacemaker
High Availability Options for Modern Oracle Infrastructures
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
Pacemaker+DRBD
Linux-HA with Pacemaker
Ad

Similar to Gluster 3.3 deep dive (20)

PDF
State of the_gluster_-_lceu
PDF
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
PDF
Cephfsglusterfs.talk
PDF
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
PDF
GlusterFs: a scalable file system for today's and tomorrow's big data
PDF
Sdc 2012-challenges
PDF
Distributed File Systems
PDF
Gluster fs current_features_and_roadmap
PDF
Gluster fs current_features_and_roadmap
PDF
Gluster fs buero20_presentation
PDF
はじめてのGlusterFS
ODP
The Future of GlusterFS and Gluster.org
PDF
GlusterFS Update and OpenStack Integration
PDF
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo Seidel
PDF
Gluster fs architecture_future_directions_tlv
KEY
Storing and distributing data
PDF
Future of cloud storage
ODP
GlusterFS Architecture - June 30, 2011 Meetup
POT
Kosmos Filesystem
PDF
Gluster fs architecture_&_roadmap-vijay_bellur-linuxcon_eu_2013
State of the_gluster_-_lceu
vBACD - Distributed Petabyte-Scale Cloud Storage with GlusterFS - 2/28
Cephfsglusterfs.talk
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
GlusterFs: a scalable file system for today's and tomorrow's big data
Sdc 2012-challenges
Distributed File Systems
Gluster fs current_features_and_roadmap
Gluster fs current_features_and_roadmap
Gluster fs buero20_presentation
はじめてのGlusterFS
The Future of GlusterFS and Gluster.org
GlusterFS Update and OpenStack Integration
OSDC 2013 | Distributed Storage with GlusterFS by Dr. Udo Seidel
Gluster fs architecture_future_directions_tlv
Storing and distributing data
Future of cloud storage
GlusterFS Architecture - June 30, 2011 Meetup
Kosmos Filesystem
Gluster fs architecture_&_roadmap-vijay_bellur-linuxcon_eu_2013

More from John Mark Walker (11)

PDF
OSEN SF Meetup - Business of Open Source
PPTX
Product Development in the Age of Cloud Native
PDF
From project to product
ODP
Hybrid Cloud Management with ManageIQ
PDF
The Secrets to Open Source Innovation
ODP
Gluster: where weve been - a history
PDF
Open Source and Cloud - The Two Great Tastes...
PDF
GlusterFS Community Preso
ODP
Intro to Open Cloud Initiative
PDF
FOSS vs. Web Services Lightning Talk: Is FOSS Necessary?
ODP
Building Vibrant Open Source Communities
OSEN SF Meetup - Business of Open Source
Product Development in the Age of Cloud Native
From project to product
Hybrid Cloud Management with ManageIQ
The Secrets to Open Source Innovation
Gluster: where weve been - a history
Open Source and Cloud - The Two Great Tastes...
GlusterFS Community Preso
Intro to Open Cloud Initiative
FOSS vs. Web Services Lightning Talk: Is FOSS Necessary?
Building Vibrant Open Source Communities

Recently uploaded (20)

PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
KodekX | Application Modernization Development
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Encapsulation theory and applications.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Encapsulation_ Review paper, used for researhc scholars
Digital-Transformation-Roadmap-for-Companies.pptx
A Presentation on Artificial Intelligence
KodekX | Application Modernization Development
The AUB Centre for AI in Media Proposal.docx
Understanding_Digital_Forensics_Presentation.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Dropbox Q2 2025 Financial Results & Investor Presentation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Unlocking AI with Model Context Protocol (MCP)
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Approach and Philosophy of On baking technology
MYSQL Presentation for SQL database connectivity
Agricultural_Statistics_at_a_Glance_2022_0.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
20250228 LYD VKU AI Blended-Learning.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Encapsulation theory and applications.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Encapsulation_ Review paper, used for researhc scholars

Gluster 3.3 deep dive

  • 1. GlusterFS 3.3 Deep-dive AB Periasamy Office of the CTO, Red Hat John Mark Walker Gluster Community Guy
  • 2. Topics Review Community and Evolution of GlusterFS Feature overview Granular locking Replication Improvements (AFR) Unified file and object storage HDFS compatibility 06/13/12
  • 4. Simple Economics Simplicity, scalability, less cost Virtualized Multi-Tenant Automated Commoditized Scale on Demand In the Cloud Scale Out Open Source 06/13/12
  • 5. What is GlusterFS, Really? Gluster is a unified, distributed storage system DHT, stackable, POSIX, Swift, HDFS 06/13/12
  • 6. What Can You Store? Media – Docs, Photos, Video VM Filesystem – VM Disk Images Big Data – Log Files, RFID Data Objects – Long Tail Data 06/13/12
  • 7. 2. Community and GlusterFS Evolution 06/13/12
  • 8. Community-led Features 2009 – GlusterFS easier to use 2010 – CLI, shell, glusterd 2011 – Marker framework, geo-replication 06/13/12
  • 9. GlusterFS in 2011 Scale-out NAS Distributed and replicated NFS, CIFS and native GlusterFS User-space, stackable architecture → A good platform to build on 06/13/12
  • 10. GlusterFS in 2011: The Gaps Object storage – popularized by S3 Simplicity bias – GET & PUT Combined with RESTful API Used mostly in web-based applications 06/13/12
  • 11. GlusterFS in 2011: The Gaps Big data, semi-structured data No Hadoop, MapReduce capabilities Structured data (databases) No MongoDB, Oracle, MySQL capability 06/13/12
  • 12. GlusterFS in 2011: The Gaps VM image hosting difficulties Difficulty in self-heal, rebalancing Small files PHP-based web sites, primary email storage 06/13/12
  • 14. GlusterFS in 2012: Filling the Gaps Better replication Granular locking Proactive self-healing Quorum enforcement Synchronous translator API 06/13/12
  • 15. Granular Locking Server fails, comes back Files evaluated Block-by-block until healed Blocks compared Virtual Disk 1-1 Virtual Disk 1-2Virtual Disk 2-1 Virtual Disk 2-2 GlusterFS GlusterFS Server 1 Server 2 06/13/12
  • 16. Proactive Self-healing Performed server-to-server Recovered node queries peers Server 1 - good Server 3 - good / Symlink 1 Hidden | Symlink 2 Distributed Symlink 3 Replicated Server 2 - recovered Server 4 - good File 1 File 1 File 2 File 2 File 3 Self-healing File 3 06/13/12
  • 17. Split Brain Nodes cannot see each other, but can all still write Often due to network outages Sometimes results in conflicts Up to 3.2, GlusterFS had no concept of “quorum” 06/13/12
  • 18. Quorum Enforcement Which node has valid data? If quorum, keep writing, else stop Configurable option Server 1 Server 2 Server 3 -No quorum -Quorum -Quorum -Stops writing -Keeps writing -Keeps writing Broken Connection 06/13/12
  • 19. Quorum Enforcement After connection restored, self-heal kicks off Replica 1 Replica 2 Replica 3 -No quorum -Quorum -Quorum -Stops writing Self-heal -Keeps writing -Keeps writing -No quorum -Quorum -Quorum -Stops writing -Keeps writing -Keeps writing 06/13/12
  • 20. GlusterFS in 2012: Filling the Gaps Synchronous translator API Unified File and Object Storage (UFO) HDFS-compatible storage layer 06/13/12
  • 21. Synchronous Translator API GlusterFS runs asynchronously non-blocking I/O, for performance Writing code for async I/O confusing 06/13/12
  • 22. Synchronous Translator API 3.3 introduces synchronous translators Easier to write Great for non-core operations Eg. background scrubbing 06/13/12
  • 23. Unified File and Object (UFO) S3, Swift-style object storage Access via UFO or Gluster mount HTTP Request Client Account Volume Proxy ID=/dir/sub/sub2/file Containe Directory r NFS or Client File Object GlusterFS Mount 06/13/12
  • 24. Unified File and Object (UFO) Your gateway to the cloud Your data, accessed your way 06/13/12
  • 25. HDFS Compatibility Run MapReduce jobs on GlusterFS Add unstructured data to Hadoop Hadoop Server GlusterF GlusterF S S GlusterF GlusterF Local Disk S S HDFS Connector 06/13/12 (Jar file)
  • 26. Thank you! AB Periasamy Office of the CTO, Red Hat ab@redhat.com John Mark Walker Gluster Community Guy johnmark@redhat.com

Editor's Notes

  • #17: Previously, needed a client mount loading the replication translator to do the self-heal No longer need client mount, because now the server loads the replication translator to enable server-server self-heal