SlideShare a Scribd company logo
Store your trillions of bytes using
commodity hardware and open source
(GlusterFS)
Theophanis K. Kontogiannis
RHC{SA,E,EV,ESM,I,X}
tkonto@gmail.com
@tkonto
The problem
● Data growth beyond manageable sizes
● Data growth beyond cost effective sizes
How much would it cost to store 100PB of
non structured data in a storage???
The idea
Create a scalable data storing infrastructure
uniformly presented to clients using:
● Commodity (even off the Self) Hardware
● Open standards
The concept
The vision
GlusterFS:
Open – Unified - Extensible
Scalable – Manageable - Reliable
Scale-out Network Attached Storage (NAS) Software Solution
for
On Premise - Virtualized - Cloud Environments
The implementation
●
Open source, distributed file system capable
of scaling to thousands petabytes (actually,
72 brontobytes!)
and handling thousands of clients.
Processing:
1024 Terabytes = 1 Petabyte
1024 Petabytes = 1 Exabyte
1024 Exabytes = 1 Zettabyte
1024 Zettabytes = 1 Yottabyte
1024 Yottabytes = 1 Brontobyte
● Clusters together storage building blocks over Infiniband
RDMA or TCP/IP interconnect, aggregating disk and memory
resources and managing data in a single global namespace.
● Based on a stackable user space design and can deliver
exceptional performance for diverse workloads.
● Self Healing
● Not tied to I/O profiles or hardware or OS
-The question is how much is a BrontoByte?
-The question is WHO CARES?
Really it can support that much?
Yes it can!
2^32 (max subvolumes of distribute translator)
X
18 exabytes (max xfs volume size)
=
72 brontobytes
(or 89,131,682,828,547,379,792,736,944,128bytes)
GlusterFS is supporting 2^128 (uuid) inodes
And this is how it goes
A bit of (business as usual) history
● Gluster Inc. was founded in 2005
● Focused in Public & Private Cloud Storage
● Main product GlusterFS was written by
Anand Babu Periasamy, Gluster’s founder
and CTO
● Received $8.5M in 2010 via VC funding
● Acquired for $136M by Red Hat in 2011
GlusterFS <--> Red Hat Storage
● Gluster.com redirects to RHS pages
● Gluster.org actively supported by RedHat
What is important is the integration of
technologies in ways that demonstrably
benefit the customers
Components
●
brick
The brick is the storage filesystem that has been assigned to a volume.
●
client
The machine which mounts the volume (this may also be a server).
●
server
The machine (virtual or bare metal) which hosts the actual filesystem in which
data will be stored.
●
subvolume
A brick after being processed by at least one translator.
●
volume
The final share after it passes through all the translators
●
Translator
Code that interprets the actual files geometry/location/distribution on disks
comprising a volume and is responsible for the perceived performance
The Outer Atmosphere View
The 100.000ft view
Storage Node
The 50.000ft View
The 10.000ft View
The ground level view
...and the programmers view
if (!(xl->fops = dlsym (handle, "fops"))) {
gf_log ("xlator", GF_LOG_WARNING, "dlsym(fops) on %s",
dlerror ());
goto out;
}
if (!(xl->cbks = dlsym (handle, "cbks"))) {
gf_log ("xlator", GF_LOG_WARNING, "dlsym(cbks) on %s",
dlerror ());
goto out;
}
if (!(xl->init = dlsym (handle, "init"))) {
gf_log ("xlator", GF_LOG_WARNING, "dlsym(init) on %s",
dlerror ());
goto out;
}
if (!(xl->fini = dlsym (handle, "fini"))) {
gf_log ("xlator", GF_LOG_WARNING, "dlsym(fini) on %s",
dlerror ());
goto out;
}
Course of action
● Partition, Format and mount the bricks
● Format the partition
● Mount the partition as a Gluster "brick"
● Add an entry to /etc/fstab
● Install Gluster packages on nodes
● Run the gluster peer probe command
● Configure your Gluster volume (and the translators)
● Test using the volume
Translators?
Translator Type Functional Purpose
Storage Lowest level translator, stores and accesses data from local file system.
Debug Provide interface and statistics for errors and debugging.
Cluster Handle distribution and replication of data as it relates to writing to and
reading from bricks & nodes.
Encryption Extension translators for on-the-fly encryption/decryption of stored data.
Protocol Interface translators for client / server authentication and communications.
Performance Tuning translators to adjust for workload and I/O profiles.
Bindings Add extensibility, e.g. The Python interface written by Jeff Darcy to extend API
interaction with GlusterFS.
System System access translators, e.g. Interfacing with file system access control.
Scheduler I/O schedulers that determine how to distribute new write operations across
clustered systems.
Features Add additional features such as Quotas, Filters, Locks, etc.
Not flexible with command line?
Benchmarks?
Method and platforms pretty much standard:
● Multiple 'dd' of varying blocks are read and written from
multiple clients simultaneously.
●
GlusterFS Brick Configuration (16 bricks)
Processor - Dual Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
RAM - 8GB FB-DIMM
Disk - SATA-II 500GB
HCA - Mellanox MHGS18-XT/S InfiniBand HCA
● Client Configuration (64 clients)
RAM - 4GB DDR2 (533 Mhz)
Processor - Single Intel(R) Pentium(R) D CPU 3.40GHz
Disk - SATA-II 500GB
HCA - Mellanox MHGS18-XT/S InfiniBand HCA
●
Interconnect Switch: Voltaire port InfiniBand Switch (14U)
Size does not matter....
...number of participants does
Suck the throughput. You can!
And you can GeoDistribute it :)
Multi-site cascading
Enough with food for thoughts...
● www.redhat.com/products/storage-server/
● www.gluster.org
Now back to your consoles!!!!
Thank you...

More Related Content

PPTX
Replication and rebuild in cStor
ODP
Efficient data maintaince in GlusterFS using Databases
PDF
Redis Conf 2019--Container Attached Storage for Redis
PDF
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
PDF
Introducing gluster filesystem by aditya
PPTX
Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage
PDF
Using S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
PPTX
OpenEBS hangout #4
Replication and rebuild in cStor
Efficient data maintaince in GlusterFS using Databases
Redis Conf 2019--Container Attached Storage for Redis
Container Attached Storage (CAS) with OpenEBS - Berlin Kubernetes Meetup - Ma...
Introducing gluster filesystem by aditya
Pros and Cons of Erasure Coding & Replication vs. RAID in Next-Gen Storage
Using S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
OpenEBS hangout #4

What's hot (20)

ODP
YDAL Barcelona
PDF
Improving Presto performance with Alluxio at TikTok
PDF
Ceph Research at UCSC
PDF
Achieving Separation of Compute and Storage in a Cloud World
PDF
Update on Crimson - the Seastarized Ceph - Seastar Summit
PDF
Improve Presto Architectural Decisions with Shadow Cache
PPT
An intro to Ceph and big data - CERN Big Data Workshop
PDF
Ceph at salesforce ceph day external presentation
PDF
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
PPTX
New Ceph capabilities and Reference Architectures
PDF
Scalable and High available Distributed File System Metadata Service Using gR...
PDF
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
PDF
State of the_gluster_-_lceu
PDF
Using Ceph for Large Hadron Collider Data
PPTX
PPTX
Red Hat Storage Day Dallas - Defiance of the Appliance
PDF
Alluxio Data Orchestration Platform for the Cloud
PPTX
Introduction to Redis
PPT
Redis e Memcached - Daniel Naves - Omnilogic
PDF
OSDC 2015: John Spray | The Ceph Storage System
YDAL Barcelona
Improving Presto performance with Alluxio at TikTok
Ceph Research at UCSC
Achieving Separation of Compute and Storage in a Cloud World
Update on Crimson - the Seastarized Ceph - Seastar Summit
Improve Presto Architectural Decisions with Shadow Cache
An intro to Ceph and big data - CERN Big Data Workshop
Ceph at salesforce ceph day external presentation
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
New Ceph capabilities and Reference Architectures
Scalable and High available Distributed File System Metadata Service Using gR...
Optimizing Latency-Sensitive Queries for Presto at Facebook: A Collaboration ...
State of the_gluster_-_lceu
Using Ceph for Large Hadron Collider Data
Red Hat Storage Day Dallas - Defiance of the Appliance
Alluxio Data Orchestration Platform for the Cloud
Introduction to Redis
Redis e Memcached - Daniel Naves - Omnilogic
OSDC 2015: John Spray | The Ceph Storage System
Ad

Viewers also liked (6)

PPTX
Assessment
PDF
Innovaphone produktkatalog 2015_2016_en
PPTX
Health hazards
PDF
iTQi 2013 - Simón Martin , Guijuelo (Salamanca)
PPTX
Human rights.
PPTX
Teaching strategies
Assessment
Innovaphone produktkatalog 2015_2016_en
Health hazards
iTQi 2013 - Simón Martin , Guijuelo (Salamanca)
Human rights.
Teaching strategies
Ad

Similar to GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR (20)

ODP
Gluster fs architecture_future_directions_tlv
PDF
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
PDF
GlusterFs: a scalable file system for today's and tomorrow's big data
PDF
Gluster fs architecture_future_directions_tlv
PDF
Gluster intro-tdose
ODP
Gluster intro-tdose
ODP
The Future of GlusterFS and Gluster.org
PDF
Glusterfs and openstack
PDF
GlusterFS Update and OpenStack Integration
PDF
GlusterFS And Big Data
PDF
Gluster fs architecture_&amp;_roadmap-vijay_bellur-linuxcon_eu_2013
ODP
GlusterFS Architecture - June 30, 2011 Meetup
PDF
GlusterFS Talk for CentOS Dojo Bangalore
ODP
Gluster fs hadoop_fifth-elephant
PPTX
Gluster Storage
ODP
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
ODP
Glusterfs for sysadmins-justin_clift
PDF
Gluster overview & future directions vault 2015
ODP
Red Hat Gluster Storage : GlusterFS
PDF
GlusterFS as a DFS
Gluster fs architecture_future_directions_tlv
GlusterFS : un file system open source per i big data di oggi e domani - Robe...
GlusterFs: a scalable file system for today's and tomorrow's big data
Gluster fs architecture_future_directions_tlv
Gluster intro-tdose
Gluster intro-tdose
The Future of GlusterFS and Gluster.org
Glusterfs and openstack
GlusterFS Update and OpenStack Integration
GlusterFS And Big Data
Gluster fs architecture_&amp;_roadmap-vijay_bellur-linuxcon_eu_2013
GlusterFS Architecture - June 30, 2011 Meetup
GlusterFS Talk for CentOS Dojo Bangalore
Gluster fs hadoop_fifth-elephant
Gluster Storage
GlusterFs Architecture & Roadmap - LinuxCon EU 2013
Glusterfs for sysadmins-justin_clift
Gluster overview & future directions vault 2015
Red Hat Gluster Storage : GlusterFS
GlusterFS as a DFS

GlusterFS Presentation FOSSCOMM2013 HUA, Athens, GR

  • 1. Store your trillions of bytes using commodity hardware and open source (GlusterFS) Theophanis K. Kontogiannis RHC{SA,E,EV,ESM,I,X} tkonto@gmail.com @tkonto
  • 2. The problem ● Data growth beyond manageable sizes ● Data growth beyond cost effective sizes How much would it cost to store 100PB of non structured data in a storage???
  • 3. The idea Create a scalable data storing infrastructure uniformly presented to clients using: ● Commodity (even off the Self) Hardware ● Open standards
  • 5. The vision GlusterFS: Open – Unified - Extensible Scalable – Manageable - Reliable Scale-out Network Attached Storage (NAS) Software Solution for On Premise - Virtualized - Cloud Environments
  • 6. The implementation ● Open source, distributed file system capable of scaling to thousands petabytes (actually, 72 brontobytes!) and handling thousands of clients. Processing: 1024 Terabytes = 1 Petabyte 1024 Petabytes = 1 Exabyte 1024 Exabytes = 1 Zettabyte 1024 Zettabytes = 1 Yottabyte 1024 Yottabytes = 1 Brontobyte ● Clusters together storage building blocks over Infiniband RDMA or TCP/IP interconnect, aggregating disk and memory resources and managing data in a single global namespace. ● Based on a stackable user space design and can deliver exceptional performance for diverse workloads. ● Self Healing ● Not tied to I/O profiles or hardware or OS -The question is how much is a BrontoByte? -The question is WHO CARES?
  • 7. Really it can support that much? Yes it can! 2^32 (max subvolumes of distribute translator) X 18 exabytes (max xfs volume size) = 72 brontobytes (or 89,131,682,828,547,379,792,736,944,128bytes) GlusterFS is supporting 2^128 (uuid) inodes
  • 8. And this is how it goes
  • 9. A bit of (business as usual) history ● Gluster Inc. was founded in 2005 ● Focused in Public & Private Cloud Storage ● Main product GlusterFS was written by Anand Babu Periasamy, Gluster’s founder and CTO ● Received $8.5M in 2010 via VC funding ● Acquired for $136M by Red Hat in 2011
  • 10. GlusterFS <--> Red Hat Storage ● Gluster.com redirects to RHS pages ● Gluster.org actively supported by RedHat What is important is the integration of technologies in ways that demonstrably benefit the customers
  • 11. Components ● brick The brick is the storage filesystem that has been assigned to a volume. ● client The machine which mounts the volume (this may also be a server). ● server The machine (virtual or bare metal) which hosts the actual filesystem in which data will be stored. ● subvolume A brick after being processed by at least one translator. ● volume The final share after it passes through all the translators ● Translator Code that interprets the actual files geometry/location/distribution on disks comprising a volume and is responsible for the perceived performance
  • 17. ...and the programmers view if (!(xl->fops = dlsym (handle, "fops"))) { gf_log ("xlator", GF_LOG_WARNING, "dlsym(fops) on %s", dlerror ()); goto out; } if (!(xl->cbks = dlsym (handle, "cbks"))) { gf_log ("xlator", GF_LOG_WARNING, "dlsym(cbks) on %s", dlerror ()); goto out; } if (!(xl->init = dlsym (handle, "init"))) { gf_log ("xlator", GF_LOG_WARNING, "dlsym(init) on %s", dlerror ()); goto out; } if (!(xl->fini = dlsym (handle, "fini"))) { gf_log ("xlator", GF_LOG_WARNING, "dlsym(fini) on %s", dlerror ()); goto out; }
  • 18. Course of action ● Partition, Format and mount the bricks ● Format the partition ● Mount the partition as a Gluster "brick" ● Add an entry to /etc/fstab ● Install Gluster packages on nodes ● Run the gluster peer probe command ● Configure your Gluster volume (and the translators) ● Test using the volume
  • 19. Translators? Translator Type Functional Purpose Storage Lowest level translator, stores and accesses data from local file system. Debug Provide interface and statistics for errors and debugging. Cluster Handle distribution and replication of data as it relates to writing to and reading from bricks & nodes. Encryption Extension translators for on-the-fly encryption/decryption of stored data. Protocol Interface translators for client / server authentication and communications. Performance Tuning translators to adjust for workload and I/O profiles. Bindings Add extensibility, e.g. The Python interface written by Jeff Darcy to extend API interaction with GlusterFS. System System access translators, e.g. Interfacing with file system access control. Scheduler I/O schedulers that determine how to distribute new write operations across clustered systems. Features Add additional features such as Quotas, Filters, Locks, etc.
  • 20. Not flexible with command line?
  • 21. Benchmarks? Method and platforms pretty much standard: ● Multiple 'dd' of varying blocks are read and written from multiple clients simultaneously. ● GlusterFS Brick Configuration (16 bricks) Processor - Dual Intel(R) Xeon(R) CPU 5160 @ 3.00GHz RAM - 8GB FB-DIMM Disk - SATA-II 500GB HCA - Mellanox MHGS18-XT/S InfiniBand HCA ● Client Configuration (64 clients) RAM - 4GB DDR2 (533 Mhz) Processor - Single Intel(R) Pentium(R) D CPU 3.40GHz Disk - SATA-II 500GB HCA - Mellanox MHGS18-XT/S InfiniBand HCA ● Interconnect Switch: Voltaire port InfiniBand Switch (14U)
  • 22. Size does not matter....
  • 25. And you can GeoDistribute it :) Multi-site cascading
  • 26. Enough with food for thoughts... ● www.redhat.com/products/storage-server/ ● www.gluster.org Now back to your consoles!!!! Thank you...