SlideShare a Scribd company logo
Towards a Scalable File System
Progress on adapting BlobSeer to WAN scale
for the HGMDS distributed metadata system

Viet-Trung Tran, Gabriel Antoniu, Alexandru Costan (INRIA - Rennes)
In collaboration with Kohei Hiraga, Osamu Tatebe (U Tsukuba)



FP3C meeting
Bordeaux, 2 – 3 September 2011
Plan

1. Background and context
2. Goal
3. Approach and solution
4. Preliminary evaluation
5. Conclusion




FP3C meeting – Bordeaux, 2-3 September 2011   -2
1
Background
BlobSeer & HGMDS




FP3C meeting – Bordeaux, 2-3 September 2011   -3
BlobSeer: A large-scale data management
service
Generic data-management platform for huge, unstructured data
•  Huge data (TB) : BLOBs
•  Highly concurrent, fine-grain access (MB): R/W/A
•  Prototype available

Key design features
•  Decentralized metadata management
•  Beyond MVCC: multiversioning exposed to the user
•  Lock-free write access through versioning

A back-end for higher-level, sophisticated data management systems




FP3C meeting – Bordeaux, 2-3 September 2011                          -4
BlobSeer: Architecture

Clients                                                    Providers
•  Perform fine grain blob accesses
Providers
•  Store the pages of the blob
Provider manager
•  Monitors the providers
•  Favours data load balancing                             Provider
                                         Clients           manager
Metadata providers
•  Store information about page location               Version
Version manager                                        manager
•  Ensures concurrency control




                                                   Metadata providers


FP3C meeting – Bordeaux, 2-3 September 2011                           -5
HGMDS: A distributed metadata
management system for global file systems

•  Multi-master file system
                                                                 The	
  Internet	
metadata server (MDS).
                                                    Site A	
                         Site B	
•  Managing inode structure.                  File system Clients	
•  High latency networks don't
affect metadata operation
                                                               HGMD                             HGMD
performance.                                                                                    S	
                                                               S	
      - Both reading and writing.
•  One MDS per site.
•  Metadata versioning using                   mkdir/rmdir/                           Propagate
                                               create/stat/                           updates in
vector clocks for collision                       unlink 	
                          background
detection.                                                            Site C	

•  Automatic collision resolution
by system side.

FP3C meeting – Bordeaux, 2-3 September 2011                                                       -6
2
Goal
A joint architecture integrating BlobSeer and HGMDS




FP3C meeting – Bordeaux, 2-3 September 2011       -7
Goal
                BlobSeer                                HGMDS
   Data management                            Metadata management
   Typically on a single site                 Global scale, multiple sites




Idea: build a global file system deployed on multiple site by integrating
BlobSeer to HGMDS

Potential benefits:
•  HGMDS: efficient multi-site file metadata management
•  BlobSeer: concurrency-optimized access to globally shared data




FP3C meeting – Bordeaux, 2-3 September 2011                                  -8
3
Our approach and solution




FP3C meeting – Bordeaux, 2-3 September 2011   -9
Two approaches

Multiple BlobSeer instances
•  One BlobSeer / site



One single BlobSeer-WAN over distributed geographic
sites




FP3C meeting – Bordeaux, 2-3 September 2011       - 10
1st approach: 1 BlobSeer instance / site




                        Client




FP3C meeting – Bordeaux, 2-3 September 2011   - 11
1st approach: Zoom




High latency when accessing remote BLOBs:
•  Too many remote requests for small metadata
EMETTEUR - NOM DE LA PRESENTATION                - 12
2nd approach: 1 BlobSeer-WAN instance
over distributed geographic sites

Multiple version managers
•  1 version manager/site
Multiple provider managers
•  1 provider manager/site


On each site
•  Multiple data providers and metadata servers
•  Data providers are under control of local provider manager




EMETTEUR - NOM DE LA PRESENTATION                               - 13
Idea: leverage locality
for remote metadata accesses




                         2




Metadata I/O is resolved locally
EMETTEUR - NOM DE LA PRESENTATION   - 14
2nd approach: I/O scheme in BlobSeer-WAN

Writing
•  Publish version on local version manager
•  Locally write metadata on local metadata servers
•  Locally write data on local data providers


Reading (Read your write in many cases)
•  Ask a version to local version manager
•  Local metadata accesses
•  Access remote/local providers if necessary




FP3C meeting – Bordeaux, 2-3 September 2011           - 15
Vector clocks and optimistic metadata
replication




FP3C meeting – Bordeaux, 2-3 September 2011   - 16
Expected benefits

•  On WAN: BlobSeer coordinates with HGMDS to provide a
   global versioning file system
     - Low latency metadata I/O
     - Eventually consistency model
    - Load balancing/fault tolerance
•  On LAN:
     - Distributed version management
     - Load balancing/fault tolerance




FP3C meeting – Bordeaux, 2-3 September 2011               - 17
4
Preliminary evaluation
BlobSeer-WAN on G5K




FP3C meeting – Bordeaux, 2-3 September 2011   - 18
Testbed

Using 2 sites of G5K
•  Rennes: 40 nodes
     • 30 nodes reserved for BlobSeer services
     • 10 nodes for clients
•  Grenoble: 40 nodes
    • 30 nodes reserved for BlobSeer services
     • 10 nodes for clients
•  Interconnect network between sites 10 Gbps




FP3C meeting – Bordeaux, 2-3 September 2011      - 19
Concurrent appending: 512 MB/client




FP3C meeting – Bordeaux, 2-3 September 2011   - 20
5
Conclusion
On going work




FP3C meeting – Bordeaux, 2-3 September 2011   - 21
Summary
Discussed the integration of BlobSeer and HGMDS:
•  BlobSeer-WAN extension is required


BlobSeer-WAN
•  Preliminary results look encouraging
•  Performance of BlobSeer-WAN on two sites similar to that of
   vanilla BlobSeer on a single site
•  Prototype available at BlobSeer’s repository/branches/
  BlobSeer-WAN-dev/


HGMDS
•  Implementation almost done
•  Works on multi-sites
•  Collisions automatically solved by a rule
FP3C meeting – Bordeaux, 2-3 September 2011                  - 22
Next steps

•  A more extensive evaluation for BlobSeer-WAN
•  Integrate BlobSeer-WAN to HGMDS
•  Preliminary evaluation of HGMDS BlobSeer-WAN on
   Grid5000 and on the Japanese Clusters
•  Submit co-authored paper by Spring 2012
•  Next internships: Kohei @Inria Rennes




FP3C meeting – Bordeaux, 2-3 September 2011          - 23
Thank you!




    FP3C meeting
    2 – 3 September 2011

More Related Content

PPTX
10 domino integration
PPTX
Oasis & The Sun website case study
PPTX
How to Leverage Social Media to Promote Your Blog
PPT
Instructional power point
PPT
NAv6TF I Pv6 State Of Union Jan 2008
PPT
Time To Care, Time To Play: Wellbeing, Social Work and the Shorter Working Week
PPTX
Leadership & Social Media Aicpa Leadership Academy
DOCX
Аттила ба Европ дахь Хүн нар М.Баянбулаг
10 domino integration
Oasis & The Sun website case study
How to Leverage Social Media to Promote Your Blog
Instructional power point
NAv6TF I Pv6 State Of Union Jan 2008
Time To Care, Time To Play: Wellbeing, Social Work and the Shorter Working Week
Leadership & Social Media Aicpa Leadership Academy
Аттила ба Европ дахь Хүн нар М.Баянбулаг

Viewers also liked (12)

PDF
EY O viziune a cresterii - editia de toamna 2016
PPTX
Operation india is my country project or mission aadhaar by www.indiaismycoun...
PDF
Présentation du Réseau Numérique & Agriculture de l'ACTA
PPTX
WideNet U: How To Write Well
PPTX
.NETクロスプラットフォーム
PPTX
Use of Big Data Analytics in Advertising
PDF
Understanding the 2016 Budget Outlook
PPT
Dynamic web 7
PDF
Ipsos MORI Scotland: Public Opinion Monitor June 2016
PDF
Como utilizar google scholar para mejorar la visibilidad de nuestra produccio...
PPT
Pubcon Las Vegas 2016 - The intersection of SEO & CRO
PDF
บทที่ 4 การอ่านตีความ
EY O viziune a cresterii - editia de toamna 2016
Operation india is my country project or mission aadhaar by www.indiaismycoun...
Présentation du Réseau Numérique & Agriculture de l'ACTA
WideNet U: How To Write Well
.NETクロスプラットフォーム
Use of Big Data Analytics in Advertising
Understanding the 2016 Budget Outlook
Dynamic web 7
Ipsos MORI Scotland: Public Opinion Monitor June 2016
Como utilizar google scholar para mejorar la visibilidad de nuestra produccio...
Pubcon Las Vegas 2016 - The intersection of SEO & CRO
บทที่ 4 การอ่านตีความ
Ad

Similar to Progress on adapting BlobSeer to WAN scale (20)

PDF
Strategies for Context Data Persistence
PPTX
Running MongoDB on AWS
PPTX
Cluster based storage - Nasd and Google file system - advanced operating syst...
PDF
Towards A Grid File System Based On A Large-Scale BLOB Management Service
PPTX
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
PPTX
Chaptor 2- Big Data Processing in big data technologies
PPTX
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
PPTX
Google
PDF
Meteor South Bay Meetup - Kubernetes & Google Container Engine
PPTX
Node.js BFFs: our way to better/micro frontends
PDF
FILES IN TODAY’S WORLD - #MFSummit2017
PDF
Nicholas:hdfs what is new in hadoop 2
PPTX
MongoDB Internals
PDF
Mongo db 3.4 Overview
ZIP
A Taste Of InfoGrid
PPTX
Cloud computing UNIT 2.1 presentation in
PPTX
Windows Server 2012 R2 Jump Start - Intro
PDF
[WSO2Con USA 2018] Up-leveling Brownfield Integration
PPTX
Tim Marston.
PPTX
Tim marston
Strategies for Context Data Persistence
Running MongoDB on AWS
Cluster based storage - Nasd and Google file system - advanced operating syst...
Towards A Grid File System Based On A Large-Scale BLOB Management Service
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
Chaptor 2- Big Data Processing in big data technologies
MongoDB World 2018: Bumps and Breezes: Our Journey from RDBMS to MongoDB
Google
Meteor South Bay Meetup - Kubernetes & Google Container Engine
Node.js BFFs: our way to better/micro frontends
FILES IN TODAY’S WORLD - #MFSummit2017
Nicholas:hdfs what is new in hadoop 2
MongoDB Internals
Mongo db 3.4 Overview
A Taste Of InfoGrid
Cloud computing UNIT 2.1 presentation in
Windows Server 2012 R2 Jump Start - Intro
[WSO2Con USA 2018] Up-leveling Brownfield Integration
Tim Marston.
Tim marston
Ad

More from Viet-Trung TRAN (20)

PDF
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
PDF
Dynamo: Amazon’s Highly Available Key-value Store
PDF
Pregel: Hệ thống xử lý đồ thị lớn
PDF
Mapreduce simplified-data-processing
PDF
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
PPTX
giasan.vn real-estate analytics: a Vietnam case study
PDF
Giasan.vn @rstars
PDF
A Vietnamese Language Model Based on Recurrent Neural Network
PDF
A Vietnamese Language Model Based on Recurrent Neural Network
PPTX
Large-Scale Geographically Weighted Regression on Spark
PDF
Recent progress on distributing deep learning
PDF
success factors for project proposals
PDF
GPSinsights poster
PPTX
OCR processing with deep learning: Apply to Vietnamese documents
PDF
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
PDF
Deep learning for nlp
PDF
Introduction to BigData @TCTK2015
PDF
From neural networks to deep learning
PDF
From decision trees to random forests
PPTX
Recommender systems: Content-based and collaborative filtering
Bắt đầu tìm hiểu về dữ liệu lớn như thế nào - 2017
Dynamo: Amazon’s Highly Available Key-value Store
Pregel: Hệ thống xử lý đồ thị lớn
Mapreduce simplified-data-processing
Tìm kiếm needle trong Haystack: Hệ thống lưu trữ ảnh của Facebook
giasan.vn real-estate analytics: a Vietnam case study
Giasan.vn @rstars
A Vietnamese Language Model Based on Recurrent Neural Network
A Vietnamese Language Model Based on Recurrent Neural Network
Large-Scale Geographically Weighted Regression on Spark
Recent progress on distributing deep learning
success factors for project proposals
GPSinsights poster
OCR processing with deep learning: Apply to Vietnamese documents
Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive...
Deep learning for nlp
Introduction to BigData @TCTK2015
From neural networks to deep learning
From decision trees to random forests
Recommender systems: Content-based and collaborative filtering

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPT
Teaching material agriculture food technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Empathic Computing: Creating Shared Understanding
PDF
KodekX | Application Modernization Development
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Encapsulation theory and applications.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Spectroscopy.pptx food analysis technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
sap open course for s4hana steps from ECC to s4
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Teaching material agriculture food technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Empathic Computing: Creating Shared Understanding
KodekX | Application Modernization Development
Review of recent advances in non-invasive hemoglobin estimation
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Big Data Technologies - Introduction.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Encapsulation theory and applications.pdf
cuic standard and advanced reporting.pdf
Unlocking AI with Model Context Protocol (MCP)
Network Security Unit 5.pdf for BCA BBA.
Spectroscopy.pptx food analysis technology
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
sap open course for s4hana steps from ECC to s4

Progress on adapting BlobSeer to WAN scale

  • 1. Towards a Scalable File System Progress on adapting BlobSeer to WAN scale for the HGMDS distributed metadata system Viet-Trung Tran, Gabriel Antoniu, Alexandru Costan (INRIA - Rennes) In collaboration with Kohei Hiraga, Osamu Tatebe (U Tsukuba) FP3C meeting Bordeaux, 2 – 3 September 2011
  • 2. Plan 1. Background and context 2. Goal 3. Approach and solution 4. Preliminary evaluation 5. Conclusion FP3C meeting – Bordeaux, 2-3 September 2011 -2
  • 3. 1 Background BlobSeer & HGMDS FP3C meeting – Bordeaux, 2-3 September 2011 -3
  • 4. BlobSeer: A large-scale data management service Generic data-management platform for huge, unstructured data •  Huge data (TB) : BLOBs •  Highly concurrent, fine-grain access (MB): R/W/A •  Prototype available Key design features •  Decentralized metadata management •  Beyond MVCC: multiversioning exposed to the user •  Lock-free write access through versioning A back-end for higher-level, sophisticated data management systems FP3C meeting – Bordeaux, 2-3 September 2011 -4
  • 5. BlobSeer: Architecture Clients Providers •  Perform fine grain blob accesses Providers •  Store the pages of the blob Provider manager •  Monitors the providers •  Favours data load balancing Provider Clients manager Metadata providers •  Store information about page location Version Version manager manager •  Ensures concurrency control Metadata providers FP3C meeting – Bordeaux, 2-3 September 2011 -5
  • 6. HGMDS: A distributed metadata management system for global file systems •  Multi-master file system The  Internet metadata server (MDS). Site A Site B •  Managing inode structure. File system Clients •  High latency networks don't affect metadata operation HGMD HGMD performance. S S - Both reading and writing. •  One MDS per site. •  Metadata versioning using mkdir/rmdir/ Propagate create/stat/ updates in vector clocks for collision unlink background detection. Site C •  Automatic collision resolution by system side. FP3C meeting – Bordeaux, 2-3 September 2011 -6
  • 7. 2 Goal A joint architecture integrating BlobSeer and HGMDS FP3C meeting – Bordeaux, 2-3 September 2011 -7
  • 8. Goal BlobSeer HGMDS Data management Metadata management Typically on a single site Global scale, multiple sites Idea: build a global file system deployed on multiple site by integrating BlobSeer to HGMDS Potential benefits: •  HGMDS: efficient multi-site file metadata management •  BlobSeer: concurrency-optimized access to globally shared data FP3C meeting – Bordeaux, 2-3 September 2011 -8
  • 9. 3 Our approach and solution FP3C meeting – Bordeaux, 2-3 September 2011 -9
  • 10. Two approaches Multiple BlobSeer instances •  One BlobSeer / site One single BlobSeer-WAN over distributed geographic sites FP3C meeting – Bordeaux, 2-3 September 2011 - 10
  • 11. 1st approach: 1 BlobSeer instance / site Client FP3C meeting – Bordeaux, 2-3 September 2011 - 11
  • 12. 1st approach: Zoom High latency when accessing remote BLOBs: •  Too many remote requests for small metadata EMETTEUR - NOM DE LA PRESENTATION - 12
  • 13. 2nd approach: 1 BlobSeer-WAN instance over distributed geographic sites Multiple version managers •  1 version manager/site Multiple provider managers •  1 provider manager/site On each site •  Multiple data providers and metadata servers •  Data providers are under control of local provider manager EMETTEUR - NOM DE LA PRESENTATION - 13
  • 14. Idea: leverage locality for remote metadata accesses 2 Metadata I/O is resolved locally EMETTEUR - NOM DE LA PRESENTATION - 14
  • 15. 2nd approach: I/O scheme in BlobSeer-WAN Writing •  Publish version on local version manager •  Locally write metadata on local metadata servers •  Locally write data on local data providers Reading (Read your write in many cases) •  Ask a version to local version manager •  Local metadata accesses •  Access remote/local providers if necessary FP3C meeting – Bordeaux, 2-3 September 2011 - 15
  • 16. Vector clocks and optimistic metadata replication FP3C meeting – Bordeaux, 2-3 September 2011 - 16
  • 17. Expected benefits •  On WAN: BlobSeer coordinates with HGMDS to provide a global versioning file system - Low latency metadata I/O - Eventually consistency model - Load balancing/fault tolerance •  On LAN: - Distributed version management - Load balancing/fault tolerance FP3C meeting – Bordeaux, 2-3 September 2011 - 17
  • 18. 4 Preliminary evaluation BlobSeer-WAN on G5K FP3C meeting – Bordeaux, 2-3 September 2011 - 18
  • 19. Testbed Using 2 sites of G5K •  Rennes: 40 nodes • 30 nodes reserved for BlobSeer services • 10 nodes for clients •  Grenoble: 40 nodes • 30 nodes reserved for BlobSeer services • 10 nodes for clients •  Interconnect network between sites 10 Gbps FP3C meeting – Bordeaux, 2-3 September 2011 - 19
  • 20. Concurrent appending: 512 MB/client FP3C meeting – Bordeaux, 2-3 September 2011 - 20
  • 21. 5 Conclusion On going work FP3C meeting – Bordeaux, 2-3 September 2011 - 21
  • 22. Summary Discussed the integration of BlobSeer and HGMDS: •  BlobSeer-WAN extension is required BlobSeer-WAN •  Preliminary results look encouraging •  Performance of BlobSeer-WAN on two sites similar to that of vanilla BlobSeer on a single site •  Prototype available at BlobSeer’s repository/branches/ BlobSeer-WAN-dev/ HGMDS •  Implementation almost done •  Works on multi-sites •  Collisions automatically solved by a rule FP3C meeting – Bordeaux, 2-3 September 2011 - 22
  • 23. Next steps •  A more extensive evaluation for BlobSeer-WAN •  Integrate BlobSeer-WAN to HGMDS •  Preliminary evaluation of HGMDS BlobSeer-WAN on Grid5000 and on the Japanese Clusters •  Submit co-authored paper by Spring 2012 •  Next internships: Kohei @Inria Rennes FP3C meeting – Bordeaux, 2-3 September 2011 - 23
  • 24. Thank you! FP3C meeting 2 – 3 September 2011