SlideShare a Scribd company logo
Introductionto
filesystem
Outline
What is File System
What’s GlusterFS and NFS
How aurora works
FileSystem
User-Application
File System
Kernel Block I/O
Device(HDD/
SDD)
Device(HDD
/SDD)
Kernel Driver
Hardware
Kernel (OS)
User-Space
Partition Partition Partition Partition
Hardware
HDD/SDD
We won’t use the whole disk
You need to plan how to use the disk
○ How many partition
○ Size of each partition
/dev/sda (100G)
○ /dev/sda1 –(50G)
○ /dev/sda2 –(25G)
○ /dev/sda3 – (25G)
The basic concept of Linux FIleSystem
FileSystem
Basic FS
○ EXT4/ZFS/NFS/NTFS/BTRFS
Logical FS
○ LVM
Distributed FS
○ GlusterFS/Ceph/OpenEBS/BeeGFS/Lizard
FS
Only focus on GlusterFS and NFS
today
Write/READPath
User-Application
File System
Kernel Block I/O
Device(HDD/
SDD)
Device(HDD
/SDD)
Kernel Driver
Partition Partition Partition Partition
Write
Read
LVM
It’s impossible to predict the usage of
each partition.
You need to re-partition the whole
disk sometimes to fit the usage of
your user.
We can use the LVM
○ Logical Volume management
LVM
RAID
RAID (software RAID)
○ MDADM
○ Block Level
○ Also integrate with LVM to support more
flexible management
File System implement
○ BTRFS
○ ZFS
RAID
The basic concept of Linux FIleSystem
NFS
Network File System
○ No Any RAID Support
Read/Write via Network
NFS
Client
NFS Server
NFS
Client
NFS
Client
NFS
Client
Switch
Theproblemoffilesystem
I/OPS (Input / Output per second)
○ Slow
Raid Mechanism isn’t for larger
number of disk
○ 100+ (only broken 2/3)
WhyDFS
Scale up V.S Scale out
Distributedfilesystem
Allows access to files from multiple
hosts sharing via a computer network.
Multiple users on multiple machines
to share files and storage resources.
Components
○ Server
○ Client
○ Metadata Servers
Different features
○ Security/Redundancy
The basic concept of Linux FIleSystem
The basic concept of Linux FIleSystem
The basic concept of Linux FIleSystem
GlusterFS
Distributed File System
Volume
○ Distributed
○ Replicated
○ Distributed-Replicated
○ Stripe
○ Disperse
Metrics
○ Large/Small files
○ Random/Sequence location
Distributed
File Based
Like RAID-0
Volume = V1+V2+…Vn
IOPS
○ Read: v * n
○ Write: v * n
Replicated
File Based
Like RAID-1
Assume Replication Factor =2
Volume = sum(V)/2
IOPS
○ Read: v*n
○ Write: v*n/2
Distribuuted-replicated
File-Based
The server will mirror by itself (in the
same Replication)
Volume: sum(V)/2
IOPS:
○ Read: n *v
○ Write: n*v/2
Stripe
Object-Based
Split file into many bricks
Volume: sum(v)
IOPS:
○ Read: n*v
○ Write: n*v
DiSPERSE
Erasure Coding
Volume: based on config
ErasureCoding
Erasure Coding
○ N = k + m (data + redundancy)
○ Take 6=4 + 2 as example
10 MB File
2.5 MB 2.5 MB
Server 1 Server 2
2.5 MB
Server 3
2.5 MB
Server 4
2.5 MB
Server 5
2.5 MB
Server 6
Data Data Data DataRedundancyRedundancy
Aurora
How Kubernetes works with volume
○ local
○ GlusterFS
○ NFS
LocalMount
Read/Write from/to local disk
○ Memory
○ Disk
Can’t share data cross nodes
Share data in the same node
○ Access control
○ Read-Write many
○ Read-Write once
○ Read-Only
Server Server
MountLocalVolume
Server
Platform (K8S)
Pod Pod Pod Pod Pod Pod
Mount
local
volume
Local Local Local
No ShareShare
GlusterFS
Read/Write from/to GlusterFS
○ Memory
○ Disk/Network Access
Share data cross nodes (same name)
○ Access Control
■ Read-Write Once
■ Read-Write Many
■ Read-Only
Ask the volume size from GlusterFS
Server Server
MountGlusterFS
Server
Platform (K8S)
Pod Pod Pod Pod Pod Pod
Mount
Remote
Volume
Local Local Local
Gluster-FS
Local LocalLocal
5G 10G 25G
Server
Local Loc
al
NFS
Read/Write from/to NFS
○ Memory
○ Network Access
Share data cross nodes (same name)
○ Access Control (no…since the NFS doesn’t
support those feature)
Ask the data size
○ It don’t support this feature..lol
Server Server
MountNFS
Server
Platform (K8S)
Pod Pod Pod Pod Pod Pod
NFS
NFS
Server
Performance bottleneck
HowaboutstorageforAItraining
Storage systems available today are
optimized for a design point that’s
different to what AI truly requires
They are optimized for structured
workloads – predictable, sequential
access, not random patterns.
Pure-Storage(FlashBlade)
A single storage server for NVIDIA
DGX-1
Based on NFS
Storage
○ Flash-Array
○ PB
○ 17 GB/s bandwidth
○ 1.5M IOPS
Network
○ 8 * 40Gb/s
Price <$1 per GB
Pure-Storage(AIRI)
Four DGX-1 systems
Based on NFS
Storage
○ Flash-Array
○ PB
○ 17 GB/s bandwidth
○ 1.5M IOPS
Network
○ 100G
○ GPUDirect RDMA
Price
○ > 1M
The basic concept of Linux FIleSystem
Pure-Storage
You can use GPFS/Lustre and other
HPC storage.
Btu data-scientists don’t want to deal
with data-center infrastructure

More Related Content

PDF
Linux directory structure by jitu mistry
PPT
Linux command ppt
PPT
PDF
Linux basic commands with examples
ODP
Linux Kernel Crashdump
PDF
Linux kernel debugging
PDF
LISA2019 Linux Systems Performance
PPSX
linux kernel overview 2013
Linux directory structure by jitu mistry
Linux command ppt
Linux basic commands with examples
Linux Kernel Crashdump
Linux kernel debugging
LISA2019 Linux Systems Performance
linux kernel overview 2013

What's hot (20)

PDF
File System Hierarchy
PPT
Linux Crash Dump Capture and Analysis
PDF
PDF
Linux Performance Analysis: New Tools and Old Secrets
PPT
Linux file system
PDF
Linux systems - Linux Commands and Shell Scripting
PDF
Container Performance Analysis
PDF
Kdump and the kernel crash dump analysis
PDF
BeagleBone Black Bootloaders
PDF
Linux kernel modules
PPTX
The TCP/IP Stack in the Linux Kernel
PDF
Arm device tree and linux device drivers
PPTX
Know the UNIX Commands
PDF
Making Linux do Hard Real-time
DOCX
system de gestion Nfs (Network File System)
PDF
Advanced Namespaces and cgroups
PDF
kdump: usage and_internals
PPTX
PDF
Linux kernel architecture
PDF
Archiving in linux tar
File System Hierarchy
Linux Crash Dump Capture and Analysis
Linux Performance Analysis: New Tools and Old Secrets
Linux file system
Linux systems - Linux Commands and Shell Scripting
Container Performance Analysis
Kdump and the kernel crash dump analysis
BeagleBone Black Bootloaders
Linux kernel modules
The TCP/IP Stack in the Linux Kernel
Arm device tree and linux device drivers
Know the UNIX Commands
Making Linux do Hard Real-time
system de gestion Nfs (Network File System)
Advanced Namespaces and cgroups
kdump: usage and_internals
Linux kernel architecture
Archiving in linux tar
Ad

Similar to The basic concept of Linux FIleSystem (20)

PDF
FreeBSD Portscamp, Kuala Lumpur 2016
PPT
Unix 6 en
ODP
Lisa 2015-gluster fs-introduction
PDF
Hadoop operations basic
PDF
Tlf2014
PDF
Nycbsdcon14
PPTX
Introduction to intelligence cybersecurity_4
PDF
Asiabsdcon14
PDF
Scale2014
PDF
Olf2013
PDF
Kafka on ZFS: Better Living Through Filesystems
PDF
Flourish16
PDF
When ACLs Attack
DOCX
Bsdtw17: allan jude: zfs: advanced integration
PPTX
Open Source Data Deduplication
PDF
Posscon2013
PDF
Fossetcon14
PDF
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
PPTX
Storage memory
PPT
Hadoop Architecture
FreeBSD Portscamp, Kuala Lumpur 2016
Unix 6 en
Lisa 2015-gluster fs-introduction
Hadoop operations basic
Tlf2014
Nycbsdcon14
Introduction to intelligence cybersecurity_4
Asiabsdcon14
Scale2014
Olf2013
Kafka on ZFS: Better Living Through Filesystems
Flourish16
When ACLs Attack
Bsdtw17: allan jude: zfs: advanced integration
Open Source Data Deduplication
Posscon2013
Fossetcon14
OSDC 2016 - Interesting things you can do with ZFS by Allan Jude&Benedict Reu...
Storage memory
Hadoop Architecture
Ad

More from HungWei Chiu (20)

PDF
Learn O11y from Grafana ecosystem.
PDF
Learned from KIND
PDF
Debug Your Kubernetes Network
PDF
以 eBPF 構建一個更為堅韌的 Kubernetes 叢集
PDF
Learning how AWS implement AWS VPC CNI
PDF
Jenkins & IaC
PDF
The relationship between Docker, Kubernetes and CRI
PDF
PDF
Introduction to CRI and OCI
PDF
IP Virtual Server(IPVS) 101
PDF
Opentracing 101
PDF
iptables and Kubernetes
PDF
IPTABLES Introduction
PDF
Open vSwitch Introduction
PDF
Load Balancing 101
PDF
How Networking works with Data Science
PDF
Introduction to CircleCI
PDF
Head First to Container&Kubernetes
PDF
Kubernetes 1001
PDF
Application-Based Routing
Learn O11y from Grafana ecosystem.
Learned from KIND
Debug Your Kubernetes Network
以 eBPF 構建一個更為堅韌的 Kubernetes 叢集
Learning how AWS implement AWS VPC CNI
Jenkins & IaC
The relationship between Docker, Kubernetes and CRI
Introduction to CRI and OCI
IP Virtual Server(IPVS) 101
Opentracing 101
iptables and Kubernetes
IPTABLES Introduction
Open vSwitch Introduction
Load Balancing 101
How Networking works with Data Science
Introduction to CircleCI
Head First to Container&Kubernetes
Kubernetes 1001
Application-Based Routing

Recently uploaded (20)

PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation theory and applications.pdf
PPT
Teaching material agriculture food technology
PPTX
Spectroscopy.pptx food analysis technology
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
cuic standard and advanced reporting.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Electronic commerce courselecture one. Pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Understanding_Digital_Forensics_Presentation.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation theory and applications.pdf
Teaching material agriculture food technology
Spectroscopy.pptx food analysis technology
Unlocking AI with Model Context Protocol (MCP)
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
Chapter 3 Spatial Domain Image Processing.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Building Integrated photovoltaic BIPV_UPV.pdf
Programs and apps: productivity, graphics, security and other tools
cuic standard and advanced reporting.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Electronic commerce courselecture one. Pdf

The basic concept of Linux FIleSystem