SlideShare a Scribd company logo
Exascale storage
By	2016,	server-based	storage	solu6ons	will	
lower	storage	hardware	costs	by	50%	or	more…	
Storage	@	a	Tipping	Point…	
What	does	this	mean	for	us?	
By	2018,	3	of	the	top	7	general-purpose	disk	
array	vendors	will	either	be	acquired	or	exit	
the	storage	hardware	business…
Leverage	on	Framework	Integra6on,	Management	
SoMware	&	Automa6on,	strong	Customer	Support	
capabili6es	and	evolving	overall	storage	ecosystem	
as	a	and	robust	global	offering	
Opportunity	up	for	grabs…	
Assess	 emerging	 storage	 architectures,	
technologies	&	approaches	to	create	a		combined	
strategy	to	meet	specific	workload	requirements	
Making	the	most	of	it…
New	breed	of	Storage	Services…	
	
	
Designed	for…	
	
•  Web-Scale	–	Scale-Up	&	Scale-Out	
•  Mul6-Tenancy	–	Mul6-Customer/Container		
•  Hyper-Access	–	Millions	of	end-consumers	
•  Resilience	–	Ge]ng	over	Prac6cal	Limita6ons		
Store	Global	–	Access	Local!
OSS	Addresses	the	Need…	
	
OSS	SDS	Solu@ons	
	
	
ü  Nutanix	
ü  Gluster	
	
Enterprise	
Hyper-Scale	
Transac@onal	
Distributed	
Appliance	
COTS/HPC
One	Workload	doesn’t	fit	all…	
	
Architectures	to	fit	various	workloads
Type	1:	Clustered	Architecture	
‘Federated	 Model’	 layered	 a	 top	
‘scale-up’	 architecture	 makes	
them	more	‘scale-out’	type	from	a	
management	standpoint.			
Tends	 to	 ‘bounce	 the	 IO’	 un6l	 it	
gets	to	the	brain	(header)	that	has	
the	 data.	 ’Federated’	 models	 use	
data	 mobility	 approach	 to	
rebalance	 between	 brains	 &	
persistence	 pools,	 leading	 to	 low	
latency	on	writes	
Brains	(HA	Header)	
Persistent	Pool
Type	2:	Tightly	Coupled,	Scale-Out	
Uses	 shared	 memory	 (cache	 and	
metadata)	between	nodes,	and	the	data	
itself	is	distributed	across	some	number	
nodes.	This	architecture	deals	with	large	
amount	of	inter-node	communica6on	
The	 defining	 element	 of	 shared	
memory	 models	 is	 cri6cal	 to	
these	 designs.	 It	 enables	
‘symmetric’	 IO	 paths	 through	 all	
brains.	 	 	It	is	designed	so	that	in	
failure	 (planned	 or	 unplanned)	
modes,	 IO	 opera6ons	 would	
remain	rela6vely	balanced.	
Brains	(HA	Header)	
Persistent	Pool	
IO	Path	(Shared	Memory)
Type	3:	Loosely	Coupled,	Scale-Out	
This	model	does	not	using	shared	
memory	 between	 nodes,	 but	 the	
data	 itself	 is	 distributed	 across	
mul6ple	 nodes.	 It	 deals	 with	 a	
larger	 amount	 of	 inter-node	
communica6on	 on	 writes	 (IO	
intensive)	 as	 data	 is	 distributed.	
As	it	is	transac6onal	the	writes	are	
distributed	&	always	coherent	
The	design	–	
•  Simple	in	opera6ons	and	scaling.	
•  Very	good	distributed	reads	as	data	
is	serviced	by	mul6ple	nodes.	
•  Not	‘HA’.	The	resilience	comes	from	
data	copies	&	distribu6on.	
Brains	(mul@-node)	
Distributed	Pool
Type	4:	Distributed,	Share	Nothing	
The	Design	–	
•  No	Shared	Memory	
•  Non-transac6on,	Lazy	data	
•  Distributed	reads	can	be	achieved	
•  No	‘HA’.	The	resilience	of	specified	
data	can	come	from	distribu6on.	
The	Architecture	–		
•  The	‘Most	Scalable’	Architecture	
•  Super-Simple	implementa6on	
•  Highly	COTS	reliable,	Low	cost	
•  Mostly	‘SoMware	Only’	design	
•  Object	 &	 non-POSIX	 support	 on	
base	filesystem
Workload	based	Architecture…	
On-Premise	 Hosted/Cloud	(Private)	 Hosted/Cloud	(Public)
Gluster	Storage	has	a	fully	supported	integra6on	
•  Hadoop	Data	Plaiorm	2.1		
•  Ambari	Management	Suite	
	
This	integra6on	can	run	various	Hadoop	jobs	with	
•  accomplished	file	system	plug-in	
•  reliable	enterprise	grade	storage	back-end	
•  standard	protocol-based	ingest	op6ons	
•  no	single	point	of	failure	
	
Gluster	Storage	is	a	verified,	high	performance	back-end	for	Splunk's	cold	
storage	6er,	used	for	vast	machine	data	analysis.		
	
Web-scale	 object	 storage	 solu6ons	 for	 archival	 &	 rich	 media,	 are	
CloudStack	offerings	on	Ceph	Storage	
ISV	Maturity	focused…
Exascale…	
	
Scale-out	
Stack	Design
Single	global	namespace	
Aggregates	disk	and	memory	resources	into	
a	single	trusted	storage	pool.	
	
Security	
Support	 SELinux	 enforcing	 mode	 with	 SSL-
based	in-flight	encryp@on	
	
Object	access	to	file	storage	
Filestore	can	be	accessed	using	object-API.	
	
Erasure	coding	
Enhance	 data	 protec@on	 by	 using	
informa@on	 stored	 in	 the	 system	 to	
reconstruct	lost	or	corrupted	data.	
	
Bit-rot	detecXon	
Help	preserve	the	integrity	of	data	assets	by	
detec@ng	silent	corrup@on.	
Tiering	
Automa@cally	move	data	between	fast	(SSD-
based)	and	slow	(HDD)	@ers	based	on	access	
frequency.	
	
ReplicaXon	
Supports	 synchronous	 replica@on	 within	 a	
data	 center	 and	 asynchronous	 replica@on	
for	disaster	recovery.	
	
Snapshots		
Assure	data	protec@on	through	cluster-wide	
filesystem	snapshots.	User	accessible	for	
easy	recovery	of	files.		
	
ElasXc	hashing	algorithm		
No	metadata	server	layer	eliminates	
performance	boYlenecks	and	single	points	
of	failure.		
Feature	Glance…
Industry	Standard	Client	Support	
	
•  NFS,	 SMB	 protocols	 for	 file-based	
access	
•  NFSv4	 mul@-headed	 support	 for	
enhanced	security	&	resilience	
•  OpenStack	Swi]	support	for	Object	
access	
•  GlusterFS	 na@ve	 client	 for	 highly	
parallelized	access	
Deep	Hadoop	IntegraXon	
	
•  HDFS-compa@ble	filesystem	
•  No	single	point	of	failure	
•  NFS	and	FUSE	based	data	inges@on	
IntegraXon	with	RHEV	
	
•  Centralized	 visibility	 and	 unified	
management	 of	 storage	 and	 virtual	
infrastructures	through	RHEV	Manager	
console.	
•  Live	migra@on	of	virtual	machines	
Feature	Glance…	
Easy	online	management	
	
•  Web-based	management	console	
•  Powerful	 and	 intui@ve	 CLI	 for	 Linux	
admins	
•  Monitoring	(Nagios-based)	
•  Expand/shrink	storage	capacity	without	
down@me
Scale-out	Write…	
•  The	client	ini6ate	an	IO	and	transmits	it	to	
the	node	it's	communica6ng	with.	For	all-
in-one	 style	 architectures,	 this	 is	 a	 VM	
node	 that's	 co-located	 with	 the	 client	 on	
the	same	hardware	
•  Once	 the	 node	 receives	 the	 write	
acknowledgement	from	the	other	node(s),	
it	 responds	 back	 to	 the	 client	
acknowledging	the	write.	
•  Depending	 on	 the	 array	 plaiorm,	 other	
things	 can	 be	 done	 with	 the	 write	 like	
inline	deduplica6on,		compression,	etc.	
•  Some	 arrays	 that	 implement	 flash-based	
write	caching	can	stage	the	writes	to	flash	
to	clear	the	RAM	for	more	incoming	writes.	
•  The	write	is	eventually	flushed	to	disk	(SSD	
or	 Magne6c)	 on	 each	 node	 that	 received	
the	write
Scale-out	Read….	 •  The	 client	 ini6ates	 an	 IO	 request	 and	
t r a n s m i t s	 i t	 t o	 t h e	 n o d e	 i t s	
communica6ng	 with.	 For	 all-in-one	 style	
architectures,	this	is	a	VM	node	that's	co-
located	 with	 the	 client	 on	 the	 same	
hardware.	
•  The	node	receives	that	IO,	checks	its	read	
cache	 in	 RAM	 for	 the	 data	 and	 then	
(depending	 on	 the	 array)	 checks	 SSD	
cache	for	the	data.	
•  If	 the	 data	 isn't	 in	 either	 loca6on,	 the	
node	checks	its	metadata	table	to	locate	
the	data	on	disk	(local	or	another	node	/
nodes).	 Data	 is	 read	 directly	 from	 the	
underlying	 disks	 if	 local	 or	 is	 requested	
from	 containing	 node	 across	 the	 inter-
node	link.	
•  The	 node	 places	 a	 copy	 of	 the	 read	 in	
cache	and	responds	to	the	client	with	the	
requested	data.
Scale-out	Resilience….	
Distributed	Clustered	
	
•  Use	of	SSD	(&	Magne6c)	across	the	environment	as	one	shared	read	cache			
•  Speed	 comparable	 with	 an	 All-Flash	 Array;	 All	 VM	 IO	 will	 be	 from	 flash,	 while	
backup	will	be	from	SSD-SSD-Magne6c	
•  Scaling	of	capacity	and	performance	achieved	by	adding	more	SSDs	
•  Limits	failure	impact	of	SSD.	IO	available	for	rebuild	&	hot	cache	for	Live-Migra6on
The	Bo]om	Line…	
SoMware	Defined	Storage	(SDS)	can	achieve	‘Exascale’	propor6ons,	which	to	
date	has	been	difficult	to	manage	cost-effec6vely,	even	at	Enterprise	levels.	
	
Wide-spread	adop6on	of	Web-Scale	and	Distributed	Applica6on	architecture,	
by	 Enterprises,	 poses	 significant	 opportuni6es	 for	 SDS	 usage	 to	 go	
mainstream.	 Enterprises	 essen6ally	 would	 look	 at	 Service	 Providers	 to	
provision	this	hyper-scale	infrastructure,	while	they	focus	on	a	more	engaging	
Business	App	&	Dev-Ops.		
	
Bear	in	mind,	however,	that	the	a	strategically	posi6oned	SDS	Service	poriolio	
may	require	substan6al	specialist	skills	and	resources	in	areas	such	as	sizing,	
integra6on,	 tuning,	 maintenance	 and	 support,	 a	 packaged	 Service	 offering	
from	the	Service	Provider	is	a	much	an6cipated	move.
GlusterFS	Current	Features	&	Roadmap:	
hYp://gluster.readthedocs.org/en/latest/presenta@onsGlusterFS_Current_Features_and_Roadmap.pdf			
AddiXonal	Reading…	
Gartner	Doc	ID:G00255093	
hYp://www.gartner.com/technology/reprints.do?id=1-23NR9T2&ct=141027&st=sb		
Red	Hat	Gluster	Storage	
hYp://www.redhat.com/en/files/resources/en-rhst-gluster-datasheet-INC0210625.pdf			
Understanding	Storage	Architecture	
hYp://virtualgeek.typepad.com/virtual_geek/2014/01/understanding-storage-architectures.html		
Distributed	File	System	
hYp://cecs.wright.edu/~pmate@/Courses/7370/Lectures/DistFileSys/distributed-fs.html
Discussion	&	Huddle…	
Abhijeet	Upponi	
aupponi@yahoo.com	
+91	9619	455	020

More Related Content

PDF
Virtual SAN- Deep Dive Into Converged Storage
PDF
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
PDF
White Paper: EMC FAST Cache — A Detailed Review
 
DOCX
How to choose a server for your data center's needs
PDF
Recent advancements in cache technology
PPTX
IMDB_Scalability
PPTX
IMDB_Scalability
PDF
Drive new initiatives with a powerful Dell EMC, Nutanix, and Toshiba solution
Virtual SAN- Deep Dive Into Converged Storage
Virtual SAN - A Deep Dive into Converged Storage (technical whitepaper)
White Paper: EMC FAST Cache — A Detailed Review
 
How to choose a server for your data center's needs
Recent advancements in cache technology
IMDB_Scalability
IMDB_Scalability
Drive new initiatives with a powerful Dell EMC, Nutanix, and Toshiba solution

What's hot (9)

PPT
Fundamentals of Computing Chapter 6
PDF
Scaling htm supported database transactions to many cores
PDF
Mysql wp memcached
PPT
Caching for J2ee Enterprise Applications
PDF
Mysql wp memcached
PPTX
Oracle 11gR2 plain servers vs Exadata - 2013
ODP
Caching technology comparison
PPTX
NetApp & Storage fundamentals
PDF
The Flash Story
 
Fundamentals of Computing Chapter 6
Scaling htm supported database transactions to many cores
Mysql wp memcached
Caching for J2ee Enterprise Applications
Mysql wp memcached
Oracle 11gR2 plain servers vs Exadata - 2013
Caching technology comparison
NetApp & Storage fundamentals
The Flash Story
 
Ad

Viewers also liked (20)

PDF
IBM's Cloud Storage Options
PDF
S de0882 new-generation-tiering-edge2015-v3
ODP
Staying Productive with Social Streams
PDF
Complete dd ex5
PDF
MM Health Capabilitiessmallerversion
PDF
IBM's Pure and Flexible Integrated Solution
ODP
Planetas
PDF
SAP HANA Runs Better, Faster, Stronger on IBM Power
PPT
Tony blogging-tips-itso30-v1310e
PDF
IBM Solid State in eX5 servers
PDF
Infographic OpenStack - Deployment Tools
PDF
S ss0886 pendulum-swings-edge2015-v3
PPT
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
PPTX
Cedaspy perfumes
DOC
Delitos cibernéticos
PDF
S cv3179 spectrum-integration-openstack-edge2015-v5
PDF
The Pendulum Swings Back: Converged and Hyperconverged Environments
PDF
Sg248107 Implementing the IBM Storwize V3700
PDF
S ss0885 spectrum-scale-elastic-edge2015-v5
PDF
Choosing the Right Storage for your Server Virtualization Environment
IBM's Cloud Storage Options
S de0882 new-generation-tiering-edge2015-v3
Staying Productive with Social Streams
Complete dd ex5
MM Health Capabilitiessmallerversion
IBM's Pure and Flexible Integrated Solution
Planetas
SAP HANA Runs Better, Faster, Stronger on IBM Power
Tony blogging-tips-itso30-v1310e
IBM Solid State in eX5 servers
Infographic OpenStack - Deployment Tools
S ss0886 pendulum-swings-edge2015-v3
How To Build A Scalable Storage System with OSS at TLUG Meeting 2008/09/13
Cedaspy perfumes
Delitos cibernéticos
S cv3179 spectrum-integration-openstack-edge2015-v5
The Pendulum Swings Back: Converged and Hyperconverged Environments
Sg248107 Implementing the IBM Storwize V3700
S ss0885 spectrum-scale-elastic-edge2015-v5
Choosing the Right Storage for your Server Virtualization Environment
Ad

Similar to Exascale storage (20)

PDF
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
PPTX
AMAZON ELASTICACHE elastic shsbsbsbssbshsh
PDF
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
PDF
Scalar Decisions: Emerging Trends and Technologies in Storage
PDF
Data has a better idea the in-memory data grid
PDF
SanDisk: Persistent Memory and Cassandra
PDF
Choose the Right Container Storage for Kubernetes
PDF
SHARE Virtual Flash Memory VFM VSM_04-17-19.pdf
PPTX
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
PPTX
Towards User-Defined SLA in Cloud Flash Storage.pptx
PDF
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
PDF
Caching principles-solutions
PDF
PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...
PPTX
Webinar: Overcoming the Storage Challenges Cassandra and Couchbase Create
PPTX
From cache to in-memory data grid. Introduction to Hazelcast.
PPTX
VMworld 2015: Horizon View Storage - Let's Dive Deep!
PDF
Towards Application Driven Storage
PDF
Oracle RAC Internals - The Cache Fusion Edition
PDF
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
PDF
Optimizing RocksDB for Open-Channel SSDs
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
AMAZON ELASTICACHE elastic shsbsbsbssbshsh
IMCSummit 2015 - Day 2 IT Business Track - Drive IMC Efficiency with Flash E...
Scalar Decisions: Emerging Trends and Technologies in Storage
Data has a better idea the in-memory data grid
SanDisk: Persistent Memory and Cassandra
Choose the Right Container Storage for Kubernetes
SHARE Virtual Flash Memory VFM VSM_04-17-19.pdf
Building Data Pipelines with SMACK: Designing Storage Strategies for Scale an...
Towards User-Defined SLA in Cloud Flash Storage.pptx
Modeling, estimating, and predicting Ceph (Linux Foundation - Vault 2015)
Caching principles-solutions
PEARC17: Interactive Code Adaptation Tool for Modernizing Applications for In...
Webinar: Overcoming the Storage Challenges Cassandra and Couchbase Create
From cache to in-memory data grid. Introduction to Hazelcast.
VMworld 2015: Horizon View Storage - Let's Dive Deep!
Towards Application Driven Storage
Oracle RAC Internals - The Cache Fusion Edition
IMCSummit 2015 - Day 2 General Session - Flash-Extending In-Memory Computing
Optimizing RocksDB for Open-Channel SSDs

Exascale storage