SlideShare a Scribd company logo
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
A	Glimpse	into	the	Future	of	I/O	
DDN Usergroup Meeting SC18
November, 2018 Sven Oehme – Chief Research Officer DDN
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
A	Glimpse	into	the	Future	of	I/O
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
Data At Scale
HPC	 AI	
Analytics	
BigData	
Cloud	Integration	
Data	Management	
Distributed	Object	
Advanced	Data	Management	
I/O	Performance	and	Behavior	
Latency	 Latency-Tail	 Throughput	
Byte	Addressable	
GPUs	
More	Heterogeneous	
CPU	
Economic	Model	
Cloud	Integration	
Workloads	
SuperComputing	
Themes	
Meta-Analytics	Simplicity	
Insight	
Scale	
Security	
Multi	Tenancy	
Data	Path	Optimisation	
Tag&Search	
HW	
Accelerators	
Fat	Nodes	
Orchestrators	
Containers	
Open	Monitoring	
Openstack/	
Kubernetes	
NVMe/SCM	
KV	
HDFS	
Trending	Technologies	
Audit	
Data	Views
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
Challenges	in	Broader	Multi-Tenanted	Security	and	Flexibly	
Managing	Data	Collections		
►  Multi-Tenancy	==	Tenanted	Isolation	by	
•  Data Collections ß Flexible allocation of collections of data to users, groups, organizations, allowing
collaborative access, publication, etc
•  Performance ß The Noisy Neighbor problem - particularly for latency critical applications being impacted by
throughput hungry workloads
•  Resilience ß failures in domain A should not impact domain B
►  Virtualization	and	Containerization	-	Data	should	be	accessible	from	within	containers	
►  Multi	tenancy	needs	to	be	a	logical	allocation,	not	a	HW	or	physical	separation	unless	desired	
►  Auditable	-	All	events	–	both	system	and	tenant	in	the	system	should	be	auditable	
►  One	data	platform	needs	to	run	on	your	own	HW,	the	cloud	and	highly	optimized	integrated	systems	
►  Need	a	more	native	approach	to	resolving	all	these	challenges
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
Challenges	in	Security	and	deployment	–	Starting	down	the	path	
►  Lustre	takes	strong	steps	towards	better	multi-tenancy	support	
	
►  This	addresses	some	of	the	requirements	
►  dynamic	‘views’	on	data	based	on	policy	and	access	control	rather	than	namespace	segregation
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
Challenges	in	I/O	Performance	and	Behavior		
►  Newer	applications	need	to	operate	on	byte	addressable	data	
►  Significant	shift	from	sequential	to	random	I/O	
►  Multifold	increase	of	metadata	to	data	ratio	
►  Average	data	sizes	are	less	homogeneous	and	are	now	fractions	or	multiples	of	previous	
workloads.	gap	between	small	and	large	data	seems	to	wide	(bytes	on	one	end	,	GB’s	on	the	
other	end	of	the	spectrum)	
►  Interactive,	outcome	and	event	driven	analytics	are	driven	by	latency	rather	than	bandwidth		
Big	Data	 NoSQL	
Analytics	
Supercomputin
g	
Multi-
Physics	
Machine	Learning	 Workflow
s	
Adaptive	Mesh		
Refinement	
Checkpointing	
Trend	for	ever	more	mixed,	complex	I/O	Workloads
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
I/O	Performance	several	techniques	to	address	them	
0.0%	
10.0%	
20.0%	
30.0%	
40.0%	
50.0%	
60.0%	
70.0%	
80.0%	
90.0%	
Oakforest-PACS	
at	JCAHPC	(IME)	
Shaheen	at	
Kaust	
(Datawarp)	
Mistral	at	DKRZ	
(Lustre)	
EMSL	Cascade	at	
PNNL	(Lustre)	
Ratio	of	Easy:Hard	
IO500	Results	
Ratio	of	Easy:Hard	(systems	with	100	clients	or	more)	
Write	Ratio	 Read	Ratio	
►  Overlapping	i/o	and	conflicting	access	to	
same	resources	are	well	addressed	with	IME	
►  Log	structured	filesystems	mitigate	i/o	
amplification	and	significant	slowdown	of	
random	i/o	patterns	
►  Leverage	HDD	for	capacity	tiers	and	IME	for	
performance	
►  HAWC	/	LROC	in	Scale	,	HDFS	flash	
accelerators	are	other	examples
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
Challenges	in	Data	Management	
►  Metadata	growth	is	substantial,	people	try	to	solve	this	with	external	data	repositories	with	mixed	
success,	but	at	current	growth	rate	its	not	sustainable	
►  At	Exabyte	scale	data	reliability	and	availability	will	be	substantially	challenged	by	traditional	
recovery	models		
►  We	see	requirements	to	support	Quadrillions	(millions	of	billions)	of	objects	in	storage	systems	and	
we	are	still	at	the	beginning	of	this		
►  Data	management	at	scale	is	complicated	,	snapshots,	tiering	(to	disk,	tape,	cloud)	and	breaks	in	
larger	systems	
►  People	can’t	find	their	data,	no	easy	way	to	tag	and	‘search’	for	it
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
Challenges	in	Data	Management	–	need	for	extensive	analytics	
►  The	acquisition	of	Tintri	provides	us	technology	to	run	analytics	for	systems	at	scale	
►  More	insights	into	system	to	make	decisions	based	on	data	and	trends	rather	manual	crafted	
administrative	policies		
►  data	placement	based	on	historic	data
DDN	Storage	|		©2018	DataDirect	Networks,	Inc.	
conclusions	
►  Already	many	strong	inroads	into	these	data	challenges	for	2019+	
•  IME
•  Lustre
•  SFAOS
►  Heavy	engineering	investment	underway	to	resolve	this	challenges	in	more	native	ways	
►  Come	and	talk	to	us	and	ask	for	the	‘secret’	presentation	room	;-)
ddn.com©2018 DataDirect Networks, Inc. *Other names and brands may be claimed as the property of others. Any statements or representations around future events are subject to change.
Thank	You!	
Keep in touch with us.
9351	Deering	Avenue	
Chatsworth,	CA	91311	
1.800.837.2298	
1.818.700.4000	
company/datadirect-networks	
@ddn_limitless	
sales@ddn.com

More Related Content

PDF
Leveraging a big data model in the IT domain
PDF
Mapping the road to better data storage strategies
PDF
A Journey to the Cloud with Data Virtualization
PDF
The Rise of Logical Data Architecture - Breaking the Data Gravity Notion (Mid...
PPTX
Scality medical imaging storage
PDF
Multi-Cloud-Datenintegration mit Datenvirtualisierung
PDF
Sithabile Article
PDF
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes
Leveraging a big data model in the IT domain
Mapping the road to better data storage strategies
A Journey to the Cloud with Data Virtualization
The Rise of Logical Data Architecture - Breaking the Data Gravity Notion (Mid...
Scality medical imaging storage
Multi-Cloud-Datenintegration mit Datenvirtualisierung
Sithabile Article
Education Seminar: Self-service BI, Logical Data Warehouse and Data Lakes

What's hot (20)

PPTX
The Extreme Data Cloud (XDC) Project
PDF
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...
PDF
Agile Data Management with Enterprise Data Fabric (Middle East)
PDF
Data Virtualization for Compliance – Creating a Controlled Data Environment
PDF
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
PDF
Denodo Global Cloud Survey 2020
PDF
Martin Willcox - What is a Data Lake, Anyway?
PDF
What Healthcare Organizations Need to Know about Hybrid Data Storage
PDF
Agile Data Management with Enterprise Data Fabric (ASEAN)
PDF
Secure your data with Virtual Data Fabric (Middle East)
PDF
Partner Keynote: How Logical Data Fabric Knits Together Data Visualization wi...
PDF
Analyst Keynote: TDWI: Data Virtualization as a Data Management Strategy for ...
PDF
Multi cloud data integration with data virtualization
PDF
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
PPTX
Data virtualization in the cloud – accelerating time to-value
PDF
How to accelerate Splunk analytics
PDF
Multi-Cloud Data Integration with Data Virtualization (APAC)
PDF
The Private Cloud Isn't Dead
PDF
Why is hybrid cloud still so hard? 4 keys to unlock the future of IT
PDF
Denodo DataFest 2017: Succeeding in Self-Service BI
The Extreme Data Cloud (XDC) Project
Role of Unified AI and ML in Cloud Technologies. Which Cloud Service Provider...
Agile Data Management with Enterprise Data Fabric (Middle East)
Data Virtualization for Compliance – Creating a Controlled Data Environment
Data Virtualization enabled Data Fabric: Operationalize the Data Lake (APAC)
Denodo Global Cloud Survey 2020
Martin Willcox - What is a Data Lake, Anyway?
What Healthcare Organizations Need to Know about Hybrid Data Storage
Agile Data Management with Enterprise Data Fabric (ASEAN)
Secure your data with Virtual Data Fabric (Middle East)
Partner Keynote: How Logical Data Fabric Knits Together Data Visualization wi...
Analyst Keynote: TDWI: Data Virtualization as a Data Management Strategy for ...
Multi cloud data integration with data virtualization
Logical Data Warehouse: The Foundation of Modern Data and Analytics (APAC)
Data virtualization in the cloud – accelerating time to-value
How to accelerate Splunk analytics
Multi-Cloud Data Integration with Data Virtualization (APAC)
The Private Cloud Isn't Dead
Why is hybrid cloud still so hard? 4 keys to unlock the future of IT
Denodo DataFest 2017: Succeeding in Self-Service BI
Ad

Similar to A Glimpse into the Future of I/O (20)

PDF
Long Live Posix - HPC Storage and the HPC Datacenter
PPTX
DDN EXA 5 - Innovation at Scale
PDF
DDN Product Update from SC13
PDF
DDN Strategic Vision Tour June 2015
PPTX
Innovating to Create a Brighter Future for AI, HPC, and Big Data
PDF
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
PDF
Optimizing Lustre and GPFS with DDN
PDF
Ddn Vision
PDF
DDN: Protecting Your Data, Protecting Your Hardware
PDF
IME - Unlocking the Potential of NVMe
PDF
DDN and Intel: Partnered for Exascale
PPTX
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
PDF
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
PDF
Infinite Memory Engine: HPC in the FLASH Era
PDF
Proactive Data Containers (PDC): An Object-centric Data Store for Large-scale...
PDF
DDN-DataFlow-Deck_v1.pdf
PPTX
Accelerated Any-Scale Solutions from DDN
PDF
Network Evolution and Market Outlook
PPTX
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
PPTX
AquaQ Analytics Kx Event - Data Direct Networks Presentation
Long Live Posix - HPC Storage and the HPC Datacenter
DDN EXA 5 - Innovation at Scale
DDN Product Update from SC13
DDN Strategic Vision Tour June 2015
Innovating to Create a Brighter Future for AI, HPC, and Big Data
DDN: Massively-Scalable Platforms and Solutions Engineered for the Big Data a...
Optimizing Lustre and GPFS with DDN
Ddn Vision
DDN: Protecting Your Data, Protecting Your Hardware
IME - Unlocking the Potential of NVMe
DDN and Intel: Partnered for Exascale
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Big Data: Infrastructure Implications for “The Enterprise of Things” - Stampe...
Infinite Memory Engine: HPC in the FLASH Era
Proactive Data Containers (PDC): An Object-centric Data Store for Large-scale...
DDN-DataFlow-Deck_v1.pdf
Accelerated Any-Scale Solutions from DDN
Network Evolution and Market Outlook
Hype, Hopes, Hell & Hadoop (#bigdata and the enterprise of everything)
AquaQ Analytics Kx Event - Data Direct Networks Presentation
Ad

More from inside-BigData.com (20)

PDF
Major Market Shifts in IT
PDF
Preparing to program Aurora at Exascale - Early experiences and future direct...
PPTX
Transforming Private 5G Networks
PDF
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
PDF
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
PDF
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
PDF
HPC Impact: EDA Telemetry Neural Networks
PDF
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
PDF
Machine Learning for Weather Forecasts
PPTX
HPC AI Advisory Council Update
PDF
Fugaku Supercomputer joins fight against COVID-19
PDF
Energy Efficient Computing using Dynamic Tuning
PDF
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
PDF
State of ARM-based HPC
PDF
Versal Premium ACAP for Network and Cloud Acceleration
PDF
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
PDF
Scaling TCO in a Post Moore's Era
PDF
CUDA-Python and RAPIDS for blazing fast scientific computing
PDF
Introducing HPC with a Raspberry Pi Cluster
PDF
Overview of HPC Interconnects
Major Market Shifts in IT
Preparing to program Aurora at Exascale - Early experiences and future direct...
Transforming Private 5G Networks
The Incorporation of Machine Learning into Scientific Simulations at Lawrence...
How to Achieve High-Performance, Scalable and Distributed DNN Training on Mod...
Evolving Cyberinfrastructure, Democratizing Data, and Scaling AI to Catalyze ...
HPC Impact: EDA Telemetry Neural Networks
Biohybrid Robotic Jellyfish for Future Applications in Ocean Monitoring
Machine Learning for Weather Forecasts
HPC AI Advisory Council Update
Fugaku Supercomputer joins fight against COVID-19
Energy Efficient Computing using Dynamic Tuning
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
State of ARM-based HPC
Versal Premium ACAP for Network and Cloud Acceleration
Zettar: Moving Massive Amounts of Data across Any Distance Efficiently
Scaling TCO in a Post Moore's Era
CUDA-Python and RAPIDS for blazing fast scientific computing
Introducing HPC with a Raspberry Pi Cluster
Overview of HPC Interconnects

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Machine learning based COVID-19 study performance prediction
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Spectroscopy.pptx food analysis technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation theory and applications.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Machine learning based COVID-19 study performance prediction
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Spectroscopy.pptx food analysis technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The AUB Centre for AI in Media Proposal.docx
Understanding_Digital_Forensics_Presentation.pptx
Electronic commerce courselecture one. Pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Programs and apps: productivity, graphics, security and other tools
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation theory and applications.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Unlocking AI with Model Context Protocol (MCP)
Building Integrated photovoltaic BIPV_UPV.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Chapter 3 Spatial Domain Image Processing.pdf
Empathic Computing: Creating Shared Understanding
Agricultural_Statistics_at_a_Glance_2022_0.pdf

A Glimpse into the Future of I/O