SlideShare a Scribd company logo
Maintenance of Big-Data
Multi-Cloud Infrastructure
Notes from the Fields
Dzmitry
Durasau
Speaker:
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
4
Big Data Startup:
Common Issues
Volume, Velocity, Variety
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Building Cloud Infrastructure
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Manage the Clouds
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Expenses Optimization
Expenses Optimization
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
•
•
•
•
•
•
•
23
SMB
Buffer
App
Buffer
SMB
Buffer
Adapter
Buffer
Adapter
Buffer
Host running
Hyper-V
Target deviceSource device
Virtual machine
Reads and writes go to new
destination VHD
Export a clone of a running VM
• Point-time image of running VM
exported to an alternate location
• Useful for troubleshooting VM
without downtime for primary VM
Export from an existing checkpoint
• Export a full cloned virtual machine
from a point-in-time, existing checkpoint
of a virtual machine
• Checkpoints automatically merged into
single virtual disk
Duplication of a Virtual
Machine whilst Running
VM1 VM2
Case: Cloud Infrastructure
Optimization
NewsRight.com
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau
Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau

More Related Content

PPTX
1.Introduction to virtualization
PPTX
Virtualization
PDF
Virtualization
PPTX
Implementation levels of virtualization
PPTX
Virtual Machine Migration & Hypervisors
PPTX
5. IO virtualization
PDF
Scheduler Support for Video-oriented Multimedia on Client-side Virtualization
PPTX
Virtualization: A Case Study from the IT Trenches - Darren Schoen, Broward Ce...
1.Introduction to virtualization
Virtualization
Virtualization
Implementation levels of virtualization
Virtual Machine Migration & Hypervisors
5. IO virtualization
Scheduler Support for Video-oriented Multimedia on Client-side Virtualization
Virtualization: A Case Study from the IT Trenches - Darren Schoen, Broward Ce...

What's hot (20)

PDF
One Step Edge Cloud
PPT
Virtual machine
PPTX
SpiceWorld London 2012 Presentation Matthieu Jaeger
PPTX
Hardware virtualization basic
PPTX
Virtual Machine Concept
PPT
Platform Virtualization
 
PPTX
Virtualization
PPTX
Virtualization
PPTX
virtualization
ODP
Introduction to virtualization
PPTX
Cloud Computing: Virtualization
PPTX
Virtual machine
PPTX
Virtual Machine
KEY
Introduction to Virtualization
PPTX
Host on Google Cloud Platform with Infiflex
PPT
Virtualization
PPT
Virtualization in cloud
PPSX
Server Virtualization Concepts & Features
PPT
What is Virtualization
PPT
Introduction to Virtualization (viadmin.com)
One Step Edge Cloud
Virtual machine
SpiceWorld London 2012 Presentation Matthieu Jaeger
Hardware virtualization basic
Virtual Machine Concept
Platform Virtualization
 
Virtualization
Virtualization
virtualization
Introduction to virtualization
Cloud Computing: Virtualization
Virtual machine
Virtual Machine
Introduction to Virtualization
Host on Google Cloud Platform with Infiflex
Virtualization
Virtualization in cloud
Server Virtualization Concepts & Features
What is Virtualization
Introduction to Virtualization (viadmin.com)
Ad

Viewers also liked (13)

PDF
BigData in IoT #iotconfua
PDF
2013 - Smarter Analytics Leadership Summit
PDF
The Internet of Flying Things - Part 2
PDF
Cwin16 tls-faurecia predictive maintenance
PDF
BA Summit 2014 Predictive maintenance: Met big data het lek dichten
PDF
Pivotal Big Data Roadshow
PPTX
What is predictive maintenance?
PPTX
Predictive Analytics: Context and Use Cases
PPTX
Predictive maintenance - Architecting a Solution with Devices, Services, Big ...
PPT
Big Data
PPTX
Agile Management: Leading Teams with a Complex Mind
PDF
Digital Brief 003 - Market Report Q2 2015
PPTX
Big Data Analytics with Hadoop
BigData in IoT #iotconfua
2013 - Smarter Analytics Leadership Summit
The Internet of Flying Things - Part 2
Cwin16 tls-faurecia predictive maintenance
BA Summit 2014 Predictive maintenance: Met big data het lek dichten
Pivotal Big Data Roadshow
What is predictive maintenance?
Predictive Analytics: Context and Use Cases
Predictive maintenance - Architecting a Solution with Devices, Services, Big ...
Big Data
Agile Management: Leading Teams with a Complex Mind
Digital Brief 003 - Market Report Q2 2015
Big Data Analytics with Hadoop
Ad

Similar to Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau (20)

PPT
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
PDF
Presentation architecting virtualized infrastructure for big data
PDF
Presentation architecting virtualized infrastructure for big data
PPTX
Make your first CloudStack Cloud successful
PPTX
Planning a Successful Cloud - Design from Workload to Infrastructure
PPTX
Final report on GOING BACK AND FORTH EFFICIENT MULTIDEPLOYMENT AND MULTI SNAP...
PDF
Big dataforbetterdatacenters - Strata2014
PPTX
vCloud Architecture BrownBag
PDF
Self-Adaptive Cloud Infrastructures with Bidirectional Programming
PPTX
V sphere 5 roadshow final
PPT
4. v sphere big data extensions hadoop
PDF
New stuff in CloudStack!
PPTX
Scvmm 2012 Building of Private Clouds and Federation to the Public Cloud
PPTX
Building a Virtualized Development and Testing Environments
PDF
Big data on virtualized infrastucture
PDF
Big data using Public Cloud
PPTX
VMUG ISRAEL November 2012, EMC session by Itzik Reich
PDF
Vmware v cloud director Training in Hyderabad
PPTX
Big Data – General Introduction
 
PPTX
VMware vSphere 6.0 - Troubleshooting Training - Day 1
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Presentation architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
Make your first CloudStack Cloud successful
Planning a Successful Cloud - Design from Workload to Infrastructure
Final report on GOING BACK AND FORTH EFFICIENT MULTIDEPLOYMENT AND MULTI SNAP...
Big dataforbetterdatacenters - Strata2014
vCloud Architecture BrownBag
Self-Adaptive Cloud Infrastructures with Bidirectional Programming
V sphere 5 roadshow final
4. v sphere big data extensions hadoop
New stuff in CloudStack!
Scvmm 2012 Building of Private Clouds and Federation to the Public Cloud
Building a Virtualized Development and Testing Environments
Big data on virtualized infrastucture
Big data using Public Cloud
VMUG ISRAEL November 2012, EMC session by Itzik Reich
Vmware v cloud director Training in Hyderabad
Big Data – General Introduction
 
VMware vSphere 6.0 - Troubleshooting Training - Day 1

More from Dzmitry Durasau (10)

PPTX
BAUG Meetup #1 2022: Оптимизация стоимости виртуальных машин при миграции в о...
PPTX
BAUG Meetup #1 2022: Публикация ресурсов в Интернет в Microsoft Azure. Обзор ...
PPTX
Технологии резервного копирования в Azure
PPTX
DevCon School: Построение корпоративных высоконагруженных инфраструктур в Azure
PPTX
Project Irrigation
PPTX
Azure Architecture Solutions Overview: Part 1
PPTX
Tech talk Windows Containers 2016 Dzmitry Durasau EPAM TechTalk
PPTX
HappyCat
PPTX
HDConf Windows Server 2016 Containerization by Dzmitry Durasau
PPTX
Azure Day Belarus : Windows Server 2016 Containerization by Dzmitry Durasau
BAUG Meetup #1 2022: Оптимизация стоимости виртуальных машин при миграции в о...
BAUG Meetup #1 2022: Публикация ресурсов в Интернет в Microsoft Azure. Обзор ...
Технологии резервного копирования в Azure
DevCon School: Построение корпоративных высоконагруженных инфраструктур в Azure
Project Irrigation
Azure Architecture Solutions Overview: Part 1
Tech talk Windows Containers 2016 Dzmitry Durasau EPAM TechTalk
HappyCat
HDConf Windows Server 2016 Containerization by Dzmitry Durasau
Azure Day Belarus : Windows Server 2016 Containerization by Dzmitry Durasau

Recently uploaded (20)

PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Tartificialntelligence_presentation.pptx
PDF
Encapsulation theory and applications.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Approach and Philosophy of On baking technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PPT
Teaching material agriculture food technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Empathic Computing: Creating Shared Understanding
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Big Data Technologies - Introduction.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
A Presentation on Artificial Intelligence
PPTX
MYSQL Presentation for SQL database connectivity
20250228 LYD VKU AI Blended-Learning.pptx
Tartificialntelligence_presentation.pptx
Encapsulation theory and applications.pdf
Spectroscopy.pptx food analysis technology
Approach and Philosophy of On baking technology
Network Security Unit 5.pdf for BCA BBA.
Teaching material agriculture food technology
Reach Out and Touch Someone: Haptics and Empathic Computing
A comparative analysis of optical character recognition models for extracting...
Empathic Computing: Creating Shared Understanding
SOPHOS-XG Firewall Administrator PPT.pptx
Getting Started with Data Integration: FME Form 101
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Group 1 Presentation -Planning and Decision Making .pptx
Big Data Technologies - Introduction.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Encapsulation_ Review paper, used for researhc scholars
gpt5_lecture_notes_comprehensive_20250812015547.pdf
A Presentation on Artificial Intelligence
MYSQL Presentation for SQL database connectivity

Maintenance Big Data Multi-Cloud Infrastructure: Notes from the Fields by Dzmitry Durasau

Editor's Notes

  • #21: [CLICK] Hot data is data that changes frequently and is stored on the faster, but more expensive, solid state drives. All data starts as hot data. [CLICK]Cold data is data that changes infrequently and is stored on the slower, but cheaper, hard disk drives. [CLICK] If cold data becomes hot it will be automatically moved to the solid state drives, and if [CLICK] hot data becomes cold it is moved to the hard disk drives.
  • #22: Data deduplication is a new storage efficiency feature available with Windows Server 2012 that helps address the ever-growing demand for file storage. Instead of expanding the storage used to host the data, the amount of space used by that data is now reduced through the use of variable-size chunking and compression. What this means is that Windows will automatically scan through your disks, identify duplicate chunks in the data you have stored and store these chunks only once. Since only one copy is stored for duplicate data this not only lets you optimize your existing storage infrastructure, it also translates into even greater savings by postponing the need to purchase storage upgrades and extending the lifespan of current storage investments. The disk space savings we have seen with Data Dedup during testing, both internally and by ESG Lab, has been phenomenal. Data deduplication can deliver storage savings of 25-60% for general file shares and 98% for OS VHDs. This is far above what was possible with Single Instance Storage (SIS) or NTFS compression. Data deduplication also throttles CPU and memory usage to allow for implementation on large volumes without impacting server performance. Furthermore, compression routine run times can be scheduled for off-peak times to reduce any impact those operations might have on data access. Reliability and data integrity aren’t problems for data deduplication, thanks to metadata and preview redundancy that helps to prevent data loss due to unexpected power outages. Checksums, along with data integrity and consistency checks, also help prevent corruption for volumes configured to use data deduplication. Not for : Live VMs SQL DBs ReFS file shares Client machines Boot data Cluster shared volumes
  • #23: Offloaded data transfer (ODX) in Windows Server 2012 R2 Preview enables you to accomplish more with your existing external storage arrays by letting you quickly move large files and virtual machines directly between storage arrays, which reduces host CPU and network resource consumption. Offloaded Data Transfer (ODX) support is a feature of the storage stack of Hyper‑V in Windows Server 2012 R2 Preview. ODX, when used with offload-capable SAN storage hardware, lets a storage device perform a file copy operation without the main processor of the Hyper‑V host actually reading the content from one storage place and writing it to another. ODX uses a token-based mechanism for reading and writing data within or between intelligent storage arrays. Instead of routing the data through the host, a small token is copied between the source and destination. The token simply serves as a point-in-time representation of the data. As an example, when you copy a file or migrate a virtual machine between storage locations (either within or between storage arrays), a token that represents the virtual machine file is copied, which removes the need to copy the underlying data through the servers. In a token-based copy operation, the steps are as follows (see the following figure): <Click> A user initiates a file copy or move in Windows Explorer, a command-line interface, or a virtual machine migration. <Click> Windows Server automatically translates this transfer request into an ODX (if supported by the storage array) and receives a token representation of the data. <Click> The token is copied between the source and destination systems. <Click> The token is delivered to the storage array. <Click> The storage array performs the copy internally and returns progress status. ODX is especially significant in the cloud space when you must provision new virtual machines from virtual machine template libraries or when virtual hard disk operations are triggered and require large blocks of data to be copied, as in virtual hard disk merges, storage migration, and live migration. These copy operations are then handled by the storage device that must be able to perform offloads (such as an offload-capable iSCSI, Fibre Channel SAN, or a file server based in Windows Server 2012 R2 Preview) and frees up the Hyper‑V host processors to carry more virtual machine workloads. As you can imagine having an ODX compliant array provides a wide range of benefits: ODX frees up the main processor to handle virtual machine workloads and lets you achieve native-like performance when your virtual machines read from and write to storage. ODX greatly reduces time to copy large amounts of data. With ODX, copy operations don’t use processor time. Virtualized workload now operates as efficiently as it would in a non-virtualized environment.