SlideShare a Scribd company logo
1
STORAGE VIRTUALIZATION: AN INSIDER’S GUIDE
Jon William Toigo
CEO Toigo Partners International
Chairman Data Management Institute
Copyright © 2013 by the Data Management Institute LLC. All Rights Reserved. Trademarks and tradenames for products
discussed in this document are the property of their respective owners. Opinions expressed here are those of the author.
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 2
STORAGE VIRTUALIZATION:
AN INSIDER’S GUIDE
Part 4: The Data Protection Imperative
A confluence of three trends is making disaster preparedness and data
protection more important than ever before. These trends include the
increased use of server and desktop virtualization, growing legal and
regulatory mandates around data governance, privacy and preservation,
and increased dependency on automation in a challenging business
environment as a means to make fewer staff more productive.
Business continuity is now a mission-critical undertaking.
The good news is that storage virtualization can deliver the right tools to
ensure the availability of data assets – the foundation for any successful
business continuity or disaster recovery capability.
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 3
STORAGE VIRTUALIZATION: AN INSIDER’S GUIDE
The Data Protection Imperative
DATA PROTECTION MANAGEMENT: THE ESSENTIAL TASK OF BUSINESS CONTINUITY
Data protection and business continuity are subjects that nobody likes to talk about, but that
everyone in contemporary business and information technology must consider. Today, a
confluence of three trends – a kind of “perfect storm”— is making disaster protection planning
and disaster preparedness more important than ever before:
 First is the increased use of server and desktop virtualization technologies in business
computing -- technologies that, for all their purported benefits, also have the downside
of being a risk multiplier. With hypervisor-based server hosting, the failure of one
hosted application can cause many other application “guests” to fail on the same
physical server. While other efficiencies may accrue to server hypervisor computing,
the risks that the strategies introduce must be clearly understood in order to avoid
catastrophic outcomes during operation.
 A second trend underscoring the need for data protection and business continuity
planning is the growing regime of regulatory and legal mandates around data
preservation and privacy that affect a growing number of industry segments. Some of
these rules apply to nearly every company and most carry penalties if businesses cannot
show that reasonable efforts have been taken to safeguard data.
 Third, and perhaps most compelling, is the simple fact that companies are more
dependent than ever before on the continuous operation of IT automation. In the
economic reality, the need to make fewer staff more productive has created a much
greater dependency on the smooth operation of information systems, networks and
storage infrastructure. Even a short term outage can have significant consequences for
the business.
Bottom line: for many companies, business continuity and data protection have moved from
nice-to-have to must-have status. Past debates over the efficacy of investments in
preparedness are increasingly moot.
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 4
Put simply, there is no “safe place” to construct a data center: historical data on weather and
seismic events, natural and man-made disaster potentials, and other catastrophic scenarios
demonstrate that all geographies are subject to what most people think of when they hear the
word “disaster.”
Moreover, from a statistical standpoint, big disasters – those with a broad geographical
footprint – represent only a small fraction of the overall causality of IT outages. Only about 5
percent of disasters are those cataclysmic events that grab a spot on the 24 hour cable news
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 5
channels. Most downtime is the result of equipment and software maintenance – some call it
planned downtime, though efforts are afoot to eliminate planned downtime altogether through
clustering and high availability engineering. The next big slices of the outage pie chart involve
problems that fall more squarely in the disaster category: those resulting software failures,
human errors (“carbon robots”), and IT hardware failures.
According to one industry study of 3000 firms in North America and Europe, IT outages in 2010
resulted in 127 million hours of downtime – equivalent to about 65,000 employees drawing
salaries without performing work for an entire year! The impact of downtime in tangible terms,
such as lost revenues, and intangible terms, such as lost customer confidence, can only be
estimated. One study placed idle labor costs across all industry verticals at nearly $1 million per
hour.
Despite this data, the truism still applies in disaster preparedness that fewer than 50% of
businesses have any sort of prevention and recovery capability. Of those that do, fewer than
50% actually test their plans – the equivalent of having no plan whatsoever.
The reasons are simple. First, planning requires money, time and resources whose allocation
may be difficult to justify given that the resulting capability may never need to be used.
Second, plans are typically difficult to manage, since effective planning typically involves
multiple data protection techniques and recovery processes that lack cost-effective testing and
validation methods. Third, many vendors have embued customers with a false sense of
security regarding the invulnerability of their product or architecture, conflating the notion of
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 6
high availability architecture with business continuity strategy (the former is actually a subset of
the latter).
Constructing the plan itself follows a well-
defined roadmap. Following an interruption
event, three things need to happen:
1. The data associated with critical
applications must be recovered to a
usable form.
2. Applications need to be re-
instantiated and connected to their
data.
3. Users need to be reconnected to their
re-hosted applications.
These three central tasks need to occur quickly as the duration of an interruption event is
usually what differentiates an inconvenience from a disaster. Taken altogether, the three tasks
may be measured using the metric “time to data.” Time to data, sometimes referred to as a
recovery time objective, is both the expression of the goal of a plan and a measure of the
efficacy of a strategy applied to realize that goal.
Data Recovery is Key
The process for building a comprehensive continuity capability requires book length
description. (One is being developed online as a free “blook” at book.drplanning.org.) The
much condensed version has three basic
components.
To do a good job of developing a
continuity capability, you need to know
your data – or more specifically, what
data belongs to what applications and
what business processes those
applications serve. Data and apps
“inherit” their criticality – their priority of
restore – from the business processes
that they serve. So, those relationships
must be understood.
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 7
The next step is to apply the right stratagems for data recovery, application re-hosting and
reconnecting users to each application and its data based on that earlier criticality assessment.
Third, plans must be tested – both routinely and on an ad hoc basis. Testing is the long-tail cost
of continuity plans, and the decisions we make about recovery objectives and the methods we
use to build recovery strategies need to take into account how these strategies will be tested to
see how we reduce the cost of a continuity program that virtually nobody wants to spend
money on.
As a practical matter, data recovery is almost always the slowest part of recovery efforts
following an outage – but this is contingent on a lot of things. First, how is data being
replicated? Is it backed up to tape, mirrored by software or disk array hardware to an
alternative hardware kit? Is the data accessible and in a good condition for restore at the
designated recovery site?
Chances are good that a company uses a mixture of data protection techniques today. That’s a
good thing, since data is not all the same
and budgetary sensibility dictates that
the most expensive recovery strategies
be applied only to the most critical data.
Still, planners need to ensure that the
approaches being taken are coordinated
and monitored on an ongoing basis.
From this perspective, a data protection
management service that provides a
coherent way to configure, monitor, and
manage the various data replication
functions would be a boon. With such a
service in place, it would be much
simpler to ascertain whether the right
data is being replicated, whether cost- and time-appropriate techniques are being applied
based on data criticality, and whether data is being replicated successfully on an on-going basis.
As a rule, a “built-in” service is superior to one that is “bolted on” when it comes to data
protection. It follows, therefore, that a data protection management service should be
designed into the storage infrastructure itself and in such a way as to enable its use across
heterogeneous hardware repositories.
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 8
Moreover, the ideal data protection management service should be able to manage different
types of data protection services such as those that are used today to provide “defense in
depth” to data assets.
Defense in depth is a concept derived from a realistic appraisal of the risks confronting data
assets. Different methods of protection may be required to safeguard assets against different
risks.
As illustrated above, data needs to be protected, first and foremost, against the most frequent
kinds of disaster potentials – those involving user and application errors that cause data
deletion or corruption. Many specialized technologies, represented by the red rectangle in the
illustration, have been developed to help meet this requirement. The most important,
arguably, is some sort of on-going local replication – sometimes called continuous data
protection or CDP. Ideally, CDP provides a way to roll data back to the moment before a
disruption event or error occurs, like rewinding a tape.
The second layer of protection provides protection against localized interruption events such as
hardware failures or facility-level events such as broken pipes in data center walls or ceilings,
HVAC outages in equipment rooms, etc. Typically, protection against localized faults involves
synchronous replication – that is replication in “real time” – between two different physical
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 9
repositories, usually across a low latency network. That could be a company LAN connecting
two or more arrays on the same raised floor, or in different buildings on a corporate campus or
at different sites interconnected by a metropolitan area network. Replicating locally provides
an alternative source for the data so that work can proceed with minimal interruption.
The third layer of data protection protects against a regional disaster, whether the failure of a
power grid or the impact of a severe weather event with a broad geographical footprint.
Recovering data in these circumstances typically requires asynchronous replication – that is,
replication across a wide area network to an alternative location well out of harm’s way. The
challenge of asynchronous replication is one of data deltas – difference between the state of
data in the production system and the state of replicated data in the recovery environment –
resulting from distance-induced latency and other factors. As a rule of thumb, for every 100
kilometers data travels in a WAN, the remote target is about 12 SCSI write operations behind
the source of the data. The effect of latency is cumulative and tends to worsen the further the
data travels. This, in turn, can have a significant impact on the usability of the recovery data set
and the overall recovery effort, so planners need a way to test asynchronous replication on an
on-going basis.
Bottom line: there are a lot of challenges to setting up effective defense in depth – especially
when the strategy involves the manual integration of many hardware and software processes,
often “bolted on” to infrastructure after the fact. Common challenges include a lack of visibility
into replication processes. (With most hardware based mirroring schemes, the only way to
check to see whether a mirror is working is to break the mirror and check both the source and
target for consistency. Checking a mirror is a hassle that nobody likes to do. As a result, a lot of
disk mirrors operate without validation.)
Another set of challenges relate to cost and logistical issues – especially in hardware-based
mirroring. Keeping the mirroring hardware itself synchronized in terms of how the two arrays
are divided into LUNs, what RAID levels are being applied, whether the two platforms have the
same firmware upgrades requires ongoing effort, time and resources.
Related to the above, the maintenance of local mirrors and remote replication strategies
typically requires tight coordination between server, storage and application administrators
and continuity planners that often doesn’t exist. If a set of LUNs are moved around for
production reasons, but this is not communicated to the business continuity planners and
accommodated in the replication strategy, replication issues will develop (mirroring empty
space, for example). The wrong time to find out about the mistake is when a disaster occurs!
Finally, there is the challenge of managing the testing of the data protection strategy holistically
– monitoring and managing the CDP and replication processes themselves and the coordinating
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 10
of all of the software processes, hardware processes, tape processes, disk mirrors, and so forth
that may be involved on an on-going basis. Without a coherent way to wrangle together all of
the protection processes, they can quickly become unwieldy and difficult to manage...not to
mention very expensive.
HOW STORAGE VIRTUALIZATION CAN HELP
Storage virtualization provides a solution to many of these challenges by building in a set of
services for data protection that are extensible to all hardware platforms and that can be
configured and managed from a single management console. To be more exact, storage
virtualization establishes a software-based abstraction layer – a virtual controller, if you will –
above storage hardware. In so doing, it creates an extensible platform on which shared,
centrally-managed storage services can be staged – including data protection management
services.
With storage virtualized, it is easy to pool storage resources into target volumes that can be
designated as repositories for different kinds of data. By segregating data and writing it onto
volumes in specific pools, services can be applied selectively to the data at either the volume or
the pool level.
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 11
Providing continuous data protection services to a specific volume is as easy as ticking a
checkbox in DataCore Software™ SANsymphony™-V storage hypervisor software. Equally
simple is the procedure for setting up a mirroring relationship between different volumes in
different pools, with synchronous replication for volumes within a metropolitan region and
asynchronous replication for volumes separated by longer distances.
Virtualized storage enables a wide range of data protection options, including the extension of
the entire storage infrastructure over distance…or into clouds. Perhaps the most important
benefit of this technology is the fact that, with products like DataCore SANsymphony-V, the
capabilities for defense in depth are delivered right out of the box. There is no need to cobble
together a number of third party software, application software, or hardware-driven processes:
data protection services are delivered, configured, managed and tested holistically with one
Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 12
product. These services are built into infrastructure at the layer of the storage hypervisor,
rather than being bolted-on and separately managed.
Testing is also dramatically simplified, since replication processes can be paused at any time
and mirror sets can be validated without disrupting production systems. In fact, the capability
offered by DataCore Software to leverage remote copies as primary data repositories means
that the percentage of downtime currently accrued to “planned maintenance” can be all but
eliminated by switching to redundant storage infrastructure when you are performing
maintenance on your primary arrays.
CONCLUSION OF PART 4
There are many reasons to virtualize storage infrastructure, but one advantage that cannot be
overlooked is the utility of the strategy from the standpoint of data protection and business
continuity. Virtualized storage infrastructure avails itself of coherent and integrated processes
for data protection that can simplify configuration and maintenance, reduce testing costs, and
improve the likelihood of full recovery from data, localized and even regionalized interruption
events.
Data recovery is not the only component of successful business continuity, but it is an
extremely important one. Think about it: most resources in a business avail themselves of
recovery strategies based on either redundancy or replacement. Data, like personnel, are
irreplaceable. To protect your data, you need to replicate it and place the replica out of harm’s
way.

More Related Content

PDF
Hanover Attains ‘Always on, Always up’ Availability
PDF
Idc paper on disaster recovery
PDF
Address the multidepartmental digital imaging conundrum with enterprise level...
PDF
Whitepaper : Building a disaster ready infrastructure
PDF
IDC Whitepaper: Achieving the full Business Value of Virtualization
PDF
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
 
PDF
Deduplication on Encrypted Big Data in HDFS
PDF
Imagine What The Cloud Can Do/
Hanover Attains ‘Always on, Always up’ Availability
Idc paper on disaster recovery
Address the multidepartmental digital imaging conundrum with enterprise level...
Whitepaper : Building a disaster ready infrastructure
IDC Whitepaper: Achieving the full Business Value of Virtualization
Whitepaper : The Bridge From PACS to VNA: Scale Out Storage
 
Deduplication on Encrypted Big Data in HDFS
Imagine What The Cloud Can Do/

What's hot (20)

PDF
Smart and Secure Healthcare Administration over Cloud Environment
PPTX
Laserfiche empowercon302 2016
PDF
IDC Study on Enterprise Hybrid Cloud Strategies
 
PPT
StorageWorks Business Continuity & Availability Solutions-Hp-8sept2010
PDF
Executive Brief- 4 Critical Risks for Healthcare IT
PDF
Scalable Data Computing for Healthcare and Life Sciences Industry
PDF
Hadoop® Accelerates Earnings Growth in Banking and Insurance
DOCX
McMahon and Associates Cloud Usage Policy Paper
PDF
Geoscientific Data Management Principles
PDF
Data Lake Protection - A Technical Review
 
PDF
Cloud Computing in Healthcare IT
PDF
Cloud computing in healthcare
PDF
Continuum health partners case study
PDF
Thesis blending big data and cloud -epilepsy global data research and inform...
PDF
7 deadly data centre sins: how to recognise them
PDF
Data Warehouse Dirty Word
PDF
Cloud Computing Stats - Cloud for Healthcare
PDF
Software Defined Storage Accelerates Storage Cost Reduction
PDF
DriCloud. Cloud based Electronic Medical Record
PDF
The Storage Side of Private Clouds
Smart and Secure Healthcare Administration over Cloud Environment
Laserfiche empowercon302 2016
IDC Study on Enterprise Hybrid Cloud Strategies
 
StorageWorks Business Continuity & Availability Solutions-Hp-8sept2010
Executive Brief- 4 Critical Risks for Healthcare IT
Scalable Data Computing for Healthcare and Life Sciences Industry
Hadoop® Accelerates Earnings Growth in Banking and Insurance
McMahon and Associates Cloud Usage Policy Paper
Geoscientific Data Management Principles
Data Lake Protection - A Technical Review
 
Cloud Computing in Healthcare IT
Cloud computing in healthcare
Continuum health partners case study
Thesis blending big data and cloud -epilepsy global data research and inform...
7 deadly data centre sins: how to recognise them
Data Warehouse Dirty Word
Cloud Computing Stats - Cloud for Healthcare
Software Defined Storage Accelerates Storage Cost Reduction
DriCloud. Cloud based Electronic Medical Record
The Storage Side of Private Clouds
Ad

Viewers also liked (6)

PPTX
PPTX
Contribution du reseau pamecas a l’amelioration de l’acces par le monde rural...
PPTX
NZTE sharepoint conference presentation
PPT
Geoeverything
PPTX
Conseils pratiques pour la préparation de stratégies genre et ciblage
Contribution du reseau pamecas a l’amelioration de l’acces par le monde rural...
NZTE sharepoint conference presentation
Geoeverything
Conseils pratiques pour la préparation de stratégies genre et ciblage
Ad

Similar to Insider's Guide- The Data Protection Imperative (20)

PDF
Data_Protection_WP - Jon Toigo
PDF
Business Continuity for Mission Critical Applications
PDF
Will You Be Prepared When The Next Disaster Strikes - Whitepaper
PDF
Opteamix_whitepaper_Data Masking Strategy.pdf
PDF
Business Continuity Getting Started
PDF
Disaster Recovery - Deep Dive
PDF
MTW03011USEN.PDF
PDF
The cost of downtime
PDF
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...
PDF
Prevent & Protect
PDF
V mware business trend brief - crash insurance - protect your business with...
PDF
Transformation of legacy landscape in the insurance world
PPTX
the_role_of_resilience_data_in_ensuring_cloud_security.pptx
PDF
Michael Josephs
PDF
Pdf wp-emc-mozyenterprise-hybrid-cloud-backup
PDF
Forrester: How Organizations Are Improving Business Resiliency with Continuou...
 
DOCX
NFRASTRUCTURE MODERNIZATION REVIEW Analyz.docx
DOCX
Big data security
DOCX
Big data security
PDF
Technology Solutions for Manufacturing
Data_Protection_WP - Jon Toigo
Business Continuity for Mission Critical Applications
Will You Be Prepared When The Next Disaster Strikes - Whitepaper
Opteamix_whitepaper_Data Masking Strategy.pdf
Business Continuity Getting Started
Disaster Recovery - Deep Dive
MTW03011USEN.PDF
The cost of downtime
Global_Technology_Services - Technical_Support_Services_White_Paper_External_...
Prevent & Protect
V mware business trend brief - crash insurance - protect your business with...
Transformation of legacy landscape in the insurance world
the_role_of_resilience_data_in_ensuring_cloud_security.pptx
Michael Josephs
Pdf wp-emc-mozyenterprise-hybrid-cloud-backup
Forrester: How Organizations Are Improving Business Resiliency with Continuou...
 
NFRASTRUCTURE MODERNIZATION REVIEW Analyz.docx
Big data security
Big data security
Technology Solutions for Manufacturing

More from DataCore Software (20)

PDF
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
PDF
NVMe and Flash – Make Your Storage Great Again!
PDF
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
PDF
From Disaster to Recovery: Preparing Your IT for the Unexpected
PDF
How to Integrate Hyperconverged Systems with Existing SANs
PDF
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
PDF
Cloud Infrastructure for Your Data Center
PDF
Building a Highly Available Data Infrastructure
PDF
TUI Case Study
PDF
Thorntons Case Study
PDF
Top 3 Challenges Impacting Your Data and How to Solve Them
PDF
Dynamic Hyper-Converged Future Proof Your Data Center
PDF
Community Health Network Delivers Unprecedented Availability for Critical Hea...
PDF
Case Study: Mission Community Hospital
PDF
Emergency Communication of Southern Oregon
PPTX
DataCore At VMworld 2016
PPTX
Integrating Hyper-converged Systems with Existing SANs
PPT
Fighting the Hidden Costs of Data Storage
PPTX
Can $0.08 Change your View of Storage?
PPT
The Need for Speed: Parallel I/O and the New Tick-Tock in Computing
Software-Defined Storage Accelerates Storage Cost Reduction and Service-Level...
NVMe and Flash – Make Your Storage Great Again!
Zero Downtime, Zero Touch Stretch Clusters from Software-Defined Storage
From Disaster to Recovery: Preparing Your IT for the Unexpected
How to Integrate Hyperconverged Systems with Existing SANs
How to Avoid Disasters via Software-Defined Storage Replication & Site Recovery
Cloud Infrastructure for Your Data Center
Building a Highly Available Data Infrastructure
TUI Case Study
Thorntons Case Study
Top 3 Challenges Impacting Your Data and How to Solve Them
Dynamic Hyper-Converged Future Proof Your Data Center
Community Health Network Delivers Unprecedented Availability for Critical Hea...
Case Study: Mission Community Hospital
Emergency Communication of Southern Oregon
DataCore At VMworld 2016
Integrating Hyper-converged Systems with Existing SANs
Fighting the Hidden Costs of Data Storage
Can $0.08 Change your View of Storage?
The Need for Speed: Parallel I/O and the New Tick-Tock in Computing

Insider's Guide- The Data Protection Imperative

  • 1. 1 STORAGE VIRTUALIZATION: AN INSIDER’S GUIDE Jon William Toigo CEO Toigo Partners International Chairman Data Management Institute Copyright © 2013 by the Data Management Institute LLC. All Rights Reserved. Trademarks and tradenames for products discussed in this document are the property of their respective owners. Opinions expressed here are those of the author.
  • 2. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 2 STORAGE VIRTUALIZATION: AN INSIDER’S GUIDE Part 4: The Data Protection Imperative A confluence of three trends is making disaster preparedness and data protection more important than ever before. These trends include the increased use of server and desktop virtualization, growing legal and regulatory mandates around data governance, privacy and preservation, and increased dependency on automation in a challenging business environment as a means to make fewer staff more productive. Business continuity is now a mission-critical undertaking. The good news is that storage virtualization can deliver the right tools to ensure the availability of data assets – the foundation for any successful business continuity or disaster recovery capability.
  • 3. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 3 STORAGE VIRTUALIZATION: AN INSIDER’S GUIDE The Data Protection Imperative DATA PROTECTION MANAGEMENT: THE ESSENTIAL TASK OF BUSINESS CONTINUITY Data protection and business continuity are subjects that nobody likes to talk about, but that everyone in contemporary business and information technology must consider. Today, a confluence of three trends – a kind of “perfect storm”— is making disaster protection planning and disaster preparedness more important than ever before:  First is the increased use of server and desktop virtualization technologies in business computing -- technologies that, for all their purported benefits, also have the downside of being a risk multiplier. With hypervisor-based server hosting, the failure of one hosted application can cause many other application “guests” to fail on the same physical server. While other efficiencies may accrue to server hypervisor computing, the risks that the strategies introduce must be clearly understood in order to avoid catastrophic outcomes during operation.  A second trend underscoring the need for data protection and business continuity planning is the growing regime of regulatory and legal mandates around data preservation and privacy that affect a growing number of industry segments. Some of these rules apply to nearly every company and most carry penalties if businesses cannot show that reasonable efforts have been taken to safeguard data.  Third, and perhaps most compelling, is the simple fact that companies are more dependent than ever before on the continuous operation of IT automation. In the economic reality, the need to make fewer staff more productive has created a much greater dependency on the smooth operation of information systems, networks and storage infrastructure. Even a short term outage can have significant consequences for the business. Bottom line: for many companies, business continuity and data protection have moved from nice-to-have to must-have status. Past debates over the efficacy of investments in preparedness are increasingly moot.
  • 4. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 4 Put simply, there is no “safe place” to construct a data center: historical data on weather and seismic events, natural and man-made disaster potentials, and other catastrophic scenarios demonstrate that all geographies are subject to what most people think of when they hear the word “disaster.” Moreover, from a statistical standpoint, big disasters – those with a broad geographical footprint – represent only a small fraction of the overall causality of IT outages. Only about 5 percent of disasters are those cataclysmic events that grab a spot on the 24 hour cable news
  • 5. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 5 channels. Most downtime is the result of equipment and software maintenance – some call it planned downtime, though efforts are afoot to eliminate planned downtime altogether through clustering and high availability engineering. The next big slices of the outage pie chart involve problems that fall more squarely in the disaster category: those resulting software failures, human errors (“carbon robots”), and IT hardware failures. According to one industry study of 3000 firms in North America and Europe, IT outages in 2010 resulted in 127 million hours of downtime – equivalent to about 65,000 employees drawing salaries without performing work for an entire year! The impact of downtime in tangible terms, such as lost revenues, and intangible terms, such as lost customer confidence, can only be estimated. One study placed idle labor costs across all industry verticals at nearly $1 million per hour. Despite this data, the truism still applies in disaster preparedness that fewer than 50% of businesses have any sort of prevention and recovery capability. Of those that do, fewer than 50% actually test their plans – the equivalent of having no plan whatsoever. The reasons are simple. First, planning requires money, time and resources whose allocation may be difficult to justify given that the resulting capability may never need to be used. Second, plans are typically difficult to manage, since effective planning typically involves multiple data protection techniques and recovery processes that lack cost-effective testing and validation methods. Third, many vendors have embued customers with a false sense of security regarding the invulnerability of their product or architecture, conflating the notion of
  • 6. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 6 high availability architecture with business continuity strategy (the former is actually a subset of the latter). Constructing the plan itself follows a well- defined roadmap. Following an interruption event, three things need to happen: 1. The data associated with critical applications must be recovered to a usable form. 2. Applications need to be re- instantiated and connected to their data. 3. Users need to be reconnected to their re-hosted applications. These three central tasks need to occur quickly as the duration of an interruption event is usually what differentiates an inconvenience from a disaster. Taken altogether, the three tasks may be measured using the metric “time to data.” Time to data, sometimes referred to as a recovery time objective, is both the expression of the goal of a plan and a measure of the efficacy of a strategy applied to realize that goal. Data Recovery is Key The process for building a comprehensive continuity capability requires book length description. (One is being developed online as a free “blook” at book.drplanning.org.) The much condensed version has three basic components. To do a good job of developing a continuity capability, you need to know your data – or more specifically, what data belongs to what applications and what business processes those applications serve. Data and apps “inherit” their criticality – their priority of restore – from the business processes that they serve. So, those relationships must be understood.
  • 7. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 7 The next step is to apply the right stratagems for data recovery, application re-hosting and reconnecting users to each application and its data based on that earlier criticality assessment. Third, plans must be tested – both routinely and on an ad hoc basis. Testing is the long-tail cost of continuity plans, and the decisions we make about recovery objectives and the methods we use to build recovery strategies need to take into account how these strategies will be tested to see how we reduce the cost of a continuity program that virtually nobody wants to spend money on. As a practical matter, data recovery is almost always the slowest part of recovery efforts following an outage – but this is contingent on a lot of things. First, how is data being replicated? Is it backed up to tape, mirrored by software or disk array hardware to an alternative hardware kit? Is the data accessible and in a good condition for restore at the designated recovery site? Chances are good that a company uses a mixture of data protection techniques today. That’s a good thing, since data is not all the same and budgetary sensibility dictates that the most expensive recovery strategies be applied only to the most critical data. Still, planners need to ensure that the approaches being taken are coordinated and monitored on an ongoing basis. From this perspective, a data protection management service that provides a coherent way to configure, monitor, and manage the various data replication functions would be a boon. With such a service in place, it would be much simpler to ascertain whether the right data is being replicated, whether cost- and time-appropriate techniques are being applied based on data criticality, and whether data is being replicated successfully on an on-going basis. As a rule, a “built-in” service is superior to one that is “bolted on” when it comes to data protection. It follows, therefore, that a data protection management service should be designed into the storage infrastructure itself and in such a way as to enable its use across heterogeneous hardware repositories.
  • 8. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 8 Moreover, the ideal data protection management service should be able to manage different types of data protection services such as those that are used today to provide “defense in depth” to data assets. Defense in depth is a concept derived from a realistic appraisal of the risks confronting data assets. Different methods of protection may be required to safeguard assets against different risks. As illustrated above, data needs to be protected, first and foremost, against the most frequent kinds of disaster potentials – those involving user and application errors that cause data deletion or corruption. Many specialized technologies, represented by the red rectangle in the illustration, have been developed to help meet this requirement. The most important, arguably, is some sort of on-going local replication – sometimes called continuous data protection or CDP. Ideally, CDP provides a way to roll data back to the moment before a disruption event or error occurs, like rewinding a tape. The second layer of protection provides protection against localized interruption events such as hardware failures or facility-level events such as broken pipes in data center walls or ceilings, HVAC outages in equipment rooms, etc. Typically, protection against localized faults involves synchronous replication – that is replication in “real time” – between two different physical
  • 9. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 9 repositories, usually across a low latency network. That could be a company LAN connecting two or more arrays on the same raised floor, or in different buildings on a corporate campus or at different sites interconnected by a metropolitan area network. Replicating locally provides an alternative source for the data so that work can proceed with minimal interruption. The third layer of data protection protects against a regional disaster, whether the failure of a power grid or the impact of a severe weather event with a broad geographical footprint. Recovering data in these circumstances typically requires asynchronous replication – that is, replication across a wide area network to an alternative location well out of harm’s way. The challenge of asynchronous replication is one of data deltas – difference between the state of data in the production system and the state of replicated data in the recovery environment – resulting from distance-induced latency and other factors. As a rule of thumb, for every 100 kilometers data travels in a WAN, the remote target is about 12 SCSI write operations behind the source of the data. The effect of latency is cumulative and tends to worsen the further the data travels. This, in turn, can have a significant impact on the usability of the recovery data set and the overall recovery effort, so planners need a way to test asynchronous replication on an on-going basis. Bottom line: there are a lot of challenges to setting up effective defense in depth – especially when the strategy involves the manual integration of many hardware and software processes, often “bolted on” to infrastructure after the fact. Common challenges include a lack of visibility into replication processes. (With most hardware based mirroring schemes, the only way to check to see whether a mirror is working is to break the mirror and check both the source and target for consistency. Checking a mirror is a hassle that nobody likes to do. As a result, a lot of disk mirrors operate without validation.) Another set of challenges relate to cost and logistical issues – especially in hardware-based mirroring. Keeping the mirroring hardware itself synchronized in terms of how the two arrays are divided into LUNs, what RAID levels are being applied, whether the two platforms have the same firmware upgrades requires ongoing effort, time and resources. Related to the above, the maintenance of local mirrors and remote replication strategies typically requires tight coordination between server, storage and application administrators and continuity planners that often doesn’t exist. If a set of LUNs are moved around for production reasons, but this is not communicated to the business continuity planners and accommodated in the replication strategy, replication issues will develop (mirroring empty space, for example). The wrong time to find out about the mistake is when a disaster occurs! Finally, there is the challenge of managing the testing of the data protection strategy holistically – monitoring and managing the CDP and replication processes themselves and the coordinating
  • 10. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 10 of all of the software processes, hardware processes, tape processes, disk mirrors, and so forth that may be involved on an on-going basis. Without a coherent way to wrangle together all of the protection processes, they can quickly become unwieldy and difficult to manage...not to mention very expensive. HOW STORAGE VIRTUALIZATION CAN HELP Storage virtualization provides a solution to many of these challenges by building in a set of services for data protection that are extensible to all hardware platforms and that can be configured and managed from a single management console. To be more exact, storage virtualization establishes a software-based abstraction layer – a virtual controller, if you will – above storage hardware. In so doing, it creates an extensible platform on which shared, centrally-managed storage services can be staged – including data protection management services. With storage virtualized, it is easy to pool storage resources into target volumes that can be designated as repositories for different kinds of data. By segregating data and writing it onto volumes in specific pools, services can be applied selectively to the data at either the volume or the pool level.
  • 11. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 11 Providing continuous data protection services to a specific volume is as easy as ticking a checkbox in DataCore Software™ SANsymphony™-V storage hypervisor software. Equally simple is the procedure for setting up a mirroring relationship between different volumes in different pools, with synchronous replication for volumes within a metropolitan region and asynchronous replication for volumes separated by longer distances. Virtualized storage enables a wide range of data protection options, including the extension of the entire storage infrastructure over distance…or into clouds. Perhaps the most important benefit of this technology is the fact that, with products like DataCore SANsymphony-V, the capabilities for defense in depth are delivered right out of the box. There is no need to cobble together a number of third party software, application software, or hardware-driven processes: data protection services are delivered, configured, managed and tested holistically with one
  • 12. Copyright © 2013 by The Data Management Institute LLC. All Rights Reserved. 12 product. These services are built into infrastructure at the layer of the storage hypervisor, rather than being bolted-on and separately managed. Testing is also dramatically simplified, since replication processes can be paused at any time and mirror sets can be validated without disrupting production systems. In fact, the capability offered by DataCore Software to leverage remote copies as primary data repositories means that the percentage of downtime currently accrued to “planned maintenance” can be all but eliminated by switching to redundant storage infrastructure when you are performing maintenance on your primary arrays. CONCLUSION OF PART 4 There are many reasons to virtualize storage infrastructure, but one advantage that cannot be overlooked is the utility of the strategy from the standpoint of data protection and business continuity. Virtualized storage infrastructure avails itself of coherent and integrated processes for data protection that can simplify configuration and maintenance, reduce testing costs, and improve the likelihood of full recovery from data, localized and even regionalized interruption events. Data recovery is not the only component of successful business continuity, but it is an extremely important one. Think about it: most resources in a business avail themselves of recovery strategies based on either redundancy or replacement. Data, like personnel, are irreplaceable. To protect your data, you need to replicate it and place the replica out of harm’s way.