SlideShare a Scribd company logo
Aspirus Enterprise Backup Assessment and Implementation of AvamarWritten by: Thomas Whalen – Server and Storage Infrastructure Team Leader, Aspirus Information Technology DepartmentExecutive SummarySince the initial implementation of Epic within the Aspirus Health System, the ability to maintain a consistent backup process was a recurring challenge.  The largest aspect of this challenge was finding a combination of backup technology and storage solutions to handle the continual growth of data as Aspirus continued to expand its Epic environment both in terms of clinical records and application modules.In late 2009, the Aspirus Information Technology department was able to participate in a proof of concept around EMC’s Avamar host-based de-duplication backup grid and Networker backup management software to see what the results of pushing the Epic production data to this backup architecture.  In the past, we had leveraged a product from Exagrid to perform target-based de-duplication but found that the Exagrid didn’t yield the performance and de-duplication rates we considered acceptable as the environment continued to grow.  In front of Exagrid we also had Symantec’s Netbackup backup management software that was proving to be inconsistent in performing routine backups and was plagued with various system issues forcing the IT staff to constantly focus a large degree of attention to it just to assure that routine backups could take place.Once we began the proof of concept with Avamar, we determined very quickly that the de-duplication rates observed were superior to Exagrid’s target-based de-duplication appliance.  Also we felt that the scalability for the long-term needs of Aspirus’ ever-increasing data growth showed that EMC’s Avamar technology using its RAIN (Redundant Array of Independent Nodes) architecture would scale as Aspirus data rates grew.  As important as all of this, the other aspect was that while implementing the system, we never observed any specific system issues with backups simply not working.At the end of the proof of concept and its eventual implementation, we now can realize an overall de-duplication rate of our Epic environment (based on routine nightly backups) of 110:1 storing an average of 900GB of total storage with an average nightly change rate of 1.5 – 2.5% or roughly 8G of daily changes.  Because of the aggressive de-duplication capabilities, this equates to a significantly lower cost of ownership on securing the same amount of data typical written to tape or even another disk-based de-duplication system.  It also allows us to free up staffing dedicated to hand-holding our previous backup system reallocating that time to be focused to more meaningful work in IT.  Lastly, the days of random missed backups appear to be a thing of the past which assures us that our clinical and financial data will be consistently protected through its life-cycle.  Epic Backup Architecture Aspirus uses a number of technologies to position its Epic clinical data for backup.  In the beginning we would simply pull the backups from a snapshot mounted to the Epic shadow server and then spin that data off the magnetic tape.  We found that this posed a number of specific problems both in Epic performance and also in the performance of writing the backup.In the area of Epic performance, using the SnapView tools from EMC on our Clariion SAN, we found that, based on the nature of how snapshots are designed to work, when the backup was initiated and the snapshot was mounted to the shadow server that this caused a residual effect in degraded performance of the production environment.   As our datasets began to get larger and larger we found this performance issue becoming more visible to users and the mission of the IT technical group is to assure that we maintained the highest degree of performance 24x7.  But as our environment started to grow, we also noticed that our backup window was getting longer and longer while we wrote 500-600G of data off to our DLT tape array.  As time progressed and the data grew, we saw the writing on the wall that DLT was not going to be the long-term solution if we wanted to keep a daily backup process intact.At this point, we decided to use EMC SnapView clones to replicate the data from the production storage LUN’s to cloned LUN’s.  While this is more expensive because of the duplicate storage requirements of the clone, mitigation of the performance issues we saw snapshot process was a good trade off in our opinion.  Also the clone could be used for other purposes like environment refreshes.  The initial clone was created using EMC 500G SATA drives which were slower in their overall speed but had more overall disk capacity.  At this same time, we also moved away from using our DLT tape array to a target-based disk appliance from Exagrid.  This transition was a good move as it brought about faster backups and restores but also introduced target-based de-duplication.  As backups began to be written to Exagrid, we started show de-duplication rates around the 15:1.  While transitioning from snapshots to clones, we also made the decision to move the backup processes off the production Shadow Server.  The Shadow Server was pulling double-duty by not only providing the DR shadow as part of Epic’s overall best-practices, but we were also using that same Shadow Server to be the extracting Cache database for Epic Clarity reporting, a very intensive process.  In order to reduce the Shadow Server workload, we decided to build a dedicated IBM AIX cloning server to present the Epic production clone to.  This allowed us to make sure that no other Epic-specific processes or services were being impacted while we performed routine backups.  The clone again would also allow us to use it for routine non-production environment refreshes for future builds, testing, validation, etc.  Visual Representation of Previous Backup SystemIn this design, we were getting acceptable backups but in using the SATA disks as well as the growth of the Epic production database, we started to see limits to the value of using the Exagrid storage system in speed and de-duplication as well as seeing more and more problems with managing the Epic backups through NetBackup. Avamar and Networker AssessmentContrasting Avamar vs. ExagridThe Avamar technology is comprised of a collection of servers or nodes or RAIN (Redundant Array of Independent Nodes) that comprises a “grid” of storage resources.  The grid can grow as your storage needs grow and can natively support backups across the network along with the ability to manage NDMP backups for NAS-based storage solutions.  Avamar also has the capabilities to replicate of backups across separate grids to provide DR for your critical data.  While the Avamar and Exagrid storage architectures share similarities in function, the biggest difference is in their general method handling the backup data it’s self. Avamar is a host-based de-duplication system utilizing a client that sits on the server where the data required to backup resides to interrogate data that will be sent to the Avamar grid and then only sending the changed data down the wire.  This results in less network traffic for your backup data.  Avamar used a patented “Commonality Factor” process which learns patterns of data behavior and uses this to determine the degree of changed data from unchanged data and thus determines its de-duplication rates.  Exagrid on the other hand is a target-based de-duplication system in which all data is sent to the grid into a high-speed disk repository.  In this repository, Exagrid does a comparison of the data to create its de-duplication at the byte-level and moves the changed data to a lower-speed, higher capacity disk area for long-term retention and compression.  This process takes place once all the data’s been passed to the Exagrid indicating to the software client that the backup is completed.One can argue the benefits of both types of technologies, and in fact, both are generally very good.  But in considering Epic as our target application, we determined that sending close to 1TB down the wire nightly was a big part of our current backup pains.  The Avamar system mitigates that again by using the host-based client to help determine what changes have been made and only sends the changed data down the wire resulting in Avamar just storing and managing the data that’s changed between backup cycles.  But also in using the client to manage the changed data we uncovered an issue around our cloning process.  Our initial testing using the SATA-based clone of Epic production showed an unacceptable degree of IOPS being pushed the host to interrogate data.  Our first backups running with SATA ran in excess of 10 hours before it was completed.  After investigating the host’s performance during the backup process, it was easy to see that the clone IOPS were slowing down the ability of the Avamar client to interrogate and move the changed data down the wire to the grid.  Based on this, we created a new Fiber Channel-based clone running on 300G, 15K RPM drives.  In this configuration, the impact was very positive.  Our backup went from 10 hours to 6 hours on the first run faster than any backup we’ve ever cut since we went live in 2004.  After a number of days of testing nightly backups, we began to see Avamar’s de-duplication process.  Avamar Backup Performance ResultsThe Avamar grid showed very good performance in accepting the data from the host even while using a single 1 gigabyte network connection.  Over the course of 10 backup tests, backup timings were recorded along with the amount of changed data and then computed de-duplication ratios from the change rate.  Figure 1 illustrates those results:Figure 1: De-Duplication Change RateThe figure 1 shows 11 backups that were run against Aspirus Epic production data using the Avamar de-duplication grid.  What this chart shows is over the period of the backup cycles, the daily change rate decreases as the host-based client “learns” the pattern of changes day-to-day.  This knowledge is used to then capture the differences only and send those to the grid. The chart’s left column is the percentage of de-duplication in percentage.  What this shows is that as the daily backups were performed, each night the amount of data that the client determined was unchanged increased.  The first backup showed zero data changes as it was the first backup performed and Avamar saw all data as new.  Then backup 2 through 11 showed a steady increase in de-duplicated data.  By backup 11, the rate of de-duplication was over 90%.  Given a 900GB Epic database, this means that the backup consisted of roughly 7 to 10G in total changes sent to the grid.  The benefits of this are a dramatic decrease is network traffic and over the continuum of backups along with a significant decrease in overall storage needs to keep a longer retention of Epic backups available.  Based on the amount of total data over the amount of changed data, this shows a de-duplication rate of approximately 110:1.   The value of this is measure in a number of ways.  The largest consideration is in space required to store the same data to tape.  Using DLT, even with compression you would need 2-3 tapes per night to keep that data safe.  With Avamar, the amount of data required is 900G plus the daily changes.  So for a week’s worth of backups, that equates to storage needs of about 950G versus approximately 21 tapes to keep about 4.9TB factoring in a moderate compression ratio on the DLT tape drive.Figure 2: Backup Time – Snapshot 1Figure 2 shows the tracking of backup time in hours for the 11 backups we monitored.  You’ll note that between backup 5 and 6 you’ll see a dramatic drop in time needed to perform the backup.  This is the impact of using the Fiber Channel clone versus the SATA clone.  This change reduced the backup time by 50%.  What this chart does not show is the impact of Commonality Factor as the Avamar client learns the pattern of data change between backup cycles.  As of the writing of this document, another capture of backup times show a much more interesting chart that illustrated the impact of Avamar’s  Commonality Factor.Figure 3: Backup Time with Commonality FactorFigure 3 illustrated that over a longer period of time how Commonality Factor plays a role in the reduction to your backup window.  As Commonality Factor learns the pattern of changed and unchanged data day-by-day, it uses algorithms to determine how to best scan the data on the host.  When this efficiency occurs, this reduces the work needed by the client to review the data with the impact being an overall reduction of time in backup.  You will see above that around backup 8 through 20 a slow decline in backup time as Commonality Factor plays are larger role in how much time the client needs to spend scanning the file systems.The fact that Aspirus can now backup their entire Epic production cache database instance in roughly 4 hours speaks volumes around the power on the Commonality Factor process versus other backup and deduplication technologies we’ve used in the past.  This is simply the finest backup process we’ve encountered to date.Avamar Restore Performance ResultsUsing Networker as the front-end to our Backup and Restore process posed a challenge in the area of Epic Production data restores.  The reason is based around the aspect of Networker being designed for more a windows-oriented restore process.  Said differently, unlikely backups where Networker will fire off a multi-threaded backup process (multiple avtar processes or Networker save processes), for restores it will only create a single restore for each of the file systems one at a time until all file systems are restored.Because of this characteristic, this poses challenges in the area of Epic restores.  Because the traditional Epic cache database instance is comprised of multiple Epic production file systems, restoring those file systems one at a time would take a significant amount of time to complete even with the smallest of cache instances.  In our testing, we found that by launching multiple restore processes against each file system allowed Networker to leverage the horsepower of the Avamar Grid and network infrastructure to pull back each Epic file system at the same time, thus simulating a multi-threaded restore.  During the course of testing 4 restore points with the EMC Networker/Avamar technology, we recorded an aggregate restore time noted in Figure 3.Figure 4: Restores Single Instance vs. Multi-InstanceIn figure 4, we see that in the area of a single-instance restore, the restoration process takes significantly longer, upwards to days to finish.  In a multi-instance restore, the ability to pull back your Epic production data is more palatable and results in a restore time that you can base an SLA around.  Also if you look at the graph, you’ll see that multi-instance restore performs almost as well as the backup which is contrary to conventional 2:1 backup to restore baselines used in the IT industry today.But as we were learning about the recovery process, an optimization concern emerged that plays a significant role in the restoration process.  Figure 5: Epic System File System ProvisioningIn figure 5, what we found as we were really dissected the restore process for our Epic production system was that one file system was significantly larger than any other file system in the production instance.  Because of this file system, we noticed that individually all of our restores were resulting in about a 6 hour recovery time frame.  All file systems but /epic/prd01.  The /epic/prd01 area of Cache was individually taking ~12 hours to finish and thus pushed our recovery window to 12 hours in total.  Considering the /epic/prd01 is 2 ½ times the size of any other /epic/prdxx file system, the restore time seemed to make sense albeit not optimal.To avoid this situation, when we learned is that we’ll need to do a better job of being more mindful of the balance of data between file systems and keep them all relative in size to assure that in a restore situation, we can maximize our time to recovery between all file systems.  In this case, by balancing /epic/prd01 with all the remaining file system, even if they all grow an additional 10-20%, we should be able to reduce our recovery window from 12-13 hours to approximately 6-6 ½ hours given the restore timing we’ve already collected with the other /epic/prd02 – 08 file systems.Aspirus will actively engage Epic to better balance the /epic/prd01 file system with the other remaining Epic Cache file systems and then will revisit the recovery window again but we feel that our projections of recovery will be acceptable given the testing already completed.  Also as stated early, every instance we restored, the file systems passed Epic integrity tests without issue.Analysis SummaryBased on the finding we captured in both the backup and recovery processes of using the EMC Networker and Avamar Grid technology is that from an Epic perspective, offers a significant improvement in the overall management of backup data.  From an SLA perspective, Aspirus was able to move their backup window for Epic from a 12-14 hour backup window to a 4–54 ½ hour backup window with recovery RTO of 7-8 hours down from 24-48 hours spinning back from tape.  Care must be taken in assessing your Epic file systems to ensure they are balanced as well as positioning yourself with a host that the Epic data can be presented to.  These steps are critical to the success of the implementation.From a Cost of Ownership perspective, we’ve observed in using aggressiveness of Avamar’s deduplication technology and its use of Commonality Factor, we’ve been able to reduce the long-term size of our backups for the retention windows we feel necessary by almost 90% over using the Exagrid.  This equates to less money being spent to continually add capacity for all the other backups in the enterprise and extends our initial storage provisioning far longer than we originally anticipated.  Also because of the host-based client, less data is traversing the network which helps to maintain overall network performance and make WAN-based Networker/Avamar back-ups a reality versus a wish-list item.The Avamar grid is sold based on your data deduplication needs not on the total amount of backup space like other backup storage technologies.  Again because of the Commonality Factoring process, you’re initial determination of storage needs based on CF means that generally the Avamar RAIN grid will cost less per GB and require less total storage space due to higher degrees of deduplication achieved over other deduplication systems.  Another major cost factor is client costs.  With other backup technologies you must license the clients or hosts you wish the backup.  With Avamar, the clients are free for a wide array of hosts (Windows, IBM AIX, HP-UX, Linux) but includes agents for Microsoft Exchange and Sharepoint, Oracle, DB2, and others which are usually high priced accessory licenses to the host license itself.  In a lot of cases, it’s in this area that the costs of implementations of backup systems become very expensive quickly.  In closing, Aspirus has spent a lot of time working with the EMC Avamar / Networker backup technology and feel it was absolutely the right move to make for all the points above but there’s one final point I have yet to cover.  The best part outside of all the cool things technically with this backup environment is that we feel our backups are safe, recoverable, and we manage backups versus backups managing us.  The time we can now devote to other work because we have a technically sound and functionally stable backup environment.The Aspirus Backup Architecture Today
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar
Aspirus Enterprise Backup Assessment And Implementation Of Avamar

More Related Content

PPTX
Avamar Presentatie Q3 2010
PDF
Symantec NetBackup na Nuvem AWS
PDF
Get higher transaction throughput and better price/performance with an Amazon...
PDF
Boosting performance with the Dell Acceleration Appliance for Databases
PDF
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
PDF
A single-socket Dell EMC PowerEdge R7515 solution delivered better value on a...
PPT
Avamar 7 2010
PDF
Upgrade to Dell EMC PowerEdge R940 servers with VMware vSphere 7.0 and gain g...
Avamar Presentatie Q3 2010
Symantec NetBackup na Nuvem AWS
Get higher transaction throughput and better price/performance with an Amazon...
Boosting performance with the Dell Acceleration Appliance for Databases
Consolidate SAS 9.4 workloads with Intel Xeon processor E7 v3 and Intel SSD t...
A single-socket Dell EMC PowerEdge R7515 solution delivered better value on a...
Avamar 7 2010
Upgrade to Dell EMC PowerEdge R940 servers with VMware vSphere 7.0 and gain g...

What's hot (20)

PDF
Prepare images for machine learning faster with servers powered by AMD EPYC 7...
PDF
Transforming Backup and Recovery in VMware environments with EMC Avamar and D...
PDF
Implementing an NDMP backup solution using EMC NetWorker on IBM Storwize V700...
PDF
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
PDF
Presentation deduplication backup software and system
PDF
Symantec NetBackup 7.6 benchmark comparison: Data protection in a large-scale...
PPTX
EMC Data domain advanced features and functions
DOCX
Avamar Run Book - 5-14-2015_v3
PDF
3 key wins: Dell EMC PowerEdge MX with OpenManage Enterprise over Cisco UCS a...
PDF
Les solutions EMC de sauvegarde des données avec déduplication dans les envir...
PDF
Component upgrades from Intel and Dell can increase VM density and boost perf...
PPTX
NetBackup Appliance Family presentation
PDF
Spend less time, effort, and money by choosing a Dell EMC server with pre-ins...
PDF
Automate high-touch server lifecycle management tasks
PDF
VMworld 2014: Data Protection for vSphere 101
PDF
SQL Server 2016 database performance on the Dell EMC PowerEdge FC630 QLogic 1...
PDF
Keep data available without affecting user response time
PDF
twp-oracledatabasebackupservice-2183633
PDF
VMworld 2013: vSphere Data Protection 5.5 Advanced VMware Backup and Recovery...
PDF
Migrate VMs faster with a new Dell EMC PowerEdge MX solution
Prepare images for machine learning faster with servers powered by AMD EPYC 7...
Transforming Backup and Recovery in VMware environments with EMC Avamar and D...
Implementing an NDMP backup solution using EMC NetWorker on IBM Storwize V700...
AWS EC2 M6i instances with 3rd Gen Intel Xeon Scalable processors accelerated...
Presentation deduplication backup software and system
Symantec NetBackup 7.6 benchmark comparison: Data protection in a large-scale...
EMC Data domain advanced features and functions
Avamar Run Book - 5-14-2015_v3
3 key wins: Dell EMC PowerEdge MX with OpenManage Enterprise over Cisco UCS a...
Les solutions EMC de sauvegarde des données avec déduplication dans les envir...
Component upgrades from Intel and Dell can increase VM density and boost perf...
NetBackup Appliance Family presentation
Spend less time, effort, and money by choosing a Dell EMC server with pre-ins...
Automate high-touch server lifecycle management tasks
VMworld 2014: Data Protection for vSphere 101
SQL Server 2016 database performance on the Dell EMC PowerEdge FC630 QLogic 1...
Keep data available without affecting user response time
twp-oracledatabasebackupservice-2183633
VMworld 2013: vSphere Data Protection 5.5 Advanced VMware Backup and Recovery...
Migrate VMs faster with a new Dell EMC PowerEdge MX solution
Ad

Viewers also liked (17)

PPTX
Accel - EMC - Data Domain Series
PDF
2009-dec02_EMC2
PDF
Comparing Unitrends and EMC Avamar
PPTX
Corporate Laptop Backup and Recovery
PDF
Presentation backup and recovery best practices for very large databases (v...
PPTX
Recovery and backup for beginners
PPTX
Data backup and disaster recovery
PPTX
Exadata Backup
PPT
Les 06 Perform Rec
PPT
Lesson 8 - Understanding Backup and Recovery Methods
PDF
Backup and recovery in oracle
PPT
Backup And Recovery
PDF
Whitepaper : ESG Whitepaper: Backup and Recovery of Large Scale VMware Enviro...
 
PDF
EMC IT's Journey to Cloud : BUSINESS PRODUCTION BACKUP & RECOVERY SYSTEMS
 
PPT
Services Marketing
PDF
26 Time Management Hacks I Wish I'd Known at 20
PPT
Technology powerpoint presentations
Accel - EMC - Data Domain Series
2009-dec02_EMC2
Comparing Unitrends and EMC Avamar
Corporate Laptop Backup and Recovery
Presentation backup and recovery best practices for very large databases (v...
Recovery and backup for beginners
Data backup and disaster recovery
Exadata Backup
Les 06 Perform Rec
Lesson 8 - Understanding Backup and Recovery Methods
Backup and recovery in oracle
Backup And Recovery
Whitepaper : ESG Whitepaper: Backup and Recovery of Large Scale VMware Enviro...
 
EMC IT's Journey to Cloud : BUSINESS PRODUCTION BACKUP & RECOVERY SYSTEMS
 
Services Marketing
26 Time Management Hacks I Wish I'd Known at 20
Technology powerpoint presentations
Ad

Similar to Aspirus Enterprise Backup Assessment And Implementation Of Avamar (20)

PDF
Dmg emc-avamar-optimized-backup-recovery-dedupe[1]
PDF
The economics of backup 5 ways disk backup can help your business
PDF
8 considerations for evaluating disk based backup solutions
PPTX
Avamar weekly webcast
PPT
Avamar-7_2010.ppt
PPTX
PPTX
Avamar presales 1.0
PPTX
Prueba para postear un ppt
PPT
Champion Fas Deduplication
PPTX
Emerging Tech Showcase Exagrid
PPT
The Pensions Trust - VM Backup Experiences
PDF
Proact ExaGrid Seminar Presentation KK 20220419.pdf
PPTX
5 Ways Your Backup Design Can Impact Virtualized Data Protection
PDF
ExaGrid-15 Value Propositions
PPTX
Key Considerations For Deduplication In The Enterprise
PPTX
SPFS - A filesystem for Spectrum Protect
PPTX
SPFS - A filesystem for Spectrum Protect
PPTX
PPTX
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...
PDF
2010 data protection best practices
Dmg emc-avamar-optimized-backup-recovery-dedupe[1]
The economics of backup 5 ways disk backup can help your business
8 considerations for evaluating disk based backup solutions
Avamar weekly webcast
Avamar-7_2010.ppt
Avamar presales 1.0
Prueba para postear un ppt
Champion Fas Deduplication
Emerging Tech Showcase Exagrid
The Pensions Trust - VM Backup Experiences
Proact ExaGrid Seminar Presentation KK 20220419.pdf
5 Ways Your Backup Design Can Impact Virtualized Data Protection
ExaGrid-15 Value Propositions
Key Considerations For Deduplication In The Enterprise
SPFS - A filesystem for Spectrum Protect
SPFS - A filesystem for Spectrum Protect
Net App D2 D Backupwith Snap Vaultand Ossv Customer Strategic Presentation201...
2010 data protection best practices

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
KodekX | Application Modernization Development
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
Big Data Technologies - Introduction.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
Spectral efficient network and resource selection model in 5G networks
Agricultural_Statistics_at_a_Glance_2022_0.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Reach Out and Touch Someone: Haptics and Empathic Computing
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Spectroscopy.pptx food analysis technology
Chapter 3 Spatial Domain Image Processing.pdf
KodekX | Application Modernization Development
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Digital-Transformation-Roadmap-for-Companies.pptx
sap open course for s4hana steps from ECC to s4
Unlocking AI with Model Context Protocol (MCP)
Big Data Technologies - Introduction.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
NewMind AI Weekly Chronicles - August'25 Week I

Aspirus Enterprise Backup Assessment And Implementation Of Avamar

  • 1. Aspirus Enterprise Backup Assessment and Implementation of AvamarWritten by: Thomas Whalen – Server and Storage Infrastructure Team Leader, Aspirus Information Technology DepartmentExecutive SummarySince the initial implementation of Epic within the Aspirus Health System, the ability to maintain a consistent backup process was a recurring challenge. The largest aspect of this challenge was finding a combination of backup technology and storage solutions to handle the continual growth of data as Aspirus continued to expand its Epic environment both in terms of clinical records and application modules.In late 2009, the Aspirus Information Technology department was able to participate in a proof of concept around EMC’s Avamar host-based de-duplication backup grid and Networker backup management software to see what the results of pushing the Epic production data to this backup architecture. In the past, we had leveraged a product from Exagrid to perform target-based de-duplication but found that the Exagrid didn’t yield the performance and de-duplication rates we considered acceptable as the environment continued to grow. In front of Exagrid we also had Symantec’s Netbackup backup management software that was proving to be inconsistent in performing routine backups and was plagued with various system issues forcing the IT staff to constantly focus a large degree of attention to it just to assure that routine backups could take place.Once we began the proof of concept with Avamar, we determined very quickly that the de-duplication rates observed were superior to Exagrid’s target-based de-duplication appliance. Also we felt that the scalability for the long-term needs of Aspirus’ ever-increasing data growth showed that EMC’s Avamar technology using its RAIN (Redundant Array of Independent Nodes) architecture would scale as Aspirus data rates grew. As important as all of this, the other aspect was that while implementing the system, we never observed any specific system issues with backups simply not working.At the end of the proof of concept and its eventual implementation, we now can realize an overall de-duplication rate of our Epic environment (based on routine nightly backups) of 110:1 storing an average of 900GB of total storage with an average nightly change rate of 1.5 – 2.5% or roughly 8G of daily changes. Because of the aggressive de-duplication capabilities, this equates to a significantly lower cost of ownership on securing the same amount of data typical written to tape or even another disk-based de-duplication system. It also allows us to free up staffing dedicated to hand-holding our previous backup system reallocating that time to be focused to more meaningful work in IT. Lastly, the days of random missed backups appear to be a thing of the past which assures us that our clinical and financial data will be consistently protected through its life-cycle. Epic Backup Architecture Aspirus uses a number of technologies to position its Epic clinical data for backup. In the beginning we would simply pull the backups from a snapshot mounted to the Epic shadow server and then spin that data off the magnetic tape. We found that this posed a number of specific problems both in Epic performance and also in the performance of writing the backup.In the area of Epic performance, using the SnapView tools from EMC on our Clariion SAN, we found that, based on the nature of how snapshots are designed to work, when the backup was initiated and the snapshot was mounted to the shadow server that this caused a residual effect in degraded performance of the production environment. As our datasets began to get larger and larger we found this performance issue becoming more visible to users and the mission of the IT technical group is to assure that we maintained the highest degree of performance 24x7. But as our environment started to grow, we also noticed that our backup window was getting longer and longer while we wrote 500-600G of data off to our DLT tape array. As time progressed and the data grew, we saw the writing on the wall that DLT was not going to be the long-term solution if we wanted to keep a daily backup process intact.At this point, we decided to use EMC SnapView clones to replicate the data from the production storage LUN’s to cloned LUN’s. While this is more expensive because of the duplicate storage requirements of the clone, mitigation of the performance issues we saw snapshot process was a good trade off in our opinion. Also the clone could be used for other purposes like environment refreshes. The initial clone was created using EMC 500G SATA drives which were slower in their overall speed but had more overall disk capacity. At this same time, we also moved away from using our DLT tape array to a target-based disk appliance from Exagrid. This transition was a good move as it brought about faster backups and restores but also introduced target-based de-duplication. As backups began to be written to Exagrid, we started show de-duplication rates around the 15:1. While transitioning from snapshots to clones, we also made the decision to move the backup processes off the production Shadow Server. The Shadow Server was pulling double-duty by not only providing the DR shadow as part of Epic’s overall best-practices, but we were also using that same Shadow Server to be the extracting Cache database for Epic Clarity reporting, a very intensive process. In order to reduce the Shadow Server workload, we decided to build a dedicated IBM AIX cloning server to present the Epic production clone to. This allowed us to make sure that no other Epic-specific processes or services were being impacted while we performed routine backups. The clone again would also allow us to use it for routine non-production environment refreshes for future builds, testing, validation, etc. Visual Representation of Previous Backup SystemIn this design, we were getting acceptable backups but in using the SATA disks as well as the growth of the Epic production database, we started to see limits to the value of using the Exagrid storage system in speed and de-duplication as well as seeing more and more problems with managing the Epic backups through NetBackup. Avamar and Networker AssessmentContrasting Avamar vs. ExagridThe Avamar technology is comprised of a collection of servers or nodes or RAIN (Redundant Array of Independent Nodes) that comprises a “grid” of storage resources. The grid can grow as your storage needs grow and can natively support backups across the network along with the ability to manage NDMP backups for NAS-based storage solutions. Avamar also has the capabilities to replicate of backups across separate grids to provide DR for your critical data. While the Avamar and Exagrid storage architectures share similarities in function, the biggest difference is in their general method handling the backup data it’s self. Avamar is a host-based de-duplication system utilizing a client that sits on the server where the data required to backup resides to interrogate data that will be sent to the Avamar grid and then only sending the changed data down the wire. This results in less network traffic for your backup data. Avamar used a patented “Commonality Factor” process which learns patterns of data behavior and uses this to determine the degree of changed data from unchanged data and thus determines its de-duplication rates. Exagrid on the other hand is a target-based de-duplication system in which all data is sent to the grid into a high-speed disk repository. In this repository, Exagrid does a comparison of the data to create its de-duplication at the byte-level and moves the changed data to a lower-speed, higher capacity disk area for long-term retention and compression. This process takes place once all the data’s been passed to the Exagrid indicating to the software client that the backup is completed.One can argue the benefits of both types of technologies, and in fact, both are generally very good. But in considering Epic as our target application, we determined that sending close to 1TB down the wire nightly was a big part of our current backup pains. The Avamar system mitigates that again by using the host-based client to help determine what changes have been made and only sends the changed data down the wire resulting in Avamar just storing and managing the data that’s changed between backup cycles. But also in using the client to manage the changed data we uncovered an issue around our cloning process. Our initial testing using the SATA-based clone of Epic production showed an unacceptable degree of IOPS being pushed the host to interrogate data. Our first backups running with SATA ran in excess of 10 hours before it was completed. After investigating the host’s performance during the backup process, it was easy to see that the clone IOPS were slowing down the ability of the Avamar client to interrogate and move the changed data down the wire to the grid. Based on this, we created a new Fiber Channel-based clone running on 300G, 15K RPM drives. In this configuration, the impact was very positive. Our backup went from 10 hours to 6 hours on the first run faster than any backup we’ve ever cut since we went live in 2004. After a number of days of testing nightly backups, we began to see Avamar’s de-duplication process. Avamar Backup Performance ResultsThe Avamar grid showed very good performance in accepting the data from the host even while using a single 1 gigabyte network connection. Over the course of 10 backup tests, backup timings were recorded along with the amount of changed data and then computed de-duplication ratios from the change rate. Figure 1 illustrates those results:Figure 1: De-Duplication Change RateThe figure 1 shows 11 backups that were run against Aspirus Epic production data using the Avamar de-duplication grid. What this chart shows is over the period of the backup cycles, the daily change rate decreases as the host-based client “learns” the pattern of changes day-to-day. This knowledge is used to then capture the differences only and send those to the grid. The chart’s left column is the percentage of de-duplication in percentage. What this shows is that as the daily backups were performed, each night the amount of data that the client determined was unchanged increased. The first backup showed zero data changes as it was the first backup performed and Avamar saw all data as new. Then backup 2 through 11 showed a steady increase in de-duplicated data. By backup 11, the rate of de-duplication was over 90%. Given a 900GB Epic database, this means that the backup consisted of roughly 7 to 10G in total changes sent to the grid. The benefits of this are a dramatic decrease is network traffic and over the continuum of backups along with a significant decrease in overall storage needs to keep a longer retention of Epic backups available. Based on the amount of total data over the amount of changed data, this shows a de-duplication rate of approximately 110:1. The value of this is measure in a number of ways. The largest consideration is in space required to store the same data to tape. Using DLT, even with compression you would need 2-3 tapes per night to keep that data safe. With Avamar, the amount of data required is 900G plus the daily changes. So for a week’s worth of backups, that equates to storage needs of about 950G versus approximately 21 tapes to keep about 4.9TB factoring in a moderate compression ratio on the DLT tape drive.Figure 2: Backup Time – Snapshot 1Figure 2 shows the tracking of backup time in hours for the 11 backups we monitored. You’ll note that between backup 5 and 6 you’ll see a dramatic drop in time needed to perform the backup. This is the impact of using the Fiber Channel clone versus the SATA clone. This change reduced the backup time by 50%. What this chart does not show is the impact of Commonality Factor as the Avamar client learns the pattern of data change between backup cycles. As of the writing of this document, another capture of backup times show a much more interesting chart that illustrated the impact of Avamar’s Commonality Factor.Figure 3: Backup Time with Commonality FactorFigure 3 illustrated that over a longer period of time how Commonality Factor plays a role in the reduction to your backup window. As Commonality Factor learns the pattern of changed and unchanged data day-by-day, it uses algorithms to determine how to best scan the data on the host. When this efficiency occurs, this reduces the work needed by the client to review the data with the impact being an overall reduction of time in backup. You will see above that around backup 8 through 20 a slow decline in backup time as Commonality Factor plays are larger role in how much time the client needs to spend scanning the file systems.The fact that Aspirus can now backup their entire Epic production cache database instance in roughly 4 hours speaks volumes around the power on the Commonality Factor process versus other backup and deduplication technologies we’ve used in the past. This is simply the finest backup process we’ve encountered to date.Avamar Restore Performance ResultsUsing Networker as the front-end to our Backup and Restore process posed a challenge in the area of Epic Production data restores. The reason is based around the aspect of Networker being designed for more a windows-oriented restore process. Said differently, unlikely backups where Networker will fire off a multi-threaded backup process (multiple avtar processes or Networker save processes), for restores it will only create a single restore for each of the file systems one at a time until all file systems are restored.Because of this characteristic, this poses challenges in the area of Epic restores. Because the traditional Epic cache database instance is comprised of multiple Epic production file systems, restoring those file systems one at a time would take a significant amount of time to complete even with the smallest of cache instances. In our testing, we found that by launching multiple restore processes against each file system allowed Networker to leverage the horsepower of the Avamar Grid and network infrastructure to pull back each Epic file system at the same time, thus simulating a multi-threaded restore. During the course of testing 4 restore points with the EMC Networker/Avamar technology, we recorded an aggregate restore time noted in Figure 3.Figure 4: Restores Single Instance vs. Multi-InstanceIn figure 4, we see that in the area of a single-instance restore, the restoration process takes significantly longer, upwards to days to finish. In a multi-instance restore, the ability to pull back your Epic production data is more palatable and results in a restore time that you can base an SLA around. Also if you look at the graph, you’ll see that multi-instance restore performs almost as well as the backup which is contrary to conventional 2:1 backup to restore baselines used in the IT industry today.But as we were learning about the recovery process, an optimization concern emerged that plays a significant role in the restoration process. Figure 5: Epic System File System ProvisioningIn figure 5, what we found as we were really dissected the restore process for our Epic production system was that one file system was significantly larger than any other file system in the production instance. Because of this file system, we noticed that individually all of our restores were resulting in about a 6 hour recovery time frame. All file systems but /epic/prd01. The /epic/prd01 area of Cache was individually taking ~12 hours to finish and thus pushed our recovery window to 12 hours in total. Considering the /epic/prd01 is 2 ½ times the size of any other /epic/prdxx file system, the restore time seemed to make sense albeit not optimal.To avoid this situation, when we learned is that we’ll need to do a better job of being more mindful of the balance of data between file systems and keep them all relative in size to assure that in a restore situation, we can maximize our time to recovery between all file systems. In this case, by balancing /epic/prd01 with all the remaining file system, even if they all grow an additional 10-20%, we should be able to reduce our recovery window from 12-13 hours to approximately 6-6 ½ hours given the restore timing we’ve already collected with the other /epic/prd02 – 08 file systems.Aspirus will actively engage Epic to better balance the /epic/prd01 file system with the other remaining Epic Cache file systems and then will revisit the recovery window again but we feel that our projections of recovery will be acceptable given the testing already completed. Also as stated early, every instance we restored, the file systems passed Epic integrity tests without issue.Analysis SummaryBased on the finding we captured in both the backup and recovery processes of using the EMC Networker and Avamar Grid technology is that from an Epic perspective, offers a significant improvement in the overall management of backup data. From an SLA perspective, Aspirus was able to move their backup window for Epic from a 12-14 hour backup window to a 4–54 ½ hour backup window with recovery RTO of 7-8 hours down from 24-48 hours spinning back from tape. Care must be taken in assessing your Epic file systems to ensure they are balanced as well as positioning yourself with a host that the Epic data can be presented to. These steps are critical to the success of the implementation.From a Cost of Ownership perspective, we’ve observed in using aggressiveness of Avamar’s deduplication technology and its use of Commonality Factor, we’ve been able to reduce the long-term size of our backups for the retention windows we feel necessary by almost 90% over using the Exagrid. This equates to less money being spent to continually add capacity for all the other backups in the enterprise and extends our initial storage provisioning far longer than we originally anticipated. Also because of the host-based client, less data is traversing the network which helps to maintain overall network performance and make WAN-based Networker/Avamar back-ups a reality versus a wish-list item.The Avamar grid is sold based on your data deduplication needs not on the total amount of backup space like other backup storage technologies. Again because of the Commonality Factoring process, you’re initial determination of storage needs based on CF means that generally the Avamar RAIN grid will cost less per GB and require less total storage space due to higher degrees of deduplication achieved over other deduplication systems. Another major cost factor is client costs. With other backup technologies you must license the clients or hosts you wish the backup. With Avamar, the clients are free for a wide array of hosts (Windows, IBM AIX, HP-UX, Linux) but includes agents for Microsoft Exchange and Sharepoint, Oracle, DB2, and others which are usually high priced accessory licenses to the host license itself. In a lot of cases, it’s in this area that the costs of implementations of backup systems become very expensive quickly. In closing, Aspirus has spent a lot of time working with the EMC Avamar / Networker backup technology and feel it was absolutely the right move to make for all the points above but there’s one final point I have yet to cover. The best part outside of all the cool things technically with this backup environment is that we feel our backups are safe, recoverable, and we manage backups versus backups managing us. The time we can now devote to other work because we have a technically sound and functionally stable backup environment.The Aspirus Backup Architecture Today