SlideShare a Scribd company logo
Data Center Forum: Power & Cooling Issues October 12, 2006 Presenters:  Dr. Robert Sullivan, "Dr. Bob,” Triton Technology Systems  Fritz Menchinger, NER
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
MOORE’S LAW “The number of transistors on a chip doubles every 24 months and the Performance doubles every 18 months.”  Intel cofounder Gordon Moore (1965)
GROWTH IN TRANSISTORS PER DIE © 2003 Intel Corporation
2005 – 2010 PROJECTIONS PRODUCT HEAT DENSITY TREND CHART
INTEL PROJECTED ACTUAL POWER CONSUMPTION FOR 42-1 RU SERVERS Actual product power consumption has lagged behind these projections by about 2 years *  Product footprint 3,360 W/ft 2* 20.2 kW 480 W/RU Q3, 2004 2,900 W/ft 2*   17.6 kW 420 W/RU Q3, 2003 2,478 W/ft 2* 14.8 kW 354 W/RU Q3, 2002 1,890 W/ft 2* 11.3 kW 270 W/RU Q3, 2001
2004 HIGH-END PRODUCTS 2004 Trend chart mid-point projection was 1,800  W/ft 2 (Maximum configurations & options) *  Based on product footprint 750 W/ft 2* 54.0 kW 32”x324” EMC DMX3 1,100 W/ft 2* 21.0 kW 32”x87” IBM DASD 1,300 W/ft 2* 10.0 kW 28”x40” HP Superdome 1,800 W/ft 2* 16.0 kW 36”x36” IBM Z-Series 1,700 W/ft 2* 24.0 kW 36”x56” Sun F15K
2004 BLADE AND 1U SERVERS Trend chart mid-point projection for 2004 is 3,000 W/ft 2 (Maximum configurations & options) *  Product footprint W/ft2 based on actual cabinet size 3,000 W/ft 2 * 18.0 kW HP ProLiant Ble 2,000 W/ft 2 * 8.0 kW Electric Oven 2,200 W/ft 2 * 13.3 kW  RLX ServerBlade 3000i 2,300 W/ft 2 * 14.0 kW Sun Sunfire 3,000 W/ft 2 * 18.0 kW IBM eServer Blade Center 4,000 W/ft 2 * 24.0 kW Dell PowerEdge 1850MC
IMPLICATIONS OF THE CMOS POWER CRISIS Cost per processor is decreasing at 29%   per year Constant dollars spent on high performance IT hardware three years from now will buy:  2.7 times more processors  12 times more processing power in the same or less floor space 3.3 times UPS power consumption increase Site power consumption will increase by at least 2x the UPS power consumption increase
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
TEMPERATURE RISE UPON LOSS OF AIR COOLING Time to critical temperature rise  40 W/ft² - 10 minutes 100 W/ft² - 3 to 5 minutes 200 W/ft² - 1to 3 minutes 300 W/ft² - Less than a minute 300 W/ft²?  4.5 kW in 15 ft² Single cabinet in typical Cold Aisle / Hot Aisle arrangement
HIGH DENSITIES REQUIRE A COOLING PARADIGM SHIFT A paradigm shift occurs when a previously loosely coupled site infrastructure system becomes tightly coupled  Small changes have a big impact (often with unexpected results) Reliability of individual components, and fault tolerance if they malfunction, becomes critical
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
MISMATCHED EXPECTATIONS “ Does it NEVER FAIL?” Versus “ Does it WORK?” (Always available versus normally available) Linguistic Choices Are Critical
MISMATCHED EXPECTATIONS When expectations do not match reality IT demand = “24 by Forever” availability Infrastructure = Tier I or Tier II facility Failure to Define Expectations
FUNCTIONALITY DEFINITIONS Multiple active paths, redundant  Tier IV:  Single active path, redundant  Tier III:  Single path, redundant components Tier II:  Single path, no redundancy  Tier I:
SINGLE POWER PATH SINGLE POINTS-OF-FAILURE UPS system level failure Major circuit breakers (2-20) Minor circuit breakers (20-500) Plugs and receptacles (21-505) Electrical connections (258-6180) Human error False EPO Utility Battery Generator THREE POWER PATHS ONE POWER PATH COMPUTER HARDWARE
DUAL POWER PATH SINGLE POINTS-OF-FAILURE UPS system level failure Major circuit breakers (2-20) Minor circuit breakers (20-500) Utility Utility Battery Generator THREE POWER PATHS TWO POWER PATHS COMPUTER HARDWARE Generator Battery 2 3 1 45
MISMATCHED EXPECTATIONS “Match the required level of site infrastructure capacity, functionality, master planning, organizational charter and doctrine, staffing, processes, and training to availability expectations.” The Only Way to Assure Success
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
Cooling concerns > No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow Need for supplemental cooling
POOR MASTER PLANNING  Circle the wagons approach is common  Not good Cooling Unit Layout
POOR MASTER PLANNING  Random placement Hot spot solution to “circle the wagons” Worse Cooling Unit Layout
RAISED-FLOOR UTILIZATION All aisles have elevated “mixed” temperature (starved supply airflow compounds problem) Fails to deliver predictable air intake temperatures Reduces return air temperature which reduces cooling unit capacity and removes moisture Traditional (Legacy) Layout
MASTER PLANNING Static regain improves usable cooling unit redundancy Maximizes static pressure & CFM per perforated tile Minimizes effect of high discharge in velocity Achieves High IT Yield by Maximizing Cooling Delivery
COMPUTER ROOM LAYOUT
IT YIELD Cooling delivery is typically the constraining factor Manage cooling by zones of the overall room One to four building bays max Monitor and manage IT Yield performance metrics Racks/thermal conduction ft 2 Rack unit positions PDU power Breaker positions Redundant sensible cooling capacity Floor loading Maximize Site Investment Utilization
Cooling concerns No computer room master plan > Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow Need for supplemental cooling
FAILURE TO MEASURE AND MONITOR If you do not measure and record You can not monitor If you do not monitor you can not control Without controls  Chaos  reigns
MEASURING MONITORING AND CONTROL Cooling delivery is typically the constraining factor Manage cooling by zones of the overall room One to four building bays max Monitor and manage IT Yield performance metrics Racks/thermal conduction ft 2 Rack unit positions PDU power Breaker positions Communication ports Redundant sensible cooling capacity Floor loading
MASTER PLAN MONITORING WORKSHEET
Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices > Mechanical incapacity Bypass airflow Need for supplemental cooling
ORIGINS OF THERMAL INCAPACITY   Design or Equipment Related A gross “ton” is not a “sensible” ton DX system refrigerant being partially charged “ Dueling” dehumidification/humidification Insufficient airflow across cooling coils Chilled water temperature too low Computer room return temperature too low Too much cold air bypass through unmanaged openings  (cable cutouts and penetrations to adjacent spaces)
ORIGINS OF THERMAL INCAPACITY   Human Factors Lack of psychrometric chart knowledge Inappropriate computer room floor plan and equipment layouts Pre-cooling of returning hot air by incorrect perforated floor tile placement and unsealed cable openings Control sensors and instruments not calibrated Engineering consultants who do not yet understand the unique cooling dynamics of data centers and underfloor air distribution
CONSEQUENCES OF THERMAL INCAPACITY The following results are based on detailed measurements in 19 computer rooms totaling 204,400 ft 2 10% of the racks had “hot spots” at the intake air exceeding 77°F/40% Rh  This occurred despite having 2.6 times more cooling running than was required by the heat load Rooms with the greatest excess of cooling capacity had the worst % of hot spots 10% of the cooling units had failed
Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity > Bypass airflow Need for supplemental cooling
BYPASS AIRFLOW DEFINITION Escaping through cable cutouts and holes under cabinets Escaping through misplaced perforated tiles Escaping through holes in computer room perimeter walls, ceiling, or floor Conditioned air is not getting to the air intakes of computer equipment
Cold air escapes through cable cutouts Escaping cold air reduces static pressure resulting in insufficient cold aisle airflow Result is vertical and zone hot spots in high heat load areas COMPUTER ROOM LAYOUT OPTIONS EFFECT OF BYPASS AIRFLOW
RAISED-FLOOR UTILIZATION TRADITIONAL LAYOUT All aisles have elevated “mixed” temperature (starved supply airflow compounds problem)  Fails to deliver predictable air intake temperatures Reduces return air temperature which reduces cooling unit capacity
TYPICAL BYPASS AIRFLOW CONDITION Reduces kW capacity per rack that can be effectively and predictably cooled.
TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED)  This unnecessarily large raised-floor opening should be closed. The edges of the cutout must be dressed according to NFPA code.
TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) Unnecessarily large cable cutout under a server rack
TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) Both a bypass airflow problem and a safety hazard.
BYPASS AIRFLOW - IS IT A PROBLEM? Based on detailed measurements in 19 computer rooms totaling 204,400 ft 2 Despite 2.6 times more cooling running than was required by the heat load, 10% of racks had air intake temperatures exceeding  ASHRAE  maximum reliability guidelines (rooms with greatest excess cooling capacity running had worst hot spots) 60% of available cold air is short cycling back to cooling units through perforated tiles in the hot aisle and unsealed cable openings
BYPASS AIRFLOW REDUCES RELIABILITY, STABILITY, AND USABLE COOLING CAPACITY Reduces underfloor static pressure Reduces volume of conditioned air coming into the cold aisle Exacerbates problems with underfloor obstructions Creates environment where recirculation of hot exhaust air across the top of racks will  occur Reduces kW capacity per rack that can be effectively and predictably cooled
INTERNAL RECIRCULATION CAN REDUCE RELIABILITY Utilize blanking plates within cabinets  Internal recirculation is also a problem
BYPASS AIRFLOW HOW IS IT FIXED? PERIMETER HOLES Use permanently installed firestop materials for conduits, pipes, construction holes, etc., through walls Removable fire pillows for floor or wall cable pass throughs Seal all the holes in the computer room perimeter
BEST PRACTICE PERIMETER PENETRATIONS ARE SEALED Good example of fire stopping through a sidewall.
BEST PRACTICE PERIMETER PENETRATIONS ARE SEALED (CONTINUED) Excellent fire stopping practices are evident throughout this site.
FLOOR OPENINGS ACCEPTABLE CABLE CUTOUT SOLUTIONS Fire pillows Foam sheeting Brush assemblies Seal all raised-floor cable openings plus openings around PDUs and cooling units
FLOOR OPENINGS FIRE PILLOW SOLUTION Difficult to achieve an effective level of sealing  Often falls to subfloor or is kicked out of way Regular policing is required No static dissipative property for electrostatic char
FLOOR OPENINGS FIRE PILLOW EXAMPLES This is one way to prevent air loss. Additional refinement is needed.
FLOOR OPENINGS FOAM SHEETING SOLUTION   Very labor intensive to achieve good sealing efficiency Every cabling change requires re-cutting foam Often tears, pulls out, or falls to subfloor when cable head is pulled through Requires regular policing  Special foam material is required to achieve static dissipation
FLOOR OPENINGS FOAM SEALING EXAMPLES Plugging cable opening is a good practice, a better choice of materials would be more appropriate.
FLOOR OPENINGS PROBLEMS WITH FOAM SEALING (CONTINUED)   The foam in this picture was torn and hanging by a thread.  It was pieced back together for the picture.  Tearing occurs when the cable head passes through.  Foam is typically deformed or missing in 50% to 75% of openings after six months.
FLOOR OPENINGS PROBLEMS WITH FOAM SEALING (CONTINUED)   Foam sealing has not been reinstalled after re-cabling. Resulting opening allows significant air leakage.
FLOOR OPENINGS BRUSH SEALING ASSEMBLIES   Most expensive initially, but least life cycle cost because recurring policing labor is not required High sealing effectiveness both initially and after multiple recablings (100% sealing effectiveness in undisturbed opening area) Doesn’t require training or policing Can be static dissipative
FLOOR OPENINGS BRUSH SEALING FOR NEW OPENINGS   Brush grommet for sealing new holes in floor tiles.
FLOOR OPENINGS BRUSH SEALING FOR NEW OPENINGS   Brush grommet for sealing new holes in floor tiles.
FLOOR OPENINGS BRUSH SEALING FOR EXISTING OPENINGS   Separable brush grommet for sealing existing openings in floor tiles.
INTERNAL BYPASS AIRFLOW - HOW IS IT FIXED? Install internal blanking plates within cabinets to prevent open RU openings from recirculating hot air exhaust
INTERNAL BYPASS AIRFLOW   BLANKING PLATE INSTALLATION EXAMPLE   Proper use of blanking or filler plates exhibited.
Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow > Need for supplemental cooling
NEED FOR SUPPLEMENTAL COOLING When the normal cooling system will not handle the load, especially in high density spot situations, supplemental cooling is necessary Dedicated to one cabinet Dedicated to an area of the room
SUPPLEMENTAL COOLING OPTIONS In line cooling – horizontal airflow  Hot exhaust air drawn through a fan coil unit and cold air blown into the Cold Aisle.  Usually a chilled water installation Overhead cooling Fan coil unit sits on top of cabinet or is hung from ceiling. Hot exhaust air drawn through a fan coil unit, using a refrigerant not chilled water, and blown into the Cold Aisle. Back cover cooling  Fan coil system replaces the back cover of a cabinet Usually a chilled water system Heat is neutralized before being blown into Hot Aisle  Dedicated cabinet Air is recirculated within the cabinet Contains its own fans and cooling coil All have redundant fan systems Only one has redundant cooling capability
MORE INFORMATION koldlok.com – Koldlok Products upsite.com – White Papers How to Successfully Cool High-Density IT Hardware Seminar November 1 – 3,  Miami, FL  November 27 – 29, Santa Fe, NM Check upsite.com for details
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
HISTORY OF INNOVATION 2006-Infrastructure Consulting, Design, Build, High Density Power and Cooling 1985 MF Tape Racks 1989 Autotrieve 1 st  automated  tape racks, first S.A.M 2006 IT Security Audit, Compliance Services 1992 NER’s first custom server cabinets 1998 Began Distributing Cybex 2000 Began Distributing NetBotz 2001 Began Distributing ServerTech 2004 Ultimate Core/ Largest Avocent Strategic Dist. Partner  2002 Introduced R3 1991 first high-density tape racks 2005 Launch of Services Business
INNOVATION ROADMAP Solutions and Services for 2006 and beyond 2005 Begin Factory Integration 2005 On-Site integration 2006 Data Center Health Check CFD Modeling and Adaptivcool cooling solutions Build-out, Build-New Assessment Enhanced Centralized Management Solutions and Training Services Project Management and Implementation 2006 Asset& Inventory Service Data Center Construction Enhanced Facility Monitoring
THE LATEST FROM AFCOM SURVEYS “ More than two thirds (66%) of 178 Afcom Data Center professionals surveyed anticipate they'll have to expand their data centers or turn to outsourcing to meet demands in the next decade.”  Source =  Information Week – August 2006
AFCOM SURVEY SAYS “by 2010, more than half (50%) of all data centers will have to relocate or outsource some applications”.
CURRENT STATE OF DATA CENTER COOLING 19 Data Centers surveyed Average had 2.6X more cooling than IT load Still 10% of racks were over 77F 72% of cooling air bypasses racks or mixes before racks AFCOM 2006 Keynote Speech
ARE YOU PREPARED?? “ AFCOM's Data Center Institute, a think tank dedicated to improving data center management and operations predicts that over the next five years, power failures and limits on power availability will halt data center operations at more than 90% of companies”.  AFCOM 2006 Keynote Speech
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
CIO’S PARADOX TODAY (SAME MESSAGE FOR 3 YEARS) The CIO’s Paradox in action: What the CIO hears from other business leaders. “ You’re a service. Why can’t you respond to our division better? What are your  people  doing?” & “ IT is strategic. How do you, the CIO, set investment priorities? “ I have no trust in IT’s ability to deliver  measurable  value” & “ We need new, better solutions. “ Reduce your budget!” & Keep our systems running 24x7!”
DATA CENTER FAULT TOLERANCE Are these photos from your data center?
THERMAL MANAGEMENT- IT’S A REALITY  (WE HAVE BEEN USING THIS SLIDE FOR YEARS) New Servers One cabinet = 27,304 BTU/hr!!!! One ton of cooling = 12,000 BTU/hr A fully configured cabinet can produce 35,000+ watts.* *IBM Blade Center H w/ 4 2900 watt power supplies 3/cabinet Projected Thermal Loads Servers and DASD Power =  Heat! Servers & DASD 100W/ft 2   150W/ft 2    200W/ft 2    2000   2002   2005
CAN YOU GET 5 TONS OF AIR THROUGH  ONE PERFORATED TILE??  HOW ABOUT 9?
HOW WILL YOU GET 6KW OR MORE TO A SINGLE CABINET?? Application Based Circuits 5.9 kW (4) 20 amp, 110v circuits (8) redundant (3) 30 amp, 110v circuits (6) redundant (2) 30 amp, 208v circuits (4) redundant (1) 60 amp, 208v circuit (2) redundant
HIGH DENSITY Leads to Complexity (we have been talking about this for years) Manufacturers have their own management platform Higher powered systems require a higher level of management Physical requirements change Supporting Infrastructure is more complex (weight, power, heat, cables)
WHAT CAN YOU DO? Implement best practices where you can Focus on high value/high return Choose solutions that allow for standardization & automation Rely on Trusted Advisors to fill the gaps
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
BACKGROUND –SOUND FAMILIAR?? Aging data center (7 years) Blades and other High Density implemented creating power and heat problems Consolidating via VM Ware Management demanding higher reliability Management will not move the data center
DATA CENTER HEALTH CHECK-UP Thermal – Airflow delivery, perforated tile placement Cable Management – In-rack cabling, overhead ladder racking, sub-floor cable raceway Standardization - Cabinet placement & orientation, data center layout, cabinet types Remote Access & Monitoring – KVM types, environmental monitoring, server room access Fault Tolerance – Redundant power, CRAC failure What areas did we investigate?
ISSUE #1- THERMALS/HEAT Problem: Heat build-up in server room Leading Causes: Sub-floor obstructions – power & data cabling 12” raised floor is insufficient for size of server room Perforated tiles incorrectly placed Cabinet placement and orientation do not permit hot aisle return paths – cabinets intake hot air
HEAT BUILDUP IN SERVER ROOM
THERMAL MANAGEMENT RECOMMENDATIONS Clean-up under floor power and data cabling. Move towards proper cable management– “Data Above, Power Below” Re-work perforated floor tile positions Block cable cutouts with brushed grommets Implement hot-aisle/cold-aisle design  Install Adaptive Cool airflow delivery system to maintain temperatures in and around high-density cabinets
RECOMMENDED PERFORATED TILE POSITIONING WITH EXISTING LAYOUT Move  this row!
RECOMMENDED HOT-AISLE COLD-AISLE LAYOUT Note the placement of the CRAC’s to the hot aisle. Note that we shut off one of the CRAC’s!
THERMAL PATTERNS OF RECOMMENDED HOT-AISLE/COLD-AISLE LAYOUT Before After v4
CABLE MANAGEMENT ISSUE #2 Problem: Power & Data Cabling causing clutter under raised floor and inside cabinets Leading Causes: No standard for in-rack cable management causing waterfall cabling and heat buildup No Overhead ladder racking  No cable raceway under the floor Server depth reaching cabinet capacity
CABLE MANAGEMENT RECOMMENDATIONS Segregate power and data via cable management  Eliminate swing arms Install patch panels to document rack density and per-port usage Use ladder racking  Standardize placement of in-rack power supplies to consolidate and secure power cabling Before After
STANDARDIZATION ISSUE #3 Problem: Lack of standardization across server room Leading Causes: Cabinets vary in size, shape, and vendor type Row lengths vary, gaps in rows due to tables Rack space unused No cable management standards In-rack power utilizing inefficient voltage/amperage
STANDARDIZATION RECOMMENDATIONS Set standards for cabinets Invest in a standard cabinet structure, power, cable mgmt High voltage/high amperage power will increase efficiency/decrease usage Separate power and data Consolidate servers into high density cabinets Arrange like cabinets in rows Remove tables from rows Arrange rows in equal lengths to improve airflow patterns Arrange layout into hot-aisle/cold-aisle design
REMOTE ACCESS & MONITORING ISSUE #4 Problem Constant foot traffic in server room due to lack of remote access Problem-Current KVM technology using legacy analog local-access methods Leading Causes: Current remote access methods costly and using up multiple network ports Environmental monitoring not adequate or providing remote notification Central alarm management located off-site Leak detection system inadequate for lakeside location
REMOTE ACCESS & MONITORING RECOMMENDATIONS Invest in modern KVM technologies Reduce cabling and port usage Improve access to servers from NOC and remote sites Install and maintain centralized remote access & monitoring solution in-house Reduce foot traffic in and out of the server room Improve security of devices and data Install improved environmental monitoring system Configure to monitor CRAC, UPS, Generator Configure for remote notification services according to user-specified thresholds Install adequate leak detection solution
FAULT TOLERANCE ISSUES #5 Problem Lack of Redundant systems to prevent unscheduled downtime Cabinets utilizing non-redundant power sources Cable clutter putting devices at risk of being disconnected Three-phase power delivery system out of balance CRAC placement inhibits redundancy and recovery from failures (page 16) UPS and Generator reaching maximum capacity, no room for growth Risk of lakeside flood requires more adequate leak detection system and disaster notification services
WORST CASE SCENARIO – 15-TON CRAC FAILURE Failure in Current Layout Failure in Hot/Cold Layout
FAULT TOLERANCE RECOMMENDATIONS Plan in-rack power around fully configured cabinet Proper planning will result in a balance three-phase power delivery system Proper cable management will decrease risk of disconnection of power/data cabling Consistent maintenance of aging CRAC units will extend lifetime Improved airflow will help to improve efficiency of units Cooling redundancy during CRAC failures still not available Install an improved leaked detection system Monitor perimeter of data center as well as specified points such as chilled water piping, overhead plumbing, and CRAC units
COST BENEFIT ANALYSIS Category 1 Low Benefit / Low Return Low Cost / Minimal Effort Category 2 High Benefit / High Return Low Cost / Minimal Effort Category 3 Minimal Benefit / Low Return High Cost / Maximum Effort Category 4 High Benefit / High Return High Cost / Maximum Effort Benefit / Return on Investment Cost / Effort
AIRFLOW/CABLE MGMT COST/BENEFIT ANALYSIS MATRIX Category 1 Manage cabling inside racks Position perforated tiles in recommended layout Eliminate vendor cable mgmt Category 2 Consolidate/rerun cabling Invest in proper perforated tiles and layout Implement Hot/Cold layout Category 3 Add/Replace CRAC units Add/Replace UPS/Generator Standardize racks/power Category 4 Standardize racks/power in conjunction with Hot/Cold layout Install dynamic airflow sys. Benefit / Return on Investment Cost / Effort
REMOTE A&M/FAULT TOLERANCE COST/BENEFIT ANALYSIS MATRIX Category 1 Manage cabling to eliminate risk of disconnections Improve remote access via current methods Category 2 Install modern KVM/Remote Access & Monitoring Maintain/Upgrade CRACs Improve leak detection Category 3 Add/Replace CRAC/Gen. Standardize racks/power/data/access Improve leak detection Category 4 Relocate Data Center Create Cooling/Disaster Recovery Plan from scratch Plan power/data/access Benefit / Return on Investment Cost / Effort
IN SUMMARY Recommendations will help to calm the issues Resolutions may not provide room for growth/fault tolerance without large cost associated with add/replacement of data center devices (CRAC, UPS, Generator) Assistance in implementation of Hot-aisle/Cold-aisle setup and rack/power standardization will help to improve issues May cause downtime Requires effort towards rearrangement High cost/minimal effort resolutions may aid in resolving current issues Issues may arise again due to normal growth during next several years
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
BACKGROUND High Visibility name brand with a strong competitor New Data Center Build (renovation) Very old multi-use building near city center  Short time frame from build–out completion to lights-on  Well known consulting firm doing project management
ISSUES Issues  30 days to implement new data center  (post construction) Maintain Enterprise Standards- while meeting varying requirements of New Data Center Build Provide best practices at the rack level Maximize efficiencies of standardization and automation Cable Management - Cabinet placement & orientation,  Remote Access & Monitoring – KVM types,  Fault Tolerance – Redundant power,
VARYING POWER REQUIREMENTS BY ROW WAS A PERCEIVED ISSUE FOR THE CUSTOMER 15kW 12kW 9kW 5kW
TIME WAS NOT ON THEIR SIDE Build-out had a 90 day window for completion Vendor Management was an issue for the project manager
2 to 4 208V 60A power strips Pre-wired with two power cords C13 every 4U (total of 14 cords) Run two blue cat5 copper cables and one green cat5 copper cable from each server location. Two pairs of LC cable to the  top of the rack Install a 24 port CAT6 panel  and three fiber cassettes for a  total of 18 LC ports.  Keyboard pull-out  IP Remote Console for each  cabinet Ganged and leveled at the  customer site THE SOLUTION (FACTORY INTEGRATED CABINETS)
THE RESULT? Customers saves 8 hours per cabinet of integration time. $105,000 in hard costs savings after integration fees. $50,000 savings from Electrical contractor because of fewer runs and all runs being the same  Maintained standards as every cabinet was the same (look and feel) Funneled many tasks (and vendors) into a single source. One throat to choke. Met 30 day operational deadline
AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution  Case study #3: Disaster recovery site
BACKGROUND The Katrina disaster 2000 miles away, a CRAC failure during Indian summer, the temporary unavailability of spot coolers for rental and persistent Hotspots spotlighted the need for better thermal management.  The site faced sharply rising energy costs it had not budgeted for. Higher electricity bills cut deeply into the IT department’s budget. The Technology Manager knew he needed a solution to both the thermal and cost problems and evaluated the options available.
ISSUES (OPTIONS CONSIDERED) ·Use Hot Aisle/Cold Aisle Layout – This is a normal solution. But here, the locations of all the CRACs at one side of the room, and the inability to shut down the Disaster Recovery function’s IT equipment during a move made this option unattractive.   Add Additional CRAC Capacity –  Room was almost full No capital budget for buying more CRACs. Plus more CRACs would drive utility bills higher Already had more than enough cooling capacity with his 30-ton units. ·
OBJECTIVES OF THE PROJECT Maintain  a higher cooling margin during the summer months where the environment immediately surrounding the Data Center exposes the center to temperature extremes. Fix  Hotspots in several areas. Save  money on electric utilities, with payback from any expense to come in less than a year. Allow  remote monitoring of the Data Center’s thermal health. Plan  for almost certain expansion of the center’s thermal load. Support  and maintenance plan that would insure continued thermal performance. ·
IMPLEMENTATION Site Audit  to inventory and characterize the IT equipment heat sources and the facility; Simulation  using CFD (Computational Fluid Dynamics) modeling to predict heat and airflow of baseline Data Center; Verification  of the CFD model against measurements taken during the audit;  Iterate  to make improvements using the model to determine optimum configuration of passive and active airflow elements; ·
CFD MODELING Before After
IMPLEMENTATION Installation  of sensor network to monitor changes during remaining Installation Install and reconfigure  passive and active airflow elements; Verify and recertify  room thermal performance;  No downtime was incurred during any of the project phases. ·
COOLING MARGIN RESULTS The cooling margin of the room was improved by 7ºF at the top of the racks.
SUMMARY OF RESULTS More dramatically,virtually all server intake temperatures dropped, some by as much as14ºF.
SUMMARY OF UTILITY SAVINGS This particular site cannot take full advantage of its potential energy savings today because it has constant speed components.  Facilities executives are considering adding low cost Variable Frequency Drives (VFDs) in the future to benefit fully.
SUMMARY Data Center cooling/power technology has remained virtually unchanged for 30 years IT equipment refreshes every 3 years. Power requirements and Heat densities are increasing with each new generation. More raw cooling capacity or adding more lower voltage whips is rarely the right answer.
FINAL WORD Meeting the current and near term power needs with High Voltage/Amperage power and targeting the available cooling to where it is needed most, and controlling airflow precisely are the sensible approaches that will pay dividends in equipment uptime, energy costs and real estate.

More Related Content

PPT
Eliminating Data Center Hot Spots
PPT
Data Center Cooling Study on Liquid Cooling
PPTX
Cooling Optimization 101: A Beginner's Guide to Data Center Cooling
PPTX
Clarifying ASHRAE's Recommended Vs. Allowable Temperature Envelopes and How t...
PDF
Ab bp warehouse hvac calculation
PDF
Slides: The Top 3 North America Data Center Trends for Cooling
PDF
How Row-based Data Center Cooling Works
PPTX
Gaining Data Center Cooling Efficiency Through Airflow Management
Eliminating Data Center Hot Spots
Data Center Cooling Study on Liquid Cooling
Cooling Optimization 101: A Beginner's Guide to Data Center Cooling
Clarifying ASHRAE's Recommended Vs. Allowable Temperature Envelopes and How t...
Ab bp warehouse hvac calculation
Slides: The Top 3 North America Data Center Trends for Cooling
How Row-based Data Center Cooling Works
Gaining Data Center Cooling Efficiency Through Airflow Management

What's hot (20)

PDF
Heat Load Calculation
PDF
Data center design standards for cabinet and floor loading
PPT
Data Center Lessons Learned
PDF
Apc cooling solutions
PPTX
Data Center Cooling Efficiency: Understanding the Science of the 4 Delta T's
PPTX
Utilization of Computer Room Cooling Infrastructure: Measurement Reveals Oppo...
PPTX
How IT Decisions Impact Facilities: The Importance of Mutual Understanding
PDF
Impact Of High Density Colo Hot Aisles on IT Personnel Work Conditions
PDF
The Use of Ceiling Ducted Air Containment in Data Centers
PPTX
The Science Behind Airflow Management Best Practices
PDF
Datwyler data center presentation info tech middle east
PDF
Data Center Floor Design - Your Layout Can Save of Kill Your PUE & Cooling Ef...
PPTX
Myths of Data Center Containment:Whats's True and What's Not
PPT
Csc.Cooling Audit Results.V1.12 Jul 08
PDF
Sb in row-gen.ii-1.0_en
PDF
Integrating Cost & Engineering Considerations in HVAC Design
PPTX
Liquid cooling hot water cooling
PDF
Implementing Hot and Cold Air Containment in Existing Data Centers
PPTX
Load estimation in Air Conditioning
PPT
Data Centre Efficiency
Heat Load Calculation
Data center design standards for cabinet and floor loading
Data Center Lessons Learned
Apc cooling solutions
Data Center Cooling Efficiency: Understanding the Science of the 4 Delta T's
Utilization of Computer Room Cooling Infrastructure: Measurement Reveals Oppo...
How IT Decisions Impact Facilities: The Importance of Mutual Understanding
Impact Of High Density Colo Hot Aisles on IT Personnel Work Conditions
The Use of Ceiling Ducted Air Containment in Data Centers
The Science Behind Airflow Management Best Practices
Datwyler data center presentation info tech middle east
Data Center Floor Design - Your Layout Can Save of Kill Your PUE & Cooling Ef...
Myths of Data Center Containment:Whats's True and What's Not
Csc.Cooling Audit Results.V1.12 Jul 08
Sb in row-gen.ii-1.0_en
Integrating Cost & Engineering Considerations in HVAC Design
Liquid cooling hot water cooling
Implementing Hot and Cold Air Containment in Existing Data Centers
Load estimation in Air Conditioning
Data Centre Efficiency
Ad

Viewers also liked (20)

PPS
Rio LLC District Cooling
PPT
example hydrogen seal oil presentation
PPT
Water Conservation - Cooling Tower Management Overview
PPSX
Cooling water (CW) system
PPTX
Generator cooling and sealing system 2
PDF
Turbine Inlet Air Cooling (TIAC) - Case Studies - Economics - Performance - C...
PPTX
Intelligent Cooling system
PPTX
Power Sector in Pakistan - 2017
PDF
Solar Thermal Cooling
PDF
District cooling design & case study
PDF
Hvac water chillers and cooling towers fundamentals application and operation
PDF
Cooling system for ic engines
PPTX
Methods Of Cooling Of Electrical Machines
PPT
Cooling system
PPTX
Topic 3 District Cooling System
PPT
Cooling towers
PPT
Engine systems diesel engine analyst - full
PPT
Engine systems diesel engine analyst - part 2
PPT
Cooling system ppt
PPTX
Cooling system
Rio LLC District Cooling
example hydrogen seal oil presentation
Water Conservation - Cooling Tower Management Overview
Cooling water (CW) system
Generator cooling and sealing system 2
Turbine Inlet Air Cooling (TIAC) - Case Studies - Economics - Performance - C...
Intelligent Cooling system
Power Sector in Pakistan - 2017
Solar Thermal Cooling
District cooling design & case study
Hvac water chillers and cooling towers fundamentals application and operation
Cooling system for ic engines
Methods Of Cooling Of Electrical Machines
Cooling system
Topic 3 District Cooling System
Cooling towers
Engine systems diesel engine analyst - full
Engine systems diesel engine analyst - part 2
Cooling system ppt
Cooling system
Ad

Similar to Cooling & Power Issues (20)

PPTX
Green Data Center
PPTX
What's Next in Cooling: Capacity, Containment, & More
PPT
Tia 942 Data Center Standards
PPTX
Proactively Managing Your Data Center Infrastructure
PPT
Ever Green Dc
PDF
Cirrascale forest container march 2011
PDF
Critical design elements for high power density data centers
PDF
Ashrae thermal guidelines svlg 2015 (1)
PPTX
4 steps to quickly improve pue through airflow management
PPTX
Virtualización para la Eficiencia
PDF
Thermal Management of Edge Data Centers
PPT
Afcom air control solution presentation
PPT
Hot Aisle & Cold Aisle Containment Solutions & Case Studies
PPTX
HVAC for Data Centers
PPTX
Competitive Assessment Updated
PPTX
Data center hvac
PPT
Thermal and airflow modeling methodology for Desktop PC
PPTX
2019 IEEE SPCE Austin presentation-1570590440.pptx
PDF
Energy recovery presentation
PDF
Crac precision climate_control_units_for_data_centres
Green Data Center
What's Next in Cooling: Capacity, Containment, & More
Tia 942 Data Center Standards
Proactively Managing Your Data Center Infrastructure
Ever Green Dc
Cirrascale forest container march 2011
Critical design elements for high power density data centers
Ashrae thermal guidelines svlg 2015 (1)
4 steps to quickly improve pue through airflow management
Virtualización para la Eficiencia
Thermal Management of Edge Data Centers
Afcom air control solution presentation
Hot Aisle & Cold Aisle Containment Solutions & Case Studies
HVAC for Data Centers
Competitive Assessment Updated
Data center hvac
Thermal and airflow modeling methodology for Desktop PC
2019 IEEE SPCE Austin presentation-1570590440.pptx
Energy recovery presentation
Crac precision climate_control_units_for_data_centres

Recently uploaded (20)

PPTX
Astra-Investor- business Presentation (1).pptx
PPTX
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
PPTX
Slide gioi thieu VietinBank Quy 2 - 2025
PDF
533158074-Saudi-Arabia-Companies-List-Contact.pdf
DOCX
Center Enamel A Strategic Partner for the Modernization of Georgia's Chemical...
PDF
Solaris Resources Presentation - Corporate August 2025.pdf
DOCX
80 DE ÔN VÀO 10 NĂM 2023vhkkkjjhhhhjjjj
PDF
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
PDF
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
PDF
Keppel_Proposed Divestment of M1 Limited
PPTX
interschool scomp.pptxzdkjhdjvdjvdjdhjhieij
PPTX
chapter 2 entrepreneurship full lecture ppt
PPTX
2 - Self & Personality 587689213yiuedhwejbmansbeakjrk
PDF
TyAnn Osborn: A Visionary Leader Shaping Corporate Workforce Dynamics
PPT
Lecture 3344;;,,(,(((((((((((((((((((((((
PDF
THE COMPLETE GUIDE TO BUILDING PASSIVE INCOME ONLINE
PPTX
IITM - FINAL Option - 01 - 12.08.25.pptx
PDF
Chapter 2 - AI chatbots and prompt engineering.pdf
DOCX
Hand book of Entrepreneurship 4 Chapters.docx
PDF
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)
Astra-Investor- business Presentation (1).pptx
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
Slide gioi thieu VietinBank Quy 2 - 2025
533158074-Saudi-Arabia-Companies-List-Contact.pdf
Center Enamel A Strategic Partner for the Modernization of Georgia's Chemical...
Solaris Resources Presentation - Corporate August 2025.pdf
80 DE ÔN VÀO 10 NĂM 2023vhkkkjjhhhhjjjj
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
Keppel_Proposed Divestment of M1 Limited
interschool scomp.pptxzdkjhdjvdjvdjdhjhieij
chapter 2 entrepreneurship full lecture ppt
2 - Self & Personality 587689213yiuedhwejbmansbeakjrk
TyAnn Osborn: A Visionary Leader Shaping Corporate Workforce Dynamics
Lecture 3344;;,,(,(((((((((((((((((((((((
THE COMPLETE GUIDE TO BUILDING PASSIVE INCOME ONLINE
IITM - FINAL Option - 01 - 12.08.25.pptx
Chapter 2 - AI chatbots and prompt engineering.pdf
Hand book of Entrepreneurship 4 Chapters.docx
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)

Cooling & Power Issues

  • 1. Data Center Forum: Power & Cooling Issues October 12, 2006 Presenters: Dr. Robert Sullivan, "Dr. Bob,” Triton Technology Systems Fritz Menchinger, NER
  • 2. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 3. MOORE’S LAW “The number of transistors on a chip doubles every 24 months and the Performance doubles every 18 months.” Intel cofounder Gordon Moore (1965)
  • 4. GROWTH IN TRANSISTORS PER DIE © 2003 Intel Corporation
  • 5. 2005 – 2010 PROJECTIONS PRODUCT HEAT DENSITY TREND CHART
  • 6. INTEL PROJECTED ACTUAL POWER CONSUMPTION FOR 42-1 RU SERVERS Actual product power consumption has lagged behind these projections by about 2 years * Product footprint 3,360 W/ft 2* 20.2 kW 480 W/RU Q3, 2004 2,900 W/ft 2* 17.6 kW 420 W/RU Q3, 2003 2,478 W/ft 2* 14.8 kW 354 W/RU Q3, 2002 1,890 W/ft 2* 11.3 kW 270 W/RU Q3, 2001
  • 7. 2004 HIGH-END PRODUCTS 2004 Trend chart mid-point projection was 1,800 W/ft 2 (Maximum configurations & options) * Based on product footprint 750 W/ft 2* 54.0 kW 32”x324” EMC DMX3 1,100 W/ft 2* 21.0 kW 32”x87” IBM DASD 1,300 W/ft 2* 10.0 kW 28”x40” HP Superdome 1,800 W/ft 2* 16.0 kW 36”x36” IBM Z-Series 1,700 W/ft 2* 24.0 kW 36”x56” Sun F15K
  • 8. 2004 BLADE AND 1U SERVERS Trend chart mid-point projection for 2004 is 3,000 W/ft 2 (Maximum configurations & options) * Product footprint W/ft2 based on actual cabinet size 3,000 W/ft 2 * 18.0 kW HP ProLiant Ble 2,000 W/ft 2 * 8.0 kW Electric Oven 2,200 W/ft 2 * 13.3 kW RLX ServerBlade 3000i 2,300 W/ft 2 * 14.0 kW Sun Sunfire 3,000 W/ft 2 * 18.0 kW IBM eServer Blade Center 4,000 W/ft 2 * 24.0 kW Dell PowerEdge 1850MC
  • 9. IMPLICATIONS OF THE CMOS POWER CRISIS Cost per processor is decreasing at 29% per year Constant dollars spent on high performance IT hardware three years from now will buy: 2.7 times more processors 12 times more processing power in the same or less floor space 3.3 times UPS power consumption increase Site power consumption will increase by at least 2x the UPS power consumption increase
  • 10. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 11. TEMPERATURE RISE UPON LOSS OF AIR COOLING Time to critical temperature rise 40 W/ft² - 10 minutes 100 W/ft² - 3 to 5 minutes 200 W/ft² - 1to 3 minutes 300 W/ft² - Less than a minute 300 W/ft²? 4.5 kW in 15 ft² Single cabinet in typical Cold Aisle / Hot Aisle arrangement
  • 12. HIGH DENSITIES REQUIRE A COOLING PARADIGM SHIFT A paradigm shift occurs when a previously loosely coupled site infrastructure system becomes tightly coupled Small changes have a big impact (often with unexpected results) Reliability of individual components, and fault tolerance if they malfunction, becomes critical
  • 13. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 14. MISMATCHED EXPECTATIONS “ Does it NEVER FAIL?” Versus “ Does it WORK?” (Always available versus normally available) Linguistic Choices Are Critical
  • 15. MISMATCHED EXPECTATIONS When expectations do not match reality IT demand = “24 by Forever” availability Infrastructure = Tier I or Tier II facility Failure to Define Expectations
  • 16. FUNCTIONALITY DEFINITIONS Multiple active paths, redundant Tier IV: Single active path, redundant Tier III: Single path, redundant components Tier II: Single path, no redundancy Tier I:
  • 17. SINGLE POWER PATH SINGLE POINTS-OF-FAILURE UPS system level failure Major circuit breakers (2-20) Minor circuit breakers (20-500) Plugs and receptacles (21-505) Electrical connections (258-6180) Human error False EPO Utility Battery Generator THREE POWER PATHS ONE POWER PATH COMPUTER HARDWARE
  • 18. DUAL POWER PATH SINGLE POINTS-OF-FAILURE UPS system level failure Major circuit breakers (2-20) Minor circuit breakers (20-500) Utility Utility Battery Generator THREE POWER PATHS TWO POWER PATHS COMPUTER HARDWARE Generator Battery 2 3 1 45
  • 19. MISMATCHED EXPECTATIONS “Match the required level of site infrastructure capacity, functionality, master planning, organizational charter and doctrine, staffing, processes, and training to availability expectations.” The Only Way to Assure Success
  • 20. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expectations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 21. Cooling concerns > No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow Need for supplemental cooling
  • 22. POOR MASTER PLANNING Circle the wagons approach is common Not good Cooling Unit Layout
  • 23. POOR MASTER PLANNING Random placement Hot spot solution to “circle the wagons” Worse Cooling Unit Layout
  • 24. RAISED-FLOOR UTILIZATION All aisles have elevated “mixed” temperature (starved supply airflow compounds problem) Fails to deliver predictable air intake temperatures Reduces return air temperature which reduces cooling unit capacity and removes moisture Traditional (Legacy) Layout
  • 25. MASTER PLANNING Static regain improves usable cooling unit redundancy Maximizes static pressure & CFM per perforated tile Minimizes effect of high discharge in velocity Achieves High IT Yield by Maximizing Cooling Delivery
  • 27. IT YIELD Cooling delivery is typically the constraining factor Manage cooling by zones of the overall room One to four building bays max Monitor and manage IT Yield performance metrics Racks/thermal conduction ft 2 Rack unit positions PDU power Breaker positions Redundant sensible cooling capacity Floor loading Maximize Site Investment Utilization
  • 28. Cooling concerns No computer room master plan > Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow Need for supplemental cooling
  • 29. FAILURE TO MEASURE AND MONITOR If you do not measure and record You can not monitor If you do not monitor you can not control Without controls Chaos reigns
  • 30. MEASURING MONITORING AND CONTROL Cooling delivery is typically the constraining factor Manage cooling by zones of the overall room One to four building bays max Monitor and manage IT Yield performance metrics Racks/thermal conduction ft 2 Rack unit positions PDU power Breaker positions Communication ports Redundant sensible cooling capacity Floor loading
  • 32. Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices > Mechanical incapacity Bypass airflow Need for supplemental cooling
  • 33. ORIGINS OF THERMAL INCAPACITY Design or Equipment Related A gross “ton” is not a “sensible” ton DX system refrigerant being partially charged “ Dueling” dehumidification/humidification Insufficient airflow across cooling coils Chilled water temperature too low Computer room return temperature too low Too much cold air bypass through unmanaged openings (cable cutouts and penetrations to adjacent spaces)
  • 34. ORIGINS OF THERMAL INCAPACITY Human Factors Lack of psychrometric chart knowledge Inappropriate computer room floor plan and equipment layouts Pre-cooling of returning hot air by incorrect perforated floor tile placement and unsealed cable openings Control sensors and instruments not calibrated Engineering consultants who do not yet understand the unique cooling dynamics of data centers and underfloor air distribution
  • 35. CONSEQUENCES OF THERMAL INCAPACITY The following results are based on detailed measurements in 19 computer rooms totaling 204,400 ft 2 10% of the racks had “hot spots” at the intake air exceeding 77°F/40% Rh This occurred despite having 2.6 times more cooling running than was required by the heat load Rooms with the greatest excess of cooling capacity had the worst % of hot spots 10% of the cooling units had failed
  • 36. Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity > Bypass airflow Need for supplemental cooling
  • 37. BYPASS AIRFLOW DEFINITION Escaping through cable cutouts and holes under cabinets Escaping through misplaced perforated tiles Escaping through holes in computer room perimeter walls, ceiling, or floor Conditioned air is not getting to the air intakes of computer equipment
  • 38. Cold air escapes through cable cutouts Escaping cold air reduces static pressure resulting in insufficient cold aisle airflow Result is vertical and zone hot spots in high heat load areas COMPUTER ROOM LAYOUT OPTIONS EFFECT OF BYPASS AIRFLOW
  • 39. RAISED-FLOOR UTILIZATION TRADITIONAL LAYOUT All aisles have elevated “mixed” temperature (starved supply airflow compounds problem) Fails to deliver predictable air intake temperatures Reduces return air temperature which reduces cooling unit capacity
  • 40. TYPICAL BYPASS AIRFLOW CONDITION Reduces kW capacity per rack that can be effectively and predictably cooled.
  • 41. TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) This unnecessarily large raised-floor opening should be closed. The edges of the cutout must be dressed according to NFPA code.
  • 42. TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) Unnecessarily large cable cutout under a server rack
  • 43. TYPICAL BYPASS AIRFLOW CONDITION (CONTINUED) Both a bypass airflow problem and a safety hazard.
  • 44. BYPASS AIRFLOW - IS IT A PROBLEM? Based on detailed measurements in 19 computer rooms totaling 204,400 ft 2 Despite 2.6 times more cooling running than was required by the heat load, 10% of racks had air intake temperatures exceeding ASHRAE maximum reliability guidelines (rooms with greatest excess cooling capacity running had worst hot spots) 60% of available cold air is short cycling back to cooling units through perforated tiles in the hot aisle and unsealed cable openings
  • 45. BYPASS AIRFLOW REDUCES RELIABILITY, STABILITY, AND USABLE COOLING CAPACITY Reduces underfloor static pressure Reduces volume of conditioned air coming into the cold aisle Exacerbates problems with underfloor obstructions Creates environment where recirculation of hot exhaust air across the top of racks will occur Reduces kW capacity per rack that can be effectively and predictably cooled
  • 46. INTERNAL RECIRCULATION CAN REDUCE RELIABILITY Utilize blanking plates within cabinets Internal recirculation is also a problem
  • 47. BYPASS AIRFLOW HOW IS IT FIXED? PERIMETER HOLES Use permanently installed firestop materials for conduits, pipes, construction holes, etc., through walls Removable fire pillows for floor or wall cable pass throughs Seal all the holes in the computer room perimeter
  • 48. BEST PRACTICE PERIMETER PENETRATIONS ARE SEALED Good example of fire stopping through a sidewall.
  • 49. BEST PRACTICE PERIMETER PENETRATIONS ARE SEALED (CONTINUED) Excellent fire stopping practices are evident throughout this site.
  • 50. FLOOR OPENINGS ACCEPTABLE CABLE CUTOUT SOLUTIONS Fire pillows Foam sheeting Brush assemblies Seal all raised-floor cable openings plus openings around PDUs and cooling units
  • 51. FLOOR OPENINGS FIRE PILLOW SOLUTION Difficult to achieve an effective level of sealing Often falls to subfloor or is kicked out of way Regular policing is required No static dissipative property for electrostatic char
  • 52. FLOOR OPENINGS FIRE PILLOW EXAMPLES This is one way to prevent air loss. Additional refinement is needed.
  • 53. FLOOR OPENINGS FOAM SHEETING SOLUTION Very labor intensive to achieve good sealing efficiency Every cabling change requires re-cutting foam Often tears, pulls out, or falls to subfloor when cable head is pulled through Requires regular policing Special foam material is required to achieve static dissipation
  • 54. FLOOR OPENINGS FOAM SEALING EXAMPLES Plugging cable opening is a good practice, a better choice of materials would be more appropriate.
  • 55. FLOOR OPENINGS PROBLEMS WITH FOAM SEALING (CONTINUED) The foam in this picture was torn and hanging by a thread. It was pieced back together for the picture. Tearing occurs when the cable head passes through. Foam is typically deformed or missing in 50% to 75% of openings after six months.
  • 56. FLOOR OPENINGS PROBLEMS WITH FOAM SEALING (CONTINUED) Foam sealing has not been reinstalled after re-cabling. Resulting opening allows significant air leakage.
  • 57. FLOOR OPENINGS BRUSH SEALING ASSEMBLIES Most expensive initially, but least life cycle cost because recurring policing labor is not required High sealing effectiveness both initially and after multiple recablings (100% sealing effectiveness in undisturbed opening area) Doesn’t require training or policing Can be static dissipative
  • 58. FLOOR OPENINGS BRUSH SEALING FOR NEW OPENINGS Brush grommet for sealing new holes in floor tiles.
  • 59. FLOOR OPENINGS BRUSH SEALING FOR NEW OPENINGS Brush grommet for sealing new holes in floor tiles.
  • 60. FLOOR OPENINGS BRUSH SEALING FOR EXISTING OPENINGS Separable brush grommet for sealing existing openings in floor tiles.
  • 61. INTERNAL BYPASS AIRFLOW - HOW IS IT FIXED? Install internal blanking plates within cabinets to prevent open RU openings from recirculating hot air exhaust
  • 62. INTERNAL BYPASS AIRFLOW BLANKING PLATE INSTALLATION EXAMPLE Proper use of blanking or filler plates exhibited.
  • 63. Cooling concerns No computer room master plan Failure to measure, monitor, and use installation best practices Mechanical incapacity Bypass airflow > Need for supplemental cooling
  • 64. NEED FOR SUPPLEMENTAL COOLING When the normal cooling system will not handle the load, especially in high density spot situations, supplemental cooling is necessary Dedicated to one cabinet Dedicated to an area of the room
  • 65. SUPPLEMENTAL COOLING OPTIONS In line cooling – horizontal airflow Hot exhaust air drawn through a fan coil unit and cold air blown into the Cold Aisle. Usually a chilled water installation Overhead cooling Fan coil unit sits on top of cabinet or is hung from ceiling. Hot exhaust air drawn through a fan coil unit, using a refrigerant not chilled water, and blown into the Cold Aisle. Back cover cooling Fan coil system replaces the back cover of a cabinet Usually a chilled water system Heat is neutralized before being blown into Hot Aisle Dedicated cabinet Air is recirculated within the cabinet Contains its own fans and cooling coil All have redundant fan systems Only one has redundant cooling capability
  • 66. MORE INFORMATION koldlok.com – Koldlok Products upsite.com – White Papers How to Successfully Cool High-Density IT Hardware Seminar November 1 – 3, Miami, FL November 27 – 29, Santa Fe, NM Check upsite.com for details
  • 67. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 68. HISTORY OF INNOVATION 2006-Infrastructure Consulting, Design, Build, High Density Power and Cooling 1985 MF Tape Racks 1989 Autotrieve 1 st automated tape racks, first S.A.M 2006 IT Security Audit, Compliance Services 1992 NER’s first custom server cabinets 1998 Began Distributing Cybex 2000 Began Distributing NetBotz 2001 Began Distributing ServerTech 2004 Ultimate Core/ Largest Avocent Strategic Dist. Partner 2002 Introduced R3 1991 first high-density tape racks 2005 Launch of Services Business
  • 69. INNOVATION ROADMAP Solutions and Services for 2006 and beyond 2005 Begin Factory Integration 2005 On-Site integration 2006 Data Center Health Check CFD Modeling and Adaptivcool cooling solutions Build-out, Build-New Assessment Enhanced Centralized Management Solutions and Training Services Project Management and Implementation 2006 Asset& Inventory Service Data Center Construction Enhanced Facility Monitoring
  • 70. THE LATEST FROM AFCOM SURVEYS “ More than two thirds (66%) of 178 Afcom Data Center professionals surveyed anticipate they'll have to expand their data centers or turn to outsourcing to meet demands in the next decade.” Source =  Information Week – August 2006
  • 71. AFCOM SURVEY SAYS “by 2010, more than half (50%) of all data centers will have to relocate or outsource some applications”.
  • 72. CURRENT STATE OF DATA CENTER COOLING 19 Data Centers surveyed Average had 2.6X more cooling than IT load Still 10% of racks were over 77F 72% of cooling air bypasses racks or mixes before racks AFCOM 2006 Keynote Speech
  • 73. ARE YOU PREPARED?? “ AFCOM's Data Center Institute, a think tank dedicated to improving data center management and operations predicts that over the next five years, power failures and limits on power availability will halt data center operations at more than 90% of companies”. AFCOM 2006 Keynote Speech
  • 74. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 75. CIO’S PARADOX TODAY (SAME MESSAGE FOR 3 YEARS) The CIO’s Paradox in action: What the CIO hears from other business leaders. “ You’re a service. Why can’t you respond to our division better? What are your people doing?” & “ IT is strategic. How do you, the CIO, set investment priorities? “ I have no trust in IT’s ability to deliver measurable value” & “ We need new, better solutions. “ Reduce your budget!” & Keep our systems running 24x7!”
  • 76. DATA CENTER FAULT TOLERANCE Are these photos from your data center?
  • 77. THERMAL MANAGEMENT- IT’S A REALITY (WE HAVE BEEN USING THIS SLIDE FOR YEARS) New Servers One cabinet = 27,304 BTU/hr!!!! One ton of cooling = 12,000 BTU/hr A fully configured cabinet can produce 35,000+ watts.* *IBM Blade Center H w/ 4 2900 watt power supplies 3/cabinet Projected Thermal Loads Servers and DASD Power = Heat! Servers & DASD 100W/ft 2 150W/ft 2 200W/ft 2 2000 2002 2005
  • 78. CAN YOU GET 5 TONS OF AIR THROUGH ONE PERFORATED TILE?? HOW ABOUT 9?
  • 79. HOW WILL YOU GET 6KW OR MORE TO A SINGLE CABINET?? Application Based Circuits 5.9 kW (4) 20 amp, 110v circuits (8) redundant (3) 30 amp, 110v circuits (6) redundant (2) 30 amp, 208v circuits (4) redundant (1) 60 amp, 208v circuit (2) redundant
  • 80. HIGH DENSITY Leads to Complexity (we have been talking about this for years) Manufacturers have their own management platform Higher powered systems require a higher level of management Physical requirements change Supporting Infrastructure is more complex (weight, power, heat, cables)
  • 81. WHAT CAN YOU DO? Implement best practices where you can Focus on high value/high return Choose solutions that allow for standardization & automation Rely on Trusted Advisors to fill the gaps
  • 82. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 83. BACKGROUND –SOUND FAMILIAR?? Aging data center (7 years) Blades and other High Density implemented creating power and heat problems Consolidating via VM Ware Management demanding higher reliability Management will not move the data center
  • 84. DATA CENTER HEALTH CHECK-UP Thermal – Airflow delivery, perforated tile placement Cable Management – In-rack cabling, overhead ladder racking, sub-floor cable raceway Standardization - Cabinet placement & orientation, data center layout, cabinet types Remote Access & Monitoring – KVM types, environmental monitoring, server room access Fault Tolerance – Redundant power, CRAC failure What areas did we investigate?
  • 85. ISSUE #1- THERMALS/HEAT Problem: Heat build-up in server room Leading Causes: Sub-floor obstructions – power & data cabling 12” raised floor is insufficient for size of server room Perforated tiles incorrectly placed Cabinet placement and orientation do not permit hot aisle return paths – cabinets intake hot air
  • 86. HEAT BUILDUP IN SERVER ROOM
  • 87. THERMAL MANAGEMENT RECOMMENDATIONS Clean-up under floor power and data cabling. Move towards proper cable management– “Data Above, Power Below” Re-work perforated floor tile positions Block cable cutouts with brushed grommets Implement hot-aisle/cold-aisle design Install Adaptive Cool airflow delivery system to maintain temperatures in and around high-density cabinets
  • 88. RECOMMENDED PERFORATED TILE POSITIONING WITH EXISTING LAYOUT Move this row!
  • 89. RECOMMENDED HOT-AISLE COLD-AISLE LAYOUT Note the placement of the CRAC’s to the hot aisle. Note that we shut off one of the CRAC’s!
  • 90. THERMAL PATTERNS OF RECOMMENDED HOT-AISLE/COLD-AISLE LAYOUT Before After v4
  • 91. CABLE MANAGEMENT ISSUE #2 Problem: Power & Data Cabling causing clutter under raised floor and inside cabinets Leading Causes: No standard for in-rack cable management causing waterfall cabling and heat buildup No Overhead ladder racking No cable raceway under the floor Server depth reaching cabinet capacity
  • 92. CABLE MANAGEMENT RECOMMENDATIONS Segregate power and data via cable management Eliminate swing arms Install patch panels to document rack density and per-port usage Use ladder racking Standardize placement of in-rack power supplies to consolidate and secure power cabling Before After
  • 93. STANDARDIZATION ISSUE #3 Problem: Lack of standardization across server room Leading Causes: Cabinets vary in size, shape, and vendor type Row lengths vary, gaps in rows due to tables Rack space unused No cable management standards In-rack power utilizing inefficient voltage/amperage
  • 94. STANDARDIZATION RECOMMENDATIONS Set standards for cabinets Invest in a standard cabinet structure, power, cable mgmt High voltage/high amperage power will increase efficiency/decrease usage Separate power and data Consolidate servers into high density cabinets Arrange like cabinets in rows Remove tables from rows Arrange rows in equal lengths to improve airflow patterns Arrange layout into hot-aisle/cold-aisle design
  • 95. REMOTE ACCESS & MONITORING ISSUE #4 Problem Constant foot traffic in server room due to lack of remote access Problem-Current KVM technology using legacy analog local-access methods Leading Causes: Current remote access methods costly and using up multiple network ports Environmental monitoring not adequate or providing remote notification Central alarm management located off-site Leak detection system inadequate for lakeside location
  • 96. REMOTE ACCESS & MONITORING RECOMMENDATIONS Invest in modern KVM technologies Reduce cabling and port usage Improve access to servers from NOC and remote sites Install and maintain centralized remote access & monitoring solution in-house Reduce foot traffic in and out of the server room Improve security of devices and data Install improved environmental monitoring system Configure to monitor CRAC, UPS, Generator Configure for remote notification services according to user-specified thresholds Install adequate leak detection solution
  • 97. FAULT TOLERANCE ISSUES #5 Problem Lack of Redundant systems to prevent unscheduled downtime Cabinets utilizing non-redundant power sources Cable clutter putting devices at risk of being disconnected Three-phase power delivery system out of balance CRAC placement inhibits redundancy and recovery from failures (page 16) UPS and Generator reaching maximum capacity, no room for growth Risk of lakeside flood requires more adequate leak detection system and disaster notification services
  • 98. WORST CASE SCENARIO – 15-TON CRAC FAILURE Failure in Current Layout Failure in Hot/Cold Layout
  • 99. FAULT TOLERANCE RECOMMENDATIONS Plan in-rack power around fully configured cabinet Proper planning will result in a balance three-phase power delivery system Proper cable management will decrease risk of disconnection of power/data cabling Consistent maintenance of aging CRAC units will extend lifetime Improved airflow will help to improve efficiency of units Cooling redundancy during CRAC failures still not available Install an improved leaked detection system Monitor perimeter of data center as well as specified points such as chilled water piping, overhead plumbing, and CRAC units
  • 100. COST BENEFIT ANALYSIS Category 1 Low Benefit / Low Return Low Cost / Minimal Effort Category 2 High Benefit / High Return Low Cost / Minimal Effort Category 3 Minimal Benefit / Low Return High Cost / Maximum Effort Category 4 High Benefit / High Return High Cost / Maximum Effort Benefit / Return on Investment Cost / Effort
  • 101. AIRFLOW/CABLE MGMT COST/BENEFIT ANALYSIS MATRIX Category 1 Manage cabling inside racks Position perforated tiles in recommended layout Eliminate vendor cable mgmt Category 2 Consolidate/rerun cabling Invest in proper perforated tiles and layout Implement Hot/Cold layout Category 3 Add/Replace CRAC units Add/Replace UPS/Generator Standardize racks/power Category 4 Standardize racks/power in conjunction with Hot/Cold layout Install dynamic airflow sys. Benefit / Return on Investment Cost / Effort
  • 102. REMOTE A&M/FAULT TOLERANCE COST/BENEFIT ANALYSIS MATRIX Category 1 Manage cabling to eliminate risk of disconnections Improve remote access via current methods Category 2 Install modern KVM/Remote Access & Monitoring Maintain/Upgrade CRACs Improve leak detection Category 3 Add/Replace CRAC/Gen. Standardize racks/power/data/access Improve leak detection Category 4 Relocate Data Center Create Cooling/Disaster Recovery Plan from scratch Plan power/data/access Benefit / Return on Investment Cost / Effort
  • 103. IN SUMMARY Recommendations will help to calm the issues Resolutions may not provide room for growth/fault tolerance without large cost associated with add/replacement of data center devices (CRAC, UPS, Generator) Assistance in implementation of Hot-aisle/Cold-aisle setup and rack/power standardization will help to improve issues May cause downtime Requires effort towards rearrangement High cost/minimal effort resolutions may aid in resolving current issues Issues may arise again due to normal growth during next several years
  • 104. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 105. BACKGROUND High Visibility name brand with a strong competitor New Data Center Build (renovation) Very old multi-use building near city center Short time frame from build–out completion to lights-on Well known consulting firm doing project management
  • 106. ISSUES Issues 30 days to implement new data center (post construction) Maintain Enterprise Standards- while meeting varying requirements of New Data Center Build Provide best practices at the rack level Maximize efficiencies of standardization and automation Cable Management - Cabinet placement & orientation, Remote Access & Monitoring – KVM types, Fault Tolerance – Redundant power,
  • 107. VARYING POWER REQUIREMENTS BY ROW WAS A PERCEIVED ISSUE FOR THE CUSTOMER 15kW 12kW 9kW 5kW
  • 108. TIME WAS NOT ON THEIR SIDE Build-out had a 90 day window for completion Vendor Management was an issue for the project manager
  • 109. 2 to 4 208V 60A power strips Pre-wired with two power cords C13 every 4U (total of 14 cords) Run two blue cat5 copper cables and one green cat5 copper cable from each server location. Two pairs of LC cable to the top of the rack Install a 24 port CAT6 panel and three fiber cassettes for a total of 18 LC ports. Keyboard pull-out IP Remote Console for each cabinet Ganged and leveled at the customer site THE SOLUTION (FACTORY INTEGRATED CABINETS)
  • 110. THE RESULT? Customers saves 8 hours per cabinet of integration time. $105,000 in hard costs savings after integration fees. $50,000 savings from Electrical contractor because of fewer runs and all runs being the same Maintained standards as every cabinet was the same (look and feel) Funneled many tasks (and vendors) into a single source. One throat to choke. Met 30 day operational deadline
  • 111. AGENDA Problem areas - power Cooling: Major areas that must be addressed when installing high density equipment Growth of power utilization and heat density: impacts, expecations, and reality Cooling concerns History and future of the data center Data center reliability and availability Case study #1: Data center health check Case study #2: Integrated cabinet solution Case study #3: Disaster recovery site
  • 112. BACKGROUND The Katrina disaster 2000 miles away, a CRAC failure during Indian summer, the temporary unavailability of spot coolers for rental and persistent Hotspots spotlighted the need for better thermal management. The site faced sharply rising energy costs it had not budgeted for. Higher electricity bills cut deeply into the IT department’s budget. The Technology Manager knew he needed a solution to both the thermal and cost problems and evaluated the options available.
  • 113. ISSUES (OPTIONS CONSIDERED) ·Use Hot Aisle/Cold Aisle Layout – This is a normal solution. But here, the locations of all the CRACs at one side of the room, and the inability to shut down the Disaster Recovery function’s IT equipment during a move made this option unattractive. Add Additional CRAC Capacity – Room was almost full No capital budget for buying more CRACs. Plus more CRACs would drive utility bills higher Already had more than enough cooling capacity with his 30-ton units. ·
  • 114. OBJECTIVES OF THE PROJECT Maintain a higher cooling margin during the summer months where the environment immediately surrounding the Data Center exposes the center to temperature extremes. Fix Hotspots in several areas. Save money on electric utilities, with payback from any expense to come in less than a year. Allow remote monitoring of the Data Center’s thermal health. Plan for almost certain expansion of the center’s thermal load. Support and maintenance plan that would insure continued thermal performance. ·
  • 115. IMPLEMENTATION Site Audit to inventory and characterize the IT equipment heat sources and the facility; Simulation using CFD (Computational Fluid Dynamics) modeling to predict heat and airflow of baseline Data Center; Verification of the CFD model against measurements taken during the audit; Iterate to make improvements using the model to determine optimum configuration of passive and active airflow elements; ·
  • 117. IMPLEMENTATION Installation of sensor network to monitor changes during remaining Installation Install and reconfigure passive and active airflow elements; Verify and recertify room thermal performance; No downtime was incurred during any of the project phases. ·
  • 118. COOLING MARGIN RESULTS The cooling margin of the room was improved by 7ºF at the top of the racks.
  • 119. SUMMARY OF RESULTS More dramatically,virtually all server intake temperatures dropped, some by as much as14ºF.
  • 120. SUMMARY OF UTILITY SAVINGS This particular site cannot take full advantage of its potential energy savings today because it has constant speed components. Facilities executives are considering adding low cost Variable Frequency Drives (VFDs) in the future to benefit fully.
  • 121. SUMMARY Data Center cooling/power technology has remained virtually unchanged for 30 years IT equipment refreshes every 3 years. Power requirements and Heat densities are increasing with each new generation. More raw cooling capacity or adding more lower voltage whips is rarely the right answer.
  • 122. FINAL WORD Meeting the current and near term power needs with High Voltage/Amperage power and targeting the available cooling to where it is needed most, and controlling airflow precisely are the sensible approaches that will pay dividends in equipment uptime, energy costs and real estate.