The AI data center dilemma: Upgrade or start from scratch

7

The AI data center dilemma: Upgrade or start from scratch


Racks of servers inside a data center.
(Image credit: Future)

In today’s data-driven landscape, the demand for AI shows no signs of slowing down as it becomes engrained in everyday tasks for businesses and consumers. A Goldman Sachs report found that AI-focused tech giants will spend over $1 trillion in the coming years, with a significant portion of capital targeted at data centers, AI infrastructure, and the power grid.

Increased AI investments are driving the need for more AI data centers, which can provide the added computing capacity needed but require significantly more power than traditional data centers. For example, in 2022, data centers consumed more than 4% of all electricity in the United States. Driven by AI’s power demands, that number is projected to more than double to 9% in 2030, according to the Electric Power Research Institute. Driving this energy consumption is higher power GPUs deployed in configurations that, at the rack level, will drive power consumption from the current sub 50KW range up to a MW by 2030. In light of such forecasts, data center operators are faced with a critical decision to ensure long-term AI readiness: upgrade existing infrastructure or build new facilities from scratch.

Vito Savino

Data center and wireline segment leader for OmniOn Power.

Each option presents unique advantages and challenges. Understanding these pros and cons is essential for making informed choices that align with business goals, budget constraints, and future needs. The range of factors to consider when deciding to update existing technology or starting fresh and build out a new AI data center include:

– The existing facility’s infrastructure: When updating an existing data center, HVAC systems, power infrastructure, energy distribution systems, and structural loading capability should all be inspected to ensure they can support the increased demand that will be placed on them when upgrading to an AI data center.

– The ability to upgrade to higher voltages – such as 48-volt (V) or even 800V power architecture: Implementing or upgrading to a higher bus voltage architecture can improve thermal management and efficiencies while delivering the higher power required by AI servers and IT equipment.

– The potential to leverage three-phase power, conduction cooling, and liquid-immersion cooling: Both technologies can help address power and heat challenges related to increased computing capacities and higher-powered networking equipment.

High-capacity requirements for HVAC, power infrastructure, and energy distribution

A data center’s HVAC, power infrastructure, and energy distribution systems must meet high-capacity requirements to support AI’s accelerated computing demands. Building a new data center from scratch could be a better option than the high cost of a complete overhaul if these systems are not up to the task.

HVAC systems

AI workloads are computationally intensive and generate significant heat. Advanced HVAC systems are crucial for maintaining optimal operating temperatures to prevent overheating, which can lead to data center hardware failures and reduced performance. Proper airflow design, hot aisle/cold aisle configurations, and effective cooling strategies can help ensure that cooling is efficiently directed where it’s needed most. This is particularly important for densely packed server environments typically found in AI data centers.

As AI applications grow in complexity and demand, HVAC systems must be scalable to handle increased heat loads and be flexible to adapt to changing needs. Efficient data center HVAC systems can contribute to lower operational costs, improve sustainability, and reduce a data center’s overall energy footprint.

Power infrastructure

AI workloads necessitate a data center’s power conversion solutions to deliver consistent, reliable energy to ensure optimal operation. Fluctuations in power quality can adversely affect sensitive AI hardware, so power infrastructure must provide clean, stable power to avoid costly downtime. According to a report from PCMag, the financial losses associated with network disruptions are substantial, as just one minute of worldwide internet outage would cost the global economy $20 million.

When existing data center systems approach or reach their functional limits, operators will seek innovative ways to serve increased computing capacity and accommodate the associated power demands. Whether upgrading or building new, backup power systems will be essential to ensure continuous operation and data integrity. Backup systems and dual-redundant power sources can provide reliable, uninterrupted service during power outages and fluctuations, reducing the risk of downtime. In addition, well-designed power infrastructures can accommodate the dynamic loads created by AI demands.

Energy distribution systems

High-capacity data centers need energy distribution systems that efficiently manage and deliver power without significant losses. High-voltage DC power architectures can significantly improve energy efficiency by reducing power losses during transmission.

Direct utility feeds that connect power straight into a data center’s power and networking cabinets can also enhance energy efficiency by minimizing the distance electricity must travel. Additionally, to optimize the performance of IT equipment and power systems, implementing real-time monitoring and control functionality allows for the proactive management of energy consumption during demand ebbs and flows.

Structural loading considerations

Floor loading increases, in addition to roof loading, need to be considered in the overall architecture as higher power density results in more copper and liquid cooling—direct-to-chip, liquid-cooled busbars, and power supplies. The increased PSI (pounds per square inch) will require a reassessment of loading capabilities in existing data centers.

When deciding whether or not to upgrade an existing data center, the costs related to refurbishing infrastructure and energy distribution systems may make building a new facility the most cost-effective strategy.

Upgrading to a 48V bus architecture

To improve efficiency and meet the increased power needs driven by AI and other technologies, operators are now leveraging 48V networking equipment and 48V bus architectures instead of traditional 12V IT equipment and DC buses. The transition to 48V architecture in data centers allows for lower current to be run to networking equipment, which, in turn, helps improve thermal management and efficiency, reduce energy costs, and create higher-density power delivery. Moving to a 48V architecture, which was recently just a consideration but has evolved into a necessity, reduces the number of conversion stages needed to power IT equipment, which can lower the overall cost of power distribution.

As AI drives computing demands higher, 48V systems can better support the high-density power needs of modern servers and networking equipment and allow for easier scaling when new equipment is implemented. Depending on how easily and cost-effectively power systems and DC bus architectures can be upgraded to 48V versions, building a new data center could be the most cost-effective long-term solution. Even higher voltages are on the way – both to the rack and within the rack

Addressing power and heat management challenges

Three-phase input power and liquid-immersion cooling are innovative solutions designed to address the power and heat management challenges of small-footprint data centers. If an existing data center can’t support the addition of these solutions, building a new data center may be the best option.

Three-phase input power systems

In data centers, three-phase power systems can help reduce the current needed for the same amount of power, which can lead to lower energy losses. With three-phase systems, the need for larger transformers and distribution solutions can be minimized, helping to save space and simplify power conversion. Three-phase power systems can evenly distribute electrical loads across three conductors, reducing the risk of overheating and allowing for more stable and reliable equipment operation.

Liquid-immersion cooling

Liquid-immersion cooling is particularly effective at handling the high thermal loads generated by densely packed servers in data centers. Liquid immersion can reduce the need for traditional cooling infrastructure, allowing more efficient use of available data center space. By reducing reliance on air cooling systems, liquid-immersion cooling can lower energy consumption and operational costs.

Analyzing specific needs to make the best choice

Whether an operator decides to upgrade an existing data center or build a new one from scratch, the key factors to consider include budget, strategic goals, the condition of existing infrastructure, and the ability to scale. As such, data center operators should thoroughly analyze their circumstances to determine the best path forward. In some cases, a hybrid approach that leverages upgrades and new builds to maximize efficiency and capabilities may deliver the best ROI possible.

We’ve featured the best IT infrastructure management service.

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Vito Savino

Vito Savino is the data center and wireline segment leader for OmniOn Power.

Latest
Tom Holland dressed in a Spider-Man costume, next to a picture of Tom Hardy and venom on a beach in Venom: The Last Dance and an animated Spider-Man from Into the Spider-Verse

The Sony Spider-Man Universe was doomed from the start, and there are only two ways to save it without Tom Holland’s popular webslinger

See more latest ►
Previous articleGoogle wants the FTC to stop exclusive Microsoft cloud deal with OpenAI
Next articleOpen source software users are being hit by AI-written junk bug reports