◦ 4 toolkits included
◦ Linux and Windows
AI hardware dilemma: Balancing ambition with infrastructure reality
AI teams are navigating a fast-moving hardware landscape with growing pressure to balance performance needs, budget constraints, and long-term scalability.
As demand grows for high-performance computing, businesses are facing difficult choices about what infrastructure they want versus what they can realistically support.
Liquid Web’s AI hardware study surveyed 252 trained AI professionals, including machine learning engineers, data scientists, and AI infrastructure specialists, to uncover the real-world factors shaping GPU and infrastructure choices across industries.
The findings offer insight into how teams are prioritizing, compromising, and adapting as hybrid and cloud solutions become more central to AI operations.
Key findings and what they mean for your AI strategy
- NVIDIA dominates, but many teams skip comparisons.
- Finding: 68% of teams use NVIDIA for AI workloads.
- What’s missing: 28% haven’t formally compared alternatives.
- Recommendation: Don’t rely on the brand alone. Conduct ROI and performance benchmarking to choose the right GPU.
- GPU and budget constraints are delaying progress.
- Finding: 42% of teams scaled back projects, 39% delayed initiatives, and 14% canceled entirely.
- Recommendation: Plan hardware investments proactively. Build in buffer budgets and test power/cooling demands early.
- Familiarity beats performance in GPU selection.
- Finding: 43% choose GPUs based on past experience. Only 37% run performance tests before adoption, and 13% don’t test at all.
- Recommendation: Standardize internal testing protocols before hardware adoption, especially for mission-critical AI systems.
- Hybrid cloud setups are the new normal.
- Finding: 58% use hybrid setups, 47% run most AI workloads in the cloud, and 36% plan to increase cloud spending by 20% in 2025.
- Recommendation: Embrace hybrid models for scalability. Prioritize cloud solutions that integrate with your on-prem infrastructure.
- Power and cooling needs are often underestimated.
- Finding: 31% regret overlooking power/cooling needs, 30% overpaid due to supply chain issues, and 19% chose GPUs with insufficient VRAM.
- Recommendation: Factor in operational costs, not just sticker price. Energy efficiency, cooling systems, and GPU longevity can make or break your infrastructure ROI.
- Energy efficiency isn’t yet a priority – but it should be.
- Finding: 45% say efficiency matters but don’t prioritize it, and only 13% actively optimize for power efficiency.
- Recommendation: Begin measuring and optimizing energy use now. It’s a sustainability issue, as well as a long-term cost and performance factor.
How teams evaluate NVIDIA GPUs against competitors
NVIDIA is the go-to GPU for AI workloads, preferred by 68% of teams. Even with its lead in the market, AI teams are taking a more strategic approach to hardware decisions.
Many are actively evaluating NVIDIA GPUs alongside other vendors to understand what sets each option apart.
More than 1 in 3 teams using NVIDIA GPUs (36%) reported conducting a pricing and ROI analysis to compare against alternative providers. A slightly smaller group (30%) said they performed formal performance testing to benchmark NVIDIA against other options.
However, not all teams have taken a data-driven approach. Over a quarter (28%) admitted they have not formally compared NVIDIA GPUs to competing solutions. That lack of structured validation can be costly, especially when shared, cloud-based GPUs introduce as much as 15–25% performance loss due to virtualization.
Beyond performance comparisons, many teams also revealed what ultimately sways their brand preferences. We’ll next look at the underlying factors that influence GPU selection.
When asked why they chose their preferred GPU brand, 43% cited familiarity or past experience, making it the most common influence.
Another 40% said they rely on cost-performance analysis, while 37% run internal performance tests as a key part of their evaluation process. Other GPU brand selection factors included cost-efficiency (35%), compatibility with existing models (31%), and availability through hosting or cloud providers (29%).
Notably, 13% of teams adopt GPUs without any pre-deployment testing. In fast-moving sectors like AI, that kind of brand-blind adoption often leads to mismatched infrastructure, especially when generic or consumer-grade GPU hosting lacks the throughput or security that enterprise workloads demand.
“AI teams must evaluate hardware not just on brand reputation, but on formal cost-performance testing and long-term scalability.”
Ryan MacDonald
Chief Technology Officer at Liquid Web
Cloud AI investments expected to grow in 2025
With AI workloads increasing, more teams are turning to the cloud to support them. They’re looking for faster performance, easier scaling, and less infrastructure to manage.
As a result, AI infrastructure spending is expected to grow in 2025.
More than half of AI teams (58%) now use a hybrid hardware setup, blending cloud and on-prem infrastructure. Among those with hybrid setups, nearly half (47%) run most of their AI workloads in the cloud, pointing to a clear shift toward scalable, cloud-based solutions.
However, not all cloud GPUs are created equal. Many public cloud services use fractional or shared GPUs, which sacrifice speed and stability.
Dedicated GPU hosting from a provider like Liquid Web combines cloud flexibility with the control and consistency of single-tenant hardware – ideal for teams that can’t afford to compromise on speed or data integrity.
Fittingly, cloud investments are expected to keep growing. Over 1 in 3 companies (36%) said they plan to increase cloud AI spending by 20% in 2025, and about 1 in 4 (24%) expect to boost it by 50% or more. Industries driving this growth include tech, ecommerce, and finance.
Common hardware challenges and shifting views on efficiency
As AI systems become more resource-intensive, earlier hardware decisions can carry lasting consequences. Here are the most frequently reported challenges and how teams are rethinking efficiency.
About 1 in 3 teams (31%) said underestimating power and cooling needs was their biggest AI hardware challenge. Another 30% cited overpaying due to supply chain issues, 29% regretted not adopting a hybrid cloud/on-prem strategy sooner, and 19% admitted to selecting GPUs with insufficient VRAM.
These missteps come with real consequences. Due to GPU or budget constraints, 42% of teams have scaled back project scope, 39% have delayed projects, and 14% have canceled initiatives altogether.
When it comes to energy use, many teams recognize its growing importance but don’t yet prioritize it. Nearly 1 in 2 (45%) said power efficiency is important but not a main driver in their decisions, while only 13% actively optimize for it. These findings suggest that while sustainability is on the radar, it’s not driving most hardware decisions.
Efficiency isn’t just about energy. It’s also about performance per dollar. That’s why many AI leaders are rethinking hardware needs in terms of total project value.
Dedicated GPU hosting gives teams predictable performance with none of the resource drag caused by shared environments.
Balancing cost, preparedness, and performance
While many teams still lean on brand familiarity when choosing GPUs, factors like cost, compatibility, and access are becoming more influential. The rise of hybrid infrastructure and growing cloud investment point to a clear focus on scalability.
Liquid Web’s dedicated GPU servers help teams avoid the performance bottlenecks of fractionalized cloud solutions. With full control, enterprise-grade NVIDIA hardware, and bundled AI tooling like CUDA, TensorFlow, and PyTorch, AI developers can train models faster—without sacrificing privacy, security, or speed.
By learning from common missteps and rethinking how they evaluate hardware, teams can overcome limitations and make smarter decisions that support long-term AI growth.
“Our research shows that skipping due diligence leads to delayed or canceled initiatives—a costly mistake in a fast-moving industry.”
Ryan MacDonald
Chief Technology Officer at Liquid Web
Fair use statement
This content is based on proprietary research conducted by Liquid Web and is shared here under fair use for educational and informational purposes. If you reference any part of this article, please provide proper attribution with a link as the original source.
Share this content
<a href="https://www.liquidweb.com/white-papers/ai-hardware-dilemma/" target="_blank" rel="noopener noreferrer">Liquid Web AI hardware dilemma study</a>
Learn more about this study
GPU servers optimized for AI workloads
From prototyping to production—run your models on fast, secure, and scalable NVIDIA GPU environments.
