The Rise of AI Neoclouds:
Your Gateway to Affordable, Scalable AI Innovation
Late on a rainy Tuesday, in a cramped startup office lit by the glow of idle monitors, Anika Patel was at her breaking point. Her small AI company’s breakthrough prototype — a model promising to transform agritech — sat trained to only 80% accuracy.
The culprit was not an algorithmic flaw, but a lack of sheer computing muscle. Traditional cloud giants had no GPUs available or quoted sky-high prices, and her team’s credit card was maxed out from weeks of Amazon Web Services rentals.
At one point, Patel had resorted to a DIY rig of gaming cards in her garage, fans whirring like a jet engine, just to eke out a few more training runs. It wasn’t enough. Then a fellow founder whispered a lifeline: a scrappy new cloud outfit that could rent her dozens of top-tier GPUs immediately, at a fraction of the cost. Skeptical but desperate, Patel logged on.
Hours later, as dawn broke, her model was finally training across a fleet of NVIDIA accelerators — an arsenal she never could have afforded or even accessed on the big-name clouds. The startup’s GPU crisis had found salvation in what’s known as an “AI Neocloud.”
Patel’s plight is hardly unique. In the frenzy of the AI boom, compute has become the defining bottleneck. A global shortage of high-end AI chips — the NVIDIA A100s and H100s of the world — left many startups (and even tech giants) scrambling in 2023 and 2024. The hyperscalers (Amazon, Microsoft, Google) simply couldn’t provision enough GPUs fast enough for the explosion in demand.
Even OpenAI, backed by Microsoft’s billions, ran into capacity roadblocks; Microsoft was forced to sign a multi-billion-dollar deal with CoreWeave, an AI-focused cloud provider, just to secure extra compute for ChatGPT’s voracious training runs. Amid this AI gold rush for computing power, a quirky new breed of cloud companies has emerged from obscurity to meet the demand. These are the AI Neoclouds — specialist cloud providers laser-focused on renting out GPU muscle, often at lower costs and greater availability than the household-name clouds.
Emergence of the AI Neoclouds
Not long ago, names like Core Weave, Lambda Labs , Nebius, or Crusoe Cloud meant little outside of niche circles. Today, they’re at the forefront of a seismic shift in cloud computing. An AI Neocloud is essentially a GPU-as-a-serviceprovider: a cloud built from the ground up to offer on-demand access to advanced accelerators (GPUs, and sometimes other AI chips) for AI and machine learning workloads. Unlike traditional clouds that offer hundreds of services — from databases to analytics — neoclouds largely stick to raw computing power. Need thousands of GPUs for a week to train a breakthrough language model? That’s their sweet spot. By focusing narrowly on AI infrastructure, these upstarts claim they can deliver better performance, price, and service for AI teams.
The value proposition became clear as generative AI took off. On CoreWeave’s platform, renting an NVIDIA A100 (40GB) GPU costs about $2.39 per hour ($1,200/month). On Microsoft Azure, the same unit runs about $3.40/hour($2,482/month), and on Google Cloud about $3.67/hour. That’s roughly a 50% cost savings right off the bat on hardware that might be completely booked or backordered on the big clouds. Multiply those differences across clusters of dozens or hundreds of GPUs (which ambitious AI projects routinely require), and the economics become impossible to ignore. These neoclouds don’t burden customers with “enterprise” markups for unused services — you’re paying purely for the GPUs and associated essentials, often with transparent pricing by the hour. For cash-strapped startups or even Fortune 500 innovation labs watching their cloud bills balloon, the appeal is obvious.
“Companies like CoreWeave participate in a market we call specialty ‘GPU-as-a-service’ providers,” explains Sid Nag, a VP of cloud research at Gartner. “Given the high demand for GPUs, they offer an alternate to the hyperscalers — another route to market and access to those GPUs.” In other words, the neoclouds unlocked a new supply channel for AI compute. Importantly, it’s not just startups taking this route. Nag and others note that even some big tech firms have quietly begun leaning on these alternative clouds when their primary data centers max out. The Microsoft-CoreWeave deal for OpenAI was a loud confirmation of the trend: cloud incumbents turning to upstarts to hedge against shortages and keep their AI missions on track. NVIDIA itself, which sells the coveted chips, has an interest in this diversification. The GPU maker has reportedly given certain neocloud providers preferential access to inventory– effectively greasing the skids for this new market — perhaps to avoid any one hyperscaler becoming too dominant a gatekeeper for AI compute.
Yet these newcomers aren’t simply charity cases of NVIDIA’s favor or short-term capacity plugs. They have architected their services to cater to AI developers in ways hyperscalers can be slow to match. CoreWeave, for instance, built its cloud on a Kubernetes-native platform optimized for multi-GPU distributed training jobs, enabling customers to spin up massive clusters in seconds with minimal ops overhead. Lambda Labs offers multi-GPU instances and even one-click managed Kubernetes for AI, aiming to make scaling large models easier for researchers and companies without deep infra expertise. These providers know their clientele: data scientists, ML engineers, and scrappy innovators who value speed, flexibility, and price transparency over the kind of one-stop-shop depth a generalist cloud provides.
The market has noticed. In 2023–24, investors poured an estimated $20 billion into roughly 25 of these GPU cloud startups. That includes about $8B in equity financing and over $12B in debt from heavyweight institutions like BlackRock and Carlyle. CoreWeave alone raised a headline-grabbing $1.1B round in mid-2024 at a nearly $19B valuation. Its revenue surged from just $16 million in 2022 to a staggering $1.9 billion in 2024 as the AI model frenzy drove unprecedented demand. Lambda Labs became a unicorn with a $320M Series C and later secured a special $500M financing vehicle to fuel expansion. And Nebius, a lesser-known player that emerged from the breakup of Russia’s Yandex, suddenly found itself with $2 billion in cash and a high-end Finnish data center, positioning to triple its capacity to serve European AI customers wary of U.S. or Chinese clouds. Even non-traditional actors jumped in: voltage Park (backed by crypto billionaire Jed McCaleb) pledged $500M for GPU data centers, and Together AI (which not only rents GPUs but also open-sources AI research) raised over $100M to blend community-driven model development with cloud services.
In short, the AI neocloud boom is real, and it’s reshaping where and how cutting-edge AI gets built. But who are these companies, and where did they come from? The answer is perhaps the most colorful part of this story — many of these neocloud upstarts were born from the ashes of yesterday’s computing mania.
From Crypto Mines to AI Supercomputers
It turns out the first wave of 21st-century gold rush — the cryptocurrency mining craze — sowed the seeds for today’s AI cloud revolution. Many neocloud providers began as crypto mining operations, hoarding GPUs to mint Ethereum or Bitcoin, only to see their business models crumble when crypto markets cooled or technical shifts (like Ethereum’s move away from mining) rendered their GPU hoards redundant. Rather than sell off expensive hardware at a loss, the savviest of these operators pivoted to the next big computational demand: AI.
The chips that once verified blockchain transactions are now training transformer models.
CoreWeave is a prime example. Founded in 2017, the New Jersey company spent its early years mining Ethereum. When crypto prices gyrated and AI began to heat up, CoreWeave reinvented itself as a cloud provider, repurposing its mining rigs into an AI supercomputing platform. The bet paid off massively — today CoreWeave is the largest of the neoclouds, reportedly managing well over 100,000 GPUs and growing, with long-term contracts locking in 96% of its revenue. During the worst of the GPU shortage, NVIDIA funneled scarce high-end chips to CoreWeave, in part because it saw CoreWeave’s strategy as expanding the overall market for its GPUs. In essence, CoreWeave became an extension of NVIDIA’s distribution, renting H100s by the hour to anyone who could pay — a chip broker of sorts. “It’s kind of like chip arbitrage,” one analyst quipped of CoreWeave’s model. “They buy GPUs from NVIDIA and rent them out at a markup — playing middleman in the AI compute economy”. The arrangement has been incredibly lucrative thus far, though it ties CoreWeave’s fate tightly to NVIDIA’s pipeline and pricing (a point we’ll revisit).
Crusoe Energy Systems, now known for its Crusoe Cloud service, has an even unlikelier origin. Crusoe started out literally in the oilfields — using natural gas flares to power Bitcoin mining rigs. Co-founders Chase Lochmiller and Cully Cavness developed technology to capture wasted methane from oil drilling (which is usually burned off in flares) and convert it to electricity for data centers. Initially, that power fed crypto mining. But as generative AI’s rise created a new hunger for compute, Crusoe saw a greener opportunity: use their mobile, energy-efficient data centers to provide low-cost AI cloud computing while mitigating emissions. They pivoted hard into GPUs for AI, and now Crusoe’s cloud division leases NVIDIA chips just like CoreWeave — except running on otherwise wasted energy. In 2024, Crusoe raised $600M to expand this vision of “clean” AI infrastructure, claiming its flare-gas-powered approach had already avoided 680,000 metric tons of CO₂-equivalent emissions by generating electricity from waste gas.
“As the world’s only carbon-reducing computing company — we invest in…the environmental impact of our products,” says CEO Lochmiller, noting that Crusoe’s mission is to “align the future of computing with the future of the climate by reducing emissions [and] building a democratized AI cloud platform”. In an industry notorious for energy-hungry GPU farms, Crusoe’s approach is a novel merging of climate tech and AI — literally turning pollution into computation. It’s also a clever play to differentiate on sustainability, attracting clients who not only need GPUs but also care about their carbon footprint.
Other neocloud players have similarly colorful backstories. Northern Data, a German company, was a major crypto mining firm that nearly went bust and then snagged a $1.1B lifeline from stablecoin giant Tether to reinvent itself as an AI compute provider. It’s now trying to parlay its massive data center facilities (originally built for mining) into rentable AI supercomputers. Some traditional cloud and hosting companies like Vultr and OVHcloud (OVH) have also pivoted: originally they offered general VPS and hosting services, but seeing the GPU demand spike, they started renting high-end GPUs and branding themselves as AI-friendly clouds. And then there’s Nebius — arguably the poster child for how geopolitical upheaval can spawn an AI cloud. Nebius emerged from the wreckage of Yandex, the Russian tech giant often dubbed “Russia’s Google.” When Yandex’s core business was isolated by geopolitical sanctions, a chunk of its cloud division splintered off into Nebius, complete with a state-of-the-art Finnish data center and a war chest of cash. Now headquartered outside Russia, Nebius is leveraging that infrastructure to offer AI cloud services in Europe and the Middle East, pitching itself as a neutral, well-funded alternative for regions that want to keep data out of U.S. or Chinese hyperscalers. With $2B in cash and no debt, Nebius is ambitiously planning to triple its capacity to around 75 MW of data center power, aiming to host tens of thousands of GPUs to serve “sovereign AI” needs in secondary markets. In effect, Nebius is turning a stranded asset (a top-notch data center that can’t easily be used by its original Russian owner) into a global south AI cloud resource. It exemplifies a broader trend: regions like Europe, India, and the Middle East — which have lagged in large-scale AI infrastructure — are keen on homegrown or region-specific AI clouds to keep their data local and comply with privacy regulations. Many emerging neoclouds are targeting exactly these markets, setting up GPU farms outside Silicon Valley’s shadow.
So, today’s AI neocloud ecosystem is a strange amalgam of crypto refugees, repurposed data centers, and opportunistic startups, all coalescing around one promise: to deliver affordable, available AI compute. They have the chips (often thanks to yesterday’s booms and busts), they have investor backing, and demand is certainly not in question. But can they truly compete with (or rather, complement) the cloud hyperscalers that currently dominate computing? To answer that, we need to examine both the economics and technology at play — and the formidable frenemy that is NVIDIA.
David vs. Goliath in the Cloud: Neoclouds and Hyperscalers
The rise of neoclouds comes at a fascinating juncture in the cloud market. For over a decade, AWS, Azure, and Google Cloud (and to a lesser extent IBM and Oracle) have enjoyed near-total dominance, investing tens of billions in global infrastructure, data services, and ecosystems. They are the Goliaths — with all the advantages and baggage that implies. The neoclouds are upstart Davids armed with a very specific slingshot: GPUs at lower cost. The two models are not mutually exclusive; indeed, many customers will use them side by side. But they do force a comparison.
In terms of raw pricing, the gap is striking. An Uptime Institute analysis in late 2024 compared the cost of renting NVIDIA’s flagship H100 GPU systems on hyperscalers versus neoclouds. The average on-demand price for a full H100-powered server on a top hyperscaler was about $98 per hour, whereas an equivalent setup on a neocloud averaged only $34 per hour — a 66% savings. Put another way, a company could triple its GPU resources on a neocloud for the same budget it would spend on a big provider. This pricing delta persists even as you scale. Hyperscalers of course offer volume discounts and reserved instance deals, but those often require long commitments. Neoclouds, by contrast, started off with aggressive pricingto lure customers (often operating barely above cost), knowing that many AI practitioners are start-ups who value elasticity without enterprise negotiation.
Average cost to rent a high-end GPU server, hyperscaler vs. neocloud, as utilization increases. Neoclouds offer dramatically lower on-demand prices (around $34/h on average, vs $98/h on big clouds), but building one’s own cluster (“Dedicated”) can beat hyperscalers if usage exceeds ~22% and beat neoclouds if above ~66% utilization. In practice, neoclouds are cost-effective for most AI workloads short of ultra-high, steady utilization.
Why can’t the giants just slash their GPU prices to match? In simple terms, they don’t have to. The big three clouds have captive user bases and a wide sticky moat of services. Large enterprises already deeply invested in AWS or Azure often find it easier to spin up an overpriced GPU there than to onboard a whole new provider. A single-provider setup means unified APIs, one invoice, integrated security and compliance, and no need to move data around. There’s an inertia and convenience factor that neoclouds can’t offer if a team is heavily bought into a specific cloud’s ecosystem. For many organizations, paying somewhat more to use their existing cloud vendor is seen as an acceptable trade-off versus the hassle of contracts, data migration and new tooling that a secondary provider entails. Hyperscalers also aren’t entirely asleep at the wheel on cost — they negotiate bespoke deals with big customers and have started offering long-term commitments that bring prices closer to neocloud levels for those willing to pre-pay.
The result today is a kind of price umbrella: hyperscalers keep prices high for on-demand GPU rentals, which lets neoclouds undercut and still profit. Enterprises sticking with AWS/Azure may be effectively subsidizing the discount that startups get on CoreWeave or Lambda. That dynamic can continue only so long as the big clouds feel no competitive pressure to truly race to the bottom on GPU pricing. For now, they seem content to let the neoclouds serve the “price-sensitive” segment of the market — which includes nearly every AI startup — while they focus on full-stack offerings and their own mega-deals with the Fortune 500. It’s a classic innovator’s dilemma scenario: the incumbents are yielding a niche (albeit a fast-growing one) to specialists, because protecting their broader high-margin business is a bigger priority.
From a performance and capability standpoint, neoclouds have proven they can run toe-to-toe with the big guys for core AI tasks. A GPU is a GPU, after all — a CoreWeave H100 and an AWS H100 deliver the same FLOPs. Some neoclouds have even standardized on high-performance storage and networking tech (like VAST Data’s systems, reportedly used by CoreWeave, Lambda, etc.) to ensure data throughput to feed those GPUs at scale. In certain cases, neoclouds might have the edge in flexibility: for instance, Lambda Labs specializes in offering multi-node GPU training clusters with ultra-fast interconnects on demand, something that can be complex to configure on a generalist cloud. Neoclouds often emphasize features like bare-metal access, custom GPU partitioning, or niche GPU types that hyperscalers don’t yet offer widely. Some, like CoreWeave, even provide managed orchestration tools (their own Kubernetes-based platform) so customers can treat GPU pods like a serverless resource pool, abstracting away the VM-by-VM management. This focus on making massive GPU scale-up/scale-out easy has attracted visual effects studios (for rendering farms), biotech firms for simulations, and of course AI labs doing giant model training — groups for whom tens or hundreds of GPUs in one job is normal. As one analysis put it, “CoreWeave excels in rapid scaling and diverse GPU options…Lambda Labs offers powerful multi-GPU instances for intensive ML tasks”, highlighting how each tries to optimize for heavy workloads.
However, breadth of services is where hyperscalers still hold a trump card. A typical enterprise application stack might use dozens of cloud services (databases, event queues, monitoring, etc.). Neoclouds generally provide compute only — you get GPUs (and requisite CPU, memory, storage to go with them), maybe some basic storage buckets, but not a rich menu of ancillary services. For AI model training or inference tasks, that’s often fine. But if you’re running a complex microservices architecture, heavy on compliance and integration needs, a neocloud alone might not cut it. For example, a fintech company dealing with sensitive customer data might be uncomfortable moving that data to a tiny vendor that lacks the compliance certifications and audited security of an AWS or Azure. Highly regulated industries (finance, healthcare, government) tend to favor providers with long-standing compliance programs (SOC2, FedRAMP, HIPAA, etc.) which the nascent neoclouds are only beginning to attain, if at all. Moreover, the lack of managed services means if your application needs a message queue or a content delivery network, you’d have to set that up separately or keep those parts on a traditional cloud. “The complexity of management, security, risk and compliance across multiple clouds” is not trivial, as Forrester analyst Lee Sustar notes. He observes that the customers most aggressively embracing neoclouds are typically those who are already multi-cloud — in other words, organizations with enough technical maturity to treat cloud providers somewhat interchangeably and pick the best tool for each job. Those comfortable with a bit of complexity will mix-and-match — say, use a CoreWeave cluster for training and then deploy the trained model on AWS for serving via its global network and managed APIs. But more conservative shops might just stick to a single cloud to avoid cross-platform headaches.
In essence, AI neoclouds shine for what they don’t offer: no-frills raw computing power without the premium of full-service cloud “baggage”. They are lean, mean, GPU machines. This makes them a godsend for certain workloads and a non-starter for others. A good rule of thumb emerging in the industry: use neoclouds for big, concentrated AI jobs (training a model, running a massive batch inference, rendering a movie) and use traditional clouds for the rest of your application environment, especially if it’s something like a SaaS product with dozens of microservices. It’s not an either/or — increasingly it’s both.
Hybrid Cloud Architectures: Best of Both Worlds?
This dynamic has led to a rise in hybrid strategies. Companies are now blending multiple clouds in their AI pipeline: e.g. train on a neocloud, deploy on a hyperscaler. Training an AI model is immensely compute-intensive but relatively self-contained — you need tons of GPUs for a few days or weeks, and then that job is done. Serving that trained model to users (inference) is a different challenge — it’s more continuous, often latency-sensitive, and has to integrate with user-facing applications. Many teams find it elegant to do the training phase on a cost-efficient neocloud, then export the model weights to their preferred big cloud for deployment (where they can use all the surrounding services and global infrastructure to run the product). Case study: A startup could spend $100k to train a new NLP model on Lambda Labs Cloud (saving perhaps 50% vs doing it on Azure), then host the resulting model on Azure in a scalable inference service close to their end-users. The data and model might live on both clouds, but that’s manageable with the right workflow.
Big cloud providers are aware of this pattern and not entirely thrilled — they’d rather keep all that workload (and revenue) on their platform. But it’s a tough proposition when a specialized rival can be so much cheaper for the heavy lifting part. We’re seeing responses: AWS, for example, offers AWS Snowball and Outposts (physical hardware deployments) which some companies use for local training then integrate results back to AWS. Microsoft and Oracle have partnered with NVIDIA to offer the NVIDIA DGX Cloud, effectively renting clusters of NVIDIA’s own supercomputers but hosted within Azure or Oracle Cloud’s data centers. These are ways the hyperscalers try to retain the training jobs. Yet, even those offerings can come at premium pricing or limited availability, keeping the door open for independent neoclouds to attract customers who are willing to venture out of the walled garden.
One notable shift is the idea of cloud-agnostic AI tooling. Platforms like Run:AI and Determined.ai (now part of HPE) provide an abstraction layer for managing AI workloads across different clouds. An AI team might use such a tool so that their code can run wherever capacity is cheapest — say, whichever neocloud has a spot instance sale today — without changing the workflow. This kind of meta-orchestration could further erode loyalty to any single cloud for AI, essentially treating GPUs as a commodity marketplace. It’s not far-fetched: SemiAnalysis, an industry research group, tracks over a hundred GPU cloud providers globally (including regional players and marketplaces) competing to rent out the latest chips. In some cases, aggregators are emerging that stitch together capacity from many small operators and present one interface to customers — analogous to how one might buy from multiple electricity suppliers on an open grid.
For now, hybrid cloud (multi-cloud) setups do require more technical chops — handling different consoles, moving data between clouds, etc. This is why Sustar emphasizes that neocloud adoption will be fastest among those who “can handle the complexity of management, security, risk and compliance across multiple clouds”. Startups (with small, nimble teams) ironically often find this easier than giant corporations tied up in red tape. In Patel’s case, her startup scripted their training jobs to pull data from an AWS S3 bucket, run on CoreWeave, then push the results back — a bit of one-time setup that ended up saving them hundreds of thousands of dollars in cloud costs. As tooling improves, expect hybrid cloud AI to become more seamless. The ultimate goal for many is to treat compute like a utility: dynamically use the best source available, whether that’s your own server, a neocloud, or a hyperscaler, much as a smartphone switches between Wi-Fi and cellular networks for the best signal.
Talent and Community: Who Will Build This Future?
Amid the excitement over hardware and cloud battles, there’s a quieter challenge looming: people. The demand for AI talent — from data scientists to machine learning engineers and MLOps specialists — far outstrips supply. Recent studies show nearly 700,000 AI-related positions are unfilled in the U.S. alone, and by 2027 as many as half of all global AI jobs could be vacant due to talent shortages. This talent crunch threatens to slow down AI innovation just as the compute bottleneck is being addressed by neoclouds. After all, renting 1,000 GPUs isn’t very useful if you don’t have skilled folks to effectively harness them for training sophisticated models.
Here, too, the neocloud phenomenon is having an impact — in some ways positive, in others complicated. On the positive side, many AI neocloud companies have taken it upon themselves to nurture communities of developers and researchers. Lambda Labs, for example, has long catered to the machine learning community (even before its cloud, it sold deep learning workstation hardware and maintained popular open-source tooling like Lambda Stack for AI software). Lambda hosts community forumsand even an “ML News” hub to keep developers in the loop. Their philosophy, as stated by CEO Stephen Balaban, is to “delight AI developers and put AI in everyone’s hands.”They back this up by supporting open-source AI projects — Lambda’s platform hosts chatbots like Lambda Chat that let anyone tinker with open-source models, and they highlight breakthroughs like Meta’s LLaMA and other public models that lower barriers to entry. In 2025, Balaban noted that thanks to open-source models and tools, “AI has become more democratized…Open source accelerates progress…We envision a future of one person, one GPU”– essentially suggesting that widely available compute (which Lambda aims to provide) plus community-driven model development can address the talent gap by empowering more “regular” developers to do extraordinary things.
Neocloud providers also often offer hands-on support and services that can substitute, in part, for in-house expertise. For instance, some advertise that their solutions architects will help optimize your model training run or debug GPU memory errors — support that a generic cloud ticket system might not quickly provide. Community events like online workshops, hackathons, or Slack/Discord groups around these platforms create spaces for less-experienced practitioners to learn from experts. In effect, the neocloud ecosystem is fostering a knowledge network alongside its hardware network. It’s in their interest to do so: the easier they make it for a small startup or researcher to utilize 100 GPUs effectively, the more likely those customers will succeed (and come back for 1,000 GPUs later). We’re even seeing specialized forums (such as Lambda’s “Deep Talk” forums or CoreWeave’s community Slack) where engineers share tips on everything from multi-GPU distributed training to handling spot instance interruptions. This communal approach can’t single-handedly fill the global AI talent shortfall, but it helps unlock existing talent — enabling one good engineer to leverage powerful compute that might have required a whole team in the past.
There is, however, a flip side. The easy availability of massive compute can create an illusion of competency or encourage a brute-force approach over thoughtful innovation. Some industry veterans worry that companies might attempt ambitious AI projects simply by “renting a ton of GPUs” without truly understanding the research problem — effectively burning through cash on cloud bills without the desired results (a scenario perhaps exemplified by Stability AI, which reportedly spent exorbitantly on cloud GPUs and ran into financial troubles when the outputs didn’t justify the cost). In other words, compute is now cheap enough to be accessible, but not so cheap that misuse is painless. The best outcomes arise when strong talent meets strong infrastructure; one without the other tends to falter.
Interestingly, the talent war extends to the neocloud companies themselves. They are vying to hire the limited pool of experts who deeply understand GPU systems, networking at scale, and AI workload optimization. In that sense, they compete not just with each other but with the Microsofts and Googles of the world for those engineers. Some neoclouds have enticed top technical minds from larger firms by offering them a more focused mission (and likely generous equity in a fast-growing startup). There’s a bit of a cult appeal in working on the “Wild West” of AI infrastructure — you might get to design a 10,000-GPU supercluster architecture in a neocloud startup, whereas at a hyperscaler you’d be a cog in a very big machine. The outcome of the talent shortage — whether through training new people or smartly leveraging communities — will influence how far and fast neoclouds can go. If they can empower a wave of developers globally to build AI solutions without needing a PhD or a Fortune 500 budget, their impact on who gets to build the future could be profound.
NVIDIA’s Shadow and the Chip Challenger Future
No discussion of AI compute can avoid the looming presence of NVIDIA — currently the single most critical vendor for all these players. NVIDIA’s GPUs power the vast majority of AI training runs today, and by extension, they power the neoclouds’ businesses. In fact, one could cheekily describe many neoclouds as merely NVIDIA resellers with some value-add on top. CoreWeave’s IPO filings underscored this dependency: “Our business depends on getting the newest and best GPUs from NVIDIA. If NVIDIA decides to prioritize others or sell directly to AI companies, we’re in trouble,”the company essentially admitted. At present, NVIDIA has little incentive to cut out neocloud middlemen — they’re helping create more demand for NVIDIA’s chips and more channels to sell them. Jensen Huang, NVIDIA’s CEO, has publicly praised the growth of these specialized clouds. And tellingly, NVIDIA has invested in several of them (it put $100M into CoreWeave’s funding roundand joined Lambda’s recent round). It’s reminiscent of how a gold mining supply company might support many small mining operations during a gold rush — NVIDIA benefits so long as people keep needing shovels, whether they dig on their own land or someone else’s.
However, the future of AI hardware is not a foregone conclusion. NVIDIA’s current dominance — estimated at 70–95% of the AI accelerator market for data centers– will inevitably face challenges. Alternative AI chips are emerging, and the neoclouds could be the very platforms that give those challengers a shot at wider adoption. After all, hyperscalers are making their own silicon (Google’s TPUs, Amazon’s Trainium/Inferentia, Microsoft’s rumored chips) largely to reduce reliance on NVIDIA. But what about independent chip startups? Enter names like Cerebras, Groq, Tenstorrent, Graphcore, Sambanova,and others — each with a radically different architecture aimed at AI workloads.
Take Cerebras Systems, for example. Cerebras built the largest chip in the world — essentially an entire silicon wafer etched as one colossal “Wafer-Scale Engine” (WSE) — to tackle AI training with a single chip that is orders of magnitude bigger than a GPU. The Cerebras WSE boasts 850,000 cores on one device, allowing certain large models to run without needing a multi-GPU cluster at all. This can yield tremendous speedups for specific tasks, and Cerebras markets it as ideal for things like natural language processing and scientific simulations that benefit from its massive shared memory and parallelism. One of the AI neoclouds in Ankur Patel’s comparison is actually Cerebras Cloud, where you can rent time on a Cerebras CS-2 system instead of GPUs. It’s highly specialized — not every AI task will run best on a wafer-scale engine — but for those that do, it’s like having a Formula 1 car instead of a standard racecar. This shows the potential for neoclouds to offer alternative chips as a service. A hyperscaler might be slower to incorporate a startup’s novel chip (though Azure did experiment with Cray and Graphcore offerings), but a neocloud could decide to differentiate by hosting, say, Groq cores or Tenstorrent processors and attract workloads that value those chips’ unique strengths.
Groq, founded by ex-Google engineers, created a novel architecture known as a “Tensor Streaming Processor” that is extremely deterministic and fast for certain matrix operations. It has been touted for low-latency inference, e.g., serving large language model responses quicker than a GPU in some cases. A cloud provider could deploy Groq chips to carve out a niche in high-speed inference services (imagine a cloud that advertises: get your ChatGPT response 2x faster than anywhere else, by using Groq hardware under the hood).
Tenstorrent, backed by figures like Jim Keller (legendary chip architect) and investors such as Samsung and even Jeff Bezos’s fund, is working on modular AI chips combined with open-source RISC-V CPUs. They recently raised $693M, valuing at $2.6B, to challenge NVIDIA’s dominance with more open, licensable designs. Tenstorrent’s strategy includes partnering with cloud providers (there are hints of collaboration with AWS to reduce Nvidia reliance) — but one can imagine an independent neocloud being an early adopter too, eager to diversify its supply chain.
At the same time, AMD is pushing hard with its MI300 accelerators, and Intel is in the mix with its Gaudi line (via Habana Labs acquisition). In fact, Microsoft’s Azure is already offering AMD’s MI250 instances and plans MI300, partly to keep NVIDIA honest on pricing. Neoclouds, if they can get their hands on these alternatives, might use them to offer even cheaper rates. One might envision a neocloud saying: “NVIDIA H100 at $2.50/hr, or AMD Instinct at $1.80/hr” — tempting some users to try the slightly less popular chip for cost savings. This could gradually chip away (pun intended) at NVIDIA’s stranglehold.
For now, though, NVIDIA remains the linchpin. The year 2024 saw the company’s valuation soar and its sales booked out for many quarters ahead, largely thanks to AI demand. Morningstar’s PitchBook projected that AI neocloud companies would collectively reach about $4 billion in revenue in 2024– a tiny fraction of the overall cloud market, but a figure likely to multiply in coming years. By 2025 or 2026, if that’s $10B+, NVIDIA will still likely be supplying the majority of that hardware. The neoclouds that succeed long-term might be those that can either secure a steady pipeline of NVIDIA GPUs at good prices (not easy when everyone wants them), or smartly incorporate alternative chips to avoid being bottlenecked by one supplier. Some observers caution that the current situation (where demand so outpaces supply) won’t last forever. If the “AI bubble” were to cool or if NVIDIA suddenly floods the market with chips, these neoclouds could find themselves with excess inventory — essentially, too many GPUs and not enough takers. It’s a boom-bust business risk: great when GPUs are scarce (they rent like hotcakes), dangerous if we hit a glut.
Cloud Independence and the Road Ahead
As AI neoclouds scale up and integrate into the fabric of AI development, a philosophical and strategic question hangs: will this remain a diverse, competitive landscape, or simply become an appendage of the existing cloud empires? The specter of cloud centralization looms. We’ve seen this movie before: innovative startups blaze a trail, only to be acquired or outcompeted by the giants who catch up. There is both optimism and caution in the air.
On one hand, the success of neoclouds has validated a new model of cloud service. It’s quite plausible that by 2030, the term “neocloud” will fade because every cloud will offer similar GPU-centric options, and some of today’s upstarts will themselves be large established players. By numbers alone, the opportunity is massive. The GPU-as-a-service market is projected to reach anywhere from $25–30 billion by 2030(depending on the analysis source), growing at double-digit CAGRs as AI adoption permeates every industry. AI startups will continue to proliferate globally — not just in Silicon Valley, but in Lagos, Bangalore, and Buenos Aires — and they will need affordable compute. The global south in particular stands to gain from these competitive clouds, as they may offer more affordable access than U.S. or European-based hyperscalers with pricier offerings. We could see regional neocloud champions in South America or Africa catering to local AI firms with pricing in local currency and compliance with local norms. It’s analogous to how telecom had alternatives in different countries before the era of consolidation.
Sustainability will also be a driving segment. As concerns about the carbon footprint of AI grow (training one large AI model can emit as much CO₂ as a car does in multiple years of driving), “green compute” could become a selling point. Neoclouds like Crusoe are already capitalizing on this, and by 2030 we may have sustainability-focused cloudsthat exclusively use renewable energy or recycle hardware. Governments or climate-conscious companies might even mandate certain workloads run on carbon-neutral infrastructure. This is a space where nimble neoclouds can differentiate faster than the big incumbents bound by legacy data centers. The future of AI chips also means future clouds might not revolve around a single vendor’s GPUs. We might talk about AI clusters of many flavors — GPU, TPU, IPU (Graphcore’s Intelligence Processing Unit), neuromorphic chips, you name it — each optimized for different tasks. And the providers who manage these heterogenous pools efficiently will lead.
But alongside these hopeful trajectories is the risk of consolidation. The hyperscalers are certainly not oblivious. There’s already a bit of “if you can’t beat ’em, join ’em (or buy ’em)” happening. Microsoft’s deep partnership with CoreWeave (and rumored investment interest) could herald a future where CoreWeave essentially becomes an extension of Azure — an Azure Neocloud, so to speak. If, hypothetically, Microsoft were to acquire a player like CoreWeave, it would gain a huge GPU farm and remove a competitor from the market, but it would also likely spell the end of CoreWeave’s neutral appeal. Similarly, Amazon could decide to scoop up one of the neocloud startups or dramatically cut its own prices to squeeze them. There’s precedent: cloud giants often cut prices or offer new incentives when a niche competitor starts gaining traction in a segment. Google, for instance, might leverage its TPUs (Tensor Processing Units) to offer bargain rates for AI training, undercutting GPU-centric providers (TPUs are Google’s proprietary chips, and they’ve continued advancing them — latest versions like TPUv4 are very powerful). If such tactics intensify, some weaker neocloud players will undoubtedly fold or sell off.
Another consolidation vector is customer concentration. If a neocloud’s capacity gets mostly rented to one or two huge clients (say, an AI research lab or a government project), that provider might end up tailoring itself to that client’s needs and effectively locking out smaller users. We already see a bit of this: CoreWeave’s revenue is largely driven by multi-year “take-or-pay” contracts — 96% of its revenue is locked-in contracts of 2–5 years. That provides stability, but it means a few deep-pocketed customers call the shots. If, for example, OpenAI (via Microsoft) has pre-booked a giant slice of CoreWeave’s future GPUs, then how independent is CoreWeave really? Some fear we might just be funneling money to these neoclouds now, only for them to become as oligopolistic as the big clouds, or to end up acquired by them in a few years — resulting in the same capacity bottleneck or pricing power issues all over again, just under different branding. The optimists counter that even if consolidation happens, the genie is out of the bottle: the idea of specialized clouds and the culture of transparent pricing they introduced will force more openness in the market. Indeed, we have more public benchmarks and knowledge of GPU pricing thanks to neoclouds — they’ve pressured even AWS to publish more info on their GPU performance and cost.
From Patel’s perspective in our opening story, what mattered was that compute access was no longer an ivory-tower privilege. The day she shifted her training to a neocloud, it changed the trajectory of her startup. That ability — for a small team with a big idea to access supercomputer-level resources on a whim — is something technologists have dreamt about for years. We’re closer than ever to it being reality. It’s a profound shift: who gets to build the futuremight no longer be gated by who has the biggest budget or the fanciest corporate partnership, but increasingly by who has the creativity and grit to leverage these abundant tools. Compute power, much like information on the internet, is trending toward democratization, albeit with bumps on the way.
In a human sense, this is also a story of innovation begetting more innovation. The neocloud founders saw an opening — some coming from failure or decline in other sectors — and seized it to solve a problem that was stifling countless others. Their success has enabled a new wave of AI entrepreneurs, researchers, and dreamers to push the envelope. Think of the scientist in a university in South America who can now rent 50 GPUs overnight to run experiments that previously would have taken months on the school’s limited servers, or the nonprofit in Africa developing an AI for crop disease detection that can train models without blowing their grant on Big Tech cloud fees. As compute becomes more accessible, we’ll hear more voices and see more perspectives shaping AI, not just those in the Silicon Valley echo chamber.
Looking ahead to 2030, the landscape of AI neoclouds will likely be very different, but their influence will be unmistakable. Analysts project the cohort could capture tens of billions in annual revenue by decade’s end, possibly claiming a significant double-digit percentage of AI infrastructure spend. The fastest growth will probably be among AI startups and mid-size enterprises that are born in this era and never think twice about using multiple clouds (they’ll pick whatever cloud gives them the best combo of price and performance for each task). We’ll also see growth in regions outside the U.S./Europe as local neoclouds or joint ventures fill the gap — imagine an “AI cloud for Southeast Asia” or one tailored for Latin America’s language and market needs, providing low-latency access and local support. And of course, sustainable compute will go from a niche to a mainstream requirement, with neoclouds pioneering everything from underwater data centers (for cooling) to 100% renewable-powered GPU farms in deserts and tundras.
Will the hyperscalers still dominate overall cloud spend in 2030? Almost certainly, yes. But it may be a bit like how the electric vehicle market evolved — initially dismissed by incumbents, then taken seriously as upstarts proved demand, and eventually leading the incumbents to significantly change their own models. By proving that a lean, GPU-focused approach works (and is profitable), the neoclouds have forced the broader industry to adapt. Microsoft, Google, and Amazon will doubtless still provide the majority of cloud services, but they might do so in a way that’s more modular, more open, and (hopefully) more cost-transparent for AI workloads, thanks to the competitive fire lit under them.
As we conclude this exploration, it’s worth zooming out to the big-picture philosophy: compute access shapes who gets to innovate. In past eras, it was access to capital or education that determined who could invent new things. In the AI era, you can have brilliant ideas and algorithms, but without sufficient compute, you’ll hit a wall. AI neoclouds are tearing down that wall. They didn’t set out to change the world; they set out to rent GPUs. Yet by doing so, they have lowered a key barrier to entry for the next generation of AI breakthroughs. They’ve turned GPU compute into a ubiquitous utility, much like cloud storage or broadband internet.
There is a new sense of possibility afoot. A small team with modest funds can now train a model that competes with those from trillion-dollar companies, simply by renting what they need on-demand. The creativity unleashed by that leveling of the playing field could bring solutions to long-standing global challenges: better climate modeling from researchers who gain access to supercomputing, new medical AI diagnostics from a startup in a country without massive data centers, or improved translation AI that finally bridge language gaps for billions. As one neocloud founder put it, we may approach a world of “one person, one GPU”– meaning anyone with the curiosity and skill can tap into the computing power necessary to turn their idea into reality.
Of course, the journey is ongoing. The neoclouds will face growing pains, competition, and perhaps some will fail. But even if specific companies rise or fall, the paradigm shift they championed seems here to stay. In the twilight of that rainy Tuesday, Anika Patel didn’t care about paradigms or market forces; she simply rejoiced that her startup’s destiny was back in her hands. The AI revolution may be driven by algorithms and data, but it runs on compute. Thanks to the rise of AI neoclouds, that compute is more affordable and scalable than ever — and increasingly in the hands of those daring enough to use it. In the new era of cloud, the keys to the future are hanging within reach of many more outstretched
Key Citations
- AI Neocloud Playbook and Anatomy — SemiAnalysis
- AI: Rise of the AI data center ‘Neoclouds’. RTZ #558 | by Michael Parekh | Medium
- AI Neocloud — Fabrica Ventures
- The Rise of ‘Neoclouds’: Shaping the Future of AI Data Centers — TLC Creative Technology
- Neoclouds: a cost-effective AI infrastructure alternative — Uptime Institute Blog
- Neoclouds: a cost-effective AI infrastructure alternative | Uptime Intelligence
- Revolutionizing the Cloud: An In-Depth Review of Neocloud | by The USA Promoter | Medium
- Investors’ $20 Billion Bet On The ‘NeoClouds’ Driving The AI Arms Race — Forbes
- Investors’ $20 Billion Bet On The ‘NeoClouds’ Driving The AI Arms Race — Best of AI
- The Power of Neoclouds; OpenAI Builds Its Own AI Chip; Nuclear tech Ups AND Down; Next-Gen Data Protection; Hydrogen-Powered Jets Lift Off — The Scenarionist