The Cheapest GPU Cloud Providers in 2026: Where AI Compute Is Actually Lowest

The cheapest H100 GPU on the public cloud market in 2026 prices at roughly $1.99/hour on-demand and $1.30/hour at 3-year reserved capacity. The most expensive prices the same hardware at $5.00/hour on-demand. The 2.5× spread is real, persistent, and the single largest cost optimization in cloud GPU spend.

This guide ranks where prices are actually lowest by GPU SKU, names specific providers in each tier, and explains the tradeoffs you accept to capture the savings. For the broader rent-side framework: Cloud GPU Pricing. For why prices vary so much: Why GPU Prices Differ 30%+.

TL;DR

Cheapest H100 cloud (May 2026):

On-demand: $1.99–$2.50/hour at long-tail providers
Reserved 3-year: $1.30–$1.80/hour at long-tail providers
Spot/preemptible: $0.80–$1.50/hour for interruption-tolerant workloads

Cheapest A100 cloud: $0.80–$1.10/hour on-demand at long-tail providers (vs $1.50–$3.00 at hyperscalers).

Cheapest H200 cloud: $2.50–$3.50/hour on-demand at long-tail providers (vs $4.50–$7.00 at hyperscalers).

For real-time tracking across all 22+ providers Mercatus monitors, plus historical pricing and provider quality metrics: Mercatus GPU Index.

How “cheapest” actually breaks down by tier

The provider landscape stratifies into four tiers with different cost structures:

Tier	H100 on-demand $/hr	What you get
Hyperscaler	$3.50 – $5.00	Premium ecosystem, enterprise sales motion
Specialty	$2.50 – $3.50	AI-optimized infrastructure, mid-tier reliability
Long-tail	$1.99 – $2.50	Lean operations, regional advantages
Decentralized	$1.50 – $2.50	Aggregated supply, variable reliability

For pure pricing, long-tail providers consistently win. The tradeoffs are smaller brand recognition, fewer bundled services, and the need to vet reliability case-by-case.

The cheapest providers by GPU SKU (2026)

H100 — cheapest on-demand

Long-tail and regional providers consistently price below $2.50/hour. Specific providers known for aggressive pricing in 2026:

DataCrunch (Finland-based, EU/global): $1.99–$2.30/hour H100 SXM5
Vultr (global): $2.00–$2.50/hour
Hetzner Cloud (Germany, EU): $2.10–$2.50/hour
Vast.ai (marketplace model): $1.80–$2.40/hour, variable by listing
Crusoe Cloud (US, North American power): $2.20–$2.80/hour
Various regional providers in EU, APAC, LATAM: typically $1.99–$2.40/hour

For continuously updated rankings: GPU Index.

A100 — cheapest on-demand

The A100 is one generation behind H100, and pricing reflects it. Cheapest providers in 2026:

Long-tail and regional providers: $0.80–$1.10/hour
Specialty (CoreWeave, Lambda): $1.10–$1.50/hour
Spot pricing at long-tail providers: $0.50–$0.90/hour for interruption-tolerant work

A100 cloud pricing has come down meaningfully through 2025–2026 as Blackwell ramping shifts demand patterns. For workloads that don’t need FP8 (most fine-tuning, smaller-model inference), A100 cloud is genuinely cheap.

H200 — cheapest on-demand

H200 launched 2024 and reached broader availability in 2025–2026. Cheapest providers:

Long-tail providers: $2.50–$3.50/hour
Specialty providers: $3.50–$4.50/hour

H200 supply is tighter than H100 throughout 2026. Long-tail providers offering competitive pricing are worth seeking out. For the H200 buy-vs-rent tradeoff: H200 Buy vs Rent.

Reserved capacity — the cheapest cloud option

For predictable workloads, reservation drops effective hourly cost 30–50% below on-demand. The cheapest reserved options in 2026:

GPU	Long-tail 1yr reserved	Long-tail 3yr reserved	Floor pricing
H100 SXM5	$1.60 – $2.10/hr	$1.30 – $1.80/hr	$1.30/hr
A100 80GB	$0.70 – $0.90/hr	$0.55 – $0.75/hr	$0.55/hr
H200 SXM5	$2.00 – $2.80/hr	$1.80 – $2.50/hr	$1.80/hr

Reserved 3-year H100 capacity at long-tail providers is the cheapest production-grade GPU access in 2026 short of owning — and lands within 15% of owned-cluster economics with none of the operational burden. For most institutional buyers running predictable workloads, this is the pricing-optimal answer.

Spot pricing — the cheapest absolute option

For interruption-tolerant workloads (batch training, fine-tuning, async processing, research experiments), spot/preemptible pricing cuts cost dramatically:

GPU	Long-tail spot $/hr	Notes
H100 SXM5	$0.80 – $1.50	60–80% below long-tail on-demand
A100 80GB	$0.40 – $0.70	50–70% below long-tail on-demand
H200 SXM5	$1.20 – $2.00	Limited availability

When spot pricing is the right answer:

Fine-tuning runs (resumable from checkpoint)
Batch processing (non-time-sensitive)
Research experiments (interruption tolerable)
Async pipelines (queue-based)

When spot pricing is wrong:

Production user-facing serving (eviction = customer impact)
Time-sensitive jobs with tight deadlines
Workloads where eviction would lose unrecoverable state

For interruption-tolerant workloads, spot pricing at decentralized networks (Akash, io.net, Bittensor subnets) sometimes goes even lower than long-tail providers, with corresponding reliability variance.

What you accept for the cheapest pricing

The 50–60% savings from going to long-tail providers come with tradeoffs. Honest assessment:

Things you typically don’t get at long-tail prices:

Comprehensive cloud platform (S3, BigQuery, IAM ecosystems)
Enterprise-grade compliance attestations
Dedicated account managers and solutions engineers
Multi-region failover with single-vendor coordination
Bundled ML platforms (SageMaker, Vertex AI equivalents)

Things you do typically get:

Same physical hardware (H100 SXM5 is H100 SXM5)
Reasonable reliability (top long-tail providers match specialty tier)
Standard cloud APIs for compute, networking, storage
Cross-provider portability (no lock-in)

For most teams using GPUs for actual AI workloads (not deeply integrated with hyperscaler ecosystems), the long-tail tradeoffs are acceptable in exchange for the savings.

If your workload is LLM inference: a different path

If you're renting GPUs specifically to serve LLM inference, there's a separate Mercatus product worth knowing about: Spot Market provides token-level API access to LLM models across providers (similar in shape to OpenRouter). Spot Market is for LLM token consumption, not GPU rental — it's a different product for a different market. For training, fine-tuning, custom inference, or non-LLM workloads, GPU rental is still the right path; use GPU Index to find the cheapest provider.

How to actually pick the cheapest provider

A practical framework for evaluation:

Step 1: Use GPU Index to identify the 3 cheapest providers for your target SKU and region.

Step 2: Verify reliability metrics — uptime, latency, customer reviews. Top long-tail providers match specialty tier; less-known operators may have reliability issues.

Step 3: Compare on real cost — egress, storage, support tier, hidden fees. Long-tail providers often have minimal egress charges and fewer hidden fees, compounding their pricing advantage.

Step 4: Match pricing model to workload — reserved for steady, on-demand for variable, spot for interruptible.

Step 5: For inference specifically, evaluate Mercatus Spot as alternative to single-provider commitment.

For full evaluation framework: Cloud GPU Pricing pillar.

Frequently Asked Questions

What’s the cheapest H100 cloud provider in 2026?

Long-tail providers like DataCrunch, Vultr, Hetzner, and Vast.ai offer H100 on-demand at $1.99–$2.50/hour — roughly 50% below hyperscaler on-demand. Reserved 3-year capacity drops further to $1.30–$1.80/hour. GPU Index tracks live cross-provider pricing.

Are cheap GPU cloud providers reliable?

Top long-tail providers (DataCrunch, Vultr, Hetzner) match specialty-tier reliability while pricing 30–50% below them. Less-known operators vary — vetting matters. The best long-tail providers are genuinely production-grade; the worst aren’t suitable for production.

Why are some providers so much cheaper?

The 2.5× cross-provider spread reflects sales overhead and margin structures, not infrastructure cost. Long-tail providers run leaner sales operations and accept lower margins for growth. Same hardware, very different cost structures. See Why GPU Prices Differ 30%+.

Should I use spot/preemptible for production?

For most production workloads, no. Spot eviction can break user-facing serving. For batch processing, fine-tuning, and interruption-tolerant work, yes — spot pricing cuts cost 60–80% below on-demand.

What’s the difference between this article and the Cloud GPU Pricing pillar?

This article ranks specific cheap providers; the Cloud GPU Pricing pillar covers the broader framework, the 4-tier provider landscape, and the structural reasons for cross-provider pricing variance. Both are useful — read this for who’s cheap, the pillar for why and how.

Does Mercatus offer a marketplace for cheap GPU rental?

No. Mercatus doesn't rent GPUs directly. GPU Index tracks cross-provider GPU rental pricing in real time so you can find the cheapest provider — but you transact with the provider directly. Mercatus's marketplace product (Spot Market) is for LLM token consumption, not GPU rental — different market, different product.

Methodology

Pricing data sourced from Mercatus GPU Index, May 2026 cross-provider snapshot. Provider names listed are illustrative of typical pricing in their tier; specific rates vary daily and by region. Last verified: 2026-05-12.