The cheapest H100 GPU on the public cloud market in 2026 prices at roughly $1.99/hour on-demand and $1.30/hour at 3-year reserved capacity. The most expensive prices the same hardware at $5.00/hour on-demand. The 2.5× spread is real, persistent, and the single largest cost optimization in cloud GPU spend.
This guide ranks where prices are actually lowest by GPU SKU, names specific providers in each tier, and explains the tradeoffs you accept to capture the savings. For the broader rent-side framework: Cloud GPU Pricing. For why prices vary so much: Why GPU Prices Differ 30%+.
TL;DR
Cheapest H100 cloud (May 2026):
- On-demand: $1.99–$2.50/hour at long-tail providers
- Reserved 3-year: $1.30–$1.80/hour at long-tail providers
- Spot/preemptible: $0.80–$1.50/hour for interruption-tolerant workloads
Cheapest A100 cloud: $0.80–$1.10/hour on-demand at long-tail providers (vs $1.50–$3.00 at hyperscalers).
Cheapest H200 cloud: $2.50–$3.50/hour on-demand at long-tail providers (vs $4.50–$7.00 at hyperscalers).
For real-time tracking across all 22+ providers Mercatus monitors, plus historical pricing and provider quality metrics: Mercatus GPU Index.
How “cheapest” actually breaks down by tier
The provider landscape stratifies into four tiers with different cost structures:
| Tier | H100 on-demand $/hr | What you get |
|---|---|---|
| Hyperscaler | $3.50 – $5.00 | Premium ecosystem, enterprise sales motion |
| Specialty | $2.50 – $3.50 | AI-optimized infrastructure, mid-tier reliability |
| Long-tail | $1.99 – $2.50 | Lean operations, regional advantages |
| Decentralized | $1.50 – $2.50 | Aggregated supply, variable reliability |
For pure pricing, long-tail providers consistently win. The tradeoffs are smaller brand recognition, fewer bundled services, and the need to vet reliability case-by-case.
The cheapest providers by GPU SKU (2026)
H100 — cheapest on-demand
Long-tail and regional providers consistently price below $2.50/hour. Specific providers known for aggressive pricing in 2026:
- DataCrunch (Finland-based, EU/global): $1.99–$2.30/hour H100 SXM5
- Vultr (global): $2.00–$2.50/hour
- Hetzner Cloud (Germany, EU): $2.10–$2.50/hour
- Vast.ai (marketplace model): $1.80–$2.40/hour, variable by listing
- Crusoe Cloud (US, North American power): $2.20–$2.80/hour
- Various regional providers in EU, APAC, LATAM: typically $1.99–$2.40/hour
For continuously updated rankings: GPU Index.
A100 — cheapest on-demand
The A100 is one generation behind H100, and pricing reflects it. Cheapest providers in 2026:
- Long-tail and regional providers: $0.80–$1.10/hour
- Specialty (CoreWeave, Lambda): $1.10–$1.50/hour
- Spot pricing at long-tail providers: $0.50–$0.90/hour for interruption-tolerant work
A100 cloud pricing has come down meaningfully through 2025–2026 as Blackwell ramping shifts demand patterns. For workloads that don’t need FP8 (most fine-tuning, smaller-model inference), A100 cloud is genuinely cheap.
H200 — cheapest on-demand
H200 launched 2024 and reached broader availability in 2025–2026. Cheapest providers:
- Long-tail providers: $2.50–$3.50/hour
- Specialty providers: $3.50–$4.50/hour
H200 supply is tighter than H100 throughout 2026. Long-tail providers offering competitive pricing are worth seeking out. For the H200 buy-vs-rent tradeoff: H200 Buy vs Rent.
Reserved capacity — the cheapest cloud option
For predictable workloads, reservation drops effective hourly cost 30–50% below on-demand. The cheapest reserved options in 2026:
| GPU | Long-tail 1yr reserved | Long-tail 3yr reserved | Floor pricing |
|---|---|---|---|
| H100 SXM5 | $1.60 – $2.10/hr | $1.30 – $1.80/hr | $1.30/hr |
| A100 80GB | $0.70 – $0.90/hr | $0.55 – $0.75/hr | $0.55/hr |
| H200 SXM5 | $2.00 – $2.80/hr | $1.80 – $2.50/hr | $1.80/hr |
Reserved 3-year H100 capacity at long-tail providers is the cheapest production-grade GPU access in 2026 short of owning — and lands within 15% of owned-cluster economics with none of the operational burden. For most institutional buyers running predictable workloads, this is the pricing-optimal answer.
Spot pricing — the cheapest absolute option
For interruption-tolerant workloads (batch training, fine-tuning, async processing, research experiments), spot/preemptible pricing cuts cost dramatically:
| GPU | Long-tail spot $/hr | Notes |
|---|---|---|
| H100 SXM5 | $0.80 – $1.50 | 60–80% below long-tail on-demand |
| A100 80GB | $0.40 – $0.70 | 50–70% below long-tail on-demand |
| H200 SXM5 | $1.20 – $2.00 | Limited availability |
When spot pricing is the right answer:
- Fine-tuning runs (resumable from checkpoint)
- Batch processing (non-time-sensitive)
- Research experiments (interruption tolerable)
- Async pipelines (queue-based)
When spot pricing is wrong:
- Production user-facing serving (eviction = customer impact)
- Time-sensitive jobs with tight deadlines
- Workloads where eviction would lose unrecoverable state
For interruption-tolerant workloads, spot pricing at decentralized networks (Akash, io.net, Bittensor subnets) sometimes goes even lower than long-tail providers, with corresponding reliability variance.
What you accept for the cheapest pricing
The 50–60% savings from going to long-tail providers come with tradeoffs. Honest assessment:
Things you typically don’t get at long-tail prices:
- Comprehensive cloud platform (S3, BigQuery, IAM ecosystems)
- Enterprise-grade compliance attestations
- Dedicated account managers and solutions engineers
- Multi-region failover with single-vendor coordination
- Bundled ML platforms (SageMaker, Vertex AI equivalents)
Things you do typically get:
- Same physical hardware (H100 SXM5 is H100 SXM5)
- Reasonable reliability (top long-tail providers match specialty tier)
- Standard cloud APIs for compute, networking, storage
- Cross-provider portability (no lock-in)
For most teams using GPUs for actual AI workloads (not deeply integrated with hyperscaler ecosystems), the long-tail tradeoffs are acceptable in exchange for the savings.
If your workload is LLM inference: a different path
If you're renting GPUs specifically to serve LLM inference, there's a separate Mercatus product worth knowing about: Spot Market provides token-level API access to LLM models across providers (similar in shape to OpenRouter). Spot Market is for LLM token consumption, not GPU rental — it's a different product for a different market. For training, fine-tuning, custom inference, or non-LLM workloads, GPU rental is still the right path; use GPU Index to find the cheapest provider.
How to actually pick the cheapest provider
A practical framework for evaluation:
Step 1: Use GPU Index to identify the 3 cheapest providers for your target SKU and region.
Step 2: Verify reliability metrics — uptime, latency, customer reviews. Top long-tail providers match specialty tier; less-known operators may have reliability issues.
Step 3: Compare on real cost — egress, storage, support tier, hidden fees. Long-tail providers often have minimal egress charges and fewer hidden fees, compounding their pricing advantage.
Step 4: Match pricing model to workload — reserved for steady, on-demand for variable, spot for interruptible.
Step 5: For inference specifically, evaluate Mercatus Spot as alternative to single-provider commitment.
For full evaluation framework: Cloud GPU Pricing pillar.
Frequently Asked Questions
What’s the cheapest H100 cloud provider in 2026?
Long-tail providers like DataCrunch, Vultr, Hetzner, and Vast.ai offer H100 on-demand at $1.99–$2.50/hour — roughly 50% below hyperscaler on-demand. Reserved 3-year capacity drops further to $1.30–$1.80/hour. GPU Index tracks live cross-provider pricing.
Are cheap GPU cloud providers reliable?
Top long-tail providers (DataCrunch, Vultr, Hetzner) match specialty-tier reliability while pricing 30–50% below them. Less-known operators vary — vetting matters. The best long-tail providers are genuinely production-grade; the worst aren’t suitable for production.
Why are some providers so much cheaper?
The 2.5× cross-provider spread reflects sales overhead and margin structures, not infrastructure cost. Long-tail providers run leaner sales operations and accept lower margins for growth. Same hardware, very different cost structures. See Why GPU Prices Differ 30%+.
Should I use spot/preemptible for production?
For most production workloads, no. Spot eviction can break user-facing serving. For batch processing, fine-tuning, and interruption-tolerant work, yes — spot pricing cuts cost 60–80% below on-demand.
What’s the difference between this article and the Cloud GPU Pricing pillar?
This article ranks specific cheap providers; the Cloud GPU Pricing pillar covers the broader framework, the 4-tier provider landscape, and the structural reasons for cross-provider pricing variance. Both are useful — read this for who’s cheap, the pillar for why and how.
Does Mercatus offer a marketplace for cheap GPU rental?
No. Mercatus doesn't rent GPUs directly. GPU Index tracks cross-provider GPU rental pricing in real time so you can find the cheapest provider — but you transact with the provider directly. Mercatus's marketplace product (Spot Market) is for LLM token consumption, not GPU rental — different market, different product.
Methodology
Pricing data sourced from Mercatus GPU Index, May 2026 cross-provider snapshot. Provider names listed are illustrative of typical pricing in their tier; specific rates vary daily and by region. Last verified: 2026-05-12.
