Google Kubernetes Engine (GKE) is one of the most popular managed Kubernetes services—but its pricing model can be confusing at first. Unlike traditional VM-based billing, GKE introduces two distinct modes—Autopilot and Standard—with completely different pricing approaches, resource controls, and cost implications.
Understanding how you’re charged across control planes, compute resources, networking, and storage is key to avoiding surprise bills and planning cost-efficient clusters.
In this guide, we’ll break down how GKE pricing works in 2025, including:
- What you pay for (and what you don’t)
- Differences between Autopilot and Standard mode
- Pricing examples, discounts, and hidden costs
- When each model makes the most sense
If you’re looking for strategies to lower your GKE bill, we’ve covered that separately in our GKE cost optimization guide. This post focuses specifically on how pricing works—so you can understand your bill before trying to reduce it.
For official rates and the latest regional pricing, check the GKE pricing documentation.
How GKE Pricing Works: The Basics
Google Kubernetes Engine offers two pricing models: Autopilot and Standard. Both give you a managed Kubernetes control plane, but differ in how compute is billed and who manages the underlying infrastructure.
At a high level, here’s what you pay for in GKE:
Key Differences:
- Autopilot Mode: You’re billed for the CPU, memory, and ephemeral storage that your pods request. Google provisions and manages the nodes behind the scenes.
- Standard Mode: You manage your own node pools and are billed based on VM instance pricing (from Google Compute Engine), regardless of how much your workloads use.
Control Plane Billing
- In Standard Mode, GKE charges a flat rate of $0.10 per hour per cluster (~$72/month).
- In Autopilot Mode, this control plane cost is included in the per-pod pricing.
Why This Matters
- Underutilized pods in Autopilot can lead to wasted spend if requests are oversized.
- In Standard, overprovisioned nodes can lead to unused CPU/RAM that still gets billed.
GKE Pricing Model: Autopilot Mode
Autopilot is GKE’s fully managed mode, designed to abstract away infrastructure concerns. You don’t manage nodes—instead, you define what your workloads need (CPU, memory), and Google provisions and scales the underlying compute for you.
You’re billed only for the resources your pods request, not the VMs they run on.
How Autopilot Pricing Works
You pay for:
- vCPU: Billed per core requested per second
- Memory: Billed per GB requested per second
- Ephemeral storage: Optional, billed by GB per second
- Startup and system overhead: Included automatically per pod
As of early 2025 (in the US region):
Google also adds system overhead per pod (usually 180m CPU and 512Mi memory), so your bill includes that baseline even if your container uses less.
Pricing Example
Let’s say you run a pod requesting:
- 1 vCPU
- 2 GiB memory
- 1 GiB ephemeral storage
Your cost per hour:
- vCPU: $0.055 × 1 = $0.055
- Memory: $0.006 × 2 = $0.012
- Storage: $0.0003 × 1 = $0.0003
- Total: ~$0.0673/hour per pod
Multiply that by number of pods and uptime, and your costs scale linearly.
Pros
- No need to manage nodes or autoscaling
- Control plane is included in the price
- Scales down to zero cleanly for idle workloads
- Great for dev teams who want fast onboarding
Tradeoffs
- No support for Spot VMs
- Less control over node-level settings
- Can be more expensive if pod requests are overestimated
GKE Pricing Model: Standard Mode
Standard mode gives you full control over your cluster’s infrastructure. You manage the node pools, instance types, autoscaling, and lifecycle. That means more flexibility—but also more responsibility for cost management.
Unlike Autopilot, you’re billed for the actual VMs (nodes) running your workloads, not for pod-level resource requests.
How Standard Pricing Works
You pay for:
- Control plane: $0.10/hour per cluster (~$72/month)
- Compute resources: Based on Google Compute Engine (GCE) pricing for the VM types you choose
- Storage and networking: Charged separately
You can run your workloads on any GCE instance type:
- General-purpose (e.g., e2, n2)
- High-memory, high-CPU, or custom
- Spot and preemptible instances for cost savings
- GPU instances for ML/AI workloads
Example:
A node pool of 3× n2-standard-4 VMs (4 vCPU, 16 GiB memory each)
Add $72/month for the control plane, plus any attached disks, snapshots, or load balancers.
Pros
- Full access to GCE features: custom VMs, Spot, GPUs
- Greater flexibility for tuning workloads and autoscaling
- Supports cluster-level optimizations and cost-aware scheduling
Tradeoffs
- You’re responsible for provisioning, scaling, and optimizing node pools
- Overprovisioning = wasted spend
- Can lead to underutilized resources if not monitored
GKE Free Tier
GKE offers a modest free tier, but it only applies to Standard mode clusters — not Autopilot.
What’s Included
- You get $74.40/month in credits per billing account
- This fully covers the control plane fee for one zonal Standard cluster
- No credits are applied to compute (VMs), storage, or networking
Example: If you run a single zonal cluster in Standard mode, your $0.10/hour control plane charge is fully offset by the free tier credit. If you spin up additional clusters or use regional clusters, you’ll start accruing charges.
What’s Not Included
- Autopilot clusters are excluded — you’ll be billed from the first pod
- Any node usage (even in Standard) is still billed
- Free tier does not stack with committed or sustained use discounts
If you’re starting with GKE for dev/test purposes, the free tier can cover the base cost of one cluster—but actual workload usage is always billed separately.
Pricing Add-ons and Gotchas
Beyond compute and control plane fees, GKE bills you for a range of supporting services. These often go unnoticed until they show up in your invoice.
Persistent Storage
- Billed separately via Google Persistent Disks
- Standard PD: ~$0.04/GB/month
- SSD PD: ~$0.17/GB/month
- Snapshots are charged incrementally per GB
Network Egress
- Charged based on destination and traffic volume
- Traffic within the same zone is free
- Egress to other GCP regions or the internet can get expensive
- Pricing tiers apply (e.g., $0.12/GB for internet egress in the US)
Load Balancers
- Google Cloud Load Balancers are billed based on forwarding rules, capacity, and data usage
- Expect ~$0.025/hour per rule + bandwidth fees
Backups & Snapshots
- Not part of GKE pricing directly, but often used with workloads
- Backups via GKE Backup or Velero incur storage and API usage costs
Observability
- GKE offers built-in monitoring and logging with Cloud Operations
- Logging beyond the free tier (50 GiB/month) is billed per GiB
Pro tip
Don’t just look at monthly totals—use Kubernetes cost monitoring to track per-pod or per-team costs and catch inefficiencies early. Idle nodes, oversized requests, and underutilized volumes are some of the most common budget leaks in GKE clusters.
To stay on top of these patterns, many teams use Kubernetes cost optimization tools that offer granular visibility and automate resource tuning based on real usage—helping prevent waste before it impacts your bill.
GKE Discounts and Commitments
Google Cloud offers several ways to reduce costs if you have predictable usage or can tolerate interruptions.
Sustained Use Discounts (SUDs)
- Applied automatically on eligible GCE VMs
- Kicks in after 25% usage per month
- Can reduce compute costs by up to 30%
- Applies to Standard mode only (not Autopilot)
Committed Use Discounts (CUDs)
- Reserve specific VM types and usage levels for 1 or 3 years
- Up to 70% discount vs pay-as-you-go pricing
- Available for most node types, including GPU VMs
- Best for production workloads with steady usage patterns
Spot VMs
- Temporary, interruptible VMs — up to 90% cheaper
- Ideal for stateless or batch workloads
- Only available in Standard mode
- Not supported in Autopilot
Node Pool Strategy
You can isolate different workload types into separate node pools:
- Use CUDs for steady workloads
- Use Spot VMs for batch or bursty jobs
- Use labels and taints to control scheduling
Pricing Example: Autopilot vs Standard
To understand how GKE pricing plays out in real workloads, let’s compare the cost of running the same app in both Autopilot and Standard modes.
Example Workload
Let’s say you run 4 pods, each requesting:
- 500m vCPU
- 1 GiB memory
- Running 24/7 for 30 days
Autopilot Mode Pricing
In Autopilot, you’re charged for requested resources, including system overhead per pod (~180m vCPU + 512Mi memory):
Effective pod request:
- vCPU: 0.5 + 0.18 = 0.68 vCPU
- Memory: 1 + 0.5 = 1.5 GiB
Pricing breakdown per pod:
Multiply by 4 pods → ~$133.64/month
✅ Control plane is included
❌ No discount options (Spot or CUDs)
Standard Mode Pricing
In Standard mode, let’s assume a 3-node pool of e2-standard-2 VMs (2 vCPU, 8 GiB RAM), running continuously:
Add $72/month for the control plane → ~$292/month
However:
- You can run other workloads on the same nodes
- You could cut costs using Spot VMs or CUDs
Summary: Monthly Cost Comparison
In Autopilot, pricing scales tightly with pod resource requests. In Standard mode, savings depend on how well you pack workloads into your nodes — and whether you optimize with discounts.
When to Choose Autopilot vs Standard
Choosing between Autopilot and Standard mode depends on your team’s priorities: operational simplicity vs infrastructure control, predictable usage vs bursty workloads, dev/test vs production.
Here’s a breakdown to help guide the decision:
Choose Autopilot if:
- You want to avoid managing nodes, scaling, or infra
- You prefer billing tied directly to pod requests
- Your workloads are short-lived, dev/test, or steady-state
- You’re a small team focused on velocity, not tuning nodes
- You don’t need Spot or GPU support
Autopilot is a great fit for:
- Internal tools and backend services
- Lightweight event-driven apps
- Teams without a dedicated infra/platform engineer
Choose Standard if:
- You want full control over VM types, node pools, and scheduling
- You need access to Spot, custom machine types, or GPUs
- You run bursty, compute-heavy, or ML/AI workloads
- You have large production deployments and want to tune cost
- You’re using multi-cluster strategies or node isolation patterns
Standard is ideal for:
- Cost-sensitive production teams
- High-performance or GPU-enabled workloads
- Teams building custom infra on top of Kubernetes
Not Sure Yet?
Some teams run both:
- Autopilot for dev/test and staging clusters
- Standard for prod, GPU, or Spot-heavy workloads
Final Thoughts
GKE gives you flexibility in how you run Kubernetes—but that flexibility comes with complexity in how you’re charged. Whether you choose Autopilot for simplicity or Standard for control, understanding how pricing works is the foundation for running clusters that are both reliable and cost-efficient.
But once you know how pricing works, the real challenge begins: keeping workloads right-sized, minimizing idle resource spend, and making infrastructure decisions that balance performance and cost.
We’ve put together a full GKE cost optimization guide that walks through practical strategies for reducing cloud waste—from tuning requests to using Spot VMs effectively.