Advertising disclosure: We earn commissions when you shop through the links below.
The Interruption
March 3rd, 6:47 AM. The AWS billing alert hits your Slack channel.
Your February bill: $8,010.
Your budget: $800.
You spend the next 4 hours in AWS Cost Explorer hunting for the culprit. EC2 spend looks normal. RDS is baseline. Lambda is negligible. Then you find it: NAT Gateway processing: $8,010.
A single resource you probably didn't even know was running is burning through your entire budget.
And AWS won't credit it back.
"We run ECS with ECR for container images. Every time a task restarts (which is 150+ times per day across our services), it pulls the image through the NAT Gateway. We didn't realize this until we looked at Cost Explorer in detail. AWS was charging $0.045 per GB processed through the NAT Gateway, plus data transfer egress. For 178,000 GB of image pulls, that's $8,010 in processing fees alone. AWS told us this was 'working as designed.'"
The Gripe: AWS Pricing Architecture Is Intentionally Opaque
This isn't a bug. It's not even a surprise to AWS. It's the by-design cost structure of AWS's network architecture: every byte moved within a VPC (Virtual Private Cloud) travels through billable layers.
You don't learn this in the AWS documentation overview. You discover it when the bill arrives.
And by then, you've architected your entire stack around a hidden cost lever that AWS controls.
The Discovery: VPC Endpoints vs Continued Bleeding ($8K/month)
After the bill shock, most DevOps teams do the same thing: Google "how to avoid NAT Gateway fees."
The answer: VPC Interface Endpoints for ECR (plus a free S3 Gateway Endpoint for image layers). They take 10 minutes to set up and eliminate the NAT Gateway route entirely for ECR pulls. Interface Endpoints cost roughly $0.01/hour per AZ plus a small per-GB processing fee — a fraction of NAT Gateway costs.
With VPC Endpoints configured for ECR, your container images travel directly through AWS's private network to your EC2 instance — no NAT Gateway, no NAT processing fees.
The bill drops from $8,010/month to a few hundred dollars (hourly compute plus minimal endpoint fees).
The Quick Fix (10 minutes)
# ECR requires two Interface Endpoints plus an S3 Gateway Endpoint
resource "aws_vpc_endpoint" "ecr_api" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.us-east-1.ecr.api"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
subnet_ids = [aws_subnet.private.id]
security_group_ids = [aws_security_group.vpc_endpoint.id]
}
resource "aws_vpc_endpoint" "ecr_dkr" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.us-east-1.ecr.dkr"
vpc_endpoint_type = "Interface"
private_dns_enabled = true
subnet_ids = [aws_subnet.private.id]
security_group_ids = [aws_security_group.vpc_endpoint.id]
}
# S3 Gateway Endpoint (free) — required because ECR
# stores image layers in S3
resource "aws_vpc_endpoint" "s3" {
vpc_id = aws_vpc.main.id
service_name = "com.amazonaws.us-east-1.s3"
vpc_endpoint_type = "Gateway"
route_table_ids = [aws_route_table.private.id]
}
The Broader Discovery: Data Movement Accounts for 30–50% of AWS Bills
This NAT Gateway issue is just the surface. What the migrating teams discovered is that AWS bills are structured around data movement, not computation:
- NAT Gateway processing: $0.045/GB
- NAT Gateway hourly: $0.045/hour (~$32.85/month, always running for most stacks)
- Internet egress (first 10TB): $0.09/GB
- Inter-AZ data transfer: $0.01/GB (each direction)
- CloudFront delivery to viewers (first 10TB): $0.085/GB
For teams doing bandwidth-heavy work (container registries, video streaming, high-traffic APIs), these data movement charges exceed compute costs. AWS's pricing architecture is optimized to monetize your data layer, not to optimize your infrastructure.
The Fascinations: What Surprised Migrating Teams
- The Cost Explorer Filter That Reveals Everything in 60 Seconds: In AWS Console → Billing → Cost Explorer, create a filter: Service = "EC2-Other", Filter by region and time range. Scroll down and sort by amount. NAT Gateway fees will jump out immediately if they're high. Most teams don't find this until after the bill shock.
- The Terraform VPC Endpoint Fix: VPC Interface Endpoints for ECR (plus a free S3 Gateway Endpoint for image layers) eliminate the NAT Gateway routing for container pulls. Interface Endpoints cost ~$0.01/hr per AZ — a fraction of what NAT Gateway charges. Most teams implement this in under 10 minutes and save thousands per month immediately.
- Why Cross-AZ RDS Read Replicas Are the Second-Biggest Silent Killer: If your database is in us-east-1a and your app servers are in us-east-1b, every query incurs $0.01/GB data transfer (in both directions). For a busy API hitting the database 100K times per day with 10KB per query, that's 1GB/day = $10/day = $300/month in cross-AZ fees alone. Solution: keep read replicas in the same AZ, or use a single-AZ RDS setup with proper backup strategy.
- The Vultr Bare Metal Breakeven Point for Bandwidth-Heavy Workloads: Vultr includes 10TB/month outbound bandwidth free. AWS charges $0.09/GB after 100GB. Breakeven: ~1.1TB/month of egress. If you exceed 2TB/month, Vultr is 10x cheaper. Most container-heavy or media-heavy workloads land here.
- What AWS Support Says When You Challenge a NAT Gateway Bill: "NAT Gateway processing fees are charged per GB processed. This is documented in the EC2 pricing page." No negotiation. No credit. It's working as designed.
The Architecture Decision Tree
If your AWS bill has NAT Gateway fees exceeding $500/month, you have two choices:
- Implement VPC Interface Endpoints for ECR + S3 Gateway Endpoint (10 min fix, saves thousands/month) — Best for teams already deep in AWS ecosystem
- Migrate to Vultr Bare Metal or Hetzner (permanent fix, saves $5-10K/month) — Better for teams doing container orchestration or media delivery at scale
Most teams do both: fix the NAT Gateway issue immediately, then gradually migrate bandwidth-heavy services to dedicated servers while keeping low-traffic services on AWS.
Run the AWS vs Vultr Egress Calculator →