A comprehensive guide for migrating microservices from VM to AWS Lambda with cost optimization strategies.
- Understanding Lambda Instances
- Cold Starts Explained
- Provisioned Concurrency
- Code Changes Required
- Cost Comparison
- Break-Even Analysis
- Hidden Costs
- Recommendations
A Lambda instance is a micro-container that AWS spins up to run your code.
┌─────────────────────────────────────────┐
│ Lambda Instance │
│ ┌─────────────────────────────────┐ │
│ │ Your Code (JAR/runtime) │ │
│ │ - Loaded into memory │ │
│ │ - Application context init │ │
│ │ - DB connections established │ │
│ └─────────────────────────────────┘ │
│ Memory: Configured | CPU: Proportional │
└─────────────────────────────────────────┘
| Property | Description |
|---|---|
| Concurrency | One instance handles one request at a time |
| Lifecycle | Created on-demand, destroyed after idle timeout (~5-15 min) |
| Isolation | Each instance is fully isolated |
| Scaling | AWS automatically creates more instances for concurrent requests |
A cold start occurs when AWS needs to create a new Lambda instance to handle a request. This involves:
- Provisioning a container
- Downloading your code
- Starting the runtime (JVM, Node.js, etc.)
- Initializing your application (Spring context, DB pools, etc.)
| Runtime | Typical Cold Start |
|---|---|
| Python | 200-500ms |
| Node.js | 200-500ms |
| Go | 100-200ms |
| Java (plain) | 3-5s |
| Java (Spring Boot) | 5-15s |
| Java (GraalVM native) | 200-500ms |
Without Provisioned Concurrency:
Request arrives
│
▼
┌─────────────────────────────────────────┐
│ COLD START (5-15 seconds for Spring) │
│ 1. AWS provisions container │
│ 2. Downloads your code │
│ 3. Starts JVM │
│ 4. Loads Spring Boot context │
│ 5. Initializes beans, DB pools │
└─────────────────────────────────────────┘
│
▼
Execute function (e.g., 500ms)
│
▼
Return response
│
▼
Instance stays WARM for ~5-15 min
│
▼
No requests → AWS kills instance
│
▼
Next request → COLD START again
Provisioned Concurrency keeps a specified number of Lambda instances pre-initialized and ready to respond instantly.
With Provisioned Concurrency = 1:
You configure: "Keep 1 instance always ready"
│
▼
┌─────────────────────────────────────────┐
│ AWS pre-initializes 1 instance │
│ - Container provisioned │
│ - Code loaded │
│ - Runtime started │
│ - Application FULLY initialized │
│ - Waiting for requests │
│ - YOU PAY for this idle time │
└─────────────────────────────────────────┘
│
▼
Request arrives → INSTANT execution (no cold start)
AWS SAM/CloudFormation:
AutoPublishAlias: live
ProvisionedConcurrencyConfig:
ProvisionedConcurrentExecutions: 1AWS CLI:
aws lambda put-provisioned-concurrency-config \
--function-name my-service \
--qualifier live \
--provisioned-concurrent-executions 1Even with Provisioned Concurrency, cold starts occur in these scenarios:
Provisioned Concurrency = 1
Time 0ms: Request A arrives → Uses warm instance ✓
Time 10ms: Request B arrives → Instance busy!
│
▼
┌─────────────────────┐
│ COLD START for B │
│ New instance spins │
│ up on-demand │
└─────────────────────┘
1 PC instance = 1 concurrent request capacity
AWS occasionally recycles instances for maintenance. Brief cold starts possible during replacement.
New code deployments require re-initialization of provisioned instances.
Formula:
PC = requests_per_second × average_duration_in_seconds
Examples:
| Traffic | Duration | PC Needed |
|---|---|---|
| 1 req/sec | 500ms | 1 |
| 10 req/sec | 500ms | 5 |
| 10 req/sec | 2s | 20 |
| 100 req/sec | 200ms | 20 |
| Approach | Effort | Cold Start | Description |
|---|---|---|---|
| AWS Lambda Web Adapter | 5-10% | High (10-15s) | Keep Spring Boot as-is |
| Spring Cloud Function | 20-30% | High (10-15s) | Refactor to Function beans |
| GraalVM Native Image | 30-50% | Low (<1s) | Compile to native binary |
| Rewrite (Go/Rust/Node) | 80-100% | Very Low | Complete rewrite |
Keep your Spring Boot app nearly unchanged:
# SAM template
Runtime: java21
Handler: run.sh
MemorySize: 2048
Layers:
- arn:aws:lambda:region:753240598075:layer:LambdaAdapterLayerX86:1// Replace @RestController with Function beans
@Bean
public Function<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> handleRequest() {
return request -> {
// your logic
return response;
};
}# Build native image
./gradlew nativeCompileRequires reflection configuration and library compatibility testing.
| Component | Monthly Cost |
|---|---|
| t3.medium (2 vCPU, 4GB) | $30 |
| ALB (base) | $16 |
| ALB (LCU estimate) | $5 |
| Total | $51 |
| 6-Month Total | $306 |
Monthly PC Cost = Memory (GB) × 2,628,000 seconds × $0.000004167
| Memory | Monthly Cost/Instance | 6-Month Cost/Instance |
|---|---|---|
| 128MB | $1.37 | $8.22 |
| 256MB | $2.74 | $16.44 |
| 512MB | $5.48 | $32.88 |
| 1GB | $10.96 | $65.76 |
| 2GB | $21.90 | $131.40 |
| Memory | Framework Support |
|---|---|
| 128MB | Go, Rust, Python, Node.js only |
| 256MB | Lightweight frameworks only |
| 512MB | Micronaut, Quarkus native (barely) |
| 1GB+ | Spring Boot minimum |
| 2GB | Spring Boot recommended |
| PC Instances | 6-Month Cost | Savings vs VM |
|---|---|---|
| 1 | $8.22 | $298 (97%) |
| 5 | $41.10 | $265 (87%) |
| 10 | $82.20 | $224 (73%) |
| 20 | $164.40 | $142 (46%) |
| 30 | $246.60 | $59 (19%) |
| 37 | $304.14 | Break-even |
| 50 | $411.00 | -$105 (loss) |
| PC Instances | 6-Month Cost | Savings vs VM |
|---|---|---|
| 1 | $16.44 | $290 (95%) |
| 5 | $82.20 | $224 (73%) |
| 10 | $164.40 | $142 (46%) |
| 15 | $246.60 | $59 (19%) |
| 18 | $295.92 | Break-even |
| 25 | $411.00 | -$105 (loss) |
| PC Instances | 6-Month Cost | Savings vs VM |
|---|---|---|
| 1 | $131.40 | $144 (47%) |
| 2 | $262.80 | $43 (14%) |
| 3 | $394.20 | -$88 (loss) |
Maximum PC Instances While Saving Money (6 months, vs $306 VM)
══════════════════════════════════════════════════════════════
128MB: ████████████████████████████████████░░░░ 37 instances
│←────────── SAVE MONEY ──────────→│
256MB: ██████████████████░░░░░░░░░░░░░░░░░░░░░░ 18 instances
│←──── SAVE MONEY ────→│
2GB: ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 2 instances
│S│
Hidden Costs
| Service | Cost | Notes |
|---|---|---|
| API Gateway | $3.50/million requests | Use Function URLs instead (free) |
| NAT Gateway | $32/month + $0.045/GB | Required for VPC access |
| VPC Endpoints | $7.30/endpoint/month | Alternative to NAT Gateway |
| CloudWatch Logs | $0.50/GB ingested | Reduce log verbosity |
| Data Transfer | $0.09/GB out | Same as EC2 |
If your Lambda needs VPC access (for RDS, ElastiCache, etc.):
| Setup | Additional Monthly Cost |
|---|---|
| NAT Gateway | $32 + data transfer |
| VPC Endpoints | $7.30 per endpoint |
Warning: NAT Gateway costs often eliminate Lambda's cost advantage.
| Scenario | Recommendation | Why |
|---|---|---|
| Low traffic, sequential requests | Lambda PC=1 | Best savings (47-97%) |
| Moderate traffic, some concurrency | Lambda PC=2-5 | Good balance |
| Need VPC access (RDS/ElastiCache) | Stay on VM | NAT Gateway kills savings |
| High concurrency (>2 for Spring) | Stay on VM | VM is cheaper |
| Unpredictable bursts | Lambda PC=1 | Accept cold starts on bursts |
| Latency-critical + high traffic | ECS Fargate or App Runner | Better fit |
| Stack | Memory | Max PC (cost-effective) | Best Savings |
|---|---|---|---|
| Go/Rust | 128MB | 37 instances | 97% at PC=1 |
| Node.js/Python | 256MB | 18 instances | 95% at PC=1 |
| Quarkus/Micronaut native | 256MB | 18 instances | 95% at PC=1 |
| Spring Boot | 2GB | 2 instances | 47% at PC=1 |
| Memory | Break-Even PC | Sweet Spot | 6-Month Savings |
|---|---|---|---|
| 128MB | 37 | 10-20 | $224-$298 |
| 256MB | 18 | 5-10 | $142-$290 |
| 2GB | 2 | 1 | $144 |
# Provisioned Concurrency (idle cost)
Monthly = Memory_GB × 2,628,000 × $0.000004167
# Execution (when running)
Cost = Memory_GB × Duration_sec × Requests × $0.000009722
# Requests
Cost = Requests × $0.20 / 1,000,000
# Set provisioned concurrency
aws lambda put-provisioned-concurrency-config \
--function-name my-service \
--qualifier live \
--provisioned-concurrent-executions 2
# Check current configuration
aws lambda get-provisioned-concurrency-config \
--function-name my-service \
--qualifier live
# Remove provisioned concurrency
aws lambda delete-provisioned-concurrency-config \
--function-name my-service \
--qualifier liveFor Spring Boot services: Lambda with Provisioned Concurrency makes sense only at PC=1 or PC=2, saving 14-47% over 6 months. Beyond that, stick with VM/ECS.
For lightweight runtimes (Go, Rust, Node.js): Lambda is significantly cheaper, supporting up to 18-37 concurrent instances while still saving money.
Key decision factor: If you need VPC access, the NAT Gateway cost often makes Lambda more expensive than VM. Consider VPC Endpoints or staying on VM/ECS.
Last updated: January 2026