Troubleshooting

Common errors and solutions when deploying MILU2 Stage Infrastructure

Common Errors

No valid credential sources found

Cause: AWS SSO expired
Solution: Run aws sso login and try again. Or use deploy.ps1 (auto refresh credentials)
aws sso login

RepositoryAlreadyExistsException (ECR)

Cause: ECR repo exists on AWS but not in state
Solution: Import repo into state
terraform import 'module.ecr.aws_ecr_repository.this["milu2/milu2-stage-api"]' milu2/milu2-stage-api

DuplicateLoadBalancerName

Cause: ALB/NLB already exists on AWS
Solution: Use deploy.ps1 (auto import) or import manually
terraform import 'module.alb.aws_lb.internal' <ALB_ARN>

ACM apply hangs at aws_acm_certificate_validation

Cause: DNS CNAME record for validation not created
Solution: Get CNAME from ACM and create on DNS provider
aws acm describe-certificate --certificate-arn <arn>
# Get the CNAME Name and Value from DomainValidationOptions
# Create the CNAME record on your DNS provider

VpcPeeringConnectionAlreadyExists

Cause: VPC Peering already exists
Solution: Delete old peering on Console or import into state

expected length of user_data to be in the range (0 - 16384)

Cause: user_data script too large (>16KB)
Solution: Module already uses user_data_base64 + base64gzip(). Pull latest code

EC2 cannot write to /mnt/s3

Cause: mount-s3 missing flags --allow-overwrite --allow-delete
Solution: SSH to EC2 and fix mount-s3.service
sudo sed -i 's|--gid 1000.*|--gid 1000 --dir-mode 0777 --file-mode 0666 --allow-overwrite --allow-delete|' /etc/systemd/system/mount-s3.service
sudo systemctl daemon-reload
sudo umount -l /mnt/s3
sudo systemctl restart mount-s3.service

MySQL not importing schema

Cause: schema_only.sql file not uploaded to S3
Solution: Upload file then redeploy or run import manually
# Upload schema
aws s3 cp schema_only.sql s3://<bucket>/docker/milu2-mysql/

# Or import manually on EC2
docker exec -i milu2mysqld mysql -uroot -p$PWD < /home/ec2-user/milu2/milu2-mysql/docker/schema_only.sql

Web service shows Docker daemon IP instead of client IP

Cause: nginx set_real_ip_from 0.0.0.0; (missing /0)
Solution: Change to set_real_ip_from 0.0.0.0/0; or list specific CIDRs

CIDR conflict when deploying multiple stages

Cause: stage_index conflicts with another stage
Solution: Run preflight-check.ps1 to scan and suggest available stage_index
.\preflight-check.ps1
# Check 7b will scan build VPC route table and suggest non-conflicting stage_index

Debug Tips

Steps to debug when encountering issues:

Debug Commands
# 1. Check Terraform state
terraform state list
terraform state show '<resource_address>'

# 2. Enable debug logging
$env:TF_LOG = "DEBUG"
terraform plan 2>&1 | Out-File debug.log

# 3. Check AWS credentials
aws sts get-caller-identity

# 4. Check EC2 logs
ssh ec2-user@<ip>
sudo tail -f /var/log/cloud-init-output.log
sudo tail -f /var/log/milu2-docker-bootstrap.log

# 5. Check docker containers
docker ps -a
docker logs <container_name>

# 6. Check systemd services
sudo systemctl status mount-s3.service
sudo journalctl -u mount-s3.service

Tip

When encountering unclear errors, run terraform plan with TF_LOG=DEBUG to see detailed API calls.