Vận hành (Operations Runbook)

Các thao tác vận hành hàng ngày: refresh, scale, backup, và xử lý user_data

Rolling Refresh ASG

Khi thay đổi Launch Template (user_data, AMI), ASG tự refresh instances:

Rolling Refresh Config
# ASG configuration in module 09-autoscaling
instance_refresh {
  strategy = "Rolling"
  preferences {
    min_healthy_percentage = var.asg_refresh_min_healthy_percentage  # default 50
    instance_warmup        = var.asg_refresh_instance_warmup         # default 300s
  }
}

# When you change user_data → terraform apply → ASG triggers rolling refresh
# Old instances are terminated gradually while new ones come up
Force Refresh
# Force refresh immediately (without waiting for user_data change)
terraform apply -replace='module.autoscaling.aws_launch_template.api'

# Adjust refresh speed
# In terraform.tfvars:
asg_refresh_min_healthy_percentage = 50    # Lower = faster but less safe
asg_refresh_instance_warmup        = 300   # Seconds for new instance to warm up

Scale Instances

Điều chỉnh số lượng instance cho mỗi role:

Scaling Configuration
# terraform.tfvars

# ASG for API (module 09)
asg_min_size         = 2
asg_max_size         = 4
asg_desired_capacity = 2

# Fixed instances (module 17)
instance_counts = {
  web          = 2    # Scale web to 2 instances
  node_commu   = 3    # Scale node_commu to 3 instances
  node_world   = 2
  mysql        = 1    # Keep mysql at 1
  mysql_mirror = 1
  # Omit a key = default 1
  # Set 0 to skip the role
}

Cập nhật User Data

Lưu ý quan trọng khi thay đổi user_data:

Nguy hiểm

Module 17 có user_data_replace_on_change = true, nghĩa là mọi thay đổi trong scripts/*.sh sẽ DESTROY + CREATE lại EC2. Dữ liệu trên EBS sẽ mất!

Backup MySQL

Trước khi apply thay đổi user_data cho DB instances:

MySQL Backup Before Redeploy
# 1. SSH to MySQL instance
ssh ec2-user@<mysql-ip>

# 2. Backup all databases
mysqldump --all-databases > schema_only.sql

# 3. Upload to S3
aws s3 cp schema_only.sql s3://<bucket>/docker/milu2-mysql/

# 4. (Optional) Backup users
# Create mysql_users.sql with CREATE USER IF NOT EXISTS, GRANT statements
aws s3 cp mysql_users.sql s3://<bucket>/docker/milu2-mysql/

# 5. Now safe to apply changes
terraform apply -replace='module.ec2_instances.aws_instance.this["mysql-1"]'

Hotfix không cần redeploy

Khi cần sửa nhanh mà không muốn destroy EC2:

Hotfix on EC2
# SSH to EC2
ssh ec2-user@<instance-ip>

# Edit mount-s3 service
sudo vi /etc/systemd/system/mount-s3.service

# Edit docker run script
vi /home/ec2-user/milu2/<role>/docker/run.sh

# Reload and restart
sudo systemctl daemon-reload
sudo systemctl restart mount-s3.service

# For docker containers
cd /home/ec2-user/milu2/<role>/docker
./run.sh

Kiểm tra Logs

Các file log quan trọng trên EC2:

FileMô tả
/var/log/cloud-init-output.logFull output của cloud-init (user_data)
/var/log/milu2-docker-bootstrap.logS3 sync + docker run + mysql import
/var/log/messagesSystem messages
docker logs <container>Container logs
Log Commands
# Check cloud-init status
cloud-init status

# View cloud-init logs
sudo tail -100 /var/log/cloud-init-output.log

# View docker bootstrap logs
sudo tail -100 /var/log/milu2-docker-bootstrap.log

# Check mount-s3 status
sudo systemctl status mount-s3.service

# Check docker containers
docker ps -a
docker logs milu2-api