From f52cfd1119186836059ff7df04fce3169238567f Mon Sep 17 00:00:00 2001 From: ZdenekSrotyr <139972147+ZdenekSrotyr@users.noreply.github.com> Date: Thu, 7 May 2026 06:58:10 +0200 Subject: [PATCH] infra(customer-instance): allow stopping VMs for in-place updates (#211) Add allow_stopping_for_update=true on google_compute_instance.vm. Without it, a TF change to machine_type triggers ForceNew (destroy + recreate); with it, the provider stops + mutates + restarts the VM in place, which is what an operator resizing a running deployment expects. Tag as infra-v1.7.0; consumers opt in by bumping the module ref. --- CHANGELOG.md | 4 ++++ infra/modules/customer-instance/main.tf | 9 +++++++++ 2 files changed, 13 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index e02797d..4f43ff6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,6 +10,10 @@ CalVer image tags (`stable-YYYY.MM.N`, `dev-YYYY.MM.N`) are produced for every C ## [Unreleased] +### Internal + +- `infra/modules/customer-instance` (tag `infra-v1.7.0`): `google_compute_instance.vm` now sets `allow_stopping_for_update = true`. Without it, changing `machine_type` (or any other field GCP will only mutate on a stopped VM) caused Terraform to fall back to a destroy + recreate, churning VM-local state for what should be an in-place resize. Consumers do not need to update — the field is provider-side only — but bumping the module ref to `infra-v1.7.0` enables in-place machine-type bumps. + ## [0.43.0] — 2026-05-06 ### Added diff --git a/infra/modules/customer-instance/main.tf b/infra/modules/customer-instance/main.tf index 3666ad3..a162857 100644 --- a/infra/modules/customer-instance/main.tf +++ b/infra/modules/customer-instance/main.tf @@ -184,6 +184,15 @@ resource "google_compute_instance" "vm" { zone = var.zone tags = ["agnes-${var.customer_name}"] + # Without this, a `machine_type` change in TF triggers a full + # ForceNew (destroy + recreate) of the VM. The data disk would + # survive (it's a separate `attached_disk`), but VM-local state + # — fingerprints, journald, ephemeral caches — would not. With + # `true`, the provider stops the VM, mutates the field, and + # restarts it in place, which is what an operator resizing a + # running deployment actually wants. + allow_stopping_for_update = true + boot_disk { initialize_params { image = "ubuntu-os-cloud/ubuntu-2404-lts-amd64"