The startup script runs on every boot but the metadata_startup_script
field is in lifecycle.ignore_changes — so a TF apply that changed
image_tag does NOT reach a long-lived VM until someone explicitly
recreates it. Meanwhile, operators commonly hand-edit /opt/agnes/.env
to pin a specific image (custom branch builds, staged rollouts).
Pre-fix, every boot rewrote .env from the baked-in template and
clobbered the operator's choice — concretely, a stop+start triggered
by a machine_type change would reset AGNES_TAG to whatever was in
the template at first provision, regardless of the operator's
intervening edit.
Now the script reads the existing .env (when present) for AGNES_TAG
and AGNES_TEMP_DIR; when those keys are set, the existing values
win over the template-computed ones. Logged on stdout when AGNES_TAG
disagrees with $IMAGE_TAG so an operator audit-trails the boot.
Fresh provisions are unchanged (no .env yet → template values land).
To force a TF-driven reset on an existing VM: rm /opt/agnes/.env
and reboot. Cut as infra-v1.8.0 — additive, downstream consumers
opt in by bumping the module ref.