Tear down of a terraform test messes up juju unit count: no longer zero-based #564

sed-i · 2024-09-05T06:21:59Z

Description

As part of a terraform test in canonical/mimir-worker-k8s-operator#74, when a testrun ends and terraform is tearing down, I noticed the following:

Tear down is much quicker than what I usually see from manual juju remove-application command.
After teardown, the next time I run the test, the unit name in juju status does not start with 0!.

Urgency

Casually reporting

Terraform Juju Provider version

0.13.0

Terraform version

Terraform v1.9.5

Juju version

3.5.3

Terraform Configuration(s)

For full context, see canonical/mimir-worker-k8s-operator#74.

run "setup_tests" {
  module {
    source = "./tests/setup"
  }
}

run "deploy_app" {
  variables {
    app_name   = "worker-${run.setup_tests.app_name_suffix}"
    model_name = "mimir5"
    channel    = "latest/edge"
    units      = 3
    trust      = true
    config = {
      role-all = "true"
    }
  }
}

run "deploy_minimal_context" {
  module {
    source = "./tests/minimal"
  }

  variables {
    app_name = "worker-${run.setup_tests.app_name_suffix}"
    model_name = "mimir5"
  }

  assert {
    condition     = juju_application.mimir_worker.name == "worker-${run.setup_tests.app_name_suffix}"
    error_message = "App name mismatch"
  }
}

Reproduce / Test

terraform test

Debug/Panic Output

# Juju status
# Note that minio scale is 1, but under "Units" we have minio/2 and nothing else

Model   Controller  Cloud/Region        Version  SLA          Timestamp
mimir5  uk8s        microk8s/localhost  3.5.3    unsupported  02:12:53-04:00

App         Version                Status   Scale  Charm                  Channel      Rev  Address         Exposed  Message
coord1                             blocked      1  mimir-coordinator-k8s  latest/edge   19  10.152.183.162  no       [consistency] Missing S3 integration.
minio       res:oci-image@220b31a  active       1  minio                  latest/edge  362  10.152.183.173  no       
worker-owl  2.12.0                 waiting      3  mimir-worker-k8s       latest/edge   37  10.152.183.61   no       installing agent

Unit           Workload  Agent      Address       Ports          Message
coord1/0*      blocked   executing  10.1.157.105                 [consistency] Missing S3 integration.
minio/2*       active    idle       10.1.157.115  9000-9001/TCP  
worker-owl/0*  waiting   idle       10.1.157.104                 Waiting for coordinator to publish a config
worker-owl/1   waiting   idle       10.1.157.109                 Waiting for coordinator to publish a config
worker-owl/2   waiting   idle       10.1.157.103                 Waiting for coordinator to publish a config


# K8s, however, shows minio-0

k get all -A | grep minio
mimir5                pod/minio-operator-0                         1/1     Running   0             83s
mimir5                pod/minio-0                                  1/1     Running   0             44s
mimir5                service/minio-operator         ClusterIP   10.152.183.254   <none>        30666/TCP                86s
mimir5                service/minio                  ClusterIP   10.152.183.173   <none>        9000/TCP,9001/TCP        45s
mimir5                service/minio-endpoints        ClusterIP   None             <none>        <none>                   44s
mimir5                statefulset.apps/minio-operator   1/1     83s
mimir5                statefulset.apps/minio            1/1     44s

Notes & References

No response

The text was updated successfully, but these errors were encountered:

Aflynn50 · 2024-09-05T15:20:52Z

Could you provide a minimal reproducer here?

Aflynn50 · 2024-09-05T15:22:38Z

Also, do the resources disappear eventually or not at all? The removal of the application is asynchronous to the execution of the plan.

hmlanigan · 2024-09-05T18:11:30Z

@sed-i

With juju in general, if you deploy an application with X number of units, then remove the application, and deploy again, the unit numbers with begin with X+1

Tear down with the juju terraform provider appears quicker than with the juju cli because it doesn't wait for tear down to be complete before returning. We have a number of bugs around this topic which will be addressed in the future.

sed-i · 2024-09-07T07:36:47Z

@hmlanigan this is news to me. I know VM charms behave that way, but in k8s I always expect unit count to start at zero.

hmlanigan · 2024-09-09T18:23:53Z

The unit name is determined by juju, not the provider has no controller there.

Please provide a small plan in the bug with the listed terraform commands to run to allows us to easily reproduce and debug.

Aflynn50 added the area/application label Sep 5, 2024

hmlanigan added kind/bug indicates a bug in the project priority/normal normal priority labels Sep 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tear down of a terraform test messes up juju unit count: no longer zero-based #564

Tear down of a terraform test messes up juju unit count: no longer zero-based #564

sed-i commented Sep 5, 2024 •

edited

Loading

Aflynn50 commented Sep 5, 2024

Aflynn50 commented Sep 5, 2024

hmlanigan commented Sep 5, 2024

sed-i commented Sep 7, 2024

hmlanigan commented Sep 9, 2024

Tear down of a terraform test messes up juju unit count: no longer zero-based #564

Tear down of a terraform test messes up juju unit count: no longer zero-based #564

Comments

sed-i commented Sep 5, 2024 • edited Loading

Description

Urgency

Terraform Juju Provider version

Terraform version

Juju version

Terraform Configuration(s)

Reproduce / Test

Debug/Panic Output

Notes & References

Aflynn50 commented Sep 5, 2024

Aflynn50 commented Sep 5, 2024

hmlanigan commented Sep 5, 2024

sed-i commented Sep 7, 2024

hmlanigan commented Sep 9, 2024

sed-i commented Sep 5, 2024 •

edited

Loading