Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tear down of a terraform test messes up juju unit count: no longer zero-based #564

Open
sed-i opened this issue Sep 5, 2024 · 5 comments
Labels
area/application kind/bug indicates a bug in the project priority/normal normal priority

Comments

@sed-i
Copy link

sed-i commented Sep 5, 2024

Description

As part of a terraform test in canonical/mimir-worker-k8s-operator#74, when a testrun ends and terraform is tearing down, I noticed the following:

  • Tear down is much quicker than what I usually see from manual juju remove-application command.
  • After teardown, the next time I run the test, the unit name in juju status does not start with 0!.

Urgency

Casually reporting

Terraform Juju Provider version

0.13.0

Terraform version

Terraform v1.9.5

Juju version

3.5.3

Terraform Configuration(s)

For full context, see canonical/mimir-worker-k8s-operator#74.

run "setup_tests" {
  module {
    source = "./tests/setup"
  }
}

run "deploy_app" {
  variables {
    app_name   = "worker-${run.setup_tests.app_name_suffix}"
    model_name = "mimir5"
    channel    = "latest/edge"
    units      = 3
    trust      = true
    config = {
      role-all = "true"
    }
  }
}

run "deploy_minimal_context" {
  module {
    source = "./tests/minimal"
  }

  variables {
    app_name = "worker-${run.setup_tests.app_name_suffix}"
    model_name = "mimir5"
  }

  assert {
    condition     = juju_application.mimir_worker.name == "worker-${run.setup_tests.app_name_suffix}"
    error_message = "App name mismatch"
  }
}

Reproduce / Test

terraform test

Debug/Panic Output

# Juju status
# Note that minio scale is 1, but under "Units" we have minio/2 and nothing else

Model   Controller  Cloud/Region        Version  SLA          Timestamp
mimir5  uk8s        microk8s/localhost  3.5.3    unsupported  02:12:53-04:00

App         Version                Status   Scale  Charm                  Channel      Rev  Address         Exposed  Message
coord1                             blocked      1  mimir-coordinator-k8s  latest/edge   19  10.152.183.162  no       [consistency] Missing S3 integration.
minio       res:oci-image@220b31a  active       1  minio                  latest/edge  362  10.152.183.173  no       
worker-owl  2.12.0                 waiting      3  mimir-worker-k8s       latest/edge   37  10.152.183.61   no       installing agent

Unit           Workload  Agent      Address       Ports          Message
coord1/0*      blocked   executing  10.1.157.105                 [consistency] Missing S3 integration.
minio/2*       active    idle       10.1.157.115  9000-9001/TCP  
worker-owl/0*  waiting   idle       10.1.157.104                 Waiting for coordinator to publish a config
worker-owl/1   waiting   idle       10.1.157.109                 Waiting for coordinator to publish a config
worker-owl/2   waiting   idle       10.1.157.103                 Waiting for coordinator to publish a config


# K8s, however, shows minio-0

k get all -A | grep minio
mimir5                pod/minio-operator-0                         1/1     Running   0             83s
mimir5                pod/minio-0                                  1/1     Running   0             44s
mimir5                service/minio-operator         ClusterIP   10.152.183.254   <none>        30666/TCP                86s
mimir5                service/minio                  ClusterIP   10.152.183.173   <none>        9000/TCP,9001/TCP        45s
mimir5                service/minio-endpoints        ClusterIP   None             <none>        <none>                   44s
mimir5                statefulset.apps/minio-operator   1/1     83s
mimir5                statefulset.apps/minio            1/1     44s

Notes & References

No response

@Aflynn50
Copy link
Contributor

Aflynn50 commented Sep 5, 2024

Could you provide a minimal reproducer here?

@Aflynn50
Copy link
Contributor

Aflynn50 commented Sep 5, 2024

Also, do the resources disappear eventually or not at all? The removal of the application is asynchronous to the execution of the plan.

@hmlanigan hmlanigan added kind/bug indicates a bug in the project priority/normal normal priority labels Sep 5, 2024
@hmlanigan
Copy link
Member

@sed-i

With juju in general, if you deploy an application with X number of units, then remove the application, and deploy again, the unit numbers with begin with X+1

Tear down with the juju terraform provider appears quicker than with the juju cli because it doesn't wait for tear down to be complete before returning. We have a number of bugs around this topic which will be addressed in the future.

@sed-i
Copy link
Author

sed-i commented Sep 7, 2024

@hmlanigan this is news to me. I know VM charms behave that way, but in k8s I always expect unit count to start at zero.

@hmlanigan
Copy link
Member

The unit name is determined by juju, not the provider has no controller there.

Please provide a small plan in the bug with the listed terraform commands to run to allows us to easily reproduce and debug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/application kind/bug indicates a bug in the project priority/normal normal priority
Projects
None yet
Development

No branches or pull requests

3 participants