Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removing default tags fails to untag resources on v6 #2633

Closed
Tracked by #2539
danielrbradley opened this issue Jul 24, 2023 · 16 comments
Closed
Tracked by #2539

Removing default tags fails to untag resources on v6 #2633

danielrbradley opened this issue Jul 24, 2023 · 16 comments
Assignees
Labels
kind/bug Some behavior is incorrect or out of spec p1 A bug severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed
Milestone

Comments

@danielrbradley
Copy link
Member

The final step (5) in the test TestAccDefaultTagsGo is currently failing in CI but passes locally.

We're disabling the test temporarily to unblock the release but should diagnose the failure as a matter of urgency post-release.

Step 5 is asserting that when all default tags are removed from the provider, then they're also removed from all resources. This fails and the previous default tags remain on the resources.

@danielrbradley danielrbradley added kind/bug Some behavior is incorrect or out of spec area/tests p1 A bug severe enough to be the next item assigned to an engineer impact/flaky-test A test that is unreliable labels Jul 24, 2023
danielrbradley added a commit that referenced this issue Jul 24, 2023
- Add more feedback to what we're doing and why.
- Explain how the default tags are applied by the bridge.
- Disable step 5 of the test until we resolve #2633
@danielrbradley
Copy link
Member Author

@mikhailshilkov
Copy link
Member

@t0yv0 I assigned to you since you were looking at it yesterday - please let me know if you don't have time to continue

@t0yv0
Copy link
Member

t0yv0 commented Jul 25, 2023

I can reproduce this issue locally on the latest v6 branch.

I have been digging into this issue.

The issue appears to be an incorrect resource Diff when moving from a state that defines some provider-default tags to a state where the state defines no provider-default tags.

Adding or removing some but not all default tags, or changing tags works as expected.

@t0yv0
Copy link
Member

t0yv0 commented Jul 25, 2023

When this feature does work, we get a non-empty InstanceDiff; the fix for default tags looked at how we used to suppress non-empty diffs for computed attributes and we stopped doing that for XComputedInput to make an exception for tags_all. However in this case that's not happening at all, we even don't get to this call site:

  • on v6 we use simpleDiffViaPlanState
  • as inputs to this we have tags_all in prior state set to {"thwomp": "pow"} but nil in rawConfig
  • when we compute objchange.ProposedNew(priorState, rawConfig) it copies {"thwomp": "pow"} to planned state; this seems bad
  • res.SimpleDiff returns empty diff because it's comparing {"thwomp": "pow"} to {"thwomp": "pow"}
  • therefore forceNew detection fails to find anything
  • therefore provider responds with Diff: None
  • therefore bucket does not get updated, Update does not execute

@t0yv0 t0yv0 removed impact/flaky-test A test that is unreliable area/tests labels Jul 25, 2023
@t0yv0
Copy link
Member

t0yv0 commented Jul 25, 2023

I'm removing flaky/test labels AFAIK this is a legit product bug.

@t0yv0
Copy link
Member

t0yv0 commented Jul 25, 2023

On second thought, when this does work, like adding a new tag, we also copy {"thwomp": "pow"} to the inputs of res.SimpleDiff however res.SimpleDiff responds with a correct diff. It figures out there's a new tag.. So somehow SimpleDiff engages TF hooks here. I need to dig deeper.

@t0yv0 t0yv0 changed the title v6 default tags test failing in CI Removing v6 default tags fails to untag resources Jul 25, 2023
@t0yv0 t0yv0 changed the title Removing v6 default tags fails to untag resources Removing default tags fails to untag resources on v6 Jul 25, 2023
@t0yv0 t0yv0 assigned iwahbe and unassigned t0yv0 Jul 26, 2023
@mnlumi mnlumi mentioned this issue Jul 26, 2023
34 tasks
@iwahbe
Copy link
Member

iwahbe commented Jul 27, 2023

res.SimpleDiff eventually calls res.CustomizeDiff which is SetTagsDiff. SetTagsDiff is responsible for utilizing the provider's state to determine what the diff should be. Unfortunately, it looks like SetTagsDiff only works when there is a non-empty tag set. I don't think the special TF hook is what we are missing.

@t0yv0
Copy link
Member

t0yv0 commented Jul 27, 2023

One laborious, but possible avenue of investigation is this. Does this example work when executed via TF CLI in TF world? If it does work, does that end up invoking SetTagsDiff, or how does it ensure the end result?

@mikhailshilkov
Copy link
Member

Before we look at TF directly - Do I remember correctly that this test works in 5.x of our provider? If so, do we know what changed?

@iwahbe
Copy link
Member

iwahbe commented Jul 28, 2023

@t0yv0 The example does work in the TF world. I confirmed that before investing too much time in fixing the bug on our end.

@mikhailshilkov hashicorp/terraform-provider-aws#30793 changed how SetTagsDiff works. We can revert, but their PR fixed other bugs so I'm hesitant to back it out.

@iwahbe
Copy link
Member

iwahbe commented Jul 28, 2023

I've been experimenting with running terraform-provider-aws directly against TF. I've been using this program:

provider "aws" {
 region = "us-east-1"
-  default_tags { tags = { "foo": "bar" } }
+  # default_tags { tags = { "foo": "bar" } }
}
resource "aws_s3_bucket" "example" {}

This calls into SetTagsDiff as expected, but follows a different execution path then expected. diff.HasChange("tags") returns true, so the diff is recorded here.

This is different to how pulumi-aws runs through SetTagsDiff with an equivalent program. For us diff.HasChange("tags") returns false. This is sufficient to cause the bug we are observing.

Investigating further, we don't appear to set tags at all:

Running this program against v5 or v6 of pulumi-aws:

name: secret-random-yaml
runtime: yaml
resources:
  p:
    type: pulumi:providers:aws
    properties:
      defaultTags:
        tags:
          foo: buz
      region: us-west-2
    defaultProvider: true
  b:
    type: aws:s3:BucketV2
outputs:
  tags: ${b.tags}

We get an empty tags output on both versions:

𝛌 pulumi stack output tags
{}

@t0yv0
Copy link
Member

t0yv0 commented Jul 28, 2023

Excellent sleuthing. This is very surprising and violates my mental model of what's happening. I thought that in the listed TF program "tags" is in fact not changing. I assume "tags" refers to the tags attribute of "example" bucket, not the provider. Do you have a test harness around to reproduce this result? Tracing the contents of diff *schema.ResourceDiff and/or the inputs into the diff computation can help understand how it's doing it.

@iwahbe
Copy link
Member

iwahbe commented Jul 28, 2023

More digging makes it even less cut and dry. Given the above program, the initial terraform apply gives this diff (tags_all but no tags):

  # aws_s3_bucket.example will be created
  + resource "aws_s3_bucket" "example" {
      + acceleration_status         = (known after apply)
      + acl                         = (known after apply)
      + arn                         = (known after apply)
      + bucket                      = (known after apply)
      + bucket_domain_name          = (known after apply)
      + bucket_prefix               = (known after apply)
      + bucket_regional_domain_name = (known after apply)
      + force_destroy               = false
      + hosted_zone_id              = (known after apply)
      + id                          = (known after apply)
      + object_lock_enabled         = (known after apply)
      + policy                      = (known after apply)
      + region                      = (known after apply)
      + request_payer               = (known after apply)
      + tags_all                    = {
          + "foo" = "bar"
        }
      + website_domain              = (known after apply)
      + website_endpoint            = (known after apply)
    }

However, running terraform plan directly after that (on the same program) does create a diff on tags:

𝛌 terraform plan 
aws_s3_bucket.example: Refreshing state... [id=terraform-20230728140713668600000001]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply" which may have affected this plan:

  # aws_s3_bucket.example has changed
  ~ resource "aws_s3_bucket" "example" {
        id                          = "terraform-20230728140713668600000001"
      + tags                        = {}
        # (11 unchanged attributes hidden)

        # (3 unchanged blocks hidden)
    }


Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan
may include actions to undo or respond to these changes.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Changes to Outputs:
  + tags = {}

You can apply this plan to save these new output values to the Terraform state, without changing any real infrastructure.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform
apply" now.

Commenting out default_tags { tags = { "foo": "bar" } } and running terraform plan we see this diff (note the out of band tags addition):

𝛌 terraform plan 
aws_s3_bucket.example: Refreshing state... [id=terraform-20230728140713668600000001]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply" which may have affected this plan:

  # aws_s3_bucket.example has changed
  ~ resource "aws_s3_bucket" "example" {
        id                          = "terraform-20230728140713668600000001"
      + tags                        = {
          + "foo" = "bar"
        }
        # (11 unchanged attributes hidden)

        # (3 unchanged blocks hidden)
    }


Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan
may include actions to undo or respond to these changes.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # aws_s3_bucket.example will be updated in-place
  ~ resource "aws_s3_bucket" "example" {
        id                          = "terraform-20230728140713668600000001"
      ~ tags                        = {
          - "foo" = "bar" -> null
        }
      ~ tags_all                    = {
          - "foo" = "bar"
        } -> (known after apply)
        # (10 unchanged attributes hidden)

        # (3 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

Changes to Outputs:
  + tags = {}

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform
apply" now.

@iwahbe
Copy link
Member

iwahbe commented Jul 28, 2023

Another interesting note, where terraform plan results in an out of bound tags change:

  # aws_s3_bucket.example has changed
  ~ resource "aws_s3_bucket" "example" {
        id                          = "terraform-20230728140713668600000001"
      + tags                        = {
          + "foo" = "bar"
        }
        # (11 unchanged attributes hidden)

        # (3 unchanged blocks hidden)
    }

And then leads to this diff:

  # aws_s3_bucket.example will be updated in-place
  ~ resource "aws_s3_bucket" "example" {
        id                          = "terraform-20230728140713668600000001"
      ~ tags                        = {
          - "foo" = "bar" -> null
        }
      ~ tags_all                    = {
          - "foo" = "bar"
        } -> (known after apply)
        # (10 unchanged attributes hidden)

        # (3 unchanged blocks hidden)
    }

If we run terraform plan -refresh=false, we see this instead:

𝛌 terraform plan -refresh=false 

No changes. Your infrastructure matches the configuration.

Terraform has compared your real infrastructure against your configuration and found no differences, so no changes are needed.

@iwahbe
Copy link
Member

iwahbe commented Jul 28, 2023

Running pulumi up --refresh=true does not duplicate terraforms behavior with regards to noticing a tags diff and then rectifying it.

@mikhailshilkov mikhailshilkov added this to the 0.92 milestone Aug 1, 2023
@iwahbe iwahbe added the resolution/fixed This issue was fixed label Aug 4, 2023
@iwahbe
Copy link
Member

iwahbe commented Aug 4, 2023

Closed by #2655

@iwahbe iwahbe closed this as completed Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec p1 A bug severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed
Projects
None yet
Development

No branches or pull requests

4 participants