Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Peering state mismatch after upgrade to 3.0.1 (UserConnect-7.0.1373) #1532

Open
dxciberaws opened this issue Jan 10, 2023 · 1 comment
Open

Comments

@dxciberaws
Copy link

dxciberaws commented Jan 10, 2023

After controller and gateways update to UserConnect-7.0.1373 and TF plugin upgraded to 3.0.1, the existing TF AWS peering resources that were previously defined are found to not match the exisiting configured peerings about their route table lists.

A terraform apply command results in all AWS peering resources requiring replacement to update their route table lists.

If the terraform apply is responded with yes the resources are destroyed and created again, but a next terraform apply command will again require to recreate all of them for the same mismatch of route table lists.

Steps to reproduce

  • Under UserConnect-6.9, define an AWS peering resource with route tables lists, particularly if one of them is defined with a single "all" element.
  • Run terraform apply so the resource is created
  • Run terraform apply again to check that no changes are required
  • Update controller to UserConnect-7.0 and Terraform plugin to 3.0.1
  • Execute terraform apply. The AWS peering resource route table lists are detected to not match. The "all" is requested to be added an each of the elements of the expanded list of route tables is requested to be removed.
  • Answer yes and let Terraform destroy the AWS peering and create it again
  • Run terraform apply again. The same mismatch of state and resources is reported and the resource is marked for replacement again.

Expected behavior

Running a terraform apply results in no changes required

Additional data / screenshots

Sample output for one of the defined AWS peering resources (all existing are affected by the issue):

  # aviatrix_aws_peer.sec-sbx must be replaced
-/+ resource "aviatrix_aws_peer" "sec-sbx" {
      ~ id            = "vpc-66b02c0d~vpc-01918fab5e6792eba" -> (known after apply)
      ~ rtb_list1     = [ # forces replacement
          - "rtb-db434eb0",
          - "rtb-3a424f51",
            "rtb-5c404d37",
            "rtb-86404ded",
          + "rtb-db434eb0",
          + "rtb-3a424f51",
        ]
      ~ rtb_list2     = [ # forces replacement
          - "rtb-086467e3ce4b50187",
          - "rtb-028657d2127b7458b",
          - "rtb-0b45932b5b6188479",
          - "rtb-0ffe6b88a7e9ae1d8",
          - "rtb-02a64f885b9b1a594",
          - "rtb-0296c8da84d8ec5c1",
          - "rtb-063ce2461a82cc3b0",
          - "rtb-00fb6b53c6aa41906",
          - "rtb-032b125694f120d75",
          + "all",
        ]
        # (6 unchanged attributes hidden)
    }

Plan: 7 to add, 0 to change, 7 to destroy.

NOTICE how two route table IDs are selected to be removed and also added, and that "all" is being compared against the expanded list of all route table IDs

Environment:

  • Provider version: 3.0.1
  • Terraform version: Terraform v1.0.3 on windows_amd64
  • provider registry.terraform.io/aviatrixsystems/aviatrix v3.0.1
  • provider registry.terraform.io/hashicorp/aws v4.49.0
  • Aviatrix Controller version: UserConnect-7.0.1373

Additional context

No updates to defined resources are possible without incurring in the destruction and re-creation of all the existing AWS peerings

A change in the list_aws_peering behaviour may be the root cause. As observed in the code, the route tables lists are obtained from the list_aws_peering API call. The terraform provider code does not sort or process the resulting list of values in any way, and does not treat the attribute as a list in the way that the comparison operation is performed (should be comparing all elements). A change in the way the API returns the values, such as ordering them differently would result in the provider incorrectly detecting that the values are different while they are in fact the same.

On the other hand, the provider is not updating the state with the new list_aws_peering API result, as in a next execution of terraform apply the issue happens again.

The resource creation API call seems to produce the list of elements in a different order than the list_aws_peering API, hence making the created state not match the refreshed state.

On creation of the resource a deferred read of the resource from the controller was introduced:

	flag := false
	defer resourceAviatrixAWSPeerReadIfRequired(d, meta, &flag)

	_, err := client.CreateAWSPeer(awsPeer)
	if err != nil {
		return fmt.Errorf("failed to create Aviatrix AWSPeer: %s", err)
	}

	return resourceAviatrixAWSPeerReadIfRequired(d, meta, &flag)
}

func resourceAviatrixAWSPeerReadIfRequired(d *schema.ResourceData, meta interface{}, flag *bool) error {
	if !(*flag) {
		*flag = true
		return resourceAviatrixAWSPeerRead(d, meta)
	}
	return nil
}

The logic there sets flag=false which causes that the state is then read. If the resource exists, its details are are read, if not nil is returned (

).

Then at the end of the block the resource read is repeated, but this time flag will always be true both if the resource existed or not, causing that this last invocation returns always nil and the resource's data is not read again, so the route table list attribute is not expanded and remains with the "all" value.

In our example, we verified that rtblist1 is affected by a change in the order of the elements. Changing the definition of the resource in the TF file to put the first two values last, fixed the issue for those, as seen in our test:

  # aviatrix_aws_peer.sec-sbx must be replaced
-/+ resource "aviatrix_aws_peer" "sec-sbx" {
      ~ id            = "vpc-66b02c0d~vpc-01918fab5e6792eba" -> (known after apply)
      ~ rtb_list2     = [ # forces replacement
          - "rtb-086467e3ce4b50187",
          - "rtb-028657d2127b7458b",
          - "rtb-0b45932b5b6188479",
          - "rtb-0ffe6b88a7e9ae1d8",
          - "rtb-02a64f885b9b1a594",
          - "rtb-0296c8da84d8ec5c1",
          - "rtb-063ce2461a82cc3b0",
          - "rtb-00fb6b53c6aa41906",
          - "rtb-032b125694f120d75",
          + "all",
        ]
        # (7 unchanged attributes hidden)
    }

The original state (from the previous version) contains the route table values in the order they are defined in the TF file, while now the controller returns them in a different order. Terraform does not apply any ordering to the list and assumes that the order may be significant to the resource's definition, causing the difference to be marked as a mismatch. It is the responsibility of the provider to handle or not a different order of values and provide Terraform with a consistent result to be stored in the state.

@dxciberaws dxciberaws changed the title Peering state mismatch after upgrade to 3.0.1 (UserConnect-7.0.1373) AWS Peering state mismatch after upgrade to 3.0.1 (UserConnect-7.0.1373) Jan 10, 2023
@dxciberaws
Copy link
Author

Bug is confirmed in support's lab. As of now a workaround is to include

lifecycle { ignore_changes = [rtb_list1, rtb_list2]}

In the resource's definition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant