issue triggering upgrade #412

ysineil · 2024-08-14T13:53:57Z

Describe the bug

issue upgrading workload cluster from 1.25 to 1.26 with the below error

tanzu-mission-control_tanzu_kubernetes_cluster.tkgs_cluster[0]: Refreshing state... [id=pct-ha-a/pct-ha-a-1547/pct-qa-mlai]
Planning failed. Terraform encountered an error while generating this plan.
╷
│ Error: Request cancelled
│
│ with tanzu-mission-control_tanzu_kubernetes_cluster.tkgs_cluster[0],
│ on main.tf line 219, in resource "tanzu-mission-control_tanzu_kubernetes_cluster" "tkgs_cluster":
│ 219: resource "tanzu-mission-control_tanzu_kubernetes_cluster" "tkgs_cluster" {
│
│ The plugin.(*GRPCProvider).ReadResource request was cancelled.
╵
Stack trace from the terraform-provider-tanzu-mission-control_v1.4.5 plugin:
panic: runtime error: index out of range [3] with length 3
goroutine 44 [running]:
github.com/vmware/terraform-provider-tanzu-mission-control/internal/resources/tanzukubernetescluster.removeUnspecifiedNodePoolsOverrides({0xc00069bd80?, 0x4, 0x1c5e659?}, 0xc0005a0ff0)
github.com/vmware/terraform-provider-tanzu-mission-control/internal/resources/tanzukubernetescluster/helper.go:402 +0x3c5
github.com/vmware/terraform-provider-tanzu-mission-control/internal/resources/tanzukubernetescluster.resourceTanzuKubernetesClusterRead({0x1f711a0, 0xc000b08c00}, 0xc00092b580, {0x1b65c40?, 0xc000b04180})
github.com/vmware/terraform-provider-tanzu-mission-control/internal/resources/tanzukubernetescluster/resource_tanzu_kuberenetes_cluster.go:154 +0x57e
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).read(0x1f711a0?, {0x1f711a0?, 0xc000b08c00?}, 0xd?, {0x1b65c40?, 0xc000b04180?})
github.com/hashicorp/terraform-plugin-sdk/[email protected]/helper/schema/resource.go:719 +0x87
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*Resource).RefreshWithoutUpgrade(0xc00035db20, {0x1f711a0, 0xc000b08c00}, 0xc0004d9450, {0x1b65c40, 0xc000b04180})
github.com/hashicorp/terraform-plugin-sdk/[email protected]/helper/schema/resource.go:1015 +0x585
github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema.(*GRPCProviderServer).ReadResource(0xc000576570, {0x1f710f8?, 0xc0004b4480?}, 0xc0004b4500)
github.com/hashicorp/terraform-plugin-sdk/[email protected]/helper/schema/grpc_provider.go:613 +0x4a5
github.com/hashicorp/terraform-plugin-go/tfprotov5/tf5server.(*server).ReadResource(0xc00047e820, {0x1f711a0?, 0xc000b08210?}, 0xc000b040c0)
github.com/hashicorp/[email protected]/tfprotov5/tf5server/server.go:746 +0x43d
github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/tfplugin5._Provider_ReadResource_Handler({0x1b802a0?, 0xc00047e820}, {0x1f711a0, 0xc000b08210}, 0xc00015a1c0, 0x0)
github.com/hashicorp/[email protected]/tfprotov5/internal/tfplugin5/tfplugin5_grpc.pb.go:349 +0x170
google.golang.org/grpc.(*Server).processUnaryRPC(0xc0002241e0, {0x1f78078, 0xc000007380}, 0xc0004fa000, 0xc0005840c0, 0x2d70050, 0x0)
google.golang.org/[email protected]/server.go:1335 +0xdf0
google.golang.org/grpc.(*Server).handleStream(0xc0002241e0, {0x1f78078, 0xc000007380}, 0xc0004fa000, 0x0)
google.golang.org/[email protected]/server.go:1712 +0xa2f
google.golang.org/grpc.(*Server).serveStreams.func1.1()
google.golang.org/[email protected]/server.go:947 +0xca
created by google.golang.org/grpc.(*Server).serveStreams.func1
google.golang.org/[email protected]/server.go:958 +0x15c
Error: The terraform-provider-tanzu-mission-control_v1.4.5 plugin crashed!
This is always indicative of a bug within the plugin. It would be immensely
helpful if you could report the crash with the plugin's maintainers so that it
can be fixed. The output above should help diagnose the issue.
::debug::{"message":"command terminated with non-zero exit code: error executing command [sh -e /__w/_temp/7d55ac10-5a43-11ef-980e-43d815da43f2.sh], exit code 1","details":{"causes":[{"reason":"ExitCode","message":"1"}]}}
Error: Error: failed to run script step: command terminated with non-zero exit code: error executing command [sh -e /__w/_temp/7d55ac10-5a43-11ef-980e-43d815da43f2.sh], exit code 1
Error: Process completed with exit code 1.
Error: Executing the custom container implementation failed. Please contact your self hosted runner administrator.

Reproduction steps

Attempt upgrade from 1.25 to 1.26

...

Expected behavior

Upgrade should complete successfully. Appears this is a new bug in 1.4.5

Additional context

No response

warroyo · 2024-08-14T19:17:51Z

Which versions of tkr exactly are being used here, trying to replicate the issue

ysineil · 2024-08-14T19:21:39Z

Going from v1.25.7+vmware.3-fips.1-tkg.1 to v1.26.5+vmware.2-fips.1-tkg.1

warroyo · 2024-08-14T20:47:52Z

ok, I was able to replicate it, it's not exactly upgrade related. Can you check and see if this cluster has a nodepool that was deleted outside of TF? I tried a few scenarios and this seems to occur when the result returned from the TMC api does not match what TMC expects in the nodepool list. so my suspicion is terraform thinks there's an extra node pool that doesn't actually exist in TMC.

ysineil · 2024-08-14T20:52:50Z

interesting - yes, you're correct, there's an extra nodepool in the state file.

thank you for responding so quickly!

warroyo · 2024-08-14T20:54:06Z

No problem, I think this is something we should handle rather than just panic 😄 so we will keep looking into it, but removing the extra nodepool from the state should get you past this.

ysineil · 2024-08-14T21:01:03Z

absolutely, deleted the reference in the state file and it's all happy. thanks again.

warroyo · 2024-08-14T21:12:02Z

tracking this here #413

ysineil added the bug Bug label Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue triggering upgrade #412

issue triggering upgrade #412

ysineil commented Aug 14, 2024 •

edited

Loading

warroyo commented Aug 14, 2024

ysineil commented Aug 14, 2024 •

edited

Loading

warroyo commented Aug 14, 2024

ysineil commented Aug 14, 2024

warroyo commented Aug 14, 2024

ysineil commented Aug 14, 2024

warroyo commented Aug 14, 2024

issue triggering upgrade #412

issue triggering upgrade #412

Comments

ysineil commented Aug 14, 2024 • edited Loading

Describe the bug

Reproduction steps

Expected behavior

Additional context

warroyo commented Aug 14, 2024

ysineil commented Aug 14, 2024 • edited Loading

warroyo commented Aug 14, 2024

ysineil commented Aug 14, 2024

warroyo commented Aug 14, 2024

ysineil commented Aug 14, 2024

warroyo commented Aug 14, 2024

ysineil commented Aug 14, 2024 •

edited

Loading

ysineil commented Aug 14, 2024 •

edited

Loading