Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to create multiple node pools #1397

Open
krzkowalczyk opened this issue Sep 8, 2022 · 2 comments
Open

Unable to create multiple node pools #1397

krzkowalczyk opened this issue Sep 8, 2022 · 2 comments
Labels
bug Something isn't working question Further information is requested triaged Scoped and ready for work

Comments

@krzkowalczyk
Copy link

TL;DR

It is not possible to create GKE cluster with multiple node pools at once.
It only works if we create GKE cluster with one node pool, and later on add additional ones.

Expected behavior

All defined node pools are being created successfully

Observed behavior

Where I want to create a new cluster containing multiple node pools, during terraform plan or terraform apply there is an error:


│ Error: Invalid for_each argument
│ 
│   on .terraform/modules/gke/modules/private-cluster/cluster.tf line 311, in resource "google_container_node_pool" "pools":
│  311:   for_each = local.node_pools
│     ├────────────────
│     │ local.node_pools is a map of map of string, known only after apply
│ 
│ The "for_each" map includes keys derived from resource attributes that cannot be determined until apply, and so Terraform cannot determine the full
│ set of keys that will identify the instances of this resource.
│ 
│ When working with unknown values in for_each, it's better to define the map keys statically in your configuration and place apply-time results only in
│ the map values.
│ 
│ Alternatively, you could use the -target planning option to first apply only the resources that the for_each value depends on, and then apply a second
│ time to fully converge.

Therefore, If I define just one node pool and create the cluster, and later add second pool terraform properly creates the missing pool.
The problem exists only if I want to create at once GKE cluster + multiple node pools.

Terraform Configuration

module "gke" {
  source                     = "terraform-google-modules/kubernetes-engine/google//modules/private-cluster"
  version                    = "23.0.0"
  project_id                 = var.project_id
  name                       = "${var.project_id}-gke-euwest1-main"
  region                     = var.region
  zones                      = ["${var.region}-d", "${var.region}-b", "${var.region}-c"]
  network                    = data.terraform_remote_state.net.outputs.vpc_name
  subnetwork                 = "${var.project_id}-euwest1-net-1"
  ip_range_pods              = "${var.project_id}-euwest1-net-1-sbt-pod-range"
  ip_range_services          = "${var.project_id}-euwest1-net-1-sbt-service-range"
  http_load_balancing        = true
  network_policy             = true
  horizontal_pod_autoscaling = false
  filestore_csi_driver       = false
  enable_private_endpoint    = false
  enable_private_nodes       = true
  master_ipv4_cidr_block     = "172.18.0.0/28"
  release_channel            = "STABLE"
  remove_default_node_pool   = true

  node_pools = [
    {
      name               = "node-pool-${var.region}-b"
      machine_type       = "e2-medium"
      node_locations     = "${var.region}-b"
      min_count          = 1
      max_count          = 3
      local_ssd_count    = 0
      spot               = false
      disk_size_gb       = 60
      disk_type          = "pd-standard"
      image_type         = "COS_CONTAINERD"
      enable_gcfs        = false
      enable_gvnic       = false
      auto_repair        = true
      auto_upgrade       = true
      service_account    = google_service_account.node_pool_service_account.email
      preemptible        = false
      initial_node_count = 1
    },
    {
      name               = "node-pool-${var.region}-c"
      machine_type       = "n2-standard-2"
      node_locations     = "${var.region}-c"
      min_count          = 1
      max_count          = 3
      local_ssd_count    = 0
      spot               = false
      disk_size_gb       = 60
      disk_type          = "pd-standard"
      image_type         = "COS_CONTAINERD"
      enable_gcfs        = false
      enable_gvnic       = false
      auto_repair        = true
      auto_upgrade       = true
      service_account    = google_service_account.node_pool_service_account.email
      preemptible        = false
      initial_node_count = 1
    },
  ]

  node_pools_oauth_scopes = {
    all = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]
  }

  node_pools_labels = {
    all = {}
    "node-pool-${var.region}-b" = {
      zone = "${var.region}-b"
    },
    "node-pool-${var.region}-c" = {
      zone = "${var.region}-c"
    }
  }

  node_pools_metadata = {
    all = {}

    "node-pool-${var.region}-b" = {
      zone = "${var.region}-b"
    },
    "node-pool-${var.region}-c" = {
      zone = "${var.region}-c"
    }
  }

  node_pools_taints = {
    all = []
  }

  node_pools_tags = {
    all = []
  }
}

Terraform Version

Terraform v1.2.9
on darwin_amd64
+ provider registry.terraform.io/hashicorp/google v4.33.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.13.1
+ provider registry.terraform.io/hashicorp/random v3.4.2

Additional information

No response

@krzkowalczyk krzkowalczyk added the bug Something isn't working label Sep 8, 2022
@bharathkkb
Copy link
Member

Hi @krzkowalczyk
I suspect this is because of google_service_account.node_pool_service_account.email which is a computed attribute only known after apply of google_service_account.node_pool_service_account. Setting create_service_accountto true should create a dedicated SA for you. Alternatively you can create thegoogle_service_account` resource in a different configuration too as pass in the email via remote state.

@bharathkkb bharathkkb added question Further information is requested triaged Scoped and ready for work labels Sep 8, 2022
@tunguyen9889
Copy link

tunguyen9889 commented Dec 11, 2022

I'm facing exactly same error, with service account resource created in different configuration. I had to comment the second node_pools, run terraform apply, then uncomment the code and re-run terraform apply to make it work.

Error:

╷
│ Error: Invalid for_each argument
│
│   on .terraform/modules/gke/modules/beta-private-cluster/cluster.tf line 432, in resource "google_container_node_pool" "pools":
│  432:   for_each = local.node_pools
│     ├────────────────
│     │ local.node_pools is a map of map of string, known only after apply
│
│ The "for_each" map includes keys derived from resource attributes that
│ cannot be determined until apply, and so Terraform cannot determine the
│ full set of keys that will identify the instances of this resource.
│
│ When working with unknown values in for_each, it's better to define the map
│ keys statically in your configuration and place apply-time results only in
│ the map values.
│
│ Alternatively, you could use the -target planning option to first apply
│ only the resources that the for_each value depends on, and then apply a
│ second time to fully converge.

My code:

resource "google_service_account" "gke" {
  project      = var.project_id
  account_id   = "gke-${var.cluster_name}"
  display_name = "Terraform-managed service account for cluster ${var.cluster_name}"
}

resource "google_project_iam_member" "gke" {
  for_each = local.service_account_roles

  member  = "serviceAccount:${google_service_account.gke.email}"
  role    = each.key
  project = var.project_id
}

module "gke" {
  source  = "terraform-google-modules/kubernetes-engine/google//modules/beta-private-cluster"
  version = "~> 24.0.0"

  project_id  = var.project_id
  name        = var.cluster_name
  description = var.cluster_description

  cluster_resource_labels = var.cluster_resource_labels

  region   = var.region
  regional = var.regional
  zones    = var.zones

  network            = data.terraform_remote_state.shared_vpc.outputs.network_name
  subnetwork         = var.subnetwork
  ip_range_pods      = var.ip_range_pods
  ip_range_services  = var.ip_range_services
  network_project_id = var.project_id

  ip_masq_link_local = "false"

  http_load_balancing        = true
  horizontal_pod_autoscaling = true
  network_policy             = var.enable_network_policy
  gce_pd_csi_driver          = var.enable_gce_pd_csi_driver

  kubernetes_version = var.kubernetes_version
  release_channel    = var.release_channel

  maintenance_start_time = var.maintenance_start_time
  maintenance_recurrence = var.maintenance_recurrence
  maintenance_end_time   = var.maintenance_end_time

  monitoring_service = var.monitoring_service
  logging_service    = var.logging_service

  enable_shielded_nodes   = var.enable_shielded_nodes
  enable_private_endpoint = false
  enable_private_nodes    = true
  master_ipv4_cidr_block  = var.master_ipv4_cidr_block
  istio                   = var.enable_istio

  database_encryption = [{
    key_name = google_kms_crypto_key_iam_member.gke.crypto_key_id,
    state    = "ENCRYPTED"
  }]

  master_authorized_networks = var.master_access_cidrs
  add_cluster_firewall_rules = var.enable_cluster_firewall_rules
  firewall_priority          = var.firewall_priority
  firewall_inbound_ports     = var.firewall_inbound_ports_from_master_to_nodes

  master_global_access_enabled = false

  cluster_dns_provider = var.cluster_dns_provider
  cluster_dns_scope    = var.cluster_dns_scope
  cluster_dns_domain   = var.cluster_dns_domain

  # We do not need to create the default service account
  create_service_account = false
  service_account        = google_service_account.gke.email

  remove_default_node_pool = true

  cluster_autoscaling = {
    enabled             = false # Disables node auto-provisioning (automatic node pool creation and deletion)
    autoscaling_profile = var.cluster_autoscaling_profile
    max_cpu_cores       = 0
    min_cpu_cores       = 0
    max_memory_gb       = 0
    min_memory_gb       = 0
    gpu_resources       = []
  }

  node_pools = [
    {
      name               = var.default_node_pool_name
      machine_type       = var.default_node_pool_machine_type
      min_count          = var.default_node_pool_min_count
      max_count          = var.default_node_pool_max_count
      initial_node_count = var.default_node_pool_min_count
      auto_repair        = true
      auto_upgrade       = true
      disk_size_gb       = var.default_node_pool_disk_size_gb
      disk_type          = var.default_node_pool_disk_type
      image_type         = var.node_pool_image_type
      enable_secure_boot = true
      enable_gcfs        = false
      preemptible        = var.default_node_pool_preemptible
      spot               = var.default_node_pool_spot
      boot_disk_kms_key  = google_kms_crypto_key_iam_member.gce.crypto_key_id
    },
    {
      name               = var.core_node_pool_name
      machine_type       = var.core_node_pool_machine_type
      min_count          = var.core_node_pool_min_count
      max_count          = var.core_node_pool_max_count
      initial_node_count = var.core_node_pool_min_count
      auto_repair        = true
      auto_upgrade       = true
      disk_size_gb       = var.core_node_pool_disk_size_gb
      disk_type          = var.core_node_pool_disk_type
      image_type         = var.node_pool_image_type
      enable_secure_boot = true
      enable_gcfs        = false
      preemptible        = var.core_node_pool_preemptible
      spot               = var.core_node_pool_spot
      boot_disk_kms_key  = google_kms_crypto_key_iam_member.gce.crypto_key_id
    },
  ]

  node_pools_labels = {
    all                          = local.all_node_pools_labels
    (var.default_node_pool_name) = var.default_node_pool_labels
    (var.core_node_pool_name)    = var.core_node_pool_labels
  }

  node_pools_metadata = {
    all                          = local.all_node_pools_metadata
    (var.default_node_pool_name) = local.default_node_pool_metadata
    (var.core_node_pool_name)    = local.core_node_pool_metadata
  }

  node_pools_taints = {
    all                          = []
    (var.default_node_pool_name) = var.default_node_pool_taints
    (var.core_node_pool_name)    = var.core_node_pool_taints
  }

  node_pools_tags = {
    all                          = local.all_node_pools_tags
    (var.default_node_pool_name) = var.default_node_pool_tags
    (var.core_node_pool_name)    = var.core_node_pool_tags
  }

  node_pools_oauth_scopes = {
    all                          = local.oauth_scopes
    (var.default_node_pool_name) = local.oauth_scopes
    (var.core_node_pool_name)    = local.oauth_scopes
  }

  identity_namespace = "${var.project_id}.svc.id.goog"

  node_metadata = "GKE_METADATA_SERVER"

  resource_usage_export_dataset_id   = ""
  enable_network_egress_export       = false
  enable_resource_consumption_export = false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested triaged Scoped and ready for work
Projects
None yet
Development

No branches or pull requests

3 participants