fix: initial commit for deploy fix when having duplicate kinds #896

renescheepers · 2022-07-13T12:13:03Z

Dropped support for Ruby 2.7, has been EOL since 2019-12-25
Dropped support for Kubernetes 1.19, has been EOL since 2021-10-28
Fixed Krane is unable to fully deploy resources when there are duplicate Kinds #895

crds/crd.yaml

lib/krane/kubernetes_resource.rb

jpfourny · 2022-07-27T17:05:33Z

FYI: You will need to merge with this PR: #900
Version 2.4.9 will be cut from that, shortly.

timothysmith0609 · 2022-08-16T14:47:31Z

test/integration/krane_deploy_test.rb

+  def test_duplicate_kind_resource_definition
+    result = deploy_global_fixtures("crd", subset: ["deployment.yml"])
+    assert_deploy_success(result)
+
+    result = deploy_fixtures("crd", subset: ["web.yml"], global_timeout: 30)
+    assert_deploy_success(result)
+  end
+


This test is passing just fine on master branch, I don't think it's forcing the failure you think it is

Should be fixed now, the testing code applies a prefix to the Kind so in that case there is no duplicate.

KnVerey · 2022-08-16T20:31:57Z

lib/krane/cluster_resource_discovery.rb

@@ -37,12 +37,39 @@ def fetch_resources(namespaced: false)
      end.compact.uniq { |r| "#{r['apigroup']}/#{r['kind']}" }
    end

+    def fetch_group_kinds


Instead of parsing the tabular information from api-resources, can you collect this from the calls we make during fetch_resources above? I believe api-resources is re-making those exact same calls, and it can be very expensive in large clusters. Furthermore, upstream makes no stability guarantees on tabular data output (vs json or yaml).

I think it does indeed do separate calls in the background. Let me check if I can re-use the existing calls.

KnVerey · 2022-08-16T20:54:44Z

lib/krane/kubernetes_resource.rb

      def kind
-        name.demodulize
+        # Converts Krane::ApiextensionsK8sIo::CustomResourceDefinition to CustomResourceDefinition


Doesn't name.demodulize still do what you want? I also have the same question about why we aren't using the information from the definition, like at L51... though in this case you're following what we (probably I) originally did. Something has gone very wrong during instantiation if the two don't match.

Sometimes the class is called where is no instance is present. Here for example:

krane/lib/krane/kubernetes_resource/service.rb

Line 9 in f0e0da7

SYNC_DEPENDENCIES = [::Krane::Pod, ::Krane::Apps::Deployment, ::Krane::Apps::StatefulSet]

krane/lib/krane/kubernetes_resource/apps/deployment.rb

Line 8 in f0e0da7

SYNC_DEPENDENCIES = [::Krane::Pod, ::Krane::Apps::ReplicaSet]

KnVerey · 2022-08-16T20:58:23Z

lib/krane/kubernetes_resource.rb

@@ -503,6 +547,36 @@ def selected?(selector)
      selector.nil? || selector.to_h <= labels
    end

+    def self.group_from_api_version(input)


why add these separately instead of with the rest of the class methods in class << self above?

KnVerey · 2022-08-16T21:11:00Z

lib/krane/kubernetes_resource.rb

        field_part = FIELDS.map { |f| "{{#{f}}}" }.join(%({{print "#{FIELD_SEPARATOR}"}}))
-        %({{range .items}}#{condition_start}#{field_part}{{print "#{EVENT_SEPARATOR}"}}{{end}}{{end}})
+        %({{range .items}}#{and_conditions_string}#{field_part}{{print "#{EVENT_SEPARATOR}"}}#{ends}{{end}})


was the change to nested syntax required? does it not lazy evaluate?

Yes, the older version of Kubectl (which we still support) are compiled with Go 1.17. Only since 1.18 it evaluates the statements correctly.

https://tip.golang.org/doc/go1.18#:~:text=test.fuzzminimizetime.-,text/template,-Within%20a%20range

KnVerey · 2022-08-16T21:21:41Z

lib/krane/api_resource.rb

+# frozen_string_literal: true
+
+module Krane
+  class APIResource


What is the purpose of this class? Krane::Resource also represents API resources

It's only for containing the response of the api resources call, so I don't have to return a hash. Could think of a better name though.

Ah, in that case we shouldn't need it, because per my other comment we should be using the discovery information directly instead of re-making the calls via kubectl and parsing tabular output. But if that really isn't possible for some reason, I'd make this class private to the class that needs it, to avoid confusion.

KnVerey · 2022-08-16T21:31:13Z

lib/krane/kubernetes_resource.rb

+          return ""
+        end
+
+        # Converts Krane::ApiextensionsK8sIo::CustomResourceDefinition to apiextensions.k8s.io


Why extract from the class name instead of the apiVersion? In fact, could we save the extraction at L50 in an instance variable and use an accessor?

The API version isn't always available, for example here. There is no instance available to fetch this information for.

ah right we're still in the class methods here 🤦‍♀️

KnVerey · 2022-08-16T21:53:41Z

lib/krane/global_deploy_task.rb

        r = KubernetesResource.build(context: context, logger: logger, definition: r_def,
-          crd: crd, global_names: global_kinds, statsd_tags: statsd_tags)
+          crd: crd, group_kinds: group_kinds, statsd_tags: statsd_tags)


It seems like this is only used for the calculation of whether the resource is global. Could it make sense to do that here instead and make more use of cluster_resource_discoverer? E.g. have a cluster_resource_discoverer.global_resource?(group, kind).

Yes, that does make sense. I tried to make the minimum amount of changes, so that most stays the same.

KnVerey · 2022-08-16T22:03:02Z

lib/krane/restart_task.rb

@@ -167,6 +167,12 @@ def identify_target_deployments(selector: nil)
        apps_v1_kubeclient.get_deployments(namespace: @namespace, label_selector: selector_string)
      end
      deployments.select { |d| d.dig(:metadata, :annotations, ANNOTATION) }
+      .map do |d|
+        d["apiVersion"] = "apps/v1"
+        d["kind"] = "Deployment"


why do we need to set/override these now?

The response doesn't contain these values and we now depend on these values being set.

Ah right, we got them from Kubeclient instead of kubectl here. Related issue: ManageIQ/kubeclient#368

KnVerey · 2022-08-16T22:05:45Z

lib/krane/deploy_task.rb

@@ -238,18 +210,22 @@ def secrets_from_ejson
    def discover_resources
      @logger.info("Discovering resources:")
      resources = []
-      crds_by_kind = cluster_resource_discoverer.crds.group_by(&:kind)
+      crds_grouped = cluster_resource_discoverer.crds.group_by(&:cr_group_kind)
+      group_kinds = @task_config.group_kinds


Within task config, you get this from an instance of cluster_resource_discoverer. Why not call it directly here? Since that class controls expensive operations, I'd think we should reuse a single instance as much as possible.

The old code did somewhat the same, thats why I didn't change it at first. Changing now.

KnVerey · 2022-08-16T22:12:19Z

lib/krane/deploy_task.rb

        r = KubernetesResource.build(namespace: @namespace, context: @context, logger: @logger, definition: r_def,
-          statsd_tags: @namespace_tags, crd: crd, global_names: @task_config.global_kinds)
+          statsd_tags: @namespace_tags, crd: crd, group_kinds: group_kinds)


Same comment as in global_deploy_task about using cluster_resource_discoverer here instead of passing the list of group kinds to build.

KnVerey · 2022-08-16T22:16:30Z

fix: all tests serial

I don't see this actually changing tests to run differently, but the commit message caught my eye. It's important that Krane be able to run multiple deploys currently, even when used as a gem. So just wanted to make super sure we are aware of that. 🙏

renescheepers · 2022-08-17T12:01:50Z

fix: all tests serial

I don't see this actually changing tests to run differently, but the commit message caught my eye. It's important that Krane be able to run multiple deploys currently, even when used as a gem. So just wanted to make super sure we are aware of that. 🙏

Yes, because the tests take a long time I went through each test-suite separately. One of those is integration-serial so while I didn't specifically touch any files there, I needed to do these fixes to have these tests pass.

renescheepers · 2022-11-30T15:01:09Z

Not planning on merging this pull request. This pull request was needed to resolve some issues that would happen when used in combination with Crossplane. However we stopped using Crossplane. Even though it still would be nice to fix, it is way too risky to deploy since this is used everywhere.

gmarx-shopify reviewed Jul 13, 2022

View reviewed changes

crds/crd.yaml Outdated Show resolved Hide resolved

renescheepers force-pushed the renescheepers/fix_deploy_duplicate_kind branch from efd35c5 to 741c133 Compare July 13, 2022 13:41

timothysmith0609 reviewed Jul 13, 2022

View reviewed changes

lib/krane/kubernetes_resource.rb Outdated Show resolved Hide resolved

renescheepers force-pushed the renescheepers/fix_deploy_duplicate_kind branch 20 times, most recently from 052d972 to 2edd0b3 Compare July 21, 2022 07:08

renescheepers self-assigned this Jul 21, 2022

renescheepers added the 🪲 bug Something isn't working label Jul 21, 2022

renescheepers marked this pull request as ready for review July 21, 2022 08:08

renescheepers requested a review from a team as a code owner July 21, 2022 08:08

renescheepers requested review from ethanaubuchon and camteasdale143 and removed request for a team July 21, 2022 08:08

renescheepers requested a review from a team July 26, 2022 18:58

timothysmith0609 reviewed Jul 27, 2022

View reviewed changes

lib/krane/kubernetes_resource.rb Outdated Show resolved Hide resolved

renescheepers force-pushed the renescheepers/fix_deploy_duplicate_kind branch 2 times, most recently from cfa48a9 to eb4971c Compare July 27, 2022 18:39

renescheepers added 13 commits July 29, 2022 14:40

feat: moved Kubernetes resources into module matching their group

f7cc29f

feat: use group kind instead of only kind for identifying resources

09b9afe

fix: moved CustomResourceDefinition in correct module

afc6cc4

feat: renamed gvk to group_kinds

d5e29d5

fix: unit tests and integration tests

4af4a4f

fix: add error handling to fetch api-resources call

b4afb91

fix: code clean up

0f17171

fix: use CR group kind instead of own CRD group kind

150b9a9

fix: all tests serial

55e59fd

feat: drop support for Ruby 2.7 and Kubernetes 1.19

fd22158

feat: add test for duplicate kinds

d0b5bb9

feat: separate class for API resource

1965fd6

fix: return empty hash of events when request failed

f0e0da7

renescheepers force-pushed the renescheepers/fix_deploy_duplicate_kind branch from eb4971c to f0e0da7 Compare July 29, 2022 12:40

renescheepers requested a review from timothysmith0609 August 15, 2022 19:45

timothysmith0609 reviewed Aug 16, 2022

View reviewed changes

KnVerey reviewed Aug 16, 2022

View reviewed changes

renescheepers force-pushed the renescheepers/fix_deploy_duplicate_kind branch from 5ca1ff6 to 5868680 Compare August 18, 2022 20:31

rewrite me

a956fac

renescheepers force-pushed the renescheepers/fix_deploy_duplicate_kind branch from 5868680 to a956fac Compare August 23, 2022 14:44

gmarx-shopify removed their request for review November 21, 2022 12:37

renescheepers closed this Nov 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: initial commit for deploy fix when having duplicate kinds #896

fix: initial commit for deploy fix when having duplicate kinds #896

renescheepers commented Jul 13, 2022 •

edited

Loading

jpfourny commented Jul 27, 2022

timothysmith0609 Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 16, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022 •

edited

Loading

KnVerey Aug 17, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 17, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 17, 2022

KnVerey Aug 16, 2022

renescheepers Aug 17, 2022

KnVerey Aug 16, 2022

KnVerey commented Aug 16, 2022

renescheepers commented Aug 17, 2022 •

edited

Loading

renescheepers commented Nov 30, 2022

fix: initial commit for deploy fix when having duplicate kinds #896

fix: initial commit for deploy fix when having duplicate kinds #896

Conversation

renescheepers commented Jul 13, 2022 • edited Loading

jpfourny commented Jul 27, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

renescheepers Aug 17, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KnVerey commented Aug 16, 2022

renescheepers commented Aug 17, 2022 • edited Loading

renescheepers commented Nov 30, 2022

renescheepers commented Jul 13, 2022 •

edited

Loading

renescheepers Aug 17, 2022 •

edited

Loading

renescheepers commented Aug 17, 2022 •

edited

Loading