Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix deep equal check failure on objects with runtime.RawExtension. #5940

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zach593
Copy link
Contributor

@zach593 zach593 commented Dec 11, 2024

What type of PR is this?
/kind bug

What this PR does / why we need it:
see #5938

Which issue(s) this PR fixes:
Fixes #5938

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NONE

@karmada-bot karmada-bot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 11, 2024
@karmada-bot karmada-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Dec 11, 2024
@codecov-commenter
Copy link

codecov-commenter commented Dec 11, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 41.66667% with 35 lines in your changes missing coverage. Please review.

Project coverage is 48.35%. Comparing base (72b6bd7) to head (c467408).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
pkg/util/equality.go 48.07% 20 Missing and 7 partials ⚠️
cmd/agent/app/agent.go 0.00% 4 Missing ⚠️
cmd/controller-manager/app/controllermanager.go 0.00% 4 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5940      +/-   ##
==========================================
- Coverage   48.37%   48.35%   -0.02%     
==========================================
  Files         665      666       +1     
  Lines       54836    54891      +55     
==========================================
+ Hits        26526    26544      +18     
- Misses      26592    26622      +30     
- Partials     1718     1725       +7     
Flag Coverage Δ
unittests 48.35% <41.66%> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zach593 zach593 changed the title fix deep equal check failure in CreateOrUpdateWork(), by change the way we check it. [WIP]fix deep equal check failure in CreateOrUpdateWork(), by change the way we check it. Dec 11, 2024
@karmada-bot karmada-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 11, 2024
@zach593 zach593 changed the title [WIP]fix deep equal check failure in CreateOrUpdateWork(), by change the way we check it. [WIP]fix deep equal check failure on objects with runtime.RawExtension. Dec 12, 2024
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign whitewindmills for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 12, 2024
@zach593 zach593 changed the title [WIP]fix deep equal check failure on objects with runtime.RawExtension. fix deep equal check failure on objects with runtime.RawExtension. Dec 12, 2024
@karmada-bot karmada-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 12, 2024
Comment on lines 63 to 85
var aObj, bObj unstructured.Unstructured
err := aObj.UnmarshalJSON(a.Raw)
if err != nil {
return false
}
err = bObj.UnmarshalJSON(b.Raw)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider the case where a.Raw is nil or an empty byte array?

	obj1 := &workv1alpha1.Work{
		Spec: workv1alpha1.WorkSpec{
			Workload: workv1alpha1.WorkloadTemplate{Manifests: []workv1alpha1.Manifest{{RawExtension: runtime.RawExtension{Raw: []byte{}}}}},
		},
	}
	obj2 := &workv1alpha1.Work{
		Spec: workv1alpha1.WorkSpec{
			Workload: workv1alpha1.WorkloadTemplate{Manifests: []workv1alpha1.Manifest{{RawExtension: runtime.RawExtension{Raw: nil}}}},
		},
	}
	checker := equality.Semantic.Copy()
	_ = RegisterEqualityCheckFunctions(&checker)
	fmt.Println(equality.Semantic.DeepEqual(obj1, obj2)) #true
	fmt.Println(checker.DeepEqual(obj1, obj2)) #false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

Suggested change
var aObj, bObj unstructured.Unstructured
err := aObj.UnmarshalJSON(a.Raw)
if err != nil {
return false
}
err = bObj.UnmarshalJSON(b.Raw)
if err != nil {
if (a.Raw == nil || len(a.Raw) == 0) != (b.Raw == nil || len(b.Raw) == 0) {
return false
}
if a.Raw == nil || len(a.Raw) == 0 {
return true
}
var aObj, bObj unstructured.Unstructured
err := aObj.UnmarshalJSON(a.Raw)
if err != nil {
return false
}
err = bObj.UnmarshalJSON(b.Raw)
if err != nil {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the code still needs improvement, I think we should focus on semantic equality rather than memory equality.

I just updated the code(not in your way), could you please review it again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that previously runtime.RawExtension is converted to unstructured.Unstructured before performing a deepequal check. Now, the conversion is to map[string]any instead. Is there any particular reason for this change?

Copy link
Contributor Author

@zach593 zach593 Jan 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a gvk check when unmarshaling JSON to unstructured.Unstructured in https://github.com/karmada-io/karmada/blob/72b6bd7ddc887744585f9ccb48b76f9c1f84ba24/vendor/k8s.io/apimachinery/pkg/apis/meta/v1/unstructured/helpers.go#L384C1-L403C2

func (s unstructuredJSONScheme) Decode(data []byte, _ *schema.GroupVersionKind, obj runtime.Object) (runtime.Object, *schema.GroupVersionKind, error) {
	var err error
	if obj != nil {
		err = s.decodeInto(data, obj)
	} else {
		obj, err = s.decode(data)
	}

	if err != nil {
		return nil, nil, err
	}

	gvk := obj.GetObjectKind().GroupVersionKind()
	if len(gvk.Kind) == 0 {
		return nil, &gvk, runtime.NewMissingKindErr(string(data))
	}
	// TODO(109023): require apiVersion here as well

	return obj, &gvk, nil
}

In work-status-controller and binding-status-controller, we use deep equal check on workload's status field, there's no gvk information in that field, so if use unstructured.Unstructured here, it will cause error because of that gvk check. Of course that's not what we need.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In work-status-controller and binding-status-controller, we use deep equal check on workload's status field, there's no gvk information in that field, so if use unstructured.Unstructured here, it will cause error because of that gvk check. Of course that's not what we need.

Make sense, Can we add test cases accordingly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add test cases accordingly?

The current test case, when performing a deep equal check on the workload's status field, expects a result of false. This outcome does not make it clear whether it is due to an unmarshaling JSON error or differences in the status itself. Therefore, the previous code could also pass the current test case. Ideally, an additional test case could be added where, if the status fields are identical, the equal check results are true, thereby catching scenarios involving an unmarshaling JSON error. Like

		{
			name: "return true when status fields are equal",
			objFn1: func() (runtime.Object, error) {
				obj := obj.DeepCopy()
				j, err := json.Marshal(obj.Object["status"])
				return &workv1alpha2.ResourceBinding{
					Status: workv1alpha2.ResourceBindingStatus{
						AggregatedStatus: []workv1alpha2.AggregatedStatusItem{
							{
								Status: &runtime.RawExtension{
									Raw: j,
								},
							},
						},
					},
				}, err
			},
			objFn2: func() (runtime.Object, error) {
				obj := obj.DeepCopy()
				j, err := json.Marshal(obj.Object["status"])
				return &workv1alpha2.ResourceBinding{
					Status: workv1alpha2.ResourceBindingStatus{
						AggregatedStatus: []workv1alpha2.AggregatedStatusItem{
							{
								Status: &runtime.RawExtension{
									Raw: j,
								},
							},
						},
					},
				}, err
			},
			addCheckFunc: true,
			wantEqual:    true,
			wantErr:      false,
		},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sence, I updated it and I added a test for equal specs.

@zhzhuang-zju
Copy link
Contributor

[FAIL] [CronFederatedHPA] CronFederatedHPA testing Scale FederatedHPA [It] Test scale FederatedHPA testing
https://github.com/karmada-io/karmada/actions/runs/12634482962/job/35202637637?pr=5940#step:6:3946

/retest

Comment on lines +75 to +84
j, err := json.Marshal(e)
if err != nil {
return nil, err
}
if string(j) == "null" {
return nil, nil
}

obj := make(map[string]any)
err = json.Unmarshal(j, &obj)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remind me why not just Unmarshal e.Raw?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons is that I think we should not limit runtime.RawExtension to use only Raw field. Maybe one day we want to start using Object field, and even if there are any obstacles at that time, they should not appear here.
Based on this, I think we should prefer semantic checks rather than memory checks. From e.MarshalJSON(), we can see that only one of the Raw and Object fields will take effect, and the final effect will be based on the JSON it generate. Based on the perspective of semantic checks, I converted them back to map[string]any for checking. This is not concise, but I think it is necessary.

@karmada-bot karmada-bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 7, 2025
@zhzhuang-zju
Copy link
Contributor

Good job~ ask @XiShanYongYe-Chang @chaunceyjiang for another look

Copy link
Member

@XiShanYongYe-Chang XiShanYongYe-Chang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no questions about the other logic.

In addition, according to my understanding, both #5939 and the current pr can solve the deepcopy failure problem. Compared with #5939, I prefer #5939. The current pr introduced additional marshal and unmarshal, and it is necessary to further test performance impact.

If the performance impact is small, I'm fine with both prs.

// RegisterEqualityCheckFunctions registers custom check functions to the equality checker.
// These functions help avoid performing deep-equality checks on workloads represented as byte slices.
func RegisterEqualityCheckFunctions(e *conversion.Equalities) error {
return e.AddFuncs(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, can we just add a comparison function for runtime.RawExtension:

                func(a, b runtime.RawExtension) bool {
			return rawExtensionDeepEqual(&a, &b, e)
		},

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then if other parts (not even karmada code) also use deepEqual on runtime.RawExtension, they will also be affected by the new check functions, I think the impact will be larger and more unpredictable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for explaining!

@zach593
Copy link
Contributor Author

zach593 commented Jan 16, 2025

@RainbowMango Since #5939 is merged, what should we do about this one? close it?

@RainbowMango
Copy link
Member

What's the relationship between #5939 and #5940? Are they solving the same issue? What do you prefer?

@zach593
Copy link
Contributor Author

zach593 commented Jan 16, 2025

What's the relationship between #5939 and #5940? Are they solving the same issue? What do you prefer?

They both solve #5938, but #5939 is more of a workaround solution, and in our practice we prefer #5940, although it will reduce performance, but we think we should prefer semantic checks instead of fighting \n everywhere.

@RainbowMango
Copy link
Member

OK, I agree with you! I don't like workaround either. I don't think the performance drops as the benchmark test report shows very quickly for a single operation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
6 participants