Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPA gatekeeper audit externaldata error: dial tcp: connect: cannot assign requested address #423

Open
mannbiher opened this issue Apr 25, 2024 · 3 comments · May be fixed by #424
Open

OPA gatekeeper audit externaldata error: dial tcp: connect: cannot assign requested address #423

mannbiher opened this issue Apr 25, 2024 · 3 comments · May be fixed by #424

Comments

@mannbiher
Copy link

What happened?

I am using ratify to verify image signature using OPA gatekeeper external data. Ratify chart is installed following the documentation. After running for few days, OPA gatekeeper audit controller cannot open any new connections to ratify and ratifyconstraint would contain below error in violations.

- enforcementAction: warn
    group: ""
    kind: Pod
    message: 'System error calling external data provider: failed to send external
      data request: Post "https://ratify.gatekeeper-system:6001/ratify/gatekeeper/v1/verify":
      dial tcp 172.20.146.228:6001: connect: cannot assign requested address'
    name: abc-767bb47d54-kvb79
    namespace: abc
    version: v1

What should happen?

ratifyconstraint should show the actual violations.

Versions

Kubernetes version 1.27 (EKS)
OPA gatekeeper 3.15.0
Ratify 1.1.0

Analysis

I looked at the external data provider code and can see it is happening because a new client is created for every request. There is no IdleConnTimeout set on transport so the old connections remain open. At some point of time no new connections can be opened and we get above error. Client should be created once for an external data provider and reused. It should also have defaults to limit concurrent idle connections e.g. setting values for MaxIdleConnsPerHost, IdleConnTimeout.

https://github.com/open-policy-agent/frameworks/blob/master/constraint/pkg/externaldata/request.go#L104

Go also recommends reuse of Client and transport.

@houdini91
Copy link

Hi, I’m fairly certain I’m experiencing the same issue. Do you have any suggestions for a workaround?

I was considering wrapping the function in the constraints template Rego section and implementing a retry mechanism that triggers a few attempts when this error is detected.

@ritazh
Copy link
Member

ritazh commented Oct 7, 2024

Thanks for the bump! @mannbiher Thanks for the PR! Sorry for the delay. Are you still interested in getting that PR merged?

@mannbiher
Copy link
Author

Hi @ritazh Yes. It has been long and I need to test the PR if it works. I will work on it and update you once ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants