-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-14914: Set tunnel and server timeouts at backend level #536
base: master
Are you sure you want to change the base?
Conversation
@alebedev87: This pull request references Jira Issue OCPBUGS-14914, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
6a1044e
to
ef08ed2
Compare
/jira refresh |
@alebedev87: This pull request references Jira Issue OCPBUGS-14914, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@alebedev87: This pull request references Jira Issue OCPBUGS-14914, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Tested it with 4.15.0-0.ci.test-2023-11-16-093712-ci-ln-dr45512-latest % oc get route % oc annotate route myedge haproxy.router.openshift.io/timeout-tunnel=70m % oc -n openshift-ingress rsh router-default-d78ddc6f-bprk5 sh-4.4$
sh-4.4# wget --no-check-certificate --limit-rate=1000 --delete-after https://myedge-default.apps.ci-ln-9r4m3fk-72292.origin-ci-int-gce.dev.rhcloud.com/oc -k oc.tmp 0%[ ] 239.88K 1000 B/s in 4m 6s --2023-11-17 04:38:15-- (try:20) https://myedge-default.apps.ci-ln-9r4m3fk-72292.origin-ci-int-gce.dev.rhcloud.com/oc oc.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp.tmp 3%[++++++ ] 4.62M 1000 B/s in 3m 55s 2023-11-17 04:42:09 (1000 B/s) - Read error at byte 4847481/121359408 (The TLS connection was non-properly terminated.). Giving up. sh-4.4#
04:41:53.345283 IP (tos 0x0, ttl 62, id 24165, offset 0, flags [DF], proto TCP (6), length 628) |
Back to WIP to address the config warning about missing
|
4113831
to
6096a39
Compare
/test e2e-agnostic |
/test e2e-metal-ipi-ovn-ipv6 |
@alebedev87: This pull request references Jira Issue OCPBUGS-14914, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I still see these errors in the most recent test results:
Did you mean remove the WIP label? |
#536 (comment) addressed. |
@alebedev87: This pull request references Jira Issue OCPBUGS-14914, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/test e2e-upgrade |
Tested it with default, 10s, 90s, 70m on 4.16.0-0.test-2024-02-01-013554-ci-ln-g418bqb-latest
oc get clusterversionNAME VERSION AVAILABLE PROGRESSING SINCE STATUS sh-4.4# ./run-cli.sh real 0m15.130s real 0m10.132s real 0m30.128s sh-4.4# ./run-cli.sh real 1m30.138s sh-4.4# ./run-cli.sh real 70m0.140s
|
/label qe-approved |
timeout server-fin {{ firstMatch $timeSpecPattern (env "ROUTER_DEFAULT_SERVER_FIN_TIMEOUT") "1s" }} | ||
timeout http-request {{ firstMatch $timeSpecPattern (env "ROUTER_SLOWLORIS_TIMEOUT") "10s" }} | ||
timeout http-keep-alive {{ firstMatch $timeSpecPattern (env "ROUTER_SLOWLORIS_HTTP_KEEPALIVE") "300s" }} | ||
|
||
# Long timeout for WebSocket connections. | ||
timeout tunnel {{ firstMatch $timeSpecPattern (env "ROUTER_DEFAULT_TUNNEL_TIMEOUT") "1h" }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you need to specify timeout tunnel
on be_sni
or be_no_sni
? Is HAProxy more lenient about not specifying timeout tunnel
than it is about not specifying timeout server
?
Does omitting timeout tunnel
result in an infinite timeout, similarly to omitting timeout server
, or does it fall back to using the server timeout as the tunnel timeout? Crucially, if you have a route that has a tunnel timeout that is greater than any route's server timeout, is the tunnel timeout respected, or does the server timeout in be_sni
/be_no_sni
cause the connection to be terminated before the route's tunnel timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you need to specify timeout tunnel on be_sni or be_no_sni? Is HAProxy more lenient about not specifying timeout tunnel than it is about not specifying timeout server?
Yes. Only the missing server timeout gives a warning message. I'm not sure the warning reports a real problem though taking into account that the bottom backends will still have the server timeout set.
Does omitting timeout tunnel result in an infinite timeout, similarly to omitting timeout server, or does it fall back to using the server timeout as the tunnel timeout? Crucially, if you have a route that has a tunnel timeout that is greater than any route's server timeout, is the tunnel timeout respected, or does the server timeout in be_sni/be_no_sni cause the connection to be terminated before the route's tunnel timeout?
Regarding the link between server and tunnel timeouts. I discovered that the tunnel timeout supersedes server timeout in the TCP proxies. Note that the middle backends in the router configuration are TCP proxies. So setting tunnel timeout in default
or the middle backends (be_sni
/be_no_sni
) results into interference in the server timeout behavior set on the bottom backend level. I didn't see the opposite though: the server timeout set on default
or middle backends didn't influence the tunnel timeout set on the bottom backends.
@@ -581,10 +583,10 @@ backend {{ genBackendNamePrefix $cfg.TLSTermination }}:{{ $cfgIdx }} | |||
{{- end }} | |||
tcp-request content reject if !whitelist | |||
{{- end }} | |||
{{- with $value := clipHAProxyTimeoutValue (firstMatch $timeSpecPattern (index $cfg.Annotations "haproxy.router.openshift.io/timeout")) }} | |||
{{- with $value := clipHAProxyTimeoutValue (firstMatch $timeSpecPattern (index $cfg.Annotations "haproxy.router.openshift.io/timeout") (env "ROUTER_DEFAULT_SERVER_TIMEOUT") "30s") }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All these additional env
calls add some overhead. As Andy says, "Trips across the Go/Template boundary can be expensive, particularly when you have many, many routes (10,000+)." Consider adding {{- $defaultServerTimeout := firstMatch $timeSpecPattern (env "ROUTER_DEFAULT_SERVER_TIMEOUT") "30s" }}
at the top of the template and using {{- with $value := clipHAProxyTimeoutValue (firstMatch $timeSpecPattern (index $cfg.Annotations "haproxy.router.openshift.io/timeout") $defaultServerTimeout) }}
here.
As noted in #483 (review), clipHAProxyTimeoutValue (firstMatch $timeSpecPattern foo)
could be simplified as clipHAProxyTimeoutValue foo
. However, that might be better left as a separate change.
...And actually, now that I look at it again, I believe clipHAProxyTimeoutValue (firstMatch $timeSpecPattern) foo
will silently ignore invalid values while clipHAProxyTimeoutValue foo
will log them. Logging errors for invalid values might be a positive change, but it's an additional behavioral change that you might not want to tie to OCPBUGS-14914.
...Although thinking about it still further, I wonder: Is maxTimeoutFirstMatchedAndClipped
going to cause clipHAProxyTimeoutValue
to log an error for each route that has an invalid annotation value, where before this change those invalid values get filtered out by firstMatch
before clipHAProxyTimeoutValue
has a chance to report any errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding {{- $defaultServerTimeout := firstMatch $timeSpecPattern (env "ROUTER_DEFAULT_SERVER_TIMEOUT") "30s" }} at the top of the template and using {{- with $value := clipHAProxyTimeoutValue (firstMatch $timeSpecPattern (index $cfg.Annotations "haproxy.router.openshift.io/timeout") $defaultServerTimeout) }} here.
Done.
timeout := clipHAProxyTimeoutValue(firstMatch(pattern, append([]string{cfg.Annotations[annotation]}, values...)...)) | ||
if timeout != "" { | ||
// No error handling because clipHAProxyTimeoutValue returns | ||
// a valid timeout or an empty string. The latter is already handled. | ||
timeoutDuration, _ := time.ParseDuration(timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of redundant computation:
maxTimeoutFirstMatchedAndClipped
parses each route's timeout annotation value usingfirstMatch
, then (if the value is valid)clipHAProxyTimeoutValue
, thentime.ParseDuration
.fe_sni
callsmaxTimeoutFirstMatchedAndClipped
, so you're parsing each valid value 3 times.fe_no_sni
also callsmaxTimeoutFirstMatchedAndClipped
, so you're parsing each valid value 3 more times.- The backend calls
clipHAProxyTimeoutValue (firstMatch $timeSpecPattern ...)
, so you're parsing each valid value 2 more times.
Granted, the regexp cache that Andy added in #268 avoids the cost of compiling the same regexp repeatedly, but you're still repeatedly doing a regexp match and then time.ParseDuration
on the same value, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The goal of the function was to determine which timeout value will be set on the lowest layer backends. That's why I used exactly the same combination of the functions as used in the template.
After some time spent on this one I didn't find a lot of room for the improvement here. However I implemented a benchmark test which:
- allowed me to find a faster way to do the same: the max annotation timeout is now found first and the values are computed only once afterwards.
- showed that I don't get any visible performance gain if I try to get the
time.Duration
fromclipHAProxyTimeoutValue
to avoid another call toParseDuration
.
Also, as you advised I moved the calculation of the max server timeout out of the backends so that it's done only one.
"parseIPList": parseIPList, //parses the list of IPs/CIDRs (IPv4/IPv6) | ||
"clipHAProxyTimeoutValue": clipHAProxyTimeoutValue, //clips extraordinarily high timeout values to be below the maximum allowed timeout value | ||
"parseIPList": parseIPList, //parses the list of IPs/CIDRs (IPv4/IPv6) | ||
"maxTimeoutFirstMatchedAndClipped": maxTimeoutFirstMatchedAndClipped, //finds the maximum timeout managed by a given annotation among all the routes, matches it against a given pattern, and clips |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep in mind that this list of helper functions is effectively an API. We cannot rename or remove a helper function without potentially breaking third-party templates, so we tend to prefer defining general and composable helpers. That said, I think it is worth putting this functionality in a single specialized helper to avoid excessive performance impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is worth putting this functionality in a single specialized helper to avoid excessive performance impact.
I'm not sure I understand how putting the function into a different helper supposed to remove the API contract. Taking into account that the function will still be used in the config template.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point was that there is a tradeoff here, and we should be aware of it.
- Pro: Simpler, more efficient template.
- Con: Risk of breaking unsupported customizations if we ever change or remove the helper.
However, it is probably worth making that tradeoff here. I'm not asking you to change anything; I'm just trying to be deliberate and explicit about the design choice.
annotation string | ||
pattern string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add test cases with different values for annotation
and pattern
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a test case with 2 routes one of which didn't match the pattern (One route timeout doesn't match pattern
). I added another one when no route matched the pattern (No route timeout matches pattern
). Let me know if this doesn't addresses your comment.
name: "Default timeout is maximum", | ||
state: map[ServiceAliasConfigKey]ServiceAliasConfig{ | ||
"test:route1": {}, | ||
"test:route2": {}, | ||
"test:route3": {}, | ||
}, | ||
annotation: "haproxy.router.openshift.io/timeout", | ||
pattern: `[1-9][0-9]*(us|ms|s|m|h|d)?`, | ||
values: []string{"30s"}, | ||
expected: "30s", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should at least one of the routes specify a value? It makes sense to me to have a test case for "some routes specify timeouts, but the default timeout is maximum" .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, what happens if you have 0 routes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should at least one of the routes specify a value? It makes sense to me to have a test case for "some routes specify timeouts, but the default timeout is maximum" .
The goal of maxTimeoutFirstMatchedAndClipped is to get the maximum timeout among the ones set on the lowest layer backends, rather than the maximum of all values provided to the function. For instance, if a route has a timeout annotation set, it will take precedence over the default timeout, even if the default timeout is greater. I updated the godoc for the function to highlight this.
By the way, what happens if you have 0 routes?
Unit test case added to cover this case (Empty state
).
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
e165421
to
05a37e0
Compare
Reanimating the PR: start addressing Miciah's review. |
/remove-lifecycle stale |
c1b2a01
to
c67ef49
Compare
High level perf test of Baseline (CI image == 4.17):
Change (image with
|
BENCH_PKGS ?= $(shell \grep -lR '^func Benchmark' | xargs -I {} dirname {} | sed 's|^|./|' | paste -sd ' ') | ||
BENCH_TEST ?= . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a couple of comments about what these are for? I see below that BENCH_PKGS
is a value for flag -benchmem
, so it could use a clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
If I understand correctly, this represents a major change for our users - whatever is the max connect timeout in ANY route becomes the max server and tunnel timeout for all the routes on the backend. I strongly feel like this needs to be something that is configurable, and is off by default in order for past users to see the same behavior when they upgrade to a version that has this fix. |
@candita: 1) only the server timeout is set to the max value on the middle backend, 2) this is only one part of the change. The other part is the server and tunnel timeouts set on all the route backends. Note that the default timeout values are added to the route level timeouts. We should not have any change in the behavior. Because for the routes which don't set any value for the timeouts (in the annotations) the default value will be set on the route backend.
"Configurable" this may put us in the same situation as right now - when a user can set a global (almost) timeout and override the timeouts set on the route level. "Off by default" would lead us to this warning during the config reloads. Overall, I understand the concern. The change may appear to have a potential to hide something I didn't manage to test manually and with our test suites. However I don't see any other solution for this bug. Let me try to organize a discussion about this PR outside of GitHub so that we can come to a conclusion. |
The middle backends (`be_sni`, `be_no_sni`) are updated with the server timeout which is set to the maximum value among all routes from the configuration. This prevents a warning message during config reloads. A benchmark test is added for the template helper function which finds the maximum timeout value among all routes (`maxTimeoutFirstMatchedAndClipped`). A new make target is introduced to run all benchmark tests (`bench`).
c67ef49
to
d70581e
Compare
Squashed the commits. |
/test e2e-metal-ipi-ovn-ipv6 |
@alebedev87: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
@alebedev87: This pull request references Jira Issue OCPBUGS-14914, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
This PR removes the tunnel and server timeouts in the global
default
section and sets them on all the backends. The route annotations for the timeouts continue to work as they did before.As suggested in the upstream issue, the middle backends (
be_sni
andbe_no_sni
) are set with the maximum of all the route timeouts to avoid the following warning message:Test using haproxy-timeout-tunnel repository:
The e2e test: openshift/cluster-ingress-operator#1013.
Presentation which summarizes the issue: link (Red Hat only).