-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
std.cache_req_body(BYTES size, BOOL partial = 0) #3798
base: master
Are you sure you want to change the base?
Conversation
Because |
bugwash: separate partial caching from request body status. |
This morning I replaced c8ce29f with 99f21e5...fc0a497, then I added more coverage in the test case. Basically:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally 👍🏽
Some of my comments should probably be addressed before merging...
AZ(http_GetHdr(req->http0, H_Content_Length, NULL)); | ||
assert(l0 < 0); | ||
http_Unset(req->http0, H_Transfer_Encoding); | ||
http_PrintfHeader(req->http0, "Content-Length: %ju", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This problem existed before the proposed change:
I think this conflicts with Req_Rollback(), because the header is created on the rolled back part of the client workspace.
Because the case is likely not particularly relevant and given that a variant of the problem already exists, I think we might just veto rollback with a cached body.
bin/varnishd/cache/cache_req_body.c
Outdated
assert(req->req_body_status->avail > 0); | ||
if (req->req_body_status->length_known) { | ||
assert(req_bodybytes == l0); | ||
assert(req_bodybytes == l); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As req.http.Content-Length
can be modified from VCL, I think we should rather either fix the header or fail the request rather than panicking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#3810 was my attempt at addressing this, but considering the consensus on why not to prevent this from happening I now think a panic is the appropriate response here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have my sympathies if you are saying this from a feeling of frustration.
But I actually think we should not not have any panics which can be triggered from VCL.
Also I wanted to take note of an idea from bugwash: For the rollback case, can we make the code DTRT even with the unmodified headers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have my sympathies if you are saying this from a feeling of frustration.
But I actually think we should not not have any panics which can be triggered from VCL.
Actually no, I'm genuinely thinking that if we leave dragons in VCL, we can't afford a gentle failure here. If you both mess with framing headers and request body caching things are going horribly wrong and should blow up.
Also I wanted to take note of an idea from bugwash: For the rollback case, can we make the code DTRT even with the unmodified headers?
One thought I didn't discuss during bugwash is that body status shouldn't be the responsibility of the HTTP connection. We miss some separation of concerns in struct http_conn
that was OK when HTTP/1 was the only game in town. There are more things to disentangle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the course of my continuation of this work I replaced the code with just setting the respective headers because, with filters on the request body (but before caching), we can not make any assumptions on the relation between the content-length before/after caching.
assert(req_bodybytes == l0); | ||
assert(req_bodybytes == l); | ||
AZ(http_GetHdr(req->http0, H_Transfer_Encoding, NULL)); | ||
AZ(http_GetHdr(req->http, H_Transfer_Encoding, NULL)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same ⬆️
assert(req_bodybytes < l0); | ||
assert(req_bodybytes < l); | ||
} else { | ||
assert(req_bodybytes == l0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment on previous commit: we should rather not panic
With the following test that i partly stole from vmod_bodyaccess I can assert varnish with "Condition((bo->req->body_oc) != 0) not true."
Not 100% sure why but someone probably has an idea |
--- bin/varnishd/cache/cache_req_fsm.c
+++ bin/varnishd/cache/cache_req_fsm.c
@@ -893,6 +893,7 @@ cnt_recv_prep(struct req *req, const char *ci)
req->hash_always_miss = 0;
req->hash_ignore_busy = 0;
req->hash_ignore_vary = 0;
+ req->req_body_cached = 0;
req->client_identity = NULL;
req->storage = NULL;
} We lack coverage of keepalive after a cached req body for HTTP/1. |
How does this work with H2 ? I guess it causes head-of-line-blocking, since I see nothing tweaking the H2 windows to stop the client from pushing ahead ? Do we care ? |
I don't think we need to care about h2, the rxbuf from #3661 is here to prevent head of line blocking. FWIW, I plan to address the panic reported by @simonvik like this: --- i/bin/varnishd/cache/cache_req_body.c
+++ w/bin/varnishd/cache/cache_req_body.c
@@ -125,6 +125,7 @@ vrb_pull(struct req *req, ssize_t maxsize, unsigned partial,
AN(oa_len);
req_bodybytes = oa_len;
AZ(HSH_DerefObjCore(req->wrk, &req->body_oc, 0));
+ req->req_body_cached = 0;
req->req_body_partial = 0;
}
@@ -345,10 +346,14 @@ VRB_Free(struct req *req)
CHECK_OBJ_NOTNULL(req, REQ_MAGIC);
- if (req->body_oc == NULL)
+ AZ(req->req_body_partial);
+ if (req->body_oc == NULL) {
+ AZ(req->req_body_cached);
return;
+ }
r = HSH_DerefObjCore(req->wrk, &req->body_oc, 0);
+ req->req_body_cached = 0;
// each busyobj may have gained a reference
assert (r >= 0);
diff --git i/bin/varnishd/cache/cache_req_fsm.c w/bin/varnishd/cache/cache_req_fsm.c
index de2a4b8584..37840ccd14 100644
--- i/bin/varnishd/cache/cache_req_fsm.c
+++ w/bin/varnishd/cache/cache_req_fsm.c
@@ -895,6 +895,8 @@ cnt_recv_prep(struct req *req, const char *ci)
req->hash_ignore_vary = 0;
req->client_identity = NULL;
req->storage = NULL;
+ AZ(req->req_body_cached);
+ AZ(req->req_body_partial);
}
req->is_hit = 0; The idea is to clear the |
It is otherwise very challenging to coordinate certain behaviors between a backend fetch and a VTC server for example. The slow_acceptor delay is 2s, which is probably more than needed for a bereq, hence the 1s delay.
It's either guaranteed to fail a fetch or cache nothing, in the sense of accessing to the request body through a VMOD. It is still possible to end up with an empty body cached, if req.body turned out to be empty and couldn't be predicted. It is no longer possible to actively try caching an empty one.
This way we can keep track of how we got the body (until it is taken by a busyobj) separately from the fact that we cached it.
With more checks.
To make it accessible from VRB_Iterate() in the next commit.
Partial caching allow VMODs to fetch and inspect part of the body. This gives the same properties as req_body_cached but allows the request body to be larger than what is stored. If a bereq send failure happens within the partial cached body delivery, it also becomes eligible to retries. This enables a more robust caching and retry policy where not knowing the complete body size (chunked or h2 without content-length) or on the other hand knowing that a body is larger (content-length) does not lead to VCL failures. It can also be valuable to inspect the beginning of a request body. For example an image upload service could parse images metadata directly in Varnish to enforce rules such as specific file formats or how large in pixels an image may be. Better diff with the --ignore-all-space option.
When a chunked req.body is partially cached the header would be added a second time upon a retry.
This parameter defaults to false to avoid breaking existing code.
Ensure that a request with a body does not break a subsequent request without a body on the same client HTTP/1 session. Spotted by @simonvik
Rebased against current master:
|
http_PrintfHeader(req->http0, "Content-Length: %ju", | ||
(uintmax_t)req_bodybytes); | ||
|
||
AZ(http_GetHdr(req->http, H_Content_Length, NULL)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also FTR:
The req.http0
changes are logged, which confused me at first thinking that the vrb_pull()
might be called twice:
**** v1 vsl| 1004 ReqUnset c Transfer-encoding: chunked
**** v1 vsl| 1004 ReqHeader c Content-Length: 110
**** v1 vsl| 1004 ReqUnset c Transfer-encoding: chunked
**** v1 vsl| 1004 ReqHeader c Content-Length: 110
24fe377 exposes that we have an issue with chunked encoding: When the body cache size exactly matches the length, we fail caching. The trivial options are:
But this is only the symptom of another, possibly more relevant issue: the v1f chunked vfp currently can not send |
This commit contains only the init/fini mechanics for the state/buffer, it is not used yet. The state is allocated on the heap to be prepared for partial caching (see varnishcache#3798), where the V1F layer is left and control returns to VCL: - the worker workspace aws is not an option because it must be freed to its original state before returning to VCL. - the task (req) workspace is not an option because of rollbacks. While I would have preferred to avoid a heap allocation, it will be payed off by the reduction of system calls in follow up commits.
I now have four branches in total with fixes and improvements around this PR. To reduce confusion, I will reference these in this single comment and delete some other references: The mentions are in proposed order to be applied to trunk:
|
This commit contains only the init/fini mechanics for the state/buffer, it is not used yet. The state is allocated on the heap to be prepared for partial caching (see varnishcache#3798), where the V1F layer is left and control returns to VCL: - the worker workspace aws is not an option because it must be freed to its original state before returning to VCL. - the task (req) workspace is not an option because of rollbacks. While I would have preferred to avoid a heap allocation, it will be payed off by the reduction of system calls in follow up commits.
This commit contains only the init/fini mechanics for the state/buffer, it is not used yet. The state is allocated on the heap to be prepared for partial caching (see varnishcache#3798), where the V1F layer is left and control returns to VCL: - the worker workspace aws is not an option because it must be freed to its original state before returning to VCL. - the task (req) workspace is not an option because of rollbacks. While I would have preferred to avoid a heap allocation, it will be payed off by the reduction of system calls in follow up commits.
This commit contains only the init/fini mechanics for the state/buffer, it is not used yet. The state is allocated on the heap to be prepared for partial caching (see varnishcache#3798), where the V1F layer is left and control returns to VCL: - the worker workspace aws is not an option because it must be freed to its original state before returning to VCL. - the task (req) workspace is not an option because of rollbacks. While I would have preferred to avoid a heap allocation, it will be payed off by the reduction of system calls in follow up commits.
See varnishcache/varnish-cache#3798 (comment) Changes: * need std.cache_req_body() now Unless the request body is only used by the soap vmod (that is, whenever it is to be sent to the backend also), std.cache_req_body(), optionally with the partial argument, needs to be used to cache at least as much of the request body to fulfil any xml lookup used. Partial caching can be used if, for example, only the XML header is read and the header is known to be placed at the top. * VCL is now responsible for checking Content-Encoding ... because it could push filters to support other Content-Encodings * Varnish Cache supports chunked Encoding for bodies The test for a body with no Content-Length and no Chunked-Encoding did not make sense, request bodies could never be closed with "EOL" semantics (write side shutdown), not even in HTTP/1.0 https://www.rfc-editor.org/rfc/rfc1945.html#section-8.3 A valid Content-Length is required on all HTTP/1.0 POST requests. * Polish vcc file $ABI is implicitly required by varnish-cache as of af87f0cd1a7b8be8a124dca355f04dd641bc5512 The $Module quotes are currently not required.
Some changes originating from this pull request (but going one step further) were submitted in #4125 and pushed upstream (ce3e446...f0e9df8). Partial request body caching should be revisited once trailers are supported. |
Last week my employer organized among other things a hackathon day (thanks!) and I made this seemingly work. I revisited the code and reorganized the commits the next day during my flight, but didn't find much to say in an area of the code base not too familiar to me. I guess it's time to throw more eyeballs at it.
Quoting from the main patch of the series:
See individual commits for more details.