Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

std.cache_req_body(BYTES size, BOOL partial = 0) #3798

Draft
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

dridi
Copy link
Member

@dridi dridi commented Apr 11, 2022

Last week my employer organized among other things a hackathon day (thanks!) and I made this seemingly work. I revisited the code and reorganized the commits the next day during my flight, but didn't find much to say in an area of the code base not too familiar to me. I guess it's time to throw more eyeballs at it.

Quoting from the main patch of the series:

req.body: New BS_PARTIAL body status for caching

Partial caching allow VMODs to fetch and inspect part of the body. This
gives the same properties as BS_CACHED, but allows the request body to
be larger than what is stored. If a bereq send failures happens within
the partial cached body delivery, it also becomes eligible to retries.

This enables a more robust caching and retry policy where not knowing
the complete body size (chunked or h2 without content-length) or on the
other hand knowing that a body is larger (content-length) does not lead
to VCL failures.

It can also be valuable to inspect the beginning of a request body. For
example an image upload service could parse images metadata directly in
Varnish to enforce rules such as specific file formats or how large in
pixels an image may be.

One limitation left on purpose is that BS_PARTIAL objects will be sent
with chunked encoding to the backend, even when we know the length.

See individual commits for more details.

@nigoroll
Copy link
Member

Because Transfer-Encoding: chunked is uncommon for request bodies, I would expect some backends to choke on it and would thus think a mode to foward the client's Content-Length would be good. On IRC, @bsdphk rightly pointed out the need for extra care and attention due to smuggling potential, and I agree that we should look into how to handle, for example, the case when a client fails to send the announced length.

@dridi
Copy link
Member Author

dridi commented Apr 11, 2022

bugwash: separate partial caching from request body status.

@dridi dridi force-pushed the partial_req_body branch from 80e5d59 to e5d75ad Compare April 12, 2022 15:16
@dridi
Copy link
Member Author

dridi commented Apr 12, 2022

This morning I replaced c8ce29f with 99f21e5...fc0a497, then I added more coverage in the test case.

Basically:

  • BS_PARTIAL is replaced by a req_body_cached flag
  • another req_body_partial flag is added

Copy link
Member

@nigoroll nigoroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally 👍🏽
Some of my comments should probably be addressed before merging...

AZ(http_GetHdr(req->http0, H_Content_Length, NULL));
assert(l0 < 0);
http_Unset(req->http0, H_Transfer_Encoding);
http_PrintfHeader(req->http0, "Content-Length: %ju",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This problem existed before the proposed change:
I think this conflicts with Req_Rollback(), because the header is created on the rolled back part of the client workspace.
Because the case is likely not particularly relevant and given that a variant of the problem already exists, I think we might just veto rollback with a cached body.

assert(req->req_body_status->avail > 0);
if (req->req_body_status->length_known) {
assert(req_bodybytes == l0);
assert(req_bodybytes == l);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As req.http.Content-Length can be modified from VCL, I think we should rather either fix the header or fail the request rather than panicking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#3810 was my attempt at addressing this, but considering the consensus on why not to prevent this from happening I now think a panic is the appropriate response here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have my sympathies if you are saying this from a feeling of frustration.
But I actually think we should not not have any panics which can be triggered from VCL.

Also I wanted to take note of an idea from bugwash: For the rollback case, can we make the code DTRT even with the unmodified headers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have my sympathies if you are saying this from a feeling of frustration.
But I actually think we should not not have any panics which can be triggered from VCL.

Actually no, I'm genuinely thinking that if we leave dragons in VCL, we can't afford a gentle failure here. If you both mess with framing headers and request body caching things are going horribly wrong and should blow up.

Also I wanted to take note of an idea from bugwash: For the rollback case, can we make the code DTRT even with the unmodified headers?

One thought I didn't discuss during bugwash is that body status shouldn't be the responsibility of the HTTP connection. We miss some separation of concerns in struct http_conn that was OK when HTTP/1 was the only game in town. There are more things to disentangle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the course of my continuation of this work I replaced the code with just setting the respective headers because, with filters on the request body (but before caching), we can not make any assumptions on the relation between the content-length before/after caching.

assert(req_bodybytes == l0);
assert(req_bodybytes == l);
AZ(http_GetHdr(req->http0, H_Transfer_Encoding, NULL));
AZ(http_GetHdr(req->http, H_Transfer_Encoding, NULL));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same ⬆️

assert(req_bodybytes < l0);
assert(req_bodybytes < l);
} else {
assert(req_bodybytes == l0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment on previous commit: we should rather not panic

bin/varnishd/cache/cache_req_body.c Show resolved Hide resolved
@simonvik
Copy link
Contributor

simonvik commented May 23, 2022

With the following test that i partly stole from vmod_bodyaccess I can assert varnish with "Condition((bo->req->body_oc) != 0) not true."

varnishtest "Assert"

server s1 {
    rxreq
    txresp -hdr "OK: yes"
    rxreq
    txresp -hdr "OK: yes"
} -start

varnish v1 -vcl+backend {
    import std;

    sub vcl_recv {
        if(req.method == "POST"){
                std.cache_req_body(10KB);
        }
    }

} -start

client c1 {
    txreq -req POST -body "BANANE"
    rxresp

    txreq
    rxresp
} -run

Not 100% sure why but someone probably has an idea

@dridi
Copy link
Member Author

dridi commented May 23, 2022

--- bin/varnishd/cache/cache_req_fsm.c
+++ bin/varnishd/cache/cache_req_fsm.c
@@ -893,6 +893,7 @@ cnt_recv_prep(struct req *req, const char *ci)
                req->hash_always_miss = 0;
                req->hash_ignore_busy = 0;
                req->hash_ignore_vary = 0;
+               req->req_body_cached = 0;
                req->client_identity = NULL;
                req->storage = NULL;
        }

We lack coverage of keepalive after a cached req body for HTTP/1.

@bsdphk
Copy link
Contributor

bsdphk commented May 23, 2022

How does this work with H2 ?

I guess it causes head-of-line-blocking, since I see nothing tweaking the H2 windows to stop the client from pushing ahead ?

Do we care ?

@dridi
Copy link
Member Author

dridi commented May 23, 2022

I don't think we need to care about h2, the rxbuf from #3661 is here to prevent head of line blocking.

FWIW, I plan to address the panic reported by @simonvik like this:

--- i/bin/varnishd/cache/cache_req_body.c
+++ w/bin/varnishd/cache/cache_req_body.c
@@ -125,6 +125,7 @@ vrb_pull(struct req *req, ssize_t maxsize, unsigned partial,
                AN(oa_len);
                req_bodybytes = oa_len;
                AZ(HSH_DerefObjCore(req->wrk, &req->body_oc, 0));
+               req->req_body_cached = 0;
                req->req_body_partial = 0;
        }
 
@@ -345,10 +346,14 @@ VRB_Free(struct req *req)
 
        CHECK_OBJ_NOTNULL(req, REQ_MAGIC);
 
-       if (req->body_oc == NULL)
+       AZ(req->req_body_partial);
+       if (req->body_oc == NULL) {
+               AZ(req->req_body_cached);
                return;
+       }
 
        r = HSH_DerefObjCore(req->wrk, &req->body_oc, 0);
+       req->req_body_cached = 0;
 
        // each busyobj may have gained a reference
        assert (r >= 0);
diff --git i/bin/varnishd/cache/cache_req_fsm.c w/bin/varnishd/cache/cache_req_fsm.c
index de2a4b8584..37840ccd14 100644
--- i/bin/varnishd/cache/cache_req_fsm.c
+++ w/bin/varnishd/cache/cache_req_fsm.c
@@ -895,6 +895,8 @@ cnt_recv_prep(struct req *req, const char *ci)
                req->hash_ignore_vary = 0;
                req->client_identity = NULL;
                req->storage = NULL;
+               AZ(req->req_body_cached);
+               AZ(req->req_body_partial);
        }
 
        req->is_hit = 0;

The idea is to clear the req_body_cached flag when req->body_oc's reference is dropped, something that used to be inherent to the assignment of req_body_status. I'll try to find some time to carefully review this aspect of the change.

dridi added 13 commits May 25, 2022 15:41
It is otherwise very challenging to coordinate certain behaviors between
a backend fetch and a VTC server for example. The slow_acceptor delay is
2s, which is probably more than needed for a bereq, hence the 1s delay.
It's either guaranteed to fail a fetch or cache nothing, in the sense of
accessing to the request body through a VMOD. It is still possible to
end up with an empty body cached, if req.body turned out to be empty and
couldn't be predicted. It is no longer possible to actively try caching
an empty one.
This way we can keep track of how we got the body (until it is taken by
a busyobj) separately from the fact that we cached it.
To make it accessible from VRB_Iterate() in the next commit.
Partial caching allow VMODs to fetch and inspect part of the body. This
gives the same properties as req_body_cached but allows the request body
to be larger than what is stored. If a bereq send failure happens within
the partial cached body delivery, it also becomes eligible to retries.

This enables a more robust caching and retry policy where not knowing
the complete body size (chunked or h2 without content-length) or on the
other hand knowing that a body is larger (content-length) does not lead
to VCL failures.

It can also be valuable to inspect the beginning of a request body. For
example an image upload service could parse images metadata directly in
Varnish to enforce rules such as specific file formats or how large in
pixels an image may be.

Better diff with the --ignore-all-space option.
When a chunked req.body is partially cached the header would be added a
second time upon a retry.
This parameter defaults to false to avoid breaking existing code.
Ensure that a request with a body does not break a subsequent request
without a body on the same client HTTP/1 session.

Spotted by @simonvik
@dridi dridi force-pushed the partial_req_body branch from e5d75ad to 706844a Compare May 25, 2022 13:58
@dridi
Copy link
Member Author

dridi commented May 25, 2022

Rebased against current master:

  • flags are now cleared consistently and asserted
  • test coverage added in 706844a

@nigoroll nigoroll self-requested a review May 27, 2022 12:43
http_PrintfHeader(req->http0, "Content-Length: %ju",
(uintmax_t)req_bodybytes);

AZ(http_GetHdr(req->http, H_Content_Length, NULL));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also FTR:
The req.http0 changes are logged, which confused me at first thinking that the vrb_pull() might be called twice:

**** v1    vsl|       1004 ReqUnset        c Transfer-encoding: chunked
**** v1    vsl|       1004 ReqHeader       c Content-Length: 110
**** v1    vsl|       1004 ReqUnset        c Transfer-encoding: chunked
**** v1    vsl|       1004 ReqHeader       c Content-Length: 110

@nigoroll
Copy link
Member

nigoroll commented May 28, 2022

24fe377 exposes that we have an issue with chunked encoding: When the body cache size exactly matches the length, we fail caching.

The trivial options are:

  • ignore (and probably docfix) that, for chunked encoding, the buffer needs to be at least one byte larger than the actual length
  • make the buffer one byte larger than specified

But this is only the symptom of another, possibly more relevant issue: the v1f chunked vfp currently can not send VFP_END with the last bytes. To solve this issue, we need to peek into the next chunk header before returning VFP_OK. See next comment...

@dridi dridi mentioned this pull request May 30, 2022
nigoroll added a commit to nigoroll/varnish-cache that referenced this pull request Jun 3, 2022
This commit contains only the init/fini mechanics for the
state/buffer, it is not used yet.

The state is allocated on the heap to be prepared for partial caching
(see varnishcache#3798), where the V1F layer is left and control returns to VCL:

- the worker workspace aws is not an option because it must be freed
  to its original state before returning to VCL.

- the task (req) workspace is not an option because of rollbacks.

While I would have preferred to avoid a heap allocation, it will be
payed off by the reduction of system calls in follow up commits.
@nigoroll
Copy link
Member

nigoroll commented Jun 7, 2022

I now have four branches in total with fixes and improvements around this PR. To reduce confusion, I will reference these in this single comment and delete some other references:

The mentions are in proposed order to be applied to trunk:

nigoroll added a commit to nigoroll/varnish-cache that referenced this pull request Jun 24, 2022
This commit contains only the init/fini mechanics for the
state/buffer, it is not used yet.

The state is allocated on the heap to be prepared for partial caching
(see varnishcache#3798), where the V1F layer is left and control returns to VCL:

- the worker workspace aws is not an option because it must be freed
  to its original state before returning to VCL.

- the task (req) workspace is not an option because of rollbacks.

While I would have preferred to avoid a heap allocation, it will be
payed off by the reduction of system calls in follow up commits.
nigoroll added a commit to nigoroll/varnish-cache that referenced this pull request Jun 24, 2022
This commit contains only the init/fini mechanics for the
state/buffer, it is not used yet.

The state is allocated on the heap to be prepared for partial caching
(see varnishcache#3798), where the V1F layer is left and control returns to VCL:

- the worker workspace aws is not an option because it must be freed
  to its original state before returning to VCL.

- the task (req) workspace is not an option because of rollbacks.

While I would have preferred to avoid a heap allocation, it will be
payed off by the reduction of system calls in follow up commits.
nigoroll added a commit to nigoroll/varnish-cache that referenced this pull request Jun 24, 2022
This commit contains only the init/fini mechanics for the
state/buffer, it is not used yet.

The state is allocated on the heap to be prepared for partial caching
(see varnishcache#3798), where the V1F layer is left and control returns to VCL:

- the worker workspace aws is not an option because it must be freed
  to its original state before returning to VCL.

- the task (req) workspace is not an option because of rollbacks.

While I would have preferred to avoid a heap allocation, it will be
payed off by the reduction of system calls in follow up commits.
nigoroll added a commit to nigoroll/libvmod-soap that referenced this pull request Jul 4, 2022
See varnishcache/varnish-cache#3798 (comment)

Changes:

* need std.cache_req_body() now

Unless the request body is only used by the soap vmod (that is, whenever it is
to be sent to the backend also), std.cache_req_body(), optionally with the partial
argument, needs to be used to cache at least as much of the request body
to fulfil any xml lookup used.

Partial caching can be used if, for example, only the XML header is read and the
header is known to be placed at the top.

* VCL is now responsible for checking Content-Encoding

... because it could push filters to support other Content-Encodings

* Varnish Cache supports chunked Encoding for bodies

The test for a body with no Content-Length and no Chunked-Encoding did
not make sense, request bodies could never be closed with "EOL" semantics
(write side shutdown), not even in HTTP/1.0

https://www.rfc-editor.org/rfc/rfc1945.html#section-8.3

   A valid Content-Length is required on all HTTP/1.0 POST requests.

* Polish vcc file

$ABI is implicitly required by varnish-cache as of af87f0cd1a7b8be8a124dca355f04dd641bc5512

The $Module quotes are currently not required.
@dridi dridi marked this pull request as draft July 3, 2023 09:42
@dridi dridi marked this pull request as ready for review July 3, 2023 09:55
@dridi dridi marked this pull request as draft July 3, 2023 09:55
@dridi
Copy link
Member Author

dridi commented Jul 3, 2024

Some changes originating from this pull request (but going one step further) were submitted in #4125 and pushed upstream (ce3e446...f0e9df8). Partial request body caching should be revisited once trailers are supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants