Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP/UDP: new function is_listening: t -> ~port:int -> callback option #508

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

hannesm
Copy link
Member

@hannesm hannesm commented Apr 5, 2023

This is useful for proxies/middleware/interception of requests, a running example is let's encrypt and the HTTP challenge.

The methodology is as follows:

  • the unikernel requests (via https from let's encrypt) a challenge and solves it (using a private key, some cryptographic computations)
  • the let's encrypt server (wants to proof the ownership of the hostname in the certificate signing request) requests via HTTP (port 80) a specific resource (http://example.com/.well-known/acme-challenge/...)
  • the unikernel needs to properly reply to that challenge

Now, one path (that we took until now) is to treat this .well-knwon/acme-challenge very special in any unikernel that we wrote.

Another path is to create a let's encrypt http challenge library that takes a stack, and whenever it needs it registers itself for port 80, proxying everything it is not interested in, to the old handler (thus, is_listening), and serving the .well-known/acme-challenge.

Concurrent updates to the "listen" hashtable are dangerous of course, great care has to be taken (if some other parts of the application as well re-register listeners). But I'm confident since listen, unlisten, and is_listening are pure (not in Lwt monad), it's fine and can be dealt with. Another option would be to implement a real protocol/locking around the shared global resource of listening ports (but I'd first see whether we run into such troubles).

Another example is the let's encrypt ALPN challenge, where the process is as follows:

  • the unikernel requests (via https from let's encrypt) a challenge and solves it (using a private key, some cryptographic computations)
  • the let's encrypt server (wants to proof the ownership of the hostname in the signing request) connects via TLS on port 443 with a specific ALPN string
  • the unikernel needs to reply with a specially craftes self-signed certificate

This can, as above, be implemented by a temporary proxy while the challenge is in process -- without service interruptions for other parties (web browser, ...)

This is useful for proxies/middleware/interception of requests, a running
example is let's encrypt and the HTTP challenge.

The methodology is as follows:
- the unikernel requests (via https from let's encrypt) a challenge and solves it
  (using a private key, some cryptographic computations)
- the let's encrypt server (wants to proof the ownership of the hostname in the
  certificate signing request) requests via HTTP (port 80) a specific resource
  (http://example.com/.well-known/acme-challenge/...)
- the unikernel needs to properly reply to that challenge

Now, one path (that we took until now) is to treat this .well-knwon/acme-challenge
very special in any unikernel that we wrote.

Another path is to create a let's encrypt http challenge library that takes a
stack, and whenever it needs it registers itself for port 80, proxying
everything it is not interested in, to the old handler (thus, is_listening),
and serving the .well-known/acme-challenge.

Concurrent updates to the "listen" hashtable are dangerous of course, great
care has to be taken (if some other parts of the application as well re-register
listeners). But I'm confident since listen, unlisten, and is_listening are pure
(not in Lwt monad), it's fine and can be dealt with. Another option would be to
implement a real protocol/locking around the shared global resource of listening
ports (but I'd first see whether we run into such troubles).

Another example is the let's encrypt ALPN challenge, where the process is as follows:
- the unikernel requests (via https from let's encrypt) a challenge and solves it
  (using a private key, some cryptographic computations)
- the let's encrypt server (wants to proof the ownership of the hostname in the
  signing request) connects via TLS on port 443 with a specific ALPN string
- the unikernel needs to reply with a specially craftes self-signed certificate

This can, as above, be implemented by a temporary proxy while the challenge is
in process -- without service interruptions for other parties (web browser, ...)
@hannesm hannesm requested a review from dinosaure April 5, 2023 13:13
@hannesm
Copy link
Member Author

hannesm commented Apr 11, 2023

I added a second function, TCP.unread : flow -> Cstruct.t -> unit which purpose is to push some data back into the flow.

I'm not convinced this is the right thing to do (though it is very convenient for my use case). The implementation is rather basic (and works fine for my use case, but not for generality - where you may have a task already blocking on read while unread is called).

I'd like to finish and evaluate the prototype I have before merging this here..

@hannesm hannesm marked this pull request as draft April 11, 2023 10:19
@@ -17,7 +17,6 @@

include Tcpip.Tcp.S
with type ipaddr = Ipaddr.t
and type flow = Lwt_unix.file_descr
Copy link
Member Author

@hannesm hannesm Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any insight whether this is needed somewhere?

@@ -59,6 +59,9 @@ module Rx = struct
| None -> 0
| Some b -> Cstruct.length b

let add_l t s =
ignore(Lwt_dllist.add_l (Some s) t.q)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any idea whether any other things must be updated? I frankly don't understand much of the add_r below, but it deals with various cur_size and max_size.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, do t.readers need to be notified?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cur_size and max_size seem to be r(elated to a window of available data in the buffer (cur_size is what is currently available and max_size the bound for available data, but it should be possible to exceed that limit with the linked list data structure).
To keep things going, I think it's best to update cur_size and call notify_size_watcher to say that the data is online. Something like (the first comparison in add_r seems to be there to avoid exceeding max_size (again), I'm not sure the problem could be anything other than higher memory consumption, but it may be best to take care of that?):

  let add_l t s =
     match Lwt_dllist.take_opt_l t.readers with
     | None ->
         t.cur_size <- Int32.(add t.cur_size (of_int (seglen s)));
         ignore(Lwt_dllist.add_l (Some s) t.q)
         notify_size_watcher t 
     | Some w ->
         Lwt.return (Lwt.wakeup u s)

@@ -78,6 +149,10 @@ let dst fd =
in
ip, port

let unread fd buf =
let buf = Cstruct.append buf fd.buf in
fd.buf <- buf
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what needs to be handled (for a complete, general API) is if a lwt task is already in Lwt_cstruct.read -- where the read should be cancelled and the buf provided here being returned to the caller.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants