Skip to content

Commit

Permalink
Merge pull request #1811 from private-octopus/try-network-order
Browse files Browse the repository at this point in the history
Ports in network order in sockaddr
  • Loading branch information
huitema authored Dec 18, 2024
2 parents 7dd57d9 + 6cd1989 commit 811fd16
Show file tree
Hide file tree
Showing 13 changed files with 293 additions and 51 deletions.
2 changes: 1 addition & 1 deletion ci/build_picotls.ps1
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Build at a known-good commit
$COMMIT_ID=" 5a4461d8a3948d9d26bf861e7d90cb80d8093515"
$COMMIT_ID= "5a4461d8a3948d9d26bf861e7d90cb80d8093515"

# Match expectations of picotlsvs project.
mkdir $dir\include\
Expand Down
15 changes: 14 additions & 1 deletion doc/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,9 @@ The API enables applications to:
# Networking API

The networking API enables processes to submit incoming packets to a QUIC context,
and to poll a QUIC context for new packets to send. A typical process using
and to poll a QUIC context for new packets to send. There could be multiple
implementations of that API. Most applications will provide an interface to
classic UPD sockets. A typical process using
picoquic will be organized around the networking API:

1. Create a QUIC context
Expand All @@ -91,6 +93,17 @@ picoquic will be organized around the networking API:
server process needs to close.
6. Close the QUIC context

The defaut "socket loop"{{socket_loop.md}} provides an implementation of
this logic. Applications may opt to use that, but may also develop their own
code, for example because they implement multiple protocols and need to manage
multiple sockets, or because they want to integrate with a specific
event handling library like `libevent`, or maybe because they want to
use high performance API like 'iouring' or 'DPDK'. The Picoquic test code
uses that API to send messages through a simple simulator instead of
actual sockets.

The networking API is detailed in the next sections.

## Polling API

The polling API allows a process to learn how long the QUIC context can wait until the next
Expand Down
238 changes: 238 additions & 0 deletions doc/socket_loop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
# Default Socket Loop

The architecture document {{architecture.md}} describes how the picoquic exposes
both an application API, on top of which applications implement their logic, and
a networking API, under which implementations provide code to send or receive
messages. The library provides a default implementation of the networking services,
which is suitable for simple applications.

![Socket loop under the network API](socket_loop.png)

The socket loop code is suitable for applications that manage a single
UDP socket. It can be operated in synchronous mode for single threaded
applications, or in asynchronous mode for multithreaded applications.

## Launching the socket loop

The socket loop API are defined in `picoquic_packet_loop.h`. Applications
that want to use this component will need to include that header file
and launch the loop by calling the function `picoquic_packet_loop_v2`
for synchronous operation, or `picoquic_start_network_thread` for
multithreaded operation.

If an application provide its own implementation of the packet loop, it should
call its own functions instead of the packet loop, and it should not
include the header file `picoquic_packet_loop.h`. In that case, the code of the
packet loop will not be linked in the application's binary.

## Synchronous operation

In synchronous operation, the application prepares a QUIC context and an application
context and then calls `picoquic_packet_loop_v2`:

```
int picoquic_packet_loop_v2(picoquic_quic_t* quic,
picoquic_packet_loop_param_t * param,
picoquic_packet_loop_cb_fn loop_callback,
void * loop_callback_ctx);
```

The loop will execute,
calling the Picoquic Netowrking API functions `picoquic_prepare_next_packet_ex`
to ask the stack whether packets are ready to be sent and
`picoquic_incoming_packet_ex` when packets are received from the network.

The code expects that the `quic` context has already been created by the
application, setting transport parameters and other options as seen fit by the
application.

The `param` argument contains data to parameterize the packet loop:

* `local_port`: the value of the local port to which the socket will be
bound, in host order (e.g., if chosing port 443, the value 443). If
that value is set to zero, the socket will be open on a random port,
as chosen by the system.

* `local_af`: the value of the Address Family that should be selected for
the local socket. If the value is left to `AF_UNSPEC`, two sockets
will be created, one for `AF_INET` (IPv6), and one for `AF_INET6` (IPv6).

* `dest_if`: the interface identifier that should be associated with the local
socket, or 0.

* `socket_buffer_size`: if specified, the size of the socket send and
receive buffers, set with the socket option SO_SNDBUF and SO_RCVBUF.

* `do_not_use_gso`: by default, the socket loop tries to send several
UDP packets in a single call to `sendmsg`, in order to improve
performance. Setting this flag forces the code to send exactly
one message per call.

* `extra_socket_required`: request to create a secondary socket, used
for example to test or simulate migration or multipath functions.
That socket will be set to a random port number, chosen by the
systems. If the parameter `local_af` is left to `AF_UNSPEC`, two sockets
will be created, one for `AF_INET` (IPv6), and one for `AF_INET6` (IPv6).

* `prefer_extra_socket`: if the source address and source port are set,
ouitgoing packets will be sent on a socket with matching

* `simulate_eio`: simulate an EIO socket error. This error happens when
the socket does not support UDP GSO. The simulation enables us to
test the automatic fallback to one packet per sendmsg call.

* `send_length_max`: the largest buffer size used by the calls to
`picoquic_prepare_next_packet_ex`. This is used in debugging,
to verify that the `UDP_GSO` implementation is functional.


In addition, the packet loop exposes a network level callback API, to handle
network level events that are not directly linked to the QUIC connections.
The callback API is defined by the function prototype:

```
typedef int (*picoquic_packet_loop_cb_fn)(picoquic_quic_t * quic, picoquic_packet_loop_cb_enum cb_mode, void * callback_ctx, void * callback_argv);
```

It exposes a series of callback events:

* `picoquic_packet_loop_ready`: Indicates that the required sockets are properly open. Passes an argument of type
`picoquic_packet_loop_options_t`, enabling the application to set the corresponding flags if
it wants to be called for a time check before the loops waits for timers or incoming packets.
* `picoquic_packet_loop_after_receive`: Called after packets have been received, enabling the application
to perform picoquic API calls triggered by the received data.
* `picoquic_packet_loop_after_send`: Called after packets have been received, enabling the application
to perform picoquic API calls triggered by the received data.
* `picoquic_packet_loop_port_update`: Provides a "loopback" socket address corresponding to the main
socket. Can be used to learn the port number associated with that socket.
* `picoquic_packet_loop_time_check`: Called before the packet loop starts waiting for a new packet or a
timer. The calling argument of type `packet_loop_time_check_arg_t` provides the current time and
the expected value of the timer. The application may set a lower timer value.
* `picoquic_packet_loop_system_call_duration`: If the application has opted to monitor system call duration,
the packet loop will compute and update statistics on the duration of calls, and passes them in
an argument of type ``packet_loop_system_call_duration_t`. This could be use during performance
tuning, to check whether system load slows down the packet loop.
* `picoquic_packet_loop_wake_up`: called when the packet loop has been awakened by a call to
`picoquic_wake_up_network_thread`, enabling the application to perform picoquic API calls.
(Only useful in asynchronous mode.)
* `picoquic_packet_loop_alt_port`: Provide the port number associated with the alternate socket.
This is used for simulations and tests of the migration and multipath capabilities,
creating alternate paths for alternate port number.

If the processing of the callback is successful, the return code should be set to 0.
If the application wants to terminate the packet loop, it can set the
return value to `PICOQUIC_NO_ERROR_TERMINATE_PACKET_LOOP`. A couple of
other error codes, `PICOQUIC_NO_ERROR_SIMULATE_NAT` and
`PICOQUIC_NO_ERROR_SIMULATE_MIGRATION` are used to manage
simulations of migration and multipath -- but could be removed in
future versions. Other returned values will cause the packet loop to terminate,
returning the error value to the application.

## Single process constraints

When running in synchronous mode, the packet loop reacts only to timers and arrival of
packets. This is generally adequate for a small server that simply serves data files,
such as the basic HTTP server used in `picoquicdemo` or the simple P2P server presented
in the `sample` code. Such servers will receive a command from the networked peer,
prepare a response and schedule the required packets, all in a single process.

The synchronous mode can support limited clients that are launched once and execute
a programmed scenario. For example, the `picoquicdemo` take as a parameter a list
of scenario that specifies a series of requests to post or download pages. The `sample`
client takes as parameter a list of files to acquire from the peer. For supporting these
scenarios, the client code will initiate a connection in the selected `quic`
context before starting the packet loop. When the packet loop starts, the initial
packets for that connection will be sent on the socket, and the connection will
continue until the end of the programmed scenario.

The synchronous mode will not easily support interactive scenarios, in which
requests are sent after UI interactions. It will also not easily support multimedia
scenarios, such as for example a video conference. The application will want to
use multiple threads, typically one for media capture, one for rendering, another for
managing the UI, and of course an independent thread for managing the QUIC
connections. For that, the application needs to start the packet loop
in asynchronous mode.

## Asynchronous operation

Applications that operate in asynchronous mode will want start the packet loop
using the `picoquic_start_network_thread` API:

```
picoquic_network_thread_ctx_t* picoquic_start_network_thread(
picoquic_quic_t* quic,
picoquic_packet_loop_param_t* param,
picoquic_packet_loop_cb_fn loop_callback,
void* loop_callback_ctx,
int * ret);
```

The parameters are the same as the call to `picoquic_packet_loop_v2`, with two differences:
the call returns a thread context of type `picoquic_network_thread_ctx_t` describing
the thread that was just created, and upon exit of the packet loop the variable
`ret` will contain the exit code of the loop -- the same value that would be
returned by a synchronous loop.

The API `picoquic_start_network_thread` is designed to be simple. It uses the default
thread handling corresponding to the OS, such as `pthread` on Unix variants and
`CreateThread` on Windows. Developers can substitute their own thread management functions
by calling:
```
picoquic_network_thread_ctx_t* picoquic_start_custom_network_thread(
picoquic_quic_t* quic,
picoquic_packet_loop_param_t* param,
picoquic_custom_thread_create_fn thread_create_fn,
picoquic_custom_thread_delete_fn thread_delete_fn,
picoquic_custom_thread_setname_fn thread_setname_fn,
char const* thread_name,
picoquic_packet_loop_cb_fn loop_callback,
void* loop_callback_ctx,
int * ret);
```
This call lets application supply their own functions for creating and deleting threads, and
also for naming threads.

## Picoquic APIs are not thread safe

When operating in asynchronous mode, developers should constantly remember that
the Picoquic APIs are not designed to be thread safe. For example, if two
threads were to call `picoquic_create_cnx` in parallel, it is entirely
possible that the internal state of the `quic` context will become
incoherent. The recommended solution is to implement some kind of synchronization
between the application thread and the background thread running Picoquic.

A typical operation would be:

* the application prepares a new message, to be sent for example on
a QUIC stream in a specified connection,

* once the message is ready, the application calls `picoquic_wake_up_network_thread`

* in the next `picoquic_packet_loop_wake_up` callback, the application
calls the necessary Picoquic API.

This structure ensures that the Picoquic API is called from within the
networking thread, and that the `quic` context will remain coherent.

## Don't seat on a callback

Callback APIs like used by Picoquic are simple to understand, but they have
one well known drawback: a badly designed application could start a
lengthy operation within a callback, during which time the entire
networking thread would become unresponsive. Don't do that! It is OK
to copy data from memory, call picoquic APIs, maybe read or write
a packet worth of data to a local storage, but doing much more
than that is asking for trouble.

One specific form of trouble is waiting too long for semaphores or other locks
from within a callback. Some kind of locking may be needed to synchronize
multiple threads, as in the message passing described in the previous
section, but it should be carefully designed so that critical sections
remain very short and contentions are resolved quickly.

Remember that network timers are generally
proportional to the network latency, which can be a few milliseconds on
a local network or a few tens of milliseconds in a typical Internet
connection. Waiting even a fraction of that in a callback can delay
the processing of packets, cause spurious packet losses, and generally
affect the performance of the connection.
Binary file added doc/socket_loop.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
14 changes: 8 additions & 6 deletions picoquic/picoquic.h
Original file line number Diff line number Diff line change
Expand Up @@ -856,7 +856,7 @@ void picoquic_set_rejected_version(picoquic_cnx_t* cnx, uint32_t rejected_versio
* path will be demoted after a short delay.
*
* Like all user-level networking API, the "probe new path" API assumes that the
* port numbers in the address fields are expressed in "host" format.
* port numbers in the socket addresses structures are expressed in network order.
*
* Path event callbacks can be enabled by calling "picoquic_enable_path_callbacks".
* This can be set as the default for new connections by calling
Expand Down Expand Up @@ -921,7 +921,8 @@ int picoquic_set_path_status(picoquic_cnx_t* cnx, uint64_t unique_path_id, picoq
/* The get path addr API provides the IP addresses used by a specific path.
* The "local" argument determines whether the APi returns the local address
* (local == 1), the address of the peer (local == 2) or the address observed by the peer (local == 3).
* The port value in the returned addresses is always in "host" format.
* Like all user-level networking API, the "picoquic_get_path_addr" API assumes that the
* port numbers in the socket addresses structures are expressed in network order.
*/
int picoquic_get_path_addr(picoquic_cnx_t* cnx, uint64_t unique_path_id, int local, struct sockaddr_storage* addr);

Expand Down Expand Up @@ -1021,7 +1022,8 @@ void picoquic_cnx_set_pmtud_required(picoquic_cnx_t* cnx, int is_pmtud_required)
int picoquic_tls_is_psk_handshake(picoquic_cnx_t* cnx);

/* Manage addresses
* The port value in the set or returned addresses is always in "host" format.
* The port value in the set or returned socket addresses structures are
* always expressed in network order.
*/
void picoquic_get_peer_addr(picoquic_cnx_t* cnx, struct sockaddr** addr);
void picoquic_get_local_addr(picoquic_cnx_t* cnx, struct sockaddr** addr);
Expand Down Expand Up @@ -1069,7 +1071,7 @@ int picoquic_queue_datagram_frame(picoquic_cnx_t* cnx, size_t length, const uint
* Quic context. The API handles the decryption of the packets
* and their processing in the context of connections.
*
* The port value in the socket addresses is always in "host" format.
* The port numbers in the socket addresses structures are expressed in network order.
*/

int picoquic_incoming_packet(
Expand Down Expand Up @@ -1099,7 +1101,7 @@ int picoquic_incoming_packet_ex(
* The API "picoquic_prepare_packet" does the same but for just one
* connection at a time.
*
* The port value in the returned addresses is always in "host" format.
* The port numbers in the socket addresses structures are expressed in network order.
*/

int picoquic_prepare_next_packet_ex(picoquic_quic_t* quic,
Expand Down Expand Up @@ -1137,7 +1139,7 @@ int picoquic_prepare_packet(picoquic_cnx_t* cnx,
* suggested by the stack. The socket_err parameter may be used by the stack for logging
* purposes.
*
* The port value in the returned addresses is always in "host" format.
* The port numbers in the socket addresses structures are expressed in network order.
*/
void picoquic_notify_destination_unreachable(picoquic_cnx_t* cnx,
uint64_t current_time, struct sockaddr* addr_peer, struct sockaddr* addr_local, int if_index, int socket_err);
Expand Down
3 changes: 2 additions & 1 deletion picoquic/picoquic_packet_loop.h
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,8 @@ extern "C" {
typedef struct st_picoquic_socket_ctx_t {
SOCKET_TYPE fd;
int af;
uint16_t port;
uint16_t port; /* Port number to which the socket is bound */
uint16_t n_port; /* value of the port number in network order htons(port) */

/* Flags */
unsigned int is_started : 1;
Expand Down
8 changes: 4 additions & 4 deletions picoquic/picosocks.c
Original file line number Diff line number Diff line change
Expand Up @@ -1269,11 +1269,11 @@ int picoquic_get_server_address(const char* ip_address_text, int server_port,
if (inet_pton(AF_INET, ip_address_text, &ipv4_dest->sin_addr) == 1) {
/* Valid IPv4 address */
ipv4_dest->sin_family = AF_INET;
ipv4_dest->sin_port = (unsigned short)server_port;
ipv4_dest->sin_port = htons((unsigned short)server_port);
} else if (inet_pton(AF_INET6, ip_address_text, &ipv6_dest->sin6_addr) == 1) {
/* Valid IPv6 address */
ipv6_dest->sin6_family = AF_INET6;
ipv6_dest->sin6_port = (unsigned short)server_port;
ipv6_dest->sin6_port = htons((unsigned short)server_port);
} else {
/* Server is described by name. Do a lookup for the IP address,
* and then use the name as SNI parameter */
Expand All @@ -1299,7 +1299,7 @@ int picoquic_get_server_address(const char* ip_address_text, int server_port,
switch (result->ai_family) {
case AF_INET:
ipv4_dest->sin_family = AF_INET;
ipv4_dest->sin_port = (unsigned short)server_port;
ipv4_dest->sin_port = htons((unsigned short)server_port);
#ifdef _WINDOWS
ipv4_dest->sin_addr.S_un.S_addr = ((struct sockaddr_in*)result->ai_addr)->sin_addr.S_un.S_addr;
#else
Expand All @@ -1308,7 +1308,7 @@ int picoquic_get_server_address(const char* ip_address_text, int server_port,
break;
case AF_INET6:
ipv6_dest->sin6_family = AF_INET6;
ipv6_dest->sin6_port = (unsigned short)server_port;
ipv6_dest->sin6_port = htons((unsigned short)server_port);
memcpy(&ipv6_dest->sin6_addr,
&((struct sockaddr_in6*)result->ai_addr)->sin6_addr,
sizeof(ipv6_dest->sin6_addr));
Expand Down
6 changes: 4 additions & 2 deletions picoquic/port_blocking.c
Original file line number Diff line number Diff line change
Expand Up @@ -145,14 +145,16 @@ int picoquic_check_port_blocked(uint16_t port)

int picoquic_check_addr_blocked(const struct sockaddr* addr_from)
{
/* The sockaddr is always in network order. We must translate to
* host order before performaing the check */
uint16_t port = UINT16_MAX;

if (addr_from->sa_family == AF_INET) {
port = ((struct sockaddr_in*)addr_from)->sin_port;
port = ntohs(((struct sockaddr_in*)addr_from)->sin_port);
}
else if (addr_from->sa_family == AF_INET6) {
/* configure an IPv6 sockaddr */
port = ((struct sockaddr_in6*)addr_from)->sin6_port;
port = ntohs(((struct sockaddr_in6*)addr_from)->sin6_port);
}
return picoquic_check_port_blocked(port);
}
Expand Down
Loading

0 comments on commit 811fd16

Please sign in to comment.