You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To make postmortem analysis of unexpected failures easier in customer sites, let's introduce request IDs attached for every log message if present.
This will allow easier search over various daemon logs when we visit custom sites. We could tell the end-users to report the request IDs they see.
The webserver/storage-proxy's client-facing API/app-proxy generate a new request ID for each frontend's API call.* If the request has failed (5xx), show an error message like "The server got a trouble. Please report to the administrator or support staff. (request ID: XXXXXXX)".
They this request context (including the URL, source IP, etc.) when calling the manager API.
The manager passes this request context when calling the agent RPC function / storage-proxy's manager-facing APIs.
All webserver/manager/agent/storage-proxy daemons should include the request context information in every log message whenever available.* Inside each daemon, we could use contextvars to keep track of the request context. common.logging should be updated to be aware of this.
For internally invoked tasks such as timers, we could generate a different type of request ID for the same purpose.* Note that scheduler should keep the request context used in enqueue_session() and destroy_session() so that any actions triggered by specific session creation/destruction requests should leave the log with the same request ID. Probably we could attach both the user-request ID and system-request ID (e.g., specific invocation of the dispatcher) for the same log.
The text was updated successfully, but these errors were encountered:
achimnol
changed the title
Introduce request ID and request context passed through webserver, manager, and agent
Introduce request ID and request context passed through webserver, manager, storage-proxy, and agent
Oct 31, 2023
To make postmortem analysis of unexpected failures easier in customer sites, let's introduce request IDs attached for every log message if present.
This will allow easier search over various daemon logs when we visit custom sites. We could tell the end-users to report the request IDs they see.
The webserver/storage-proxy's client-facing API/app-proxy generate a new request ID for each frontend's API call.* If the request has failed (5xx), show an error message like "The server got a trouble. Please report to the administrator or support staff. (request ID: XXXXXXX)".
They this request context (including the URL, source IP, etc.) when calling the manager API.
The manager passes this request context when calling the agent RPC function / storage-proxy's manager-facing APIs.
All webserver/manager/agent/storage-proxy daemons should include the request context information in every log message whenever available.* Inside each daemon, we could use
contextvars
to keep track of the request context.common.logging
should be updated to be aware of this.For internally invoked tasks such as timers, we could generate a different type of request ID for the same purpose.* Note that scheduler should keep the request context used in
enqueue_session()
anddestroy_session()
so that any actions triggered by specific session creation/destruction requests should leave the log with the same request ID. Probably we could attach both the user-request ID and system-request ID (e.g., specific invocation of the dispatcher) for the same log.The text was updated successfully, but these errors were encountered: