Pare down logs output from publication and tests #64

jgraettinger · 2022-08-30T15:13:51Z

No description provided.

jshearer · 2022-09-06T21:20:32Z

So @jgraettinger, @psFried and I talked about this today, and here is what we ended up coming up with. As a preface, we can absolutely not do this and just squelch most of the NPM output and move on -- this solution is more like an answer to "how should this flow look/work?", and what would be worth doing in order to allow that to happen.

NOTE: This is written in an authoritative voice, like a decision to be logged. My intent is to put it out there before starting work, in order to get feedback, so don't feel like this is set in stone etc.

Leading the UX of discover, publish, materialize etc with a box of streaming logs sends the message that we're not confident in things working, so we're going to preemptively give you the raw system logs "in case something goes wrong". Not that we shouldn't have logs as an option for developers/power users, but we should have a better UX by default.

One benefit of having streaming logs is that it gives the user a feeling of movement. Given that these operations frequently take tens of seconds to minutes to run, it's important that we convey to users that something is happening. Rather than logs, we want to present a checklist of sub-tasks that need to complete before this operation is finished, as well as presenting any errors/failures alongside the sub-task that it occurred during. Roughly, these operations are:

Validating yaml/specs
Calling connector validate RPC
Generate typescript
Compiling typescript
Calling typescript linter
Spin up temp data plane and run tasks/tests
Apply captures/materializations to their connectors
Activating into data plane (update journals, shards etc)

This is all well and good, but how do we communicate this structured information? We already have a streaming communication mechanism in the form of logs, so we should use that. Ideally we'd convert logs to structured JSON and just use that structure to communicate these semantic state updates, but for various reasons that's not immediately tenable (something something Rust calling Go calling Rust, maybe @jgraettinger can give a better explanation).

The solution we came to is to have the control-plane-agent's logs (which are captured by Rust's tracing crate, which supports structured logging) be included as JSON log lines in the existing log stream for each task. Then all we need to do is add specific log statements with associated metadata when each of the above sub-tasks starts/ends/errors, and that will be consumable by the frontend. This is additionally beneficial because it allows future work to consolidate the whole spec application process into Rust to take advantage of this existing logging infrastructure.

One issue to consider is how we make sure that only the logs for a specific task get send to that task's log_lines. The solution to this is through the tracing crate's Span feature (link describes usage in async contexts via #[instrument]) in order to make sure that any log lines inside of a task-specific Span get sent in JSON form to that tasks log_lines.

The other issue to consider is how the frontend will consume this semi-structured logging. One possibility was to capture all logs as JSON (with the unstructured ones being shoved into something like {"body": "...", "unstructured": true}. The other is to continue writing unstructured logs as plain text lines, and only writing JSON lines for logs coming from a structured (i.e tracing) source. Filtering for JSON log lines is fairly trivial (does it start with { and end with }?), and from there we can use the well-known structure of tracing to filter by log level, component, additional metadata etc.

This is a pragmatic first step towards estuary/animated-carnival#64. Ultimately we'll have structured logging and can filter logs after-the-fact, but in the mean time this does a lot to clean up the log output.

This is a short-term solution for #64, where the longer term solution would be emitting structured logs and filtering elsewhere

* refactor: Only output NPM logs if there was an error This is a pragmatic first step towards estuary/animated-carnival#64. Ultimately we'll have structured logging and can filter logs after-the-fact, but in the mean time this does a lot to clean up the log output. * refactor: Quiet docker pull output * refactor: review comments

jgraettinger · 2022-09-22T17:41:07Z

Can we close this as "we've done what we're going to do right now"?

jgraettinger assigned jshearer Aug 30, 2022

jshearer mentioned this issue Sep 12, 2022

refactor: Only output NPM logs if there was an error estuary/flow#665

Merged

jshearer added a commit that referenced this issue Sep 14, 2022

refactor: Add the ability to hide log output unless there was an error

e819159

This is a short-term solution for #64, where the longer term solution would be emitting structured logs and filtering elsewhere

jshearer mentioned this issue Sep 14, 2022

refactor: Add the ability to hide log output unless there was an error #69

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pare down logs output from publication and tests #64

Pare down logs output from publication and tests #64

jgraettinger commented Aug 30, 2022

jshearer commented Sep 6, 2022

jgraettinger commented Sep 22, 2022

Pare down logs output from publication and tests #64

Pare down logs output from publication and tests #64

Comments

jgraettinger commented Aug 30, 2022

jshearer commented Sep 6, 2022

jgraettinger commented Sep 22, 2022