You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is similar to #4, but it's more related to child process output content rather than output/buffer sizes.
Child processes that output a lot of data, or run long enough, but don't emit line breaks will eventually cause buffer overflows and therefore broken pipe errors.
You can notice that in each server loop iteration, it doesn't emit a line break to the output, it uses a ANSI control sequence (\033[2K) to clear the line and return the cursor to the start of the line (\r).
go-cmd will keep accumulating output, because it doesn't see a \n and will never send a message to the output channel. This causes two issues.
The overflow issue
Eventually go-cmd's internal buffer overflows:
> go run streaming.go2024/05/31 14:17:18 [client] launching server: ["/tmp/go-build653253379/b001/exe/streaming" "server"]2024/05/31 14:17:18 [client] wait server[server] start (running for 10s)2024/05/31 14:17:24 signal: broken pipeexit status 1
I believe there are no buffer size adjustments that could solve this.
The issue with buffering based on assumptions about the data format
go-cmd assumes the output will be line oriented and sends channel messages for each line.
I imagine this is done for convenience, but it breaks streaming flows that could be important, like showing progress status (e.g.: rclone, aria2c, wget, curl, etc).
Even for simpler cases (ANSI control sequences are not used) or smaller cases (all of the output would fit in the buffer), for example when a process prints progress on the same line (like a series of dots or percentages), go-cmd's implementation breaks those flows (e.g.: some mkfs implementations do this if I remember correctly).
Another broken use case would be processes that emit very long lines. It could be the case for minifiers/uglyfiers/obfuscators, or language servers (LSPs) that are known to emit long JSON lines to stdio, for example. go-cmd would have to keep increasing buffer size to accomodate for this.
Proposed solution
I believe the solution to both issues is to just do byte oriented streaming with buffering: don't make assumptions about data format and emit chunks downstream whenever the buffer is filled. The client is responsible to reassemble the data if needed.
The text was updated successfully, but these errors were encountered:
Right, go-cmd is inherently line-oriented/buffered. It's original (and still primary) use case is standard Linux command output.
A byte-oriented streaming buffer would be a nice addition. Off top of my head, the design probably needs both a buffer size and a flush timeout/interval. Let's say buff size is 1k but only 900B are received and upstream pauses for a long time. Should go-cmd wait for the remaining 100B to flush? Or flush what it's go so far (900B) after some time?
You or someone else want to write a PR? Or, if others want this new capability, 👍 the original post.
This is similar to #4, but it's more related to child process output content rather than output/buffer sizes.
Child processes that output a lot of data, or run long enough, but don't emit line breaks will eventually cause buffer overflows and therefore broken pipe errors.
This is easily reproducible:
You can notice that in each server loop iteration, it doesn't emit a line break to the output, it uses a ANSI control sequence (
\033[2K
) to clear the line and return the cursor to the start of the line (\r
).go-cmd
will keep accumulating output, because it doesn't see a\n
and will never send a message to the output channel. This causes two issues.The overflow issue
Eventually
go-cmd
's internal buffer overflows:I believe there are no buffer size adjustments that could solve this.
The issue with buffering based on assumptions about the data format
go-cmd
assumes the output will be line oriented and sends channel messages for each line.I imagine this is done for convenience, but it breaks streaming flows that could be important, like showing progress status (e.g.:
rclone
,aria2c
,wget
,curl
, etc).Even for simpler cases (ANSI control sequences are not used) or smaller cases (all of the output would fit in the buffer), for example when a process prints progress on the same line (like a series of dots or percentages),
go-cmd
's implementation breaks those flows (e.g.: somemkfs
implementations do this if I remember correctly).Another broken use case would be processes that emit very long lines. It could be the case for minifiers/uglyfiers/obfuscators, or language servers (LSPs) that are known to emit long JSON lines to stdio, for example.
go-cmd
would have to keep increasing buffer size to accomodate for this.Proposed solution
I believe the solution to both issues is to just do byte oriented streaming with buffering: don't make assumptions about data format and emit chunks downstream whenever the buffer is filled. The client is responsible to reassemble the data if needed.
The text was updated successfully, but these errors were encountered: