[streaming] Line breaks and buffer overflows (broken pipes) #104

13k · 2024-05-31T17:51:49Z

This is similar to #4, but it's more related to child process output content rather than output/buffer sizes.

Child processes that output a lot of data, or run long enough, but don't emit line breaks will eventually cause buffer overflows and therefore broken pipe errors.

This is easily reproducible:

package main

import (
	"fmt"
	"log"
	"math/rand/v2"
	"os"
	"time"

	"github.com/go-cmd/cmd"
)

func main() {
	if len(os.Args) > 1 {
		server()
	} else {
		client()
	}
}

func server() {
	ttl := time.Second * 10
	timeout := time.After(ttl)

	fmt.Printf("[server] start (running for %s)\n", ttl)

	for {
		n := time.Duration(rand.Int64N(10))
		step := time.After(time.Millisecond * n)

		select {
		case <-step:
			fmt.Printf("\033[2K\r[server] %d", n)
		case <-timeout:
			fmt.Println("\033[2K\r[server] exit")
			return
		}
	}

}

func client() {
	selfExe, err := os.Executable()

	if err != nil {
		panic(err)
	}

	svCmd := []string{selfExe, "server"}

	log.Printf("[client] launching server: %q", svCmd)

	cmd := cmd.NewCmdOptions(cmd.Options{Streaming: true}, selfExe, "server")

	go func() {
		for output := range cmd.Stdout {
			fmt.Fprintln(os.Stdout, output)
		}
	}()

	go func() {
		for output := range cmd.Stderr {
			fmt.Fprintln(os.Stderr, output)
		}
	}()

	log.Println("[client] wait server")

	status := <-cmd.Start()

	if status.Error != nil {
		log.Fatal(status.Error)
	}

	log.Printf("[client] server status: %v", status)
	log.Println("[client] exit")
}

You can notice that in each server loop iteration, it doesn't emit a line break to the output, it uses a ANSI control sequence (\033[2K) to clear the line and return the cursor to the start of the line (\r).

go-cmd will keep accumulating output, because it doesn't see a \n and will never send a message to the output channel. This causes two issues.

The overflow issue

Eventually go-cmd's internal buffer overflows:

> go run streaming.go
2024/05/31 14:17:18 [client] launching server: ["/tmp/go-build653253379/b001/exe/streaming" "server"]
2024/05/31 14:17:18 [client] wait server
[server] start (running for 10s)
2024/05/31 14:17:24 signal: broken pipe
exit status 1

I believe there are no buffer size adjustments that could solve this.

The issue with buffering based on assumptions about the data format

go-cmd assumes the output will be line oriented and sends channel messages for each line.

I imagine this is done for convenience, but it breaks streaming flows that could be important, like showing progress status (e.g.: rclone, aria2c, wget, curl, etc).

Even for simpler cases (ANSI control sequences are not used) or smaller cases (all of the output would fit in the buffer), for example when a process prints progress on the same line (like a series of dots or percentages), go-cmd's implementation breaks those flows (e.g.: some mkfs implementations do this if I remember correctly).

Another broken use case would be processes that emit very long lines. It could be the case for minifiers/uglyfiers/obfuscators, or language servers (LSPs) that are known to emit long JSON lines to stdio, for example. go-cmd would have to keep increasing buffer size to accomodate for this.

Proposed solution

I believe the solution to both issues is to just do byte oriented streaming with buffering: don't make assumptions about data format and emit chunks downstream whenever the buffer is filled. The client is responsible to reassemble the data if needed.

The text was updated successfully, but these errors were encountered:

daniel-nichter · 2024-06-23T17:14:59Z

Right, go-cmd is inherently line-oriented/buffered. It's original (and still primary) use case is standard Linux command output.

A byte-oriented streaming buffer would be a nice addition. Off top of my head, the design probably needs both a buffer size and a flush timeout/interval. Let's say buff size is 1k but only 900B are received and upstream pauses for a long time. Should go-cmd wait for the remaining 100B to flush? Or flush what it's go so far (900B) after some time?

You or someone else want to write a PR? Or, if others want this new capability, 👍 the original post.

daniel-nichter added the enhancement label Jun 23, 2024

xmapst linked a pull request Jun 25, 2024 that will close this issue

fix: Line breaks and buffer overflows (broken pipes) #106

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[streaming] Line breaks and buffer overflows (broken pipes) #104

[streaming] Line breaks and buffer overflows (broken pipes) #104

13k commented May 31, 2024 •

edited

Loading

daniel-nichter commented Jun 23, 2024 •

edited

Loading

[streaming] Line breaks and buffer overflows (broken pipes) #104

[streaming] Line breaks and buffer overflows (broken pipes) #104

Comments

13k commented May 31, 2024 • edited Loading

The overflow issue

The issue with buffering based on assumptions about the data format

Proposed solution

daniel-nichter commented Jun 23, 2024 • edited Loading

13k commented May 31, 2024 •

edited

Loading

daniel-nichter commented Jun 23, 2024 •

edited

Loading