Decode line-delimited JSON #34

tsutsu · 2018-03-15T20:52:51Z

JSONL documents, ala http://jsonlines.org, are streams of valid JSON documents, separated by newlines ("\n").

I know it's pretty easy to parse these in Elixir itself (e.g. StringIO.open(jsonl) |> elem(1) |> IO.binstream(:line) |> Stream.map(&Jason.decode!/1)), but a convenience method would be nice.

As well, there may be an appreciable amount of redundant work done by the above approach, compared to having Jason manage document-splitting while parsing. Embedding the logic for handling top-level newline tokens in the document may be quite a lot faster. Something to benchmark!

The text was updated successfully, but these errors were encountered:

michalmuskala · 2018-03-19T13:26:04Z

I think this could be also solved with something like:

jsonl
|> String.splitter("\n")
|> Stream.map(&Jason.decode!/1)
...

In general, I would say this is a subset of issue #25 - in here it's easy to do this manually, though, because the separator is a newline and newlines are not allowed inside JSON directly. I'll close this in favour of #25.

michalmuskala closed this as completed Mar 19, 2018

michalmuskala mentioned this issue May 18, 2018

Streaming support #25

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decode line-delimited JSON #34

Decode line-delimited JSON #34

tsutsu commented Mar 15, 2018

michalmuskala commented Mar 19, 2018

Decode line-delimited JSON #34

Decode line-delimited JSON #34

Comments

tsutsu commented Mar 15, 2018

michalmuskala commented Mar 19, 2018