Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode line-delimited JSON #34

Closed
tsutsu opened this issue Mar 15, 2018 · 1 comment
Closed

Decode line-delimited JSON #34

tsutsu opened this issue Mar 15, 2018 · 1 comment

Comments

@tsutsu
Copy link

tsutsu commented Mar 15, 2018

JSONL documents, ala http://jsonlines.org, are streams of valid JSON documents, separated by newlines ("\n").

I know it's pretty easy to parse these in Elixir itself (e.g. StringIO.open(jsonl) |> elem(1) |> IO.binstream(:line) |> Stream.map(&Jason.decode!/1)), but a convenience method would be nice.

As well, there may be an appreciable amount of redundant work done by the above approach, compared to having Jason manage document-splitting while parsing. Embedding the logic for handling top-level newline tokens in the document may be quite a lot faster. Something to benchmark!

@michalmuskala
Copy link
Owner

I think this could be also solved with something like:

jsonl
|> String.splitter("\n")
|> Stream.map(&Jason.decode!/1)
...

In general, I would say this is a subset of issue #25 - in here it's easy to do this manually, though, because the separator is a newline and newlines are not allowed inside JSON directly. I'll close this in favour of #25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants