-
-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set of features for Thoth.Json v5 #56
Comments
Should we add the possibility to support different representation for the types? (see #57) It should already be possible to use "override of the standard decoder" with a |
Add a way to not crash the encoder/decoding process if no coder is found a type. This is need for Elmish.Debugger, because people can store anything in their model and it's not really friendly to ask them to provide encoders/decoders for using a dev-tool. (see #59) Make a strong note in the documentation that this feature is really risky because it will weaken Thoth.Json security when this flag is active. |
+1 for not unifying the parser into one. I had issues where JSON parsers implemented on the JS side were not as performant as the native |
@njlr It will never be as performant as the native |
Today I read this article and it made me think once again about how we deal with JSON. I never was a big fan of the The implementation is hard to maintain and people want complete control over what it is generating by customizing just the one things they don't like by default or which needs to be deal depending on the case.
Unfortunately, people seem to prefer the auto decoders compared to the manual decoder because it is hiding the complexity and/or the boilerplate code. At least, it was the reason that leads to its creation. I think I will remove the auto module in a next release probably the one where I refactor the whole package by trying to mutualize the code etc. If I am to remove it I do agree that writing the coders manually is not fun. For this reason, I will seriously explore the code generation that I left on the side for a long time. Code generation will allow people to not write "dummy" manual coders when the default generation is doing the job for them. And if they need to customize it then they can deactivate the code generation and copy/paste/modify the generated code to fit their needs. |
I wanted to just briefly weigh in on this, as I am currently using both approaches for different reasons. For serializing to/from persistent stores, the manual encoders are the only way imo. They give you total control, and this has enabled us to use a versioning system so that historic json payloads with different shapes can be automatically transposed onto the new model structure, saving the need to mess around running data migrations and all that pain. However, going from server to client, we have quite a complicated domain model, and using Auto coders is an extremely easy way to pass symmetrical payloads to and from the client without doing any grunt work. This is especially handy when you have shared request/response typed payloads that are forever changing. I totally appreciate that if one of the two goes out of sync, your going to get catastrophic explosions, but our release cadence mandates that the client always goes out with the server, so this is just not a problem for us (and in my experience this is not uncommon unless microservices). It would be a shame to entirely loose Auto coders, even though they are probably abused far too often. Perhaps it could exist in a standalone package? |
Thank you for sharing @alexswan10k And I do agree, if auto-coders are to be kept during the big rewrite they will have to go in a separate package. Edit: I think I need to accept that auto coders are a thing and it's something which is selling the library. But I need to re-think the implementation, set of features and also the documentation in order to explain why/when to use them and also what is supported so people know exactly what they can do with it. |
Today, I started prototyping the next version of Thoth.Json. I wanted to check if my idea of using an interface to abstract the implementation details about which JSON parser is being used could work. And it seems like yes 🎉 🎉 Here are the implementations for the As we can see the code required to add a new parser is quite small :) My theory is that we will have this architecture:
The next step for me is to mutualize the tests between the different implementations because right now, I duplicated them locally for testing purpose. Then, I will take the time to re-activate all the decoders because I disabled some of them as they were a bit more complex to port and I preferred to focus on building a small prototype first. Mutualizing the tests should show us how it will look like consuming Thoth.Json from different parsers. In theory, we should not see anymore the compiler directives :D I am also introducing another repository called I think it will be more generic to the whole Thoth ecosystem at some point because Thoth.Fetch could probably be added too I think. The idea behind this "meta" repository is to solve the problem of sharing the code/tests/etc. between the different implementation. |
Hum, I am facing a problem... I had to use a generic This lead to code like that: let levelOfAwesomenessDecoder<'JsonValue> =
Decode.string<'JsonValue>
|> Decode.andThen (fun value ->
match value with
| "MaxiAwesome" -> Decode.succeed MaxiAwesome
| "MegaAwesome" -> Decode.succeed MegaAwesome
| _ ->
sprintf "`%s` is an invalid value for LevelOfAwesomeness" value
|> Decode.fail
) Otherwise, the decoder cannot be used when using And this code doesn't work (at least I don't know how to make the generic propagate correctly): A solution could be to not use module/function to declare let levelOfAwesomenessDecoder<'JsonValue> =
Decode<_>.string
|> Decode<_>.andThen (fun value ->
match value with
| "MaxiAwesome" -> Decode.succeed MaxiAwesome
| "MegaAwesome" -> Decode.succeed MegaAwesome
| _ ->
sprintf "`%s` is an invalid value for LevelOfAwesomeness" value
|> Decode.fail
) Which is really ugly I think I need to explore new ideas in order to make the implementation parser agnostic and cross-platform. |
New idea: Disclaimer: I am writing my though down because it helps me shape my ideas and not forget, so perhaps it is not always well explain sorry about that. As soon, as I have something working I would make a proper post to ask feedback etc. as I do in general. I think my current rewrite is trying to solve several things at the same times:
The goal of using generics was to have a contract : type IDecoderHelpers<'JsonValue> =
abstract GetField : FieldName : string -> value : 'JsonValue -> 'JsonValue
abstract IsString : value : 'JsonValue -> bool
abstract IsBoolean : value : 'JsonValue -> bool
abstract IsNumber : value : 'JsonValue -> bool
abstract IsArray : value : 'JsonValue -> bool
abstract IsObject : value : 'JsonValue -> bool
abstract IsNullValue : value : 'JsonValue -> bool
abstract IsIntegralValue : value : 'JsonValue -> bool
abstract IsUndefined : value : 'JsonValue -> bool
abstract AnyToString : value : 'JsonValue -> string
abstract ObjectKeys : value : 'JsonValue -> string seq
abstract AsBool : value : 'JsonValue -> bool
abstract AsInt : value : 'JsonValue -> int
abstract AsFloat : value : 'JsonValue -> float
abstract AsFloat32 : value : 'JsonValue -> float32
abstract AsString : value : 'JsonValue -> string
abstract AsArray : value : 'JsonValue -> 'JsonValue[] were each implementation could implement the contract. The problem as explained in my previous message is that the generics are escaping the bound of the "parser related library". If we think in term of macro generic is just a feature of the compiler which allows us to re-use code. In my case, the goal was to have What if we were going back a bit and integrate directly The idea would be in that case to share the code not by using generics code but by referencing directly the implementation files and just having the "contract" written in another file. Something like that: <!-- Thoth.Json.Fable -->
<?xml version="1.0" encoding="utf-8"?>
<Project Sdk="Microsoft.NET.Sdk">
<ItemGroup>
<Compile Include="Contract.fs" /> <!-- The file would probably be called Helpers.fs -->
<Compile Include="Thoth.Json.Core/Shared/Decode.fs" />
<Compile Include="Thoth.Json.Core/Shared/Encode.fs" />
</ItemGroup>
</Project>
<!-- Thoth.Json.Newtonsoft -->
<?xml version="1.0" encoding="utf-8"?>
<Project Sdk="Microsoft.NET.Sdk">
<ItemGroup>
<Compile Include="Contract.fs" /> <!-- The file would probably be called Helpers.fs -->
<Compile Include="Thoth.Json.Core/Shared/Decode.fs" />
<Compile Include="Thoth.Json.Core/Shared/Encode.fs" />
</ItemGroup>
</Project> The idea is that files under We could always use a few compiler directives inside the shared implementation when there are runtime specific requirements: For example, in order to support This approach is similar to how it is down today. But today, we kind of copy/paste the code between the repo and adapt the code in Thoth.Json.Net especially for |
Work for the next version of Thoth.Json has started in #76 The goal of this work is to mutualize as much as possible the code between the different implementation. Right now, I have ported the code for:
The goal of mutualizing the code is that it should be possible to add a new target like JSON.Net if someone prefers this library or for performance reason. These packages are already working and follows the same philosophy as before for sharing the code. User need to use compiler directives currently Current thought: One goal of the new rewrite was also to make it easier to share code, for this reason, I plan on converting the package Thoth.Json to be platform-independent #38 so people can share code using NuGet and avoid compiler directives. However, the performance will probably not be as good as when using native JSON.Parse or Newtonsoft or any performance specialized JSON parser. This one will be to provide easy to use code sharing. If people really need highly performant parser they will need to use dedicated package using compiler directives. I think that for most users it should be ok to use Thoth.Json. |
Hi, I’m using a web worker to transmit JS objects ( Is there a way I could avoid these unneeded JS to JSON + JSON to JS transformations? |
Perhaps worth opening a fresh issue for this?
|
@MangelMaxime Thanks for your hard work on Thoth. I'm one of the founders at https://stacktracehq.com/ (small software consultancy) and https://cubcare.com.au/ (my main job. Stack is F# + Elm), and we use Thoth quite heavily (always from dotnet, never Fable). The vast majority of our Encoders/Decoders are manual (human authored + code-gen), and we use them to serialize over the wire to Elm, and to store our domain events into PostgreSQL. To rehydrate the domain without snapshotting (ie replaying all of history), we might deserialize up to 10M rows, and we regularly transcribe all of these events into a new DB table/schema for a few reasons, one of which is so that we can migrate JSON schema versions over time. I'd like to understand what performance improvements might be possible in our systems if we keep Thoth, but swap Newtonsoft for System.Text.Json, Utf8Json, SpanJosn etc. Ideally, I'd prefer to test this first (on the assumptions that it wouldn't require changes to the encoding schemas, and that it would be helpful to other folks using the library). If it's not enough of an improvement, we'd try alternative JSON approaches or something binary, but would be doing so having explored the cheaper option first (and we could share the performance implications for anyone else here working with large datasets). Am I correct in understanding that We'd be open to either contributing (if you're happy to give us some pointers on how you'd like things laid out) or providing some funding to you or someone you nominate in order to see if these can be reconciled so we can answer the peformance questions. Happy to chat here or over Thanks again - it's a great library, especially for folks targeting Elm. |
Hello @dansowter, Sorry for the late answer I was taking a few day break. Thank you for using Thoth.Json and trusting it.
In general, I always expect Thoth.Json to always be slower compared to using Newtonsoft or System.Text.Json, etc. directly. This is because Thoth.Json does more validation work under the hood compared to these frameworks. This is what gives us the good error message that Thoth.Json provides. But I never confirmed it using a benchmark because perhaps because we are using manual API and not reflection things are different in reality. In regards of performance, the new version of Thoth.Json (the unified implementation) should be faster than the previous version because we moved some of error related computation to the error path only.
With the new version of Thoth.Json, supporting a new JSON library is much easier. In the past, you needed to fork the library making it difficult to maintain and also introducing others problems for people using both Fable and .NET. Now all libraries target have a unified API provided by Thoth.Json.Core, and they just need to implement 2 interfaces and 2-3 functions. You can see it done for Thoth.Json.Newtonsoft for example. And the API between Thoth.Json.Net and Thoth.Json.Core + Thoth.Json.Newtonsoft should be the same at least in regard of the manual API. Auto API is not yet supported but you don't use it so you should be fine. Thanks to the new architecture testing out different frameworks and benchmarking them should not be too difficult. |
You can see an example of how to implement a new library in this PR #210 It implements System.Text.Json, I will release later this weekend. |
Thoth.Json.System.Text.Json package has been released. I didn't do any benchmark on it's performance yet. |
I did some basic benchmark and as expected Thoth.Json is slower than standard libraries. But thanks to the benchmark, I was able to make You can see the whole benchmark result here |
Hi @MangelMaxime, Thank you so much for all your recent efforts, both with the code changes and the thoughtful replies you’ve shared. The benchmarks you provided, along with your explanations, have been incredibly helpful in steering our next steps. After reviewing everything, we've decided to explore code-generation targeting SpanJson directly, rather than continuing to use Thoth in our specific context. Your work has made it clear that while Thoth offers excellent validation and error messages, direct use of SpanJson aligns better with our performance requirements for large-scale deserialization tasks. That said, I want to express how much we’ve appreciated working with Thoth and the effort you’ve put into making it such a great library. It’s been a critical part of our workflow for a long time, and I’ll continue recommending it to others where the use case fits. Thank you again for everything! If there's anything that would be helpful for us to share (some of the code-gen or performance gains, maybe?), please don't hesitate to ask. |
Indeed, your use case with over 10M rows is special and definitely falls under the exception of
I think Thoth.Json could be made faster, if instead of making an extensible library like done now with Thoth.Json.Core and Thoth.Json.JavaScript, Thoth.Json.Newtonsoft, etc. but this would make the code harder to work with, with a lot of compiler directives + there could be issues with the dependency list. If Thoth.Json was to use its own parser, it would solve the issue of the dependency list and would allow the reader/parser to be optimise with Thoth.Json approach of handling JSON. Like it would not need to parse 2 the JSON structure which is kind of what is happening now. But I don't think I would be able to compete with performance of others libraries as I don't have a lot of experience in optimising code or using
Yes for sure, if you have something to share regarding your experience regarding the pro/cons of using Thoth.Json you can write a blog post somewhere or an issue. And we can add it to the list of linked blog post from the readme. |
DateTime
in a specific mannerThe text was updated successfully, but these errors were encountered: