Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NerdbankMessagePackFormatter #1100

Draft
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

trippwill
Copy link
Member

@trippwill trippwill commented Dec 20, 2024

Add an IJsonRpcMessageFormatter implementation that uses Nerdbank.MessagePack. This serializer has better startup time and is AOT-safe. This alone won't make StreamJsonRpc as a whole AOT safe, but it's a step in that direction.

Charles Willis added 5 commits December 18, 2024 15:03
- Upgraded several package versions in `Directory.Packages.props`, including `Microsoft.Bcl.AsyncInterfaces`, `System.Collections.Immutable`, `System.IO.Pipelines`, and `System.Text.Json`. Added `Nerdbank.MessagePack`.
- Modified `nuget.config` to add a new package source for `nuget.org` and included package source mappings.
- Updated `StreamJsonRpc.csproj` to target only `net8.0`, removing `net6.0`, and added a reference to `Nerdbank.MessagePack`.
- Changed `Benchmarks.csproj` to target `net8.0` instead of `net6.0`.
- Adjusted `StreamJsonRpc.Tests.csproj` to exclusively target `net8.0`.
Replaces the `ProgressFormatterResolver` and `AsyncEnumerableFormatterResolver` with `ProgressConverterResolver` and `AsyncEnumerableConverterResolver`.
Significant modifications to the `NerdbankMessagePackFormatter` and related classes to improve serialization context and type shape provider functionalities.

- Introduced new methods in `ISerializationContextBuilder.cs` for registering various type converters, enhancing flexibility for asynchronous enumerables, duplex pipes, and streams.
- Replaced `PipeFormatterResolver` with `PipeConverterResolver` in `NerdbankMessagePackFormatter` to align with new converter registration methods.
- Updated `FormatterContextBuilder` to utilize new context and type shape provider structures for a more modular design.
Copy link
Member

@AArnott AArnott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing all this work. You're providing invaluable feedback for the Nerdbank.MessagePack library's feature set as well.

Directory.Packages.props Outdated Show resolved Hide resolved
Directory.Packages.props Outdated Show resolved Hide resolved
/// <summary>
/// Provides a builder interface for adding type shape providers.
/// </summary>
public interface ICompositeTypeShapeProviderBuilder
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this whole interface will be unnecessary when eiriktsarpalis/PolyType#91 is fulfilled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified this further by removing this interface and merging the methods in the FormatterContectBuilder. But yes, it would be nice to rely on built-in support from PolyType for composing type shape providers.

src/StreamJsonRpc/Protocol/JsonRpcRequest.cs Outdated Show resolved Hide resolved
src/StreamJsonRpc/NerdbankMessagePackFormatter.cs Outdated Show resolved Hide resolved
src/StreamJsonRpc/NerdbankMessagePackFormatter.cs Outdated Show resolved Hide resolved
src/StreamJsonRpc/NerdbankMessagePackFormatter.cs Outdated Show resolved Hide resolved
Charles Willis added 6 commits December 21, 2024 17:30
…rmatterContextBuilder` class.

The `NerdbankMessagePackFormatter` class has been modified to eliminate the `NBMP` namespace prefix, favoring direct usage of `MessagePack` types.
@trippwill
Copy link
Member Author

@AArnott : Current test stats:

 2157 Passed
 195 Failed
 13 Skipped

@trippwill
Copy link
Member Author

trippwill commented Dec 28, 2024

Current test results:
 2182 Passed
 170 Failed
 13 Skipped

@AArnott - Currently seeing a lot of test failures around exceptions and having difficulty tracing the issues. Many errors are caused by an NB.MP throwing b/c an exception is not serializable.

While adding additional formatter configuration for the tests to pass, I'm seeing some
opportunities to make the builder API a bit nicer if that's the pattern we want to go with.

The need to pre-register converters in the serializer without the ability to resolve and cache at runtime like legacy MessagePack-CSharp resolvers, is currently the cause of the most repeated builder calls. I suppose we could make a converter act similar to a resolver, or make some converters more open to supporting a wider set of types?

Edit
Unrelated thought: Do we really need two different serializer instances? Maybe it would better to register all the envelope and support type shapes and converters in the default user data configuration, and let the user add additional converters and shapes via the builder. Instead of switching back and forth between an "rpc" configuration and a "userData" configuration.

@trippwill
Copy link
Member Author

Benchmarks:


BenchmarkDotNet v0.13.10, Windows 11 (10.0.26100.2605) (Hyper-V)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.402
  [Host]     : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-GLSONT : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-VLZGJF : .NET Framework 4.8.1 (4.8.9282.0), X64 RyuJIT VectorSize=256


Method Runtime Formatter Mean Error StdDev Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
InvokeAsync_NoArgs .NET 8.0 JSON 77.92 μs 1.471 μs 1.913 μs 0.88 0.03 0.4883 - 13.09 KB 0.84
InvokeAsync_NoArgs .NET Framework 4.7.2 JSON 88.86 μs 0.743 μs 0.659 μs 1.00 0.00 2.4414 0.1221 15.67 KB 1.00
InvokeAsync_NoArgs .NET 8.0 MessagePack 46.29 μs 0.631 μs 0.591 μs 1.09 0.02 0.1221 - 4.07 KB 0.63
InvokeAsync_NoArgs .NET Framework 4.7.2 MessagePack 42.43 μs 0.611 μs 0.571 μs 1.00 0.00 1.0376 0.0610 6.43 KB 1.00
InvokeAsync_NoArgs .NET 8.0 NerdbankMessagePack 51.66 μs 0.589 μs 0.551 μs 0.88 0.01 - - 4.06 KB 0.63
InvokeAsync_NoArgs .NET Framework 4.7.2 NerdbankMessagePack 58.85 μs 0.218 μs 0.170 μs 1.00 0.00 1.0376 0.0610 6.44 KB 1.00

BenchmarkDotNet v0.13.10, Windows 11 (10.0.26100.2605) (Hyper-V)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.402
  [Host]     : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-GLSONT : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-VLZGJF : .NET Framework 4.8.1 (4.8.9282.0), X64 RyuJIT VectorSize=256


Method Runtime Formatter Mean Error StdDev Ratio Gen0 Gen1 Allocated Alloc Ratio
NotifyAsync_NoArgs .NET 8.0 JSON 2.821 μs 0.0177 μs 0.0157 μs 0.44 0.0877 - 2216 B 0.94
NotifyAsync_NoArgs .NET Framework 4.7.2 JSON 6.347 μs 0.0327 μs 0.0306 μs 1.00 0.3738 0.0076 2359 B 1.00
NotifyAsync_NoArgs .NET 8.0 MessagePack 1.877 μs 0.0122 μs 0.0114 μs 0.63 0.0038 - 168 B 0.68
NotifyAsync_NoArgs .NET Framework 4.7.2 MessagePack 2.961 μs 0.0345 μs 0.0306 μs 1.00 0.0381 0.0038 248 B 1.00
NotifyAsync_NoArgs .NET 8.0 NerdbankMessagePack 1.891 μs 0.0134 μs 0.0125 μs 0.56 0.0057 - 168 B 0.67
NotifyAsync_NoArgs .NET Framework 4.7.2 NerdbankMessagePack 3.347 μs 0.0272 μs 0.0255 μs 1.00 0.0381 0.0038 250 B 1.00

BenchmarkDotNet v0.13.10, Windows 11 (10.0.26100.2605) (Hyper-V)
Intel Xeon Platinum 8370C CPU 2.80GHz, 1 CPU, 16 logical and 8 physical cores
.NET SDK 8.0.402
  [Host]     : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-GLSONT : .NET 8.0.11 (8.0.1124.51707), X64 RyuJIT AVX2
  Job-VLZGJF : .NET Framework 4.8.1 (4.8.9282.0), X64 RyuJIT VectorSize=256


Method Runtime Formatter Mean Error StdDev Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
CreateConnectionAndInvokeOnce .NET 8.0 JSON 68.88 μs 0.650 μs 0.576 μs 0.39 0.00 2.0000 - 49.85 KB 0.86
CreateConnectionAndInvokeOnce .NET Framework 4.7.2 JSON 175.96 μs 2.712 μs 2.537 μs 1.00 0.00 9.3333 - 58.21 KB 1.00
CreateConnectionAndInvokeOnce .NET 8.0 MessagePack 49.62 μs 0.977 μs 1.711 μs 0.37 0.01 1.5000 - 36.55 KB 0.83
CreateConnectionAndInvokeOnce .NET Framework 4.7.2 MessagePack 133.71 μs 2.591 μs 2.983 μs 1.00 0.00 7.0000 - 44.21 KB 1.00
CreateConnectionAndInvokeOnce .NET 8.0 NerdbankMessagePack 184.48 μs 2.625 μs 2.455 μs 0.40 0.02 2.0000 - 54.96 KB 0.88
CreateConnectionAndInvokeOnce .NET Framework 4.7.2 NerdbankMessagePack 484.65 μs 14.761 μs 43.523 μs 1.00 0.00 10.0000 3.0000 62.52 KB 1.00

@AArnott
Copy link
Member

AArnott commented Dec 29, 2024

Currently seeing a lot of test failures around exceptions and having difficulty tracing the issues. Many errors are caused by an NB.MP throwing b/c an exception is not serializable.

Seeing the exception type, message and stack trace usually helps me understand and suggest fixes for issues like this.
I don't think either msgpack library includes built-in support for exception serialization, so I'm not sure what NB.MP failure would be to blame here.

I'm seeing some opportunities to make the builder API a bit nicer if that's the pattern we want to go with.

FYI I haven't looked closely at the core of your pull request yet, since I'm trying to stay 'OOF' 😉 so I can't make a firm recommendation at this point.

The need to pre-register converters in the serializer without the ability to resolve and cache at runtime like legacy MessagePack-CSharp resolvers, is currently the cause of the most repeated builder calls. I suppose we could make a converter act similar to a resolver, or make some converters more open to supporting a wider set of types?

I recently removed the exception that is thrown when you register more converters after serialization has happened. But it still resets the converter cache so it's a runtime hit that should be avoided if possible.

Is your concern mostly around generics? I'm still working on a better solution for that from NB.MP that I hope will make this simpler for you.

Unrelated thought: Do we really need two different serializer instances? Maybe it would better to register all the envelope and support type shapes and converters in the default user data configuration, and let the user add additional converters and shapes via the builder. Instead of switching back and forth between an "rpc" configuration and a "userData" configuration.

Historically, separating envelope from user data has been useful in protecting the integrity of the JSON-RPC protocol and our extensions, while allowing the user to have flexibility in how types are serialized. If we have just one serializer object, it seems the user will either be disallowed to customize how types are serialized that we internally must serialize, or the user's customizations could invalidate the objects we serialize for the envelope. And this set of sensitive types may change over time as we change what or how we serialize objects as part of the core protocol or the exotic types.

If I'm missing the mark of what you're thinking of, please correct me.
Otherwise, I'm curious what cost (complexity, run time, etc.) you're thinking would benefit from consolidating the serializer instances.

... benchmarks...

Most of those look pretty good. The last one seems to mix in making the connection itself, which seems likely to be noisy. But I have no idea why NB.MP would make the process of connecting so much slower than the other libraries. Do you?

@trippwill
Copy link
Member Author

Historically, separating envelope from user data has been useful in protecting the integrity of the JSON-RPC protocol and our extensions, while allowing the user to have flexibility in how types are serialized.

Totally makes sense.

Is your concern mostly around generics?

Mostly, yes. With MP-CS resolvers, the right formatter is found by working through an ordered list of resolvers for any given type. This allows for behaviors like a resolver intercepting intrinsic rpc-marhalables before they matched to a more conventional formatter. Same for things like Pipes and Streams as arguments or return types. And IAsyncEnumerable as a property on a class. With NB.MP, a type like MyClass : IObserver<int> has to have an RpcMarhsalable converter registered explicitly for it. Currently, I have this done as part of builder. Maybe the better solution would be to expose an ObserverConverter<T> from StreamJsonRpc that the user can apply to a custom type like MyClass?

Basically, not being able to intercept the exotic types as they are seen (for best perf) means the StreamJsonRpc consumer needs to be very explicit up-front.

But I have no idea why NB.MP would make the process of connecting so much slower than the other libraries. Do you?

Not yet :-)

FYI I haven't looked closely at the core of your pull request yet, since I'm trying to stay 'OOF' 😉

Totally understandable. I should probably step away from this myself for awhile.

@trippwill
Copy link
Member Author

Latest Test Results:
 2205 Passed
 147 Failed
 13 Skipped

Updated serialization and deserialization processes in the
NerdbankMessagePackFormatter to utilize ReadOnlySequence<byte>
instead of RawMessagePack.
Introduced new classes for handling enumerator results and
enhanced comments for clarity. Updated tests to ensure
compatibility with the new serialization methods and added
data contract attributes for better interoperability.
@trippwill
Copy link
Member Author

Latest Test Results:
 2270 Passed
 82 Failed
 13 Skipped

Added new type shape providers and registered exception types in the NerdbankMessagePackFormatter class. Updated AddTypeShapeProvider to initialize with a capacity of 3. Removed methods for duplex pipes and streams, reflecting a shift in supported types. Registered new converters for handling exceptions like RemoteInvocationException. Updated CommonErrorData with shape generation attributes and modified test classes to align with these changes.
@trippwill
Copy link
Member Author

Test Results:

 2275 Passed
 77 Failed
 13 Skipped

@trippwill
Copy link
Member Author

Method Runtime Formatter Mean Error StdDev Ratio RatioSD Gen0 Gen1 Allocated Alloc Ratio
CreateConnectionAndInvokeOnce .NET 8.0 JSON 75.15 μs 1.493 μs 2.767 μs 0.41 0.02 2.0000 - 49.83 KB 0.86
CreateConnectionAndInvokeOnce .NET Framework 4.7.2 JSON 182.33 μs 3.626 μs 7.805 μs 1.00 0.00 9.3333 - 58.28 KB 1.00
CreateConnectionAndInvokeOnce .NET 8.0 MessagePack 49.67 μs 0.737 μs 1.126 μs 0.38 0.01 1.3333 - 36.58 KB 0.83
CreateConnectionAndInvokeOnce .NET Framework 4.7.2 MessagePack 130.67 μs 2.109 μs 1.973 μs 1.00 0.00 7.0000 - 44.25 KB 1.00
CreateConnectionAndInvokeOnce .NET 8.0 NerdbankMessagePack 243.71 μs 4.817 μs 5.735 μs 0.50 0.02 2.0000 - 66.71 KB 0.79
CreateConnectionAndInvokeOnce .NET Framework 4.7.2 NerdbankMessagePack 492.03 μs 9.718 μs 15.413 μs 1.00 0.00 13.0000 3.0000 84.54 KB 1.00

Latest benchmarks hint that there is a correlation between time and number of serializer runtime register calls. In the latest commit, many more converters and new sub type maps are added in the ctor of NerdbankMessagePackFormatter. The time could also be a result of the calls to converter resolvers.

@AArnott
Copy link
Member

AArnott commented Jan 2, 2025

Does this benchmark include the RegisterConverter calls themselves? If so, we should probably isolate that in a "connection creation" benchmark, leaving a benchmark that measures only the incremental cost of an RPC call.

- Bump `Nerdbank.MessagePack` package version to `0.3.98-beta`.
- Refactor `Profile.Builder` to register async enumerable converters and known subtypes.
- Update `NerdbankMessagePackFormatter` to include new converter registration methods.
- Update multiple test files to reflect new converter registrations and functionality.

public override JsonObject? GetJsonSchema(JsonSchemaContext context, ITypeShape typeShape) => JsonNode.Parse("""
{
"type": ["string", { "type": "integer", "format": "int64" }]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should probably find a way to make this more discoverable, but this isn't a valid JSON schema.
image

A valid form for this would be:

{
  "$schema": "http://json-schema.org/draft-04/schema",
	"oneOf": [
		{
			"type": "string"
		},
		{ "type": "integer", "format": "int64" }
	]
}

You don't need the $schema property of course, but including it while you write the schema out in VS Code gives you the validation on your json schema.

In JSON though, integers default to 64-bit I believe, so you could probably get away with just this:

{
	"type": ["string", "integer"]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants