Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add serialization/deserialization support for ClassifierId, Split and Pages. #47619

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jaliyaudagedara
Copy link

Add serialization/deserialization support for ClassifierId, Split and Pages.

Addresses #47612

@github-actions github-actions bot added Cognitive - Form Recognizer Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization. labels Dec 19, 2024
Copy link

Thank you for your contribution @jaliyaudagedara! We will review the pull request and get back to you soon.

@jaliyaudagedara
Copy link
Author

@kinelski, appreciate if you can let me know whether I am on the correct track. Didn't see any specific tests for Serialization?

@azure-sdk
Copy link
Collaborator

API change check

API changes are not detected in this pull request.

Copy link
Member

@kinelski kinelski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you're on the right track, but there's one issue that could be tricky to address.

My understanding is that we use the method JsonModelWriteCore in two scenarios:

  1. When the client method ClassifyDocument is called: we serialize the object to send it as part of the request to the service.
  2. When the JsonModelConverter is used: the customer triggers the serialization method to store the serialized object.

In the first case, the additional properties (classifierId, pages, split) are added separately at a later point when building the request:

uri.AppendPath(classifierId, true);
uri.AppendPath(":analyze", false);
uri.AppendQuery("api-version", _apiVersion, true);
if (stringIndexType != null)
{
uri.AppendQuery("stringIndexType", stringIndexType, true);
}
if (split != null)
{
uri.AppendQuery("split", split, true);
}
if (pages != null)
{
uri.AppendQuery("pages", pages, true);
}

Note that classifierId is added as part of the endpoint's path, and properties split and pages are added as part of the endpoint's query.

Your changes as they are right now may fix the JsonModelConverter serialization issue, but these additional properties will be also included as part of the JSON body when sending a request to the service, which is not desirable.

@@ -44,13 +44,28 @@ protected virtual void JsonModelWriteCore(Utf8JsonWriter writer, ModelReaderWrit
writer.WritePropertyName("base64Source"u8);
writer.WriteBase64StringValue(BytesSource.ToArray(), "D");
}
if (Optional.IsDefined(ClassifierId))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must be careful when modifying this file. As you can see, it's inside our Generated folder. Once we run the code generator again, these changes will be overwritten.

The right way of adding changes to this method would be to move it to the other ClassifyDocumentOptions file, located outside of the Generated folder (ClassifyDocumentOptions). This should prevent a method with the same signature from being generated when we run the generator again.

Copy link
Author

@jaliyaudagedara jaliyaudagedara Dec 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing.

I am sorry, I am a bit confused here.

I have already introduced this to the ClassifyDocumentOptions file located outside of the Generated folder. So when the code generator runs again it would create a new ClassifyDocumentOptions.Serialization.cs and JsonModelWriteCore/DeserializeClassifyDocumentOptions will have new code, so we don't have to manually modify ClassifyDocumentOptions.Serialization.cs. Is it?

internal ClassifyDocumentOptions(string classifierId,
    Uri uriSource,
    BinaryData bytesSource,
    SplitMode? split,
    string pages,
    IDictionary<string, BinaryData> serializedAdditionalRawData)
{
    ClassifierId = classifierId;
    UriSource = uriSource;
    BytesSource = bytesSource;
    Split = split;
    Pages = pages;
    _serializedAdditionalRawData = serializedAdditionalRawData;
}

@kinelski
Copy link
Member

Didn't see any specific tests for Serialization?

The serialization code is direct output of our code generator (autorest.csharp), so we don't test it on the SDK client side.

That said, the serialization issues we're seeing are caused because of custom code we added on top of the generated layer, so it would be ideal if we added serialization/deserialization tests for the customized models.

@jaliyaudagedara
Copy link
Author

Your changes as they are right now may fix the JsonModelConverter serialization issue, but these additional properties will be also included as part of the JSON body when sending a request to the service, which is not desirable.

Yes, additional properties are included in the JSON body, will have a look at better way to get rid of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Cognitive - Form Recognizer Community Contribution Community members are working on the issue customer-reported Issues that are reported by GitHub users external to the Azure organization.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants