[RKOTLIN-1089] Guard decoding issues when core logs invalid utf-8 messages #1792

rorbech · 2024-07-03T10:00:09Z

nhachicha · 2024-07-04T09:31:23Z

packages/cinterop/src/jvm/jni/utils.cpp

    for (std::string::size_type i = 0; i < str.size(); ++i)
        ret << " 0x" << std::hex << std::setfill('0') << std::setw(2) << (int)s[i];
    ret << "; ";
-    ret << "in_begin = " << in_begin << "; ";
-    ret << "in_end = " << in_end << "; ";
+    ret << "in_begin = " << (void*) in_begin << "; ";


why casting to opaque pointer?

If treating in_begin and in_end as char * they will be considered as 0-terminated strings and printed to the stream ... in which case we end up including the invalid UTF-8 bytes in the message - Just as for ret << "StringData.data = " << str << "; ";. Want to avoid having invalid unicode in the exception message as spelled out in #1792 (comment). Just including the pointers to give some insights if we were to try to understand and reproduce an issue based on one of these exceptions.

nhachicha · 2024-07-04T09:32:11Z

packages/cinterop/src/jvm/jni/utils.cpp

@@ -82,13 +82,12 @@ static std::string string_to_hex(const std::string& message, realm::StringData&
    ret << "error_code = " << error_code << "; ";
    ret << "retcode = " << retcode << "; ";
    ret << "StringData.size = " << str.size() << "; ";
-    ret << "StringData.data = " << str << "; ";


We're not printing data?

This string_to_hex is only used to make a hex-dump of the data when we fail to decode it from UTF-8. So don't think it makes sense to include the non-decodable string itself. The result of this method is used as the message of an exception and we cannot pass that on to the JVM if the message itself is not valid unicode.

nhachicha · 2024-07-04T09:33:24Z

packages/jni-swig-stub/src/main/jni/realm_api_helpers.cpp

+                               jstring j_message = NULL;
+                               try {
+                                   j_message = to_jstring(jenv, message);
+                               } catch (RuntimeError exception) {


Did you observe this on Android & JVM?

It is on both platforms

nhachicha

I think we should skip printing memory address of begin/end if there's not plan how to use them

rorbech · 2024-07-05T12:40:16Z

I think we should skip printing memory address of begin/end if there's not plan how to use them

We can actually derive some context from it. The in_begin will point to the byte that we cannot decode, so we can derive the index of the buffer from in_begin and in_end. Haven't traced it but I assume there is a similar correspondence with the out-arguments around the encoding to UTF-16. So will leave them for now.

rorbech requested review from nhachicha and clementetb July 3, 2024 10:00

cla-bot bot added the cla: yes label Jul 3, 2024

github-actions bot assigned rorbech Jul 3, 2024

rorbech marked this pull request as ready for review July 3, 2024 10:03

nhachicha reviewed Jul 4, 2024

View reviewed changes

Guard encoding issues when core logs invalid utf-8 messages

7af0a91

rorbech force-pushed the cr/fix-logger-encoding-issues branch from db2e957 to 7af0a91 Compare July 5, 2024 06:37

nhachicha approved these changes Jul 5, 2024

View reviewed changes

rorbech changed the title ~~[RKOTLIN-1089] Guard encoding issues when core logs invalid utf-8 messages~~ [RKOTLIN-1089] Guard decoding issues when core logs invalid utf-8 messages Jul 5, 2024

rorbech merged commit 2eaaf92 into releases Jul 5, 2024
56 checks passed

rorbech deleted the cr/fix-logger-encoding-issues branch July 5, 2024 12:42

github-actions bot locked as resolved and limited conversation to collaborators Aug 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RKOTLIN-1089] Guard decoding issues when core logs invalid utf-8 messages #1792

[RKOTLIN-1089] Guard decoding issues when core logs invalid utf-8 messages #1792

rorbech commented Jul 3, 2024 •

edited

Loading

nhachicha Jul 4, 2024

rorbech Jul 4, 2024 •

edited

Loading

nhachicha Jul 4, 2024

rorbech Jul 4, 2024

nhachicha Jul 4, 2024

rorbech Jul 4, 2024 •

edited

Loading

nhachicha left a comment

rorbech commented Jul 5, 2024

[RKOTLIN-1089] Guard decoding issues when core logs invalid utf-8 messages #1792

[RKOTLIN-1089] Guard decoding issues when core logs invalid utf-8 messages #1792

Conversation

rorbech commented Jul 3, 2024 • edited Loading

nhachicha Jul 4, 2024

Choose a reason for hiding this comment

rorbech Jul 4, 2024 • edited Loading

Choose a reason for hiding this comment

nhachicha Jul 4, 2024

Choose a reason for hiding this comment

rorbech Jul 4, 2024

Choose a reason for hiding this comment

nhachicha Jul 4, 2024

Choose a reason for hiding this comment

rorbech Jul 4, 2024 • edited Loading

Choose a reason for hiding this comment

nhachicha left a comment

Choose a reason for hiding this comment

rorbech commented Jul 5, 2024

rorbech commented Jul 3, 2024 •

edited

Loading

rorbech Jul 4, 2024 •

edited

Loading

rorbech Jul 4, 2024 •

edited

Loading