Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Native Image] Does not honor LANG or LC_ALL and fails to set sun.jnu.encoding #9879

Open
2 tasks done
Schaka opened this issue Oct 14, 2024 · 4 comments
Open
2 tasks done
Assignees

Comments

@Schaka
Copy link

Schaka commented Oct 14, 2024

Describe the Issue

I hope I'm in the right place. So please excuse me if I'm wasting your time.
You can find all the code I'm talking about right here: https://github.com/Schaka/janitorr/tree/bazarr-support

Edit: I built some smaller projects to reproduce the issue.

This one works if using Java 23, but the 21 image does not. The shell files inside can be used to try different things.
graalvm-test.zip

This one is a full project to be built with Gradle. I basically just followed a GraalVM blog post.
native-image-error.zip

The image is built using the Spring-Boot buildBootImage step via Gradle and I'm passing these ENV variables.

"BPE_DEFAULT_LANG" to "en_US.UTF-8",
"BPE_LANG" to "en_US.UTF-8",
"BPE_LC_ALL" to "en_US.UTF-8",
"JAVA_TOOL_OPTIONS" to """
    -Dsun.jnu.encoding=UTF-8
    -Dfile.encoding=UTF-8
""".trimIndent(),
"BP_NATIVE_IMAGE_BUILD_ARGUMENTS" to """
    -march=compatibility
    -H:+AddAllCharsets
    -Dsun.jnu.encoding=UTF-8
    -Dfile.encoding=UTF-8
""".trimIndent()

The result of the build command is:

native-image --no-fallback -H:+StaticExecutableWithDynamicLibC -march=compatibility -H:+AddAllCharsets -Dsun.jnu.encoding=UTF-8 -Dfile.encoding=UTF-8 -H:Name=/layers/paketo-buildpacks_native-image/native-image/com.github.schaka.janitorr.JanitorrApplicationKt -cp /workspace:/workspace/BOOT-INF/classes:/workspace/BOOT-INF/lib/feign-jackson-13.1.jar:/workspace/BOOT-INF/lib/spring-boot-actuator-autoconfigure-3.4.0-M3.jar:/workspace/BOOT-INF/lib/jackson-datatype-jdk8-2.17.2.jar:/workspace/BOOT-INF/lib/jackson-datatype-jsr310-2.17.2.jar:/workspace/BOOT-INF/lib/jackson-module-parameter-names-2.17.2.jar:/workspace/BOOT-INF/lib/jackson-databind-2.17.2.jar:/workspace/BOOT-INF/lib/jackson-annotations-2.17.2.jar:/workspace/BOOT-INF/lib/jackson-core-2.17.2.jar:/workspace/BOOT-INF/lib/jackson-module-kotlin-2.17.2.jar:/workspace/BOOT-INF/lib/kotlin-reflect-2.0.20.jar:/workspace/BOOT-INF/lib/kotlinx-coroutines-core-jvm-1.8.1.jar:/workspace/BOOT-INF/lib/kotlin-stdlib-2.0.20.jar:/workspace/BOOT-INF/lib/caffeine-3.1.8.jar:/workspace/BOOT-INF/lib/feign-httpclient-13.1.jar:/workspace/BOOT-INF/lib/feign-core-13.1.jar:/workspace/BOOT-INF/lib/jcl-over-slf4j-2.0.16.jar:/workspace/BOOT-INF/lib/annotations-23.0.0.jar:/workspace/BOOT-INF/lib/spring-webmvc-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/spring-web-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/spring-context-support-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/micrometer-jakarta9-1.14.0-M3.jar:/workspace/BOOT-INF/lib/spring-boot-autoconfigure-3.4.0-M3.jar:/workspace/BOOT-INF/lib/spring-boot-actuator-3.4.0-M3.jar:/workspace/BOOT-INF/lib/spring-boot-3.4.0-M3.jar:/workspace/BOOT-INF/lib/spring-context-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/micrometer-core-1.14.0-M3.jar:/workspace/BOOT-INF/lib/micrometer-observation-1.14.0-M3.jar:/workspace/BOOT-INF/lib/checker-qual-3.37.0.jar:/workspace/BOOT-INF/lib/error_prone_annotations-2.21.1.jar:/workspace/BOOT-INF/lib/httpclient-4.5.14.jar:/workspace/BOOT-INF/lib/logback-classic-1.5.8.jar:/workspace/BOOT-INF/lib/log4j-to-slf4j-2.24.0.jar:/workspace/BOOT-INF/lib/jul-to-slf4j-2.0.16.jar:/workspace/BOOT-INF/lib/slf4j-api-2.0.16.jar:/workspace/BOOT-INF/lib/jakarta.annotation-api-2.1.1.jar:/workspace/BOOT-INF/lib/spring-aop-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/spring-beans-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/spring-expression-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/spring-core-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/snakeyaml-2.3.jar:/workspace/BOOT-INF/lib/tomcat-embed-websocket-10.1.30.jar:/workspace/BOOT-INF/lib/tomcat-embed-core-10.1.30.jar:/workspace/BOOT-INF/lib/tomcat-embed-el-10.1.30.jar:/workspace/BOOT-INF/lib/micrometer-commons-1.14.0-M3.jar:/workspace/BOOT-INF/lib/httpcore-4.4.16.jar:/workspace/BOOT-INF/lib/commons-logging-1.2.jar:/workspace/BOOT-INF/lib/commons-codec-1.17.1.jar:/workspace/BOOT-INF/lib/spring-jcl-6.2.0-RC1.jar:/workspace/BOOT-INF/lib/HdrHistogram-2.2.2.jar:/workspace/BOOT-INF/lib/LatencyUtils-2.0.3.jar:/workspace/BOOT-INF/lib/logback-core-1.5.8.jar:/workspace/BOOT-INF/lib/log4j-api-2.24.0.jar com.github.schaka.janitorr.JanitorrApplicationKt

I've set the ENV variables both for the image at runtime and at build time in my desperation, but neither seem to make a difference. When starting the resulting container, setting them also does not make a difference.

services:
  janitorr:
    container_name: janitorr
    image: ghcr.io/schaka/janitorr:native-amd64-80-merge
    user: 1000:1000
    ports:
      - 8978:8978 # Technically, we don't publish any endpoints, so this isn't strictly required
    volumes:
      - /appdata/janitorr/config/application.yml:/workspace/application.yml
      - /share_media:/data
    environment:
      - LC_ALL=en_US.UTF-8
      - LANG=en_US.UTF-8

Yet, the second I use Path.of("a path with an ümläüt"), I run into the following exception:

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /data/media/anime-movies/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544]/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544] - [Bluray-1080p][FLAC 2.0][x264].mkv
	at java.base@23/sun.nio.fs.UnixPath.encode(UnixPath.java:131) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/sun.nio.fs.UnixPath.<init>(UnixPath.java:77) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:312) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/java.nio.file.Path.of(Path.java:148) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.pathStructure$janitorr(AbstractMediaServerService.kt:71) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.createLinks(AbstractMediaServerService.kt:99) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]

Is there something I'm missing here, or could this be a bug in GraalVM somehow?
Looking at the code, UnixFileSystem and UnixPath definitely reads sun.jnu.encoding.

Logging from within the image provides:

2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : Default charset UTF-8
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : sun.jnu.encoding ANSI_X3.4-1968
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : sun.stdout.encoding null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : sun.stderr.encoding null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV JAVA_TOOL_OPTIONS null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV LANG en_US.UTF-8
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV LANGUAGE null
2024-10-14T09:56:37.360Z  INFO 1 --- [           main] c.g.s.j.config.RuntimeEnvironment        : ENV LC_ALL en_US.UTF-8

Using the latest version of GraalVM can resolve many issues.

GraalVM Version

23

Operating System and Version

Debian 12, Docker

Diagnostic Flag Confirmation

  • I tried the -H:ThrowMissingRegistrationErrors= flag.

Run Command

The default for the image created with paketo builders, completely unchanged.

Expected Behavior

I would expect the encoding to be set correctly so I can use the nio.Path API.

Actual Behavior

java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /data/media/anime-movies/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544]/Nausicaä of the Valley of the Wind (1984) [imdbid-tt0087544] - [Bluray-1080p][FLAC 2.0][x264].mkv
	at java.base@23/sun.nio.fs.UnixPath.encode(UnixPath.java:131) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/sun.nio.fs.UnixPath.<init>(UnixPath.java:77) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:312) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at java.base@23/java.nio.file.Path.of(Path.java:148) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.pathStructure$janitorr(AbstractMediaServerService.kt:71) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]
	at com.github.schaka.janitorr.mediaserver.AbstractMediaServerService.createLinks(AbstractMediaServerService.kt:99) ~[com.github.schaka.janitorr.JanitorrApplicationKt:na]

Steps to Reproduce

  1. Check out my linked repository.
  2. Use Path.of("/any/directory/with/ümläut")
  3. Observe error when used inisde native image

Additional Context

Important to note:
Even running it like this, with the correct charset parsed, it seems to not work:

schaka@hp-laptop:~$ docker run ghcr.io/schaka/janitorr:native-amd64-bazarr-support -Dsun.jnu.encoding=UTF-8
Exception in thread "main" sun.jnu.encoding UTF-8
java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: /home/schaka/nausicaä
        at java.base@23/sun.nio.fs.UnixPath.encode(UnixPath.java:131)
        at java.base@23/sun.nio.fs.UnixPath.<init>(UnixPath.java:77)
        at java.base@23/sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:312)
        at java.base@23/java.nio.file.Path.of(Path.java:148)
        at com.github.schaka.janitorr.JanitorrApplicationKt.main(JanitorrApplication.kt:52)
        at java.base@23/java.lang.invoke.LambdaForm$DMH/sa346b79c.invokeStaticInit(LambdaForm$DMH)

No response

Run-Time Log Output and Error Messages

No response

@fernando-valdez
Copy link
Member

Hello @Schaka, this is the right place.

Can you install this package glibc-all-langpacks and try again?

@Schaka
Copy link
Author

Schaka commented Oct 15, 2024

It doesn't seem available on my host, which runs Debian.

Is this something that I would need in my container that runs the resulting native-image binary or in the container used to build the binary?

@fernando-valdez
Copy link
Member

fernando-valdez commented Oct 15, 2024

This was a solution for the same error reported here: #8792 (comment)

@Schaka
Copy link
Author

Schaka commented Oct 15, 2024

I stumbled across that issue too but wasn't really sure how it relates to my problem.
What does GraalVM actually need to run?

Debian/Ubuntu have the locales packages. I doubt there's anything wrong with it in particular.
In my first example, you can see that in the image provided by GraalVM (community:21), the setup was incorrect too, leading to the same error. The binary built inside the image for GraalVM 22/23 work fine.

As you can see in my example project with Spring Boot code, even following official GraalVM blog posts using paketo buildpacks (they build on Ubuntu), Path.of runs into the error I posted.

There's some way that a RHEL/Fedora based image picks up sun.jnu.encoding correctly based on locale settings, where on Debian and Ubuntu it does not.

I also opened an issue over at paketo, maybe you can tell them what exactly they need?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants