Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode support on Windows? #68

Open
windowsair opened this issue Sep 22, 2022 · 9 comments
Open

Unicode support on Windows? #68

windowsair opened this issue Sep 22, 2022 · 9 comments

Comments

@windowsair
Copy link

windowsair commented Sep 22, 2022

Working with character encoding on Windows is really annoying.

Passing UTF-8 characters on the command line using CreateProcessA seems to be impossible. While local code pages seem to be able to handle special characters such as Chinese and Japanese, they don't seem to be able to do that for emoji.

In my case, the way I use it is to convert your commandLineCombined to UTF16LE characters and then call CreateProcessW. My original input was UTF8 characters, and this modification seems to handle UTF8 characters properly.

What do you think of this? Thanks.

@windowsair
Copy link
Author

windowsair commented Sep 22, 2022

CreateProcessA seems to internally convert to the OEM code page, which is kind of broken for characters like emoji.

@sheredom
Copy link
Owner

What version of windows were you running on? The reason I ask is that I thought that CreateProcessA supported UTF-8 on later Windows versions!

@windowsair
Copy link
Author

Oh, and I don't seem to be receiving messages from github.

Windows 10 19044, I think that's new enough.

@jlaumon
Copy link

jlaumon commented Jan 3, 2024

There's a manifest thing to do to enable that apparently: https://learn.microsoft.com/en-us/windows/apps/design/globalizing/use-utf8-code-page

Did you happen to try it? I'd be curious to know if it works

@jlaumon
Copy link

jlaumon commented Jan 3, 2024

Ok, reading that page again I'm not really sure what that manifest is for. It seems to say the default code page is UTF8 without manifest for recent enough Windows (and that's the case for me, but of course I first tested with OutputDebugStringA which doesn't support UTF8).

The manifest is indeed needed, not sure how I managed to fumble my previous test. GetACP() returns CP_UTF8 when it works.

I'll give subprocess with UTF8 command lines a try in the following days and let you know how it went.

@windowsair
Copy link
Author

Hi, @jlaumon

Thank you for following this. This issue has been raised for some time now. I have now replaced it with CreateProcessW and it works fine. I always use CP_UTF8 for the parent process, but I'm not sure if the child process inherits this.

@sheredom
Copy link
Owner

sheredom commented Jan 5, 2024

If someone wants to put together a PR that passes CI I'd happily accept it.

@jlaumon
Copy link

jlaumon commented Jan 5, 2024

I just tried subprocess on xcopy.exe to copy file/directory names with non-ascii characters in UTF-8 and, as long as that magic manifest is there, it just works!

I am now the proud owner of 🍌.txt and 🍌_copy.txt.

@matyalatte
Copy link
Contributor

UTF-8 support on Windows is still beta and does not work by default (at least with localized envrionments for some Asian languages.)
You need a manifest file to use CreateProcessA with utf-8 as jlaumon said.

But for a cross-platform single-header library, it's not desirable that behavior changes depending on external factors (including manifest files,) I think.

As far as I know, common cross-platform libraries use the wchar version of APIs with MultiByteToWideChar like windowsair did. Or support UTF16 strings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants