Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3 UnicodeDecodeError #136

Open
usrbinsam opened this issue Nov 16, 2017 · 4 comments
Open

Python 3 UnicodeDecodeError #136

usrbinsam opened this issue Nov 16, 2017 · 4 comments

Comments

@usrbinsam
Copy link

I get an exception in my core buffer when someone sends unicode characters to a channel, like below. I'm on Python 3.6.3.

https://puu.sh/yntz6/1e4f54f40d.png

@koolfy
Copy link
Collaborator

koolfy commented Nov 16, 2017

Can you try with the latest Potr (pure-python-otr https://github.com/python-otr/pure-python-otr) manually built from master?

We did some Unicode fixes IIRC

@tribut
Copy link
Collaborator

tribut commented Aug 6, 2019

After blackbit on irc just reported the same problem, I did a bit of digging. It seems that python scripts cannot deal with non-utf-8 input in weechat at all when using python3:

16:01:41 <@FlashCode> when invalid utf-8 is sent to a python 3 callback, there are problems
16:01:54 <@FlashCode> I have no easy solution for this problem
16:02:07 <@FlashCode> for now you should use only signals that return only utf-8

For example, this script triggers a UnicodeDecodeError when it sees non-utf8 input:

python: stdout/stderr (test): UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 67: invalid start byte
python: error in function "message_in_cb"

This has to be fixed in the python plugin for weechat. The upstream bug is weechat/weechat#1389.

@tribut
Copy link
Collaborator

tribut commented Aug 6, 2019

It appears however that (except for invisible tags not working for messages with non-utf8 content) this does not prevent otr from working. It's just spamming the log.

@tribut
Copy link
Collaborator

tribut commented Oct 14, 2019

The upstream bug is now fixed in master: weechat/weechat@513f5a1

Also, it appears that the irc_in2_* callback is triggered after decoding, but before the message is used. So maybe we should just be using that. If that works, it could mean we can remove a lot of the charset handling code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants