Skip to content
This repository has been archived by the owner on Jan 12, 2022. It is now read-only.

sanitize_word can fail on special characters #2

Open
svisser opened this issue May 27, 2014 · 1 comment
Open

sanitize_word can fail on special characters #2

svisser opened this issue May 27, 2014 · 1 comment
Labels
Milestone

Comments

@svisser
Copy link
Contributor

svisser commented May 27, 2014

I'm using Python 2.6.

If I run crosswords ?sunción the program doesn't display asuncion but it fails with a UnicodeDecodeError. This happens because the default encoding ASCII can't perform the .decode() operation:

>>> 'Asunción'.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 6: ordinal not in range(128)

It may be worth passing the encoding explicitly to decode to avoid relying on the default encoding. It may also be a good idea to inform the user though I'm not sure where it's best to catch this error (in compile_pattern and let that function return None instead?).

@bfontaine
Copy link
Owner

Right, when I wrote the script some years ago I used non-accented words on the CLI to avoid that, but it should be fixed. We also have a lot of accented words in French but accents are not needed in crosswords.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants