UnicodeEncodeError: 'charmap' codec can't encode characters in position 578-694: character maps to <undefined> #429

ryntml · 2024-10-04T18:38:35Z

I am currently trying to do a training on Ottoman Turkish. This language consists of a mixture of the Arabic alphabet and the Persian alphabet. I created all the datasets, the moment I run train.py I get the following error:

A small example from labels.txt:

Even though I do UTF-8 encoding, I still get errors.

There is this problem with the characters:

This language, like Arabic, is written differently at the beginning, middle and end, and that's why I wrote all the characters.
For example, I added 3 spellings of the letter Noon.
Could this cause a problem? Does anyone know?
Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnicodeEncodeError: 'charmap' codec can't encode characters in position 578-694: character maps to <undefined> #429

UnicodeEncodeError: 'charmap' codec can't encode characters in position 578-694: character maps to <undefined> #429

ryntml commented Oct 4, 2024

UnicodeEncodeError: 'charmap' codec can't encode characters in position 578-694: character maps to <undefined> #429

UnicodeEncodeError: 'charmap' codec can't encode characters in position 578-694: character maps to <undefined> #429

Comments

ryntml commented Oct 4, 2024