-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement support for getting plural categories #167
base: master
Are you sure you want to change the base?
Conversation
i think this belongs in format module, dont think we should create a new essentially if it outputs text -> formatting this clearly belongs in the format module i would also prefer if this came from a single dict, if we create a .json file for each category it quickly gets out of hand, connectors.json only has 2 entries for example.... I would suggest this becomes a simple pluralization.json file, if you feel the need add categories as top level keys inside that single json file a .json file also does not scale well, so logic should be available, but thats for another future PR #37 #36 |
8e1fc52
to
9174cce
Compare
I moved functions into However, there is some problem. Some tests in I also added |
There's some boilerplate at the top of the other tests, which is required. Your tests pass once you paste it in there. It's this part: class TestPronounceNumber(unittest.TestCase):
def setUp(self):
self.old_lang = get_default_lang()
set_default_lang("sl-si")
def tearDown(self):
if self.old_lang:
set_default_lang(self.old_lang) Basically, there's no way to guarantee the order in which pytest will run tests. In order to be absolutely, 100% certain that the language you're testing is the default language at the moment the test runs, we set and then reset it in each test. That is, when a given test class is finished, the default language goes back to what it had been before the test ran. It's not a perfect solution, but it keeps the paradigm from breaking Pytest. Hence, at the moment, with that boilerplate missing, your new tests are trying to run with some other default language, one in which the function really hasn't been localized. In a perfect world, we'd just unload all languages after every test, so that this problem would throw a slightly more straightforward "no language module loaded." However, that didn't resolve the problem when I was writing the localizer. I should revisit that... |
9174cce
to
c28ffe1
Compare
return "other" | ||
|
||
elif type == "ordinal": | ||
if amount % 10 == 1 and amount % 100 != 11: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the end of the world, but if you store amount % 10
and amount % 100
and use variables for these comparisons, it'll shave several operations off of the later cases.
Looks like some of the English and Slovenian res files got nuked when you renamed that file (hello, fellow regex person! 😁) |
nitpick, can |
Now having more time to review, I'm not sure Perhaps something like However, (@JarbasAl @jmontane) I wonder what implications this approach might have down the road for languages that need the "variant" framework. |
@JarbasAl bump re: compatibility with variants, possible need for integration (plural variants?) |
Let's chime in. As is you'd need to pass an english singular to get the localized plural. I guess there are some good cross-language arguments. The local function isn't used widely, so not a deep code. Why not step away from Please tell me if i'm missing something. |
This PR has been inactive for quite some time, so I kinda forgot about it, but I think it would still be useful. I think one of the things that weren't determined yet was how to name
A similar function to translate words already exists (and has been renamed to
In this case, you should use There is also
If you meant why If you meant why |
Hey there, I haven't looked at this deeply but definitely agree with the premise. We're very close to launching the Mark II so keeping our focus quite specifically on that, but I'd like to come back to this once we hit code freeze on the Mark II. I won't try to present an opinion on the questions raised until I've properly digested this. I would however say that I'm not opposed to more modules in LF beyond parse and format. Where the functions from this PR live is another question but very open to a |
I approve these changes and cherry picked these commits for ovos, so far i only did minor changes like adding an enum and porting the old singularize/pluralize prs for english/portuguese, but i still want to change a few more things like renaming "type" to something else to avoid the namespace collision with the builtin type etc. might be interesting to keep an eye on OpenVoiceOS#28 , for what its worth i think this PR is looking great already! |
Description
This implementes support for getting plural categories and prepared for future full pliralization functions. It also changes how
_translate_word
works to make it possible to get correct plural forms innice_duration
.I added those functions to new
utils
module, because I think they don't fit completely into existingparse
andformat
. If you think adding them to existing modules would be better, I can move them. Because of #160 I added emptyutils
modules for for all other languages.Format of translations in
res/text
is now changed to use JSON files that contain all translations for all plural forms. However, I kept existing format in most languages (except English and Slovenian) and added "legacy" translation function. That languages need to be converted in the future and legacy function can be deleted.Currently the only languages that implement plural categories are English and Slovenian, but there is a fallback to English-style pluralization for all other languages for backwards compatibility.
Type of PR
Testing
There are tests for English and Slovenian language.
Documentation
I added/fixed docstrings where needed and created documentation in readme and project structure files.