feature/pronounce_digits #150

ChanceNCounter · 2020-11-03T18:16:12Z

No description provided.

works for smaller numbers, but sucks as soon as they're big. *massive* WIP

JarbasAl · 2020-12-16T00:29:09Z

lingua_franca/format.py

+    treating each pair as a single number.
+
+    Examples:
+        >>> pronounce_number(127, all_digits=False)


pronounce_digits not pronounce_number

JarbasAl · 2020-12-16T00:29:28Z

lingua_franca/lang/format_en.py

+        no_no_words = list(_SHORT_SCALE_EN.values())[:5]
+        no_no_words.append('and')
+        print(no_no_words)
+        print(result)


debug prints

JarbasAl · 2020-12-16T00:31:03Z

add unittests and this should be good, i'm not sure about the behaviour for all_digits=False, i bet there is some language where we will run into an issue like catalan having 4 ways to pronounce time....

ChanceNCounter · 2020-12-16T03:21:06Z

I expect the variant decorator will be able to deal with that, if and when the time comes. I'm still getting my arms all the way around it, but it seems like you covered all the bases.

ChanceNCounter · 2021-06-04T21:21:38Z

@krisgesling This obviously needs a rebase, but the code itself is merged and working at Chatterbox, so I think it's good to go if you agree. Just need to get around to the rebase.

krisgesling

Hey, it looks like a nice addition - thanks.

Before the rebase, can we address Jarbas's two comments above, and I've raised a few questions below.

I wrote a short suite of tests to help me think through different cases too - feel free to steal them or write new ones: e750f39
Commented out tests are how I expected the output, and the passing version of it should be directly above.

lingua_franca/lang/format_en.py

krisgesling · 2021-06-09T05:41:47Z

lingua_franca/lang/format_en.py

+            result.insert(0, pronounce_number_en(int(op_val)))
+        if is_float:
+            result.append(decimal_part)
+        no_no_words = list(_SHORT_SCALE_EN.values())[:5]


Why do we specifically care about the first 5 values? Is this just an optimisation because the chances of the rest being there are so slim?

Because it slices 2 or 3 digits at a time, the rest can't be there. Right now, I'm trying to remember why I included anything but 'hundred'.

krisgesling · 2021-06-09T05:48:50Z

lingua_franca/lang/format_en.py

+        result = " ".join(result)
+    else:
+        while len(op_val) > 1:
+            idx = -2 if len(op_val) in [2, 4] else -3


Without first reading this code I wrote the following tests:

self.assertEqual(pronounce_digits(238513096), "twenty three eighty five thirteen zero ninety six") self.assertEqual(pronounce_digits(238513696), "twenty three eighty five thirteen sixty nine six")

I like that you go from the end rather than beginning so the final numbers can be read closer to what they actually are - "ninety six".

However being a longer number, it ends up getting broken down into multiple groups of three so we get:

self.assertEqual(pronounce_digits(238513096), "two thirty eight five thirteen ninety six")

What's the intended outcome here?

If we're aiming for speaking in two digit numbers, should we check for an odd number length, speak the first digit and then speak all remaining pairs? Something like:

if len(op_val) % 2 == 1: result.append(pronounce_number(op_val[0])) op_val = op_val[1:] remaining_pairs = # some code for pair in remaining_pairs: result.append(pronounce_number(pair))

It seems to be speaking in pairs slightly more often than intended. It doesn't really work on large numbers, but my intention was to "end with" three digit groupings in most cases, which just sounded most natural to me.

I'm gonna go over the code again top to bottom tomorrow, but the gist is:

123 -> "one twenty three"
1234 -> "twelve thirty four"
12345 -> "twelve three forty five"
123456 -> "one twenty three four fifty six"

It's definitely bugged on large numbers atm. The above should be followed by "one two thirty four five sixty seven", but I'm getting "twelve thirty four five sixty seven".

Once you're looking at 9+ digits, I don't think the function is much use without all_digits:

>>> assert(format.pronounce_digits(238513096, all_digits=True) == "two three eight five one three zero nine six") >>>

(edit: "tomorrow" to commence mid-afternoon UTC")

krisgesling · 2021-06-09T06:01:48Z

lingua_franca/lang/format_en.py

+        no_no_words.append('and')
+        print(no_no_words)
+        print(result)
+        result = [word for word in result if word.strip() not in no_no_words]


Is there any case where you think this might happen that we can test for? Or is it just a safety measure?

This happens anytime the input is longer than two digits. The algorithm acts by running pronounce_number() on 2-3 digits at a time. This often returns the words hundred and and.

The latter stray debug print (=P) is the result prior to this operation:

>>> pronounce_digits(234534) ['two', 'hundred', 'and', 'thirty', 'four', 'five', 'hundred', 'and', 'thirty', 'four'] 'two thirty four five thirty four'

pronounce_number(534), prepended with pronounce_number(234), sanitized.

The strip, on the other hand, is probably unneeded.

ChanceNCounter added 2 commits November 3, 2020 10:13

create pronounce_digits

1ff2255

works for smaller numbers, but sucks as soon as they're big. *massive* WIP

gitignore more editor cruft

a58a716

devs-mycroft added the CLA: Yes Contributor License Agreement exists (see https://github.com/MycroftAI/contributors) label Nov 3, 2020

JarbasAl reviewed Dec 16, 2020

View reviewed changes

ChanceNCounter mentioned this pull request Dec 16, 2020

Proposed function: format.pronounce_currency() #166

Open

JarbasAl added a commit to HelloChatterbox/lingua-nostra that referenced this pull request May 9, 2021

https://github.com/MycroftAI/lingua-franca/pull/150

34d3f7d

krisgesling reviewed Jun 9, 2021

View reviewed changes

ChanceNCounter and others added 5 commits June 10, 2021 17:41

begin addressing review

bc14dae

Add tests for pronounce_digits

42c0aa5

rm rstrip (don't need with float())

06386dc

fix dropped zeroes, impr output, add 'casual' flag

fee093c

switch default to all_digits=True

2dff502

JarbasAl mentioned this pull request Jul 12, 2021

Handling Preceding Zeroes #204

Open

JarbasAl mentioned this pull request Nov 27, 2022

Feat/pronounce digits OpenVoiceOS/ovos-lingua-franca#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature/pronounce_digits #150

feature/pronounce_digits #150

ChanceNCounter commented Nov 3, 2020

JarbasAl Dec 16, 2020

JarbasAl Dec 16, 2020

JarbasAl commented Dec 16, 2020

ChanceNCounter commented Dec 16, 2020

ChanceNCounter commented Jun 4, 2021

krisgesling left a comment

krisgesling Jun 9, 2021

ChanceNCounter Jun 11, 2021

krisgesling Jun 9, 2021

krisgesling Jun 9, 2021

ChanceNCounter Jun 10, 2021 •

edited

Loading

krisgesling Jun 9, 2021

ChanceNCounter Jun 11, 2021

ChanceNCounter Jun 11, 2021

feature/pronounce_digits #150

Are you sure you want to change the base?

feature/pronounce_digits #150

Conversation

ChanceNCounter commented Nov 3, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JarbasAl commented Dec 16, 2020

ChanceNCounter commented Dec 16, 2020

ChanceNCounter commented Jun 4, 2021

krisgesling left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChanceNCounter Jun 10, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ChanceNCounter Jun 10, 2021 •

edited

Loading