Release v1.0.20230412
github-actions
released this
12 Apr 22:08
·
325 commits
to main
since this release
1.0.20230412 (2023-04-12)
⚠ BREAKING CHANGES
- put network_to_echart where we can test it properly
Features
- add -a/--substring-alignments argument to cli (6b41213)
- add accessors for useful things like the input and output languages (cacce3b)
- add aligned cmudict and lexicon transducer type (596ab82)
- add alignments method to get textual alignments (e2303f4)
- add edges for alignments in lexicon (f2c9f6c)
- add proper typing to compose_indices (7bbfb6d)
- add type checking and use Tuples (as they can be type checked) (4780702)
- language name for spelling variants describe the variant (ffba389)
- make the use of None explicit and limited (97aaed5)
- make TransductionGraph and CompositeTransductionGraph compatible (e00790a)
- output monotonic alignments for deletions and reorderings (126aa83)
- properly normalize edges on concatenation (f37897c)
- shrink pickle by optimizing alignment storage (0860ad6)
- support lexicon mappings in Studio (but they are slow) (c824f6b)
- switch script to use phonetisaurus from PyPI (bb91b12)
Bug Fixes
- add spaces and avoid formatting (a5c2894)
- avoid crashing on empty edges (8d57e68)
- avoid creating None in input position (404306d)
- comment and clean up substring_alignments (9cd84d8)
- disable the utf8 fix for windows when running in pytest (bd5690a), closes #241
- do not call logging.basicConfig, just config the logger itself (8ff314f)
- emit input unchanged when no transducers exist (b0db10e)
- fix doctor (0b0f2ed)
- fix speed issues by not deep-copying alignments (56e933b)
- make pretty_edges consistent and fix tests to expect tuples (065fa23)
- make sure we do not output bogus edges (fab9f0a)
- most sensible possible behaviour, keep spaces if user wanted them (70ab1e6)
- remove impossible try/catch (2db239a)
- remove spaces in
sanitize_unidecode_output
as suggested by @littell (bd1b1ec) - remove spontaneous extraneous spaces from und-ipa (9e64b7f)
- remove unnecessary default value (722215a)
- restore original edges API and rename alignments (c054256)
- switching back to Custom did not actually work (7f0f640)
- the only special character we want to escape is ? (7af2f0b)
- update treatment of deletions in lexicon to match rules (18bdc6b)
- use OrderedDict explicitly for clarity (d2ef567)
Documentation
- add documentation for lexicon mappings (dcf5973)
- add links to non-packaged files (9d6275c)
- clarify use of generic type (7bb7df6)
- clean up docstrings (91aa3b3)
Tests
- add alignment tests and improve coverage for tranducers (76f85dd)
- add coverage of invalid regex in rule (bd81a70)
- add coverage to studio tests and app (0945336)
- add test of lexicon loading from config file (22de19b)
- fix studio test (31c9e48)
- long delay no longer necessary (33efc1e)
- make test_tokenizer.py exercise tce and unknown lang and default (1da815b)
- run the expensive doctor test because it can catch errors (bb60f55)
- update lexicon test for eng ipa (f05a513)
Code Refactoring
- add explicit b, m, p, u rules to moh for borrowed words (2dc5e42)
- put network_to_echart where we can test it properly (970e358)
- remove superfluous list comprehension (dd8f5df)
- test: when a mapping fails, show test case filename:lineno (fb309ec)
- tests: quiet yappy test suites (c6423b6)
Styles
- all other badges are rounded, why not the readme one? (ba76f57)
- rewrite moh_equiv and moh_to_ipa in compact form (c781cbe)