Skip to content

Commit

Permalink
docs: add a basic migration guide to v2
Browse files Browse the repository at this point in the history
Squash of:
fix(docs): info about removal of `g2p run`
docs: add notes about API
docs: more editor notes
docs: oh there is a new Emacs LSP client it seems
fix(docs): clarify commands, versions, and APIs
fix: where did that whitespace come from
fix: small tweaks
  • Loading branch information
dhdaines authored and joanise committed Mar 19, 2024
1 parent 48ae50d commit cbfd608
Show file tree
Hide file tree
Showing 3 changed files with 133 additions and 1 deletion.
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,17 @@ Once you have the Compiled DB, it is then possible to use the `g2p convert` comm

You can also run the `g2p Studio` which is a web interface for creating custom lookup tables to be used with g2p. To run the `g2p Studio` either visit https://g2p-studio.herokuapp.com/ or run it locally using `python run_studio.py`.

Alternatively, you can run the app from the command line: `g2p run`
## API for Developers

There is also a REST API available for use in your own applications.
To launch it from the command-line use `python run_studio.py` or
`flask --app g2p.app run`. The API documentation will be viewable
(with the ability to use it interactively) at
http://localhost:5000/docs - an OpenAPI definition is also available
at http://localhost:5000/static/swagger.json .

You can see the list of URLs served by the API using `flask --app
g2p.app routes`.

## Maintainers

Expand Down
121 changes: 121 additions & 0 deletions docs/migration-2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
---
comments: true
---

# Migrating from `g2p` 1.x

The `g2p` 2.0 release introduces a number of improvements and changes
which, unfortunately, are incompatible with mappings and Python code
written for the previous version. We have tried to describe them here
with the changes you will need to make to your code and data.

## Mapping configurations have changed (for the better)

The configurations for mappings (which you'll find in
`g2p/mappings/langs/*/config-g2p.yaml`) are now validated with a [YAML
Schema](https://raw.githubusercontent.com/roedoejet/g2p/main/g2p/mappings/.schema/g2p-config-schema-2.0.json).
If you use an editor like [Visual Studio
Code](https://code.visualstudio.com/) with the [YAML
extension](https://marketplace.visualstudio.com/items?itemName=redhat.vscode-yaml),
the names of fields will be autocompleted and some warnings will be
shown for possible values. This also works with [GNU
Emacs](https://www.gnu.org/software/emacs/) using
[Eglot](https://joaotavora.github.io/eglot/) or
[lsp-mode](https://emacs-lsp.github.io/lsp-mode/) and any other editor
that supports the [Language Server
Protocol](https://microsoft.github.io/language-server-protocol/)
and/or [SchemaStore](https://www.schemastore.org/json/). Some
varieties of VIM are known to work, for instance.

In order for this magic to work, we needed to give the configuration
files a somewhat more meaningful name than `config.yaml`, so they must
now be called `config-g2p.yaml`. In addition some fields have changed
names to reflect the fact that they refer to *files* and not the
actual rules themselves:

- `mapping` is now `rules_path`
- `abbreviations` is now `abbreviations_path`

The mappings themselves should be compatible with the previous
version, please let us know if you encounter any problems.

## Submodules of `g2p` must be imported explicitly

Previously, when you called `import g2p`, it imported absolutely
everything, which caused the command-line interface (and probably your
program too) to start up very, very slowly.

If you simply use the public and documented `make_g2p` and
`make_tokenizer` APIs, this will not change anything, but if you
relied on internal classes and functions from `g2p.mappings`,
`g2p.transducer`, etc, then you can no longer depend on them being
also accessible in the top-level `g2p` package. For example, you will
need to make this sort of change:

```diff
- from g2p import Mapping, Transducer
+ from g2p.mappings import Mapping
+ from g2p.transducer import Transducer
```

> **NOTE** These are not public APIs, and are subject to further
changes. This guide is provided as a courtesy to anyone who may have
been using them and should not be construed as public API documentation.

## Mappings and rules use properties to access their fields

Along the same lines, access to the internal structure of rule-based
mappings has changed considerably (and for the better) due to the use
of [Pydantic](https://docs.pydantic.dev/latest/). This means,
however, that you can no longer treat them as the simple dictionaries
that they used to be, since they are no longer that. Instead, use
properties, which correspond to the names used in `config-g2p.yaml`.

For example, you can access the `case_sensitive` flag using the
property of the same name (note also that you can no longer construct
a `Mapping` by simply passing the name of the file):

```python
mapping = Mapping.load_from_file("path/to/some/config-g2p.yaml")
print("Case sensitive?", mapping.case_sensitive)
```

To iterate over the rules in a mapping, you now use the `rules`
property instead of the `mapping_data` field. The rules themselves
now also use properties for access, which do not entirely correspond
to the names used in the JSON/YAML definition, because `in`, for example,
is a reserved word in Python. So for instance you would make this
change:

```diff
- for rule in mapping["mapping_data"]:
- print("Rule maps", rule["in"], "to", rule["out"])
+ for rule in mapping.rules:
+ print("Rule maps", rule.rule_input, "to", rule.rule_output)
```

> **NOTE**: These are not public APIs, and are subject to further
changes. This guide is provided as a courtesy to anyone who may have
been using them and should not be construed as public API documentation.

## Some CLI commands no longer exist

Several commands for the `g2p` command-line have been removed as they
were duplicates of other functionality:

- run
- routes
- shell

To run the `g2p` API server, you can use:

flask --app g2p.app run

Likewise, for `routes` and `shell`, you can use:

flask --app g2p.app routes
flask --app g2p.app shell

To run G2P Studio, use:

python run_studio.py
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ nav:
- Getting started: start.md
- How to contribute: contributing.md
- Using the g2p studio: studio.md
- Migrating from g2p 1.x: migration-2.md
- Reference:
- Package: package.md
- Command Line: cli.md

0 comments on commit cbfd608

Please sign in to comment.