Echo translation engine doesn't work for Paratext projects #396

pmachapman · 2024-05-30T00:14:55Z

This is likely low priority.

The echo translation engine only works with text files. As Scripture Forge now uploads Paratext zip files, we are no longer able to use Echo for testing purposes.

In addition, the language checks 'https://qa.serval-api.org/api/v1/translation/engine-types/echo/languages/en always return isNative: false for echo. Should these always return true, as echo supports any language?

The text was updated successfully, but these errors were encountered:

Enkidu93 · 2024-06-10T18:26:09Z

The most straightforward way to accomplish this is to abstract the alignment and inclusion/exclusion of data functionality from NmtPreprocessBuildJob into a separate class and share that with the Echo engine (or have both inherit/implement the same pipeline). This is something that would be more broadly useful, I think. This way the Echo engine could properly reflect the logic in NMT jobs and would be generally more useful for testing. Should we wait on this until the Machine ASP.NET code is moved over to the Serval repo? @ddaspit @johnml1135 What do you guys think? How important is this?

The switch to a default of true is trivial and I think that would make more sense.

ddaspit · 2024-06-10T19:48:49Z

My plan was that once we move the Machine engine to the Serval repo, we could create a library for shared engine functionality. So, yes, we should wait until we move the Machine engine to the Serval repo.

Enkidu93 · 2024-06-28T16:25:42Z

@johnml1135 Doesn't look like the machine engine has been moved over yet; can we move this back to another release?

Enkidu93 · 2024-08-12T16:47:09Z

Waiting on #451

Enkidu93 · 2024-10-11T15:41:21Z

@ddaspit, do you think it would be appropriate to move some of the logic from PreprocessBuildJob into the service toolkit so that the Echo engine can use it? I'm just not sure how to make that work in a way that maintains a separation between the toolkit and Serval/Machine - or is that degree of separation not a priority? I can just duplicate code, but if we want the Echo engine to be a usable testing engine, we'll need to be conscientious about updating the Echo engine as we make updates to preprocessing particularly. It would be nice if they could just share code so we're always certain that we're testing something as realistic as possible.

ddaspit · 2024-10-14T16:32:11Z

Yes, it would be good to move the preprocessing logic to the service toolkit. In what way would it be difficult to main a separation between the toolkit and Serval?

Enkidu93 · 2024-10-14T16:42:05Z

Yes, it would be good to move the preprocessing logic to the service toolkit. In what way would it be difficult to main a separation between the toolkit and Serval?

OK. I guess more the machine engine than Serval - I'm thinking that it'd be difficult in regard to the machine engine corpus data models. Is it OK to have those data models be shared and if so, where should they live? I could also just move as much logic as possible to the toolkit without sharing those data models (e.g., things like the AlignScripture functions), but that would still leave a good bit of logic for the Echo engine to have to duplicate.

ddaspit · 2024-10-14T16:47:57Z

I believe that those models are not database models. Is that correct? If I remember correctly, they are basically models that are used to pass the configuration to the preprocessing logic, so it would make sense to move them to the toolkit as config models.

Enkidu93 · 2024-10-14T17:43:37Z

I believe that those models are not database models. Is that correct? If I remember correctly, they are basically models that are used to pass the configuration to the preprocessing logic, so it would make sense to move them to the toolkit as config models.

No, they're not; that's correct. OK. And then we can have the machine engine import those models 👍 .

pmachapman added the bug Something isn't working label May 30, 2024

johnml1135 assigned johnml1135 and Enkidu93 and unassigned johnml1135 Jun 3, 2024

Enkidu93 mentioned this issue Jul 29, 2024

Fix Echo issues #444

Merged

Enkidu93 linked a pull request Oct 16, 2024 that will close this issue

Move preprocess logic to toolkit #512

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Echo translation engine doesn't work for Paratext projects #396

Echo translation engine doesn't work for Paratext projects #396

pmachapman commented May 30, 2024

Enkidu93 commented Jun 10, 2024

ddaspit commented Jun 10, 2024

Enkidu93 commented Jun 28, 2024

Enkidu93 commented Aug 12, 2024

Enkidu93 commented Oct 11, 2024

ddaspit commented Oct 14, 2024

Enkidu93 commented Oct 14, 2024

ddaspit commented Oct 14, 2024

Enkidu93 commented Oct 14, 2024

Echo translation engine doesn't work for Paratext projects #396

Echo translation engine doesn't work for Paratext projects #396

Comments

pmachapman commented May 30, 2024

Enkidu93 commented Jun 10, 2024

ddaspit commented Jun 10, 2024

Enkidu93 commented Jun 28, 2024

Enkidu93 commented Aug 12, 2024

Enkidu93 commented Oct 11, 2024

ddaspit commented Oct 14, 2024

Enkidu93 commented Oct 14, 2024

ddaspit commented Oct 14, 2024

Enkidu93 commented Oct 14, 2024