Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finalize data contracts for all languages #548

Open
2 tasks done
andrewtavis opened this issue Jan 9, 2025 · 1 comment
Open
2 tasks done

Finalize data contracts for all languages #548

andrewtavis opened this issue Jan 9, 2025 · 1 comment
Labels
feature New feature or request help wanted Extra attention is needed

Comments

@andrewtavis
Copy link
Member

Terms

Description

This issue is a continuation of scribe-org/Scribe-Android#269 where we discussed the possibility of data contracts being used in end applications. From here the goal is to finalize the contracts such that they can then be transferred to the end applications via an update of Scribe-Data, with the version of Scribe-Data on Scribe-Server then being updated and the contracts then being exported and sent to the end applications.

What Scribe-Data's role at this point is still needs to be finalized.

Note that the current version of the data contracts can be found in src/scribe_data/wikidata/data-contracts. We need to now match the English, Swedish and German verb contracts with the end application data. This does require the data process for German and English to be expanded a bit as they don't have all of their end application conjugations. What's needed is to expand the contract to include the various parts of the data that we get in the appropriate places in the same way as the data was formatted in the past.

Contribution

Happy to discuss how Scribe-Data will fit into this process and then help with implementation and review! 😊

@andrewtavis andrewtavis added feature New feature or request help wanted Extra attention is needed labels Jan 9, 2025
@andrewtavis
Copy link
Member Author

CC @angrezichatterbox and @axif0 👋 Specifically I want to draw your attention to the following line:

What's needed is to expand the contract to include the various parts of the data that we get in the appropriate places in the same way as the data was formatted in the past.

We used to have really expansive and hard to maintain Python formatting processes in Scribe-Data, and we should avoid going back to those by any means. The thing is though, we need conjugations for English verbs, but then language_data_extraction/english/verbs/query_verbs.sparql only has the simple forms that are actually required. We could create the English verbs contract like the following though and use this as a template for further contracts that expand the data we get to all of its use cases:

{
    ...
    "conjugations": {
        "1": {
            "title": "Present",
            "1": { "I": "simplePresent" },
            "2": { "you": "simplePresent" },
            "3": { "he/she/it": "simplePresentThirdPersonSingular" },
            "4": { "we": "simplePresent" },
            "5": { "you all": "simplePresent" },
            "6": { "they": "simplePresent" }
        },
        "2": {
            "title": "Pr. Perfect",
            "1": { "I": "have pastParticiple" },
            "2": { "you": "have pastParticiple" },
            "3": { "he/she/it": "has pastParticiple" },
            "4": { "we": "have pastParticiple" },
            "5": { "you all": "have pastParticiple" },
            "6": { "they": "have pastParticiple" }
        },
        ...
    }
}

The Pr. Perfect examples above are like "I have gone", etc. Basically we would use the contract in a way that we'd split on spaces and then try to convert the various parts that are returned. We could also similarly do a list for the final value like ["have", "pastParticiple"], but then I find the above to be a bit more readable 🤔

Let me know what your thoughts are!

@andrewtavis andrewtavis changed the title Finalize data contracts and Finalize data contracts for all languages Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request help wanted Extra attention is needed
Projects
Status: Todo
Development

No branches or pull requests

1 participant