Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another multilingual approach #177

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

richard-rogers
Copy link
Contributor

@richard-rogers richard-rogers commented Oct 30, 2023

  • Dictionary mapping language code to config
  • Disable metrics for unsupported languages
  • Metric modules no longer init() on import!
  • Separate prompt/response configuration parameters
  • Translator interface for polyglotting English-only metrics

@richard-rogers richard-rogers force-pushed the dev/richard/multilingual3 branch 2 times, most recently from dd6269b to 824c027 Compare October 31, 2023 18:29
.bumpversion.cfg Outdated Show resolved Hide resolved

def sentiment_nltk(text: str) -> float:
if _sentiment_analyzer is None:
def sentiment_nltk(text: str, sentiment_analyzer) -> float:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should continue supporting the use of the modules as standalone functions. It looks like this change would prevent the user from simply calling sentiment_nltk("I am very sad"), right?

I think this comment applies to other modules. At least for toxicity, I think it's the same case.

Can we require only the text values for the standalone functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a way to have a single function support multiple languages without the extra parameter. There's a few ways it could be curried, but logically it needs to be there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having an extra parameter is ok, but requiring the user to pass a SentimentIntensityAnalyzer, for example, looks very complex. Couldn't we have the user be able to pass an optional language parameter that defaults to english?

we can choose to drop the single function support, but I personally find it very useful.

pass


init()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of the init() call in the import of these modules: (textstat, sentiment, regexes, etc) is a significant breaking change in behavior and will break existing LangKit integrations that imported these modules directly.

Can we provide a back-compat friendly version of the language support changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants