Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept char iterator as input #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

tmfink
Copy link
Contributor

@tmfink tmfink commented Sep 16, 2020

Work-in-progress to handle #3.

Please let me know your thoughts on this style. I just worked on katakana_to_hiragana(). Once we figure out how the best way to do this, I will work on the other modules.

Add iterator input APIs to

  • is_hiragana
    • is_hiragana()
  • is_japanese
    • is_japanese()
  • is_kana
    • is_kana()
  • is_kanji
    • contains_kanji()
    • is_kanji()
  • is_katakana
    • is_katakana()
  • is_mixed
    • is_mixed()
    • is_mixed_pass_kanji()
  • is_romaji
    • is_romaji()
  • to_hiragana
    • to_hiragana()
    • to_hiragana_with_opt()
  • to_kana
    • to_kana()
    • to_kana_with_opt()
  • to_katakana
    • to_katakana()
    • to_katakana_with_opt()
  • to_romaji
    • to_romaji()
    • to_romaji_with_opt()
  • tokenize
    • tokenize()
    • tokenize_detailed()
    • tokenize_with_opt()
  • trim_okurigana
    • is_invalid_matcher()
    • is_leading_without_initial_kana()
    • is_trailing_without_final_kana()
    • trim_okurigana()
    • trim_okurigana_with_opt()
  • utils
    • hiragana_to_katakana()
    • katakana_to_hiragana()
    • romaji_to_hiragana()

Pull out body of katakana_to_hiragana_with_opt() into separate function
that takes a char iterator as input.
@PSeitz
Copy link
Owner

PSeitz commented Sep 16, 2020

There is a place where the next char is peeked:

} else if is_romaji(input) || input.chars().next().map(|c| is_char_english_punctuation(c)).unwrap_or(false) {
// TODO: is it correct to check only the first char (see src\utils\isCharEnglishPunctuation.js)
romaji_to_hiragana(input, config)

So it would require a peekable iterator I think
https://doc.rust-lang.org/std/iter/struct.Peekable.html

There may be other similar cases.

@PSeitz
Copy link
Owner

PSeitz commented Sep 30, 2020

On second thought, we could always just use the minimal api instead of a uniform one, which would not require Peekable here. What do you think?

@tmfink
Copy link
Contributor Author

tmfink commented Oct 1, 2020

On second thought, we could always just use the minimal api instead of a uniform one, which would not require Peekable here. What do you think?

@PSeitz What do you mean by minimal vs. uniform API?

In some cases, we can get away with avoiding the "direct indexing" approach. For katakana_iter_to_hiragana_with_opt(), I added a previous_char variable to track the previous character (instead of indexing into the previous index).

@PSeitz
Copy link
Owner

PSeitz commented Oct 1, 2020

@PSeitz What do you mean by minimal vs. uniform API?

I mean defining a single iterator api which is used by all methods, vs minimal api everywhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants