-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Korean Revised Romanization to hangeul IME #716
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few silly, nitpicking comments about code style for now. We'll go over the code later.
[ 'b', 'ᄇ' ], | ||
[ 'ᄇp', 'ᄈ' ], | ||
[ 'ᄉs', 'ᄊ' ], | ||
// [ '\'', 'ᄋ'], // Apostrophe can be written to represent silent ᄋ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks unnecessary.
// This version does not support context rules, but we don't need them | ||
patterns: function(input, context) { | ||
var patterns, regex, rule, replacement, i, result; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many empty lines. There should be one.
|
||
// This regex matches jamo that form a syllable so they can be combined | ||
var jamoRegex = /([ᄀ-ᄒ])([ᅡ-ᅵ])([ᆨ-ᇂ])?([ᄀ-ᄒ]|[\- '])(.*)$/; | ||
if (jamoRegex.test(result)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We require spaces inside parentheses.
// This regex matches jamo that form a syllable so they can be combined | ||
var jamoRegex = /([ᄀ-ᄒ])([ᅡ-ᅵ])([ᆨ-ᇂ])?([ᄀ-ᄒ]|[\- '])(.*)$/; | ||
if (jamoRegex.test(result)) { | ||
return { noop: false, output: result.replace(jamoRegex, combineJamo) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, too, spaces inside parentheses.
// Conjoining jamo behavior is defined by this Unicode standard | ||
// https://www.unicode.org/versions/Unicode13.0.0/ch03.pdf#G24646 | ||
// parameter `final` is optional | ||
function combineJamo(substring, initial, vowel, final, nextSyllableInitial, otherChars) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, too, spaces inside parentheses.
var syllable = String.fromCharCode(syllableNo); | ||
|
||
const disambig = /[\- ']/; | ||
if (nextSyllableInitial.match(disambig)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, too, spaces inside parentheses.
const disambig = /[\- ']/; | ||
if (nextSyllableInitial.match(disambig)) { | ||
return syllable; | ||
} else if (otherChars.match(disambig)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
@@ -1274,6 +1278,10 @@ | |||
autonym: 'ಕನ್ನಡ', | |||
inputmethods: [ 'kn-transliteration', 'kn-inscript', 'kn-kgp', 'kn-inscript2' ] | |||
}, | |||
kor: { | |||
autonym: '한국어', | |||
inputmethods: [ 'kor-rr' ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually give them an identified that is based on the shortest language code, so it should be "ko" and not "kor".
description: 'Korean RR test', | ||
inputmethod: 'kor-rr', | ||
tests: [ | ||
// Note that RR is meant to romanize from hangul to latin script, but not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize "Hangul" and "Latin".
]; | ||
|
||
var koreanRR = { | ||
id: 'kor-rr', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually give them an identified that is based on the shortest language code, so it should be "ko" and not "kor".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this patch and it works as intended. @annedrewhu Thanks a lot for your contributions and apologies for the delay in reviewing!
Follow-up to wikimedia#716. * Rename the "kor-rr" identifier to "ko-rr" and "kor" to "ko": We use two-letter language codes when they are available. * Whitespace clean-up.
Follow-up to wikimedia#716. * Rename the "kor-rr" identifier to "ko-rr" and "kor" to "ko": We use two-letter language codes when they are available. * Whitespace clean-up.
Follow-up to #716. * Rename the "kor-rr" identifier to "ko-rr" and "kor" to "ko": We use two-letter language codes when they are available. * Whitespace clean-up. Co-authored-by: SrishAkaTux <[email protected]>
Any Korean transliteration keyboard will have a lot of edge cases to check because of differences in romanization between revised romanization, the older McCune–Reischauer, and even a different system included in the RR standard. I've added a modest amount of tests based on this official RR document, but I'm open to adding more before merging.