Add Korean Revised Romanization to hangeul IME #716

annedrewhu · 2023-02-06T03:42:54Z

Any Korean transliteration keyboard will have a lot of edge cases to check because of differences in romanization between revised romanization, the older McCune–Reischauer, and even a different system included in the RR standard. I've added a modest amount of tests based on this official RR document, but I'm open to adding more before merging.

amire80

Just a few silly, nitpicking comments about code style for now. We'll go over the code later.

amire80 · 2023-02-06T06:39:28Z

rules/kor/kor-rr.js

+		[ 'b', 'ᄇ' ],
+		[ 'ᄇp', 'ᄈ' ],
+		[ 'ᄉs', 'ᄊ' ],
+		// [ '\'', 'ᄋ'],  // Apostrophe can be written to represent silent ᄋ


Looks unnecessary.

amire80 · 2023-02-06T06:40:31Z

rules/kor/kor-rr.js

+		// This version does not support context rules, but we don't need them
+		patterns: function(input, context) {
+			var patterns, regex, rule, replacement, i, result;
+


Too many empty lines. There should be one.

amire80 · 2023-02-06T06:40:59Z

rules/kor/kor-rr.js

+
+					// This regex matches jamo that form a syllable so they can be combined
+					var jamoRegex = /([ᄀ-ᄒ])([ᅡ-ᅵ])([ᆨ-ᇂ])?([ᄀ-ᄒ]|[\- '])(.*)$/;
+					if (jamoRegex.test(result)) {


We require spaces inside parentheses.

amire80 · 2023-02-06T06:41:24Z

rules/kor/kor-rr.js

+					// This regex matches jamo that form a syllable so they can be combined
+					var jamoRegex = /([ᄀ-ᄒ])([ᅡ-ᅵ])([ᆨ-ᇂ])?([ᄀ-ᄒ]|[\- '])(.*)$/;
+					if (jamoRegex.test(result)) {
+						return { noop: false, output: result.replace(jamoRegex, combineJamo) };


Here, too, spaces inside parentheses.

amire80 · 2023-02-06T06:41:42Z

rules/kor/kor-rr.js

+	// Conjoining jamo behavior is defined by this Unicode standard
+	// https://www.unicode.org/versions/Unicode13.0.0/ch03.pdf#G24646
+	// parameter `final` is optional
+	function combineJamo(substring, initial, vowel, final, nextSyllableInitial, otherChars) {


Here, too, spaces inside parentheses.

amire80 · 2023-02-06T06:42:54Z

rules/kor/kor-rr.js

+		var syllable = String.fromCharCode(syllableNo);
+
+		const disambig = /[\- ']/;
+		if (nextSyllableInitial.match(disambig)) {


Here, too, spaces inside parentheses.

amire80 · 2023-02-06T06:43:07Z

rules/kor/kor-rr.js

+		const disambig = /[\- ']/;
+		if (nextSyllableInitial.match(disambig)) {
+			return syllable;
+		} else if (otherChars.match(disambig)) {


amire80 · 2023-02-06T06:44:15Z

src/jquery.ime.inputmethods.js

@@ -1274,6 +1278,10 @@
 			autonym: 'ಕನ್ನಡ',
 			inputmethods: [ 'kn-transliteration', 'kn-inscript', 'kn-kgp', 'kn-inscript2' ]
 		},
+		kor: {
+			autonym: '한국어',
+			inputmethods: [ 'kor-rr' ]


We usually give them an identified that is based on the shortest language code, so it should be "ko" and not "kor".

amire80 · 2023-02-06T06:44:41Z

test/jquery.ime.test.fixtures.js

+		description: 'Korean RR test',
+		inputmethod: 'kor-rr',
+		tests: [
+			// Note that RR is meant to romanize from hangul to latin script, but not


Capitalize "Hangul" and "Latin".

amire80 · 2023-02-06T06:45:54Z

rules/kor/kor-rr.js

+	];
+
+	var koreanRR = {
+		id: 'kor-rr',


We usually give them an identified that is based on the shortest language code, so it should be "ko" and not "kor".

srish

I tested this patch and it works as intended. @annedrewhu Thanks a lot for your contributions and apologies for the delay in reviewing!

Follow-up to wikimedia#716. * Rename the "kor-rr" identifier to "ko-rr" and "kor" to "ko": We use two-letter language codes when they are available. * Whitespace clean-up.

Follow-up to #716. * Rename the "kor-rr" identifier to "ko-rr" and "kor" to "ko": We use two-letter language codes when they are available. * Whitespace clean-up. Co-authored-by: SrishAkaTux <[email protected]>

amire80 reviewed Feb 6, 2023

View reviewed changes

annedrewhu added 9 commits February 26, 2023 16:31

Create kor-rr.js

3af46c7

Initial Korean RR broken code and tests

cf65587

Fixed off by one bug

d5b3741

Fix bugs

8454adc

Fix tests, add capital T and K for stressed consonants

0df0fd5

Allow dd initial aspirated t

e4f9ca4

Style fixes

7f08d37

More style fixes

cb32895

Fix combineJamo bug

2b2c06a

amire80 force-pushed the korean-rr branch from 2efe5d0 to 2b2c06a Compare February 26, 2023 14:31

srish added 2 commits October 2, 2024 16:21

Merge branch 'master' into korean-rr

26114fc

Merge branch 'master' into korean-rr

cf5cf10

srish approved these changes Oct 3, 2024

View reviewed changes

srish merged commit 7150436 into wikimedia:master Oct 3, 2024
3 checks passed

amire80 mentioned this pull request Oct 22, 2024

Clean up Korean input method #802

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Korean Revised Romanization to hangeul IME #716

Add Korean Revised Romanization to hangeul IME #716

annedrewhu commented Feb 6, 2023

amire80 left a comment

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

amire80 Feb 6, 2023

srish left a comment

Add Korean Revised Romanization to hangeul IME #716

Add Korean Revised Romanization to hangeul IME #716

Conversation

annedrewhu commented Feb 6, 2023

amire80 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

srish left a comment

Choose a reason for hiding this comment