Skip to content

Commit

Permalink
Update regex in semantic.ts to include number matching
Browse files Browse the repository at this point in the history
  • Loading branch information
shtse8 committed Apr 17, 2024
1 parent 7b96129 commit 800c1ff
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/semantic.ts
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ export function semanticWords(text: string, concatCjk = false): string[] {
// Construct the regular expression dynamically based on the concatCjk flag.
// This regex pattern aims to match Latin vocabulary words or CJK characters (grouped or not based on concatCjk).
// The use of non-capturing groups (?:) and 'ug' flags ensures global matching of all occurrences in Unicode mode.
const regex = new RegExp(`${regexMap.latinVocab}|${regexMap.cjk}${concatCjk ? '+' : ''}`, 'ug');
const regex = new RegExp(`${regexMap.latinVocab}|${regexMap.number}+|${regexMap.cjk}${concatCjk ? '+' : ''}`, 'ug');

// Use matchAll to find all matches for the regex in the text, then map to extract the matched strings.
// This approach is streamlined for clarity and performance, directly converting the iterable from matchAll into an array of strings.
Expand Down

0 comments on commit 800c1ff

Please sign in to comment.