You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using Ack with UTF data is now a FAQ, as we have had a second query. (One UTF-16, one UTF-8 Multibyte / Wide character ) Ref ack3:153 and ack3:222 q.v..
AFAIK,
Ack does not (yet) honor $LOCALE etc for files scanned; the assumed problem domain is ASCII program source code, not Natural Language.
Ack3 is serendipitously able to process Latin 1 UTF8 files -- Eurpean accented characters -- which covers most cases of UTF-8 in e.g. Perl sourcecode
Ack can not process UTF-8 multibyte / Wide character data -- everything non-European -- , and can not process files saved as UCS/UTF-16/UTF-32 (even if pure Latin 1 characters).
We have a workaround (in above cited issues) for processing files with multibyte characters as appropriate UTF provided all files are processable the same way (ASCII and UTF-8 intermingle OK, but UTF16LE and UTF16BE do not unless they have BOM), but it requires Perl $OLD_PERL_VERSION < 5.029000 . (The use of Encoding on sysread is fatally deprecated in 5.30 (5.29+), which defeats the workaround; Warnings in 5.24-5.28.)
Linux does not accept a global Local UTF-16. Weird but true.
The reason for not immediately adding UTF de-encoding after our sysread according to global Locale, commandline flag, or file BOM (byte order marker) is the test case combinatorial explosion for our test suite. We won't ship the feature unless we know it's not harming the relied upon functionality.
The text was updated successfully, but these errors were encountered:
Using Ack with UTF data is now a FAQ, as we have had a second query. (One UTF-16, one UTF-8 Multibyte / Wide character ) Ref ack3:153 and ack3:222 q.v..
AFAIK,
$OLD_PERL_VERSION < 5.029000
. (The use of Encoding onsysread
is fatally deprecated in 5.30 (5.29+), which defeats the workaround; Warnings in 5.24-5.28.)sysread
according to global Locale, commandline flag, or file BOM (byte order marker) is the test case combinatorial explosion for our test suite. We won't ship the feature unless we know it's not harming the relied upon functionality.The text was updated successfully, but these errors were encountered: