Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrections and additions for Punjabi Shahmukhi (Arabic script) #57

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bgo-eiu
Copy link

@bgo-eiu bgo-eiu commented Dec 21, 2022

These updates are concerned with correcting the set of characters used for the language. When I have time in the future, I will add sample text as well that mirrors that of the Punjabi Gurmukhi file. Thank you

@google-cla
Copy link

google-cla bot commented Dec 21, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@moyogo
Copy link
Contributor

moyogo commented Aug 22, 2023

@bgo-eiu The data originally comes from the CLDR https://github.com/unicode-org/cldr/blob/aa0dd5acb1e68876c730e90ef6cb43cfc4d7006f/common/main/pa_Arab.xml#L32-L36.

These changes look likes they are replacing ڻ used in Panjabi [pa] by ‎ݨ which is used in Saraiki [skr] or Western Panjabi [pnb], and adding ࣇ which is used in Western Panjabi [pnb].
Note that Panjabi [pa] in ISO 639-1 is called "Eastern Panjabi" in Ethnologue and Panjabi [pan] in ISO 639-3).

Should these changes be a separate language data file for Western Panjabi pnb_Arab.textproto?

If this is indeed for Eastern Panjabi [pan], do you have a reference.
Elena Bashir and Thomas J. Conners, with Brook Hefright, A Descriptive Grammar of Hindko, Panjabi, and Saraiki, 2019, p. 65 indicates ݨ is used in Saraiki and ن is used in Panjabi.

Lorna Priest Evans and M. G. Abbas Malik, Proposal to encode ARABIC LETTER LAM WITH SMALL ARABIC LETTER TAH ABOVE in the UCS, 2019 shows ݨ being used in Punjabi but it does indicate Western Punjabi uses the Arabic script (Shahmukhi) and hasn’t been standardized.

I’m leaning towards creating a separate file for Western Panjabi [pnb].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants