-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
28 changed files
with
56,596 additions
and
7,832 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
name: CLDF-validation | ||
|
||
on: | ||
push: | ||
branches: [ main ] | ||
pull_request: | ||
branches: [ main ] | ||
|
||
jobs: | ||
build: | ||
|
||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
python-version: ["3.10"] | ||
|
||
steps: | ||
- uses: actions/checkout@v4 | ||
- name: Set up Python ${{ matrix.python-version }} | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
- name: Install dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
pip install pytest-cldf | ||
- name: Test with pytest | ||
run: | | ||
pytest --cldf-metadata=cldf/StructureDataset-metadata.json test.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
{ | ||
"creators": [ | ||
{ | ||
"name": "Louise Baird" | ||
}, | ||
{ | ||
"name": "Nicholas Evans" | ||
}, | ||
{ | ||
"name": "Simon J. Greenhill" | ||
} | ||
], | ||
"contributors": [ | ||
{ | ||
"name": "Tiago Tresoldi", | ||
"type": "Other" | ||
}, | ||
{ | ||
"name": "Johann-Mattis List", | ||
"type": "Other" | ||
}, | ||
{ | ||
"name": "Robert Forkel", | ||
"type": "Other" | ||
} | ||
], | ||
"title": "CLDF dataset with phoneme inventories from the \"Journal of the IPA\", aggregated by Baird et al. (2021)", | ||
"access_right": "open", | ||
"keywords": [ | ||
"cldf:StructureDataset", | ||
"linguistics" | ||
], | ||
"upload_type": "dataset", | ||
"description": "<p>Cite the source of the dataset as:</p>\n\n<blockquote>\n<p>Baird, L., Evans, N., & Greenhill, S. J. (2021). Blowing in the wind: Using 'North Wind and the Sun' texts to sample phoneme inventories. Journal of the International Phonetic Association, 1\u201342. doi:10.1017/s002510032000033x</p>\n</blockquote>", | ||
"license": { | ||
"id": "CC0-1.0" | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
# Contributors | ||
|
||
| Name | Role | | ||
|:-------------------|:-------| | ||
| Louise Baird | Author | | ||
| Nicholas Evans | Author | | ||
| Simon J. Greenhill | Author | | ||
| Tiago Tresoldi | other | | ||
| Johann-Mattis List | other | | ||
| Robert Forkel | other | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,29 @@ | ||
# JIPA | ||
# CLDF dataset with phoneme inventories from the "Journal of the IPA", aggregated by Baird et al. (2021) | ||
|
||
CLDF dataset with phoneme inventories from the *Journal of the International Phonetic Association*. Aggregated by Baird et al. 2021. | ||
[![CLDF validation](https://github.com/cldf-datasets/jipa/workflows/CLDF-validation/badge.svg)](https://github.com/cldf-datasets/jipa/actions?query=workflow%3ACLDF-validation) | ||
|
||
* Baird L, Evans N, & Greenhill SJ. 2021. Blowing in the wind: Using 'North Wind and the Sun' texts to sample phoneme inventories. *Journal of the International Phonetic Association*, 1–42. [doi:10.1017/s002510032000033x](https://doi.org/10.1017/s002510032000033x) | ||
## How to cite | ||
|
||
If you use these data please cite | ||
- the original source | ||
> Baird, L., Evans, N., & Greenhill, S. J. (2021). Blowing in the wind: Using 'North Wind and the Sun' texts to sample phoneme inventories. Journal of the International Phonetic Association, 1–42. doi:10.1017/s002510032000033x | ||
- the derived dataset using the DOI of the [particular released version](../../releases/) you were using | ||
|
||
## Description | ||
|
||
|
||
This dataset is licensed under a CC0-1.0 license | ||
|
||
Available online at https://doi.org/10.1017/S002510032000033x | ||
|
||
|
||
|
||
Languages representd in the dataset color-coded by language family. | ||
|
||
![](map.svg) | ||
|
||
## CLDF Datasets | ||
|
||
The following CLDF datasets are available in [cldf](cldf): | ||
|
||
- CLDF [StructureDataset](https://github.com/cldf/cldf/tree/master/modules/StructureDataset) at [cldf/StructureDataset-metadata.json](cldf/StructureDataset-metadata.json) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# Releasing the JIPA CLDF dataset | ||
|
||
- Install requirements: | ||
```shell | ||
pip install cldfviz[cartopy] | ||
``` | ||
- Re-create the CLDF dataset running | ||
```shell | ||
cldfbench makecldf cldfbench_jipa.py --glottolog-version v5.0 --with-cldfreadme --with-zenodo | ||
cldfbench readme cldfbench_jipa.py | ||
``` | ||
- Make sure the data is valid running | ||
```shell | ||
pytest | ||
``` | ||
- Make sure data can be loaded into SQLite | ||
```shell | ||
rm -f jipa.sqlite | ||
cldf createdb cldf/StructureDataset-metadata.json jipa.sqlite | ||
``` | ||
- Recreate the coverage map | ||
```shell | ||
cldfbench cldfviz.map cldf --format svg --width 20 --output map.svg --with-ocean --language-properties Family --no-legend --pacific-centered | ||
``` | ||
- Recreate the ER diagram | ||
```shell | ||
cldferd --format compact.svg cldf > erd.svg | ||
``` | ||
- Commit all changes, tag the release, push code and tags. | ||
- Create a release on GitHub and make sure it is picked up by Zenodo. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
<a name="ds-structuredatasetmetadatajson"> </a> | ||
|
||
# StructureDataset CLDF dataset with phoneme inventories from the "Journal of the IPA", aggregated by Baird et al. (2021) | ||
|
||
**CLDF Metadata**: [StructureDataset-metadata.json](./StructureDataset-metadata.json) | ||
|
||
**Sources**: [sources.bib](./sources.bib) | ||
|
||
property | value | ||
--- | --- | ||
[dc:bibliographicCitation](http://purl.org/dc/terms/bibliographicCitation) | Baird, L., Evans, N., & Greenhill, S. J. (2021). Blowing in the wind: Using 'North Wind and the Sun' texts to sample phoneme inventories. Journal of the International Phonetic Association, 1–42. doi:10.1017/s002510032000033x | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF StructureDataset](http://cldf.clld.org/v1.0/terms.rdf#StructureDataset) | ||
[dc:identifier](http://purl.org/dc/terms/identifier) | https://doi.org/10.1017/S002510032000033x | ||
[dc:license](http://purl.org/dc/terms/license) | https://creativecommons.org/publicdomain/zero/1.0/ | ||
[dcat:accessURL](http://www.w3.org/ns/dcat#accessURL) | https://github.com/cldf-datasets/jipa | ||
[prov:wasDerivedFrom](http://www.w3.org/ns/prov#wasDerivedFrom) | <ol><li><a href="https://github.com/cldf-clts/clts/tree/v2.3.0">Catalog v2.3.0</a></li><li><a href="https://github.com/cldf-datasets/jipa/tree/a1c4dcf">cldf-datasets/jipa a1c4dcf</a></li><li><a href="https://github.com/glottolog/glottolog/tree/v5.0">Glottolog v5.0</a></li></ol> | ||
[prov:wasGeneratedBy](http://www.w3.org/ns/prov#wasGeneratedBy) | <ol><li><strong>python</strong>: 3.10.12</li><li><strong>python-packages</strong>: <a href="./requirements.txt">requirements.txt</a></li></ol> | ||
[rdf:ID](http://www.w3.org/1999/02/22-rdf-syntax-ns#ID) | jipa | ||
[rdf:type](http://www.w3.org/1999/02/22-rdf-syntax-ns#type) | http://www.w3.org/ns/dcat#Distribution | ||
|
||
|
||
## <a name="table-valuescsv"></a>Table [values.csv](./values.csv) | ||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF ValueTable](http://cldf.clld.org/v1.0/terms.rdf#ValueTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 6660 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string`<br>Regex: `[a-zA-Z0-9_\-]+` | Primary key | ||
[Language_ID](http://cldf.clld.org/v1.0/terms.rdf#languageReference) | `string` | References [languages.csv::ID](#table-languagescsv) | ||
[Parameter_ID](http://cldf.clld.org/v1.0/terms.rdf#parameterReference) | `string` | References [features.csv::ID](#table-featurescsv) | ||
[Value](http://cldf.clld.org/v1.0/terms.rdf#value) | `string` | | ||
[Code_ID](http://cldf.clld.org/v1.0/terms.rdf#codeReference) | `string` | | ||
[Comment](http://cldf.clld.org/v1.0/terms.rdf#comment) | `string` | | ||
[Source](http://cldf.clld.org/v1.0/terms.rdf#source) | list of `string` (separated by `;`) | References [sources.bib::BibTeX-key](./sources.bib) | ||
[Contribution_ID](http://cldf.clld.org/v1.0/terms.rdf#contributionReference) | `string` | References [contributions.csv::ID](#table-contributionscsv) | ||
`Marginal` | `boolean` | | ||
`Allophones` | list of `string` (separated by ` `) | | ||
`InventorySize` | `integer` | | ||
`Value_in_Source` | `string` | | ||
|
||
## <a name="table-featurescsv"></a>Table [features.csv](./features.csv) | ||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF ParameterTable](http://cldf.clld.org/v1.0/terms.rdf#ParameterTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 956 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string`<br>Regex: `[a-zA-Z0-9_\-]+` | Primary key | ||
[Name](http://cldf.clld.org/v1.0/terms.rdf#name) | `string` | | ||
[Description](http://cldf.clld.org/v1.0/terms.rdf#description) | `string` | | ||
`CLTS_BIPA` | `string` | | ||
`CLTS_Name` | `string` | | ||
|
||
## <a name="table-languagescsv"></a>Table [languages.csv](./languages.csv) | ||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF LanguageTable](http://cldf.clld.org/v1.0/terms.rdf#LanguageTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 159 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string`<br>Regex: `[a-zA-Z0-9_\-]+` | Primary key | ||
[Name](http://cldf.clld.org/v1.0/terms.rdf#name) | `string` | | ||
[Macroarea](http://cldf.clld.org/v1.0/terms.rdf#macroarea) | `string` | | ||
[Latitude](http://cldf.clld.org/v1.0/terms.rdf#latitude) | `decimal`<br>≥ -90<br>≤ 90 | | ||
[Longitude](http://cldf.clld.org/v1.0/terms.rdf#longitude) | `decimal`<br>≥ -180<br>≤ 180 | | ||
[Glottocode](http://cldf.clld.org/v1.0/terms.rdf#glottocode) | `string`<br>Regex: `[a-z0-9]{4}[1-9][0-9]{3}` | | ||
[ISO639P3code](http://cldf.clld.org/v1.0/terms.rdf#iso639P3code) | `string`<br>Regex: `[a-z]{3}` | | ||
`Family` | `string` | | ||
`Glottolog_Name` | `string` | | ||
|
||
## <a name="table-contributionscsv"></a>Table [contributions.csv](./contributions.csv) | ||
|
||
property | value | ||
--- | --- | ||
[dc:conformsTo](http://purl.org/dc/terms/conformsTo) | [CLDF ContributionTable](http://cldf.clld.org/v1.0/terms.rdf#ContributionTable) | ||
[dc:extent](http://purl.org/dc/terms/extent) | 159 | ||
|
||
|
||
### Columns | ||
|
||
Name/Property | Datatype | Description | ||
--- | --- | --- | ||
[ID](http://cldf.clld.org/v1.0/terms.rdf#id) | `string`<br>Regex: `[a-zA-Z0-9_\-]+` | Primary key | ||
[Name](http://cldf.clld.org/v1.0/terms.rdf#name) | `string` | | ||
[Description](http://cldf.clld.org/v1.0/terms.rdf#description) | `string` | | ||
[Contributor](http://cldf.clld.org/v1.0/terms.rdf#contributor) | `string` | | ||
[Citation](http://cldf.clld.org/v1.0/terms.rdf#citation) | `string` | | ||
`URL` | `string` | | ||
[Source](http://cldf.clld.org/v1.0/terms.rdf#source) | list of `string` (separated by `;`) | References [sources.bib::BibTeX-key](./sources.bib) | ||
[Comment](http://cldf.clld.org/v1.0/terms.rdf#comment) | `string` | | ||
`Metadata` | `json` | | ||
`Minimal_Pairs` | `json` | | ||
|
Oops, something went wrong.