Skip to content

wooorm/dictionaries

Repository files navigation

dictionaries

Collection of normalized and installable hunspell dictionaries.

Contents

What is this?

This monorepo is a bunch of scripts that crawls dictionaries from several sources, normalizes them, and packs them so that they can each be installed and used in one single way. Dictionaries are not maintained here but they are usable from here.

When should I use this?

You can particularly use the packages here as a programmer when integrating with other tools (such as nodehun or nspell) or when making such tools.

Install

These packages are ESM only. In Node.js (version 16+), install with npm:

npm install dictionary-en

👉 Note: replace en with the language code you want.

⚠️ Important: this project itself is MIT, but each index.dic and index.aff file still has its original license!

Use

import en from 'dictionary-en'

console.log(en)
// To do: use `en` somehow

Yields:

{aff: <Buffer>, dic: <Buffer>}

List of dictionaries

👉 Note: preferred BCP-47 codes are used (according to Unicode CLDR). To illustrate, as American English and Brazilian Portuguese are the most common types of English and Portuguese respectively, they get the codes en and pt.

In total 92 dictionaries are provided.

Name Description License
dictionary-bg Bulgarian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-br Breton (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ca Catalan (GPL-2.0 OR LGPL-2.1)
dictionary-ca-valencia Catalan (Valencia) (GPL-2.0 OR LGPL-2.1)
dictionary-cs Czech GPL-2.0
dictionary-cy Welsh LGPL-3.0
dictionary-da Danish (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-de German (GPL-2.0 OR GPL-3.0)
dictionary-de-at German (Austria) (GPL-2.0 OR GPL-3.0)
dictionary-de-ch German (Switzerland) (GPL-2.0 OR GPL-3.0)
dictionary-el Greek (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-el-polyton Greek (Polyton) GPL-3.0
dictionary-en English (MIT AND BSD)
dictionary-en-au English (Australia) (MIT AND BSD)
dictionary-en-ca English (Canada) (MIT AND BSD)
dictionary-en-gb English (United Kingdom) (MIT AND BSD)
dictionary-en-za English (South Africa) LGPL-2.1
dictionary-eo Esperanto GPL-2.0
dictionary-es Spanish (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ar Spanish (Argentina) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-bo Spanish (Bolivia) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-cl Spanish (Chile) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-co Spanish (Colombia) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-cr Spanish (Costa Rica) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-cu Spanish (Cuba) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-do Spanish (Dominican Republic) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ec Spanish (Ecuador) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-gt Spanish (Guatemala) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-hn Spanish (Honduras) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-mx Spanish (Mexico) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ni Spanish (Nicaragua) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-pa Spanish (Panama) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-pe Spanish (Peru) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ph Spanish (Philippines) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-pr Spanish (Puerto Rico) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-py Spanish (Paraguay) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-sv Spanish (El Salvador) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-us Spanish (United States of America) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-uy Spanish (Uruguay) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-es-ve Spanish (Venezuela) (GPL-3.0 OR LGPL-3.0 OR MPL-1.1)
dictionary-et Estonian LGPL-2.1
dictionary-eu Basque GPL-2.0
dictionary-fa Persian Apache-2.0
dictionary-fo Faroese (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-fr French MPL-2.0
dictionary-fur Friulian GPL-2.0
dictionary-fy Western Frisian GPL-3.0
dictionary-ga Irish GPL-2.0
dictionary-gd Scottish Gaelic GPL-3.0
dictionary-gl Galician GPL-3.0
dictionary-he Hebrew AGPL-3.0
dictionary-hr Croatian (LGPL-2.1 OR SISSL)
dictionary-hu Hungarian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-hy Armenian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-hyw Western Armenian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ia Interlingua GPL-3.0
dictionary-ie Interlingue Apache-2.0
dictionary-is Icelandic CC-BY-SA-3.0
dictionary-it Italian GPL-3.0
dictionary-ka Georgian MIT
dictionary-ko Korean (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-la Latin GPL-2.0
dictionary-lb Luxembourgish EUPL-1.1
dictionary-lt Lithuanian BSD-3-Clause
dictionary-ltg Latgalian LGPL-2.1
dictionary-lv Latvian LGPL-2.1
dictionary-mk Macedonian GPL-3.0
dictionary-mn Mongolian LPPL-1.3c
dictionary-nb Norwegian Bokmål GPL-2.0
dictionary-nds Low German GPL-3.0
dictionary-ne Nepali LGPL-2.1
dictionary-nl Dutch (BSD-3-Clause OR CC-BY-3.0)
dictionary-nn Norwegian Nynorsk GPL-2.0
dictionary-oc Occitan GPL-2.0
dictionary-pl Polish (GPL-3.0 OR LGPL-3.0 OR MPL-2.0)
dictionary-pt Portuguese (LGPL-3.0 OR MPL-2.0)
dictionary-pt-pt Portuguese (Portugal) (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ro Romanian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-ru Russian BSD-3-Clause
dictionary-rw Kinyarwanda GPL-3.0
dictionary-sk Slovak (GPL-2.0 OR LGPL-2.1 OR MPL-1.1)
dictionary-sl Slovenian (GPL-3.0 OR LGPL-2.1)
dictionary-sr Serbian (GPL-2.0 OR LGPL-2.1 OR MPL-1.1 OR CC-BY-SA-3.0)
dictionary-sr-latn Serbian (Latin script) (GPL-2.0 OR LGPL-2.1 OR MPL-1.1 OR CC-BY-SA-3.0)
dictionary-sv Swedish LGPL-3.0
dictionary-sv-fi Swedish (Finland) LGPL-3.0
dictionary-tk Turkmen Apache-2.0
dictionary-tlh Klingon Apache-2.0
dictionary-tlh-latn Klingon (Latin script) Apache-2.0
dictionary-tr Turkish MIT
dictionary-uk Ukrainian GPL-3.0
dictionary-vi Vietnamese GPL-2.0

Examples

Example: use with nspell

This example uses dictionary-en in combination with nspell.

Show install command for this example
npm install dictionary-en nspell
import en from 'dictionary-en'
import nspell from 'nspell'

const spell = nspell(en)
console.log(spell.correct('color'))
console.log(spell.correct('colour'))

Yields:

true
false

Example: load files

This example loads the index.dic and index.aff files located in dictionary-hyw (Western Armenian) from a Node.js JavaScript module (ESM).

It uses a ponyfill (import-meta-resolve) for an experimental Node API.

Show install command for this example
npm install dictionary-hyw import-meta-resolve
import fs from 'node:fs/promises'
import {resolve} from 'import-meta-resolve'

const base = await resolve('dictionary-hyw', import.meta.url)
const aff = await fs.readFile(new URL('index.aff', base))
const dic = await fs.readFile(new URL('index.dic', base))
console.log(aff, dic)

Example: use with macOS

Follow these steps to use a dictionary on macOS:

  1. navigate to the dictionary you want on GitHub, such as dictionaries/$code (replace $code with the language code you want)
  2. download the index.aff and index.dic files (as in open them, right-click “Raw”, and “download linked files”)
  3. rename the download files to $code.aff and $code.dic
  4. move $code.aff and $code.dic into the folder ~/Library/Spelling/
  5. go to System Preferences > Keyboard > Text > Spelling and select your added language (it should come with the (Library) suffix and is situated at the bottom)

Types

The packages are typed with TypeScript.

Security

These packages are safe.

Contribute

Yes please! See How to Contribute to Open Source.

Build

To build this project, on macOS, you at least need to install:

  • wget: brew install wget (crawling)
  • hunspell: brew install hunspell (many dictionaries)
  • sed: brew install gnu-sed (crawling, many dictionaries)
  • coreutils: brew install coreutils (many dictionaries)
  • ispell: brew install ispell (German)

👉 Note: sed and the GNU replacements should be setup in PATH to overwrite macOS defaults.

Updating a dictionary

Dictionaries are not maintained here. Report problems upstream.

Adding a new dictionary

Dictionaries are not maintained here. Most languages have a small community or institute that maintains a dictionary, and they often do so on GitHub or similar. Please ask in the issues to request that such a dictionary is included here.

👉 Note: acceptable dictionaries must:

  • have a significant affix file (not just a .dic file)
  • have an open source license
  • have recent contributions

License

MIT © Titus Wormer

See license files in each dictionary for the licensing of index.dic and index.aff files.