modularity_metric

An adhoc tool/metric to diagnose whether the resulting cross-lingual word embedding is "mixed" w.r.t to its language.

Requirements

gensim
(optional) annoy

    pip3 install -r requirements.txt

Confirmed that it runs on

Python 3.6.5.
gensim 3.4.0
annoy 1.8.3

Usage

python3 src/modularity.py --w2v YOUR_VECTOR --src_lang SRC_LANG --tgt_lang TGT_LANG

Currently, the input vector is assumed to be a concatenated cross-lingual embedding where each word has a prefix tag of three characters (i.e., ISO 639-2 Code), e.g.,

python3 src/modularity.py --w2v $WORD_VEC --src_lang eng --tgt_lang jpn

and an example of a word vector is eng:the 0.123988 -0.0562252....

Run tests

sh scripts/run_test.sh

Example usage

sh scripts/run_sample.sh

Example usage with annoy (approximate nearest neighbors)

sh scripts/run_sample_annoy.sh

Reproduce Figure 1 in the paper

sh scripts/get_sample_embedding.sh
sh scripts/run_eat.sh
sh scripts/run_firefox.sh

References

If you use this code, please cite our paper.

Yoshinari Fujinuma, Jordan Body-Graber, and Michael J. Paul, A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings based on Graph Modularity, ACL 2019

@inproceedings{clwe_modularity,
   title = "A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings based on Graph Modularity",
   author = "Fujinuma, Yoshinari and Boyd-Graber, Jordan and Paul, Michael J.",
   booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
   year = "2019",
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
pickles		pickles
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

modularity_metric

Requirements

Usage

Run tests

Example usage

Example usage with annoy (approximate nearest neighbors)

Reproduce Figure 1 in the paper

References

About

Releases

Packages

Languages

License

akkikiki/modularity_metric

Folders and files

Latest commit

History

Repository files navigation

modularity_metric

Requirements

Usage

Run tests

Example usage

Example usage with annoy (approximate nearest neighbors)

Reproduce Figure 1 in the paper

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages