PIP-experiments

Tool for word embedding dimensionality selection

We have cleaned all experimental code to provide a unified tool in the following repo

https://github.com/aaaasssddf/word-embedding-dimensionality-selection

Experiments in this repo

This is the source code for all experiments of the paper, "Understand Functionality and Dimensionality of Vector Embeddings: the Distributional Hypothesis, the Pairwise Inner Product Loss and Its Bias-Variance Trade-off". The paper is publicly available at

https://arxiv.org/abs/1803.00502

The NIPS paper can be found here

https://nips.cc/Conferences/2018/Schedule?showEvent=12567

@inproceedings{yin2018dimensionality,
 title={On the Dimensionality of Word Embedding},
 author={Yin, Zi and Shen, Yuanyuan},
 booktitle={Advances in Neural Information Processing Systems},
 year={2018}
}

and

@article{yin2018pairwise,  
  title={Understand Functionality and Dimensionality of Vector Embeddings: the Distributional Hypothesis, the Pairwise Inner Product Loss and Its Bias-Variance Trade-off},  
  author={Yin, Zi},  
  journal={arXiv preprint arXiv:1803.00502},  
  year={2018}  
}

All experiments in the paper are included, in particular:

Matrix perturbation theoretical PIP loss upper bound;
Robustness to over-parametrization with respect to the exponent parameter;
Forward stability experiments
Optimal dimensionality for LSA/LSI and empirical optimal dimensionalities
Optimal dimensionality for Word2Vec/GloVe and empirical optimal dimensionalities

The optimal_dimensionality directory contains sample codes for dimensionality selection for word2vec (PMI) and glove (log-count).

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
PIP_upper_bound		PIP_upper_bound
Wiki_corpus		Wiki_corpus
forward_stability		forward_stability
glove		glove
optimal_dimensionality		optimal_dimensionality
robustness_to_overfitting		robustness_to_overfitting
sts		sts
text8		text8
word2vec		word2vec
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PIP-experiments

Tool for word embedding dimensionality selection

Experiments in this repo

About

Releases

Packages

Languages

License

ziyin-dl/PIP-experiments

Folders and files

Latest commit

History

Repository files navigation

PIP-experiments

Tool for word embedding dimensionality selection

Experiments in this repo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages