Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about learnbpe #36

Open
dreamingo opened this issue Dec 3, 2019 · 1 comment
Open

Question about learnbpe #36

dreamingo opened this issue Dec 3, 2019 · 1 comment

Comments

@dreamingo
Copy link

Hi,

I have a question about the learnbpe operation. The example in the README.md learn bpecodes together for en and de, and then apply code for en and de separately..

./fast learnbpe 40000 train.de train.en > codes
./fast applybpe train.de.40000 train.de codes
./fast applybpe train.en.40000 train.en codes

Here is my question:

  1. What's the purpose of jointly learning bpe cde for en and de? If in the NMT system, which en and de will not share embedding. Is it more reasonable to learn bpe code for en and de separately ?

  2. What's the different between the number 40000 in learnbpe and applybpe ?

Thanks~

@glample
Copy link
Owner

glample commented Dec 3, 2019

  1. Jointly learning the code is mostly useful when you share the embeddings. It's good because it helps the model handling rare words like named entities very easily. Even if you don't share, I would still learn them jointly. At the very least you save GPU memory.

  2. I'm not sure I understand your question. In learnbpe, 40000 is the number of codes you want to learn. In applybpe you don't have to provide 40000, just the codes file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants