You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question about the learnbpe operation. The example in the README.md learn bpecodes together for en and de, and then apply code for en and de separately..
What's the purpose of jointly learning bpe cde for en and de? If in the NMT system, which en and de will not share embedding. Is it more reasonable to learn bpe code for en and de separately ?
What's the different between the number 40000 in learnbpe and applybpe ?
Thanks~
The text was updated successfully, but these errors were encountered:
Jointly learning the code is mostly useful when you share the embeddings. It's good because it helps the model handling rare words like named entities very easily. Even if you don't share, I would still learn them jointly. At the very least you save GPU memory.
I'm not sure I understand your question. In learnbpe, 40000 is the number of codes you want to learn. In applybpe you don't have to provide 40000, just the codes file.
Hi,
I have a question about the
learnbpe
operation. The example in theREADME.md
learn bpecodes together foren
andde
, and then apply code foren
andde
separately..Here is my question:
What's the purpose of jointly learning bpe cde for
en
andde
? If in the NMT system, whichen
andde
will not share embedding. Is it more reasonable to learn bpe code foren
andde
separately ?What's the different between the number 40000 in
learnbpe
andapplybpe
?Thanks~
The text was updated successfully, but these errors were encountered: