You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you look at the original dev data, you will see every datapoint is distinct. Training data set, however, has a lot of repetitions. This makes it infeasible to do a 90-10-10 split.
Potential solutions:
one reasonable thing to do would be to (a) separate in a "not overlapping" dev and set up a cross-fold validation experiment
using a fraction of the original dev as internal dev for Anli
The text was updated successfully, but these errors were encountered:
denizbeser
changed the title
ANLI train distribution makes it hard to create internal dev - so it's temporarily ignored
ANLI data distribution makes it hard to create internal dev - so it's temporarily ignored
Jun 9, 2020
If you look at the original dev data, you will see every datapoint is distinct. Training data set, however, has a lot of repetitions. This makes it infeasible to do a 90-10-10 split.
Potential solutions:
The text was updated successfully, but these errors were encountered: