Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some question about data(Each line corresponds to the line of test.rating, containing 99 negative samples. But why 99?) #62

Open
yifannir opened this issue Dec 18, 2019 · 3 comments

Comments

@yifannir
Copy link

according to the describtion "Each line corresponds to the line of test.rating, containing 99 negative samples.", why negative sample num is 99, when the number is small ,the perfermance will be well of course, I want to know should it be all the negative number? Thank you

@KylinA1
Copy link

KylinA1 commented Dec 27, 2019

It should be compared among all the items. But as the author demonstrated in paper,

since it is too time consuming to rank all items for every user during evaluation, we ... randomly samples 100 items that are not interacted by the user

Although the ml-1m and Pinterest datasets are actually pretty small .....
PS: almost all related paper said they sample 100 negative items, but in fact that is 99.

@Chuan1997
Copy link

It should be compared among all the items. But as the author demonstrated in paper,

since it is too time consuming to rank all items for every user during evaluation, we ... randomly samples 100 items that are not interacted by the user

Although the ml-1m and Pinterest datasets are actually pretty small .....
PS: almost all related paper said they sample 100 negative items, but in fact that is 99.

so in order to fully reproduce the result of paper, we need to set num_neg to 99?

@beathahahaha
Copy link

I don't think the data profile is right. The number of negative items should be 100 instead of 99. (original article:"we followed the common strategy that randomly samples 100 items that are not interacted by the user, ranking the test item among the 100 items.") And I suggest to read this paper: Self-Attentive Sequential Recommendation, 2018, ICDM, which made a clear declaration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants