Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyse precision recall curve #59

Open
KoenLoeffen opened this issue May 26, 2023 · 1 comment
Open

Analyse precision recall curve #59

KoenLoeffen opened this issue May 26, 2023 · 1 comment

Comments

@KoenLoeffen
Copy link

KoenLoeffen commented May 26, 2023

I have two questions:

  1. The precision-recall curve is a trade off between the min similarity and the percentage matched. So in the ideal case you want both the precision as the recall as high as possible. However I found out in my results that the model with the highest precision and recall isn't always the best. Am I missing something?
  2. How would I set the optimal threshold for the similarity? Is this also based on the precision recall curve?
@MaartenGr
Copy link
Owner

MaartenGr commented May 28, 2023

The precision-recall curve is a trade off between the min similarity and the percentage matched. So in the ideal case you want both the precision as the recall as high as possible. However I found out in my results that the model with the highest precision and recall isn't always the best. Am I missing something?

The precision-recall curve is an approximation as we do not have the ground-truth available. We ideally still want this to be as high as possible but it would still be an approximation.

How would I set the optimal threshold for the similarity? Is this also based on the precision recall curve?

Yes, that is the main purpose of the precision-recall curve as defined in PolyFuzz. It helps you understand what the threshold would be to get a certain amount of matches and the relative accuracy of the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants