-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zero-shot keyword extraction #7
Comments
I am not sure if this is going to work with zero-shot. But I will note that in the training corpus I do see examples at the end of abstracts and articles with "Keywords: ..." ... So you could try appending "Keywords: " to the end of your example and seeing what happens. I think the key for this to work is finding a format that appears repeatedly in the Pubmed training data ... another direction is to briefly fine tune ... I think if you came up with several hundred examples and fine tuned on those it should produce reasonable results. We could help if you have trouble getting access to compute for the fine tuning ... I'll try to explore this for keyword as well and let you know what I see ! |
I am not sure how many training examples are needed for fine tuning, but on the MeQSum task there are only 500 examples, so it is possible with a relatively small training set the model could be fine tuned to do the right thing ... |
My advice would be to look at the PubMed abstracts and articles and look at what patterns involve "Keywords" ... I see things like "Keywords used", "Keywords: ", "Keywords included " ... I think if this works zero-shot it would be because there is a common pattern in the PubMed abstracts and articles. |
Here is a real PubMed example for instance:
|
And remember our model only has context length of 1024, so you need to break up input into 1024 blocks ... so if you had long input it'd be better to get keywords for each section and then combine in the end ... |
Hi @J38, Many thanks for your reply, it was extremely helpful. Actually, you are right about the out-of-the-box zero-shot performance, the initial results don't look very promising. I still have to search for more prompt patterns, maybe the results get improved. Unfortunately, we only have (very) limited data and that is why we were hoping for a zero-shot setup to work out, but in the worst case, we will try to get additional data and get it annotated. |
Hello,
I am planning to use
pubmedgpt
for zero-shot keyword extraction on a biomedical text. On my (proprietary dataset) GPT-3 has demonstrated pretty decent performance for keyword extraction; I wanted to get your thoughts on zero-shot generalization capabilities ofpubmedgpt
? especially for tasks such as keyword extraction?Also, can you point to helpful prompt(s) format optimized for pubmedgpt?
Many thanks!
The text was updated successfully, but these errors were encountered: