Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__init__() got an unexpected keyword argument 'spacy_component' #65

Open
eunsuk-c opened this issue Jan 16, 2021 · 7 comments
Open

__init__() got an unexpected keyword argument 'spacy_component' #65

eunsuk-c opened this issue Jan 16, 2021 · 7 comments

Comments

@eunsuk-c
Copy link

eunsuk-c commented Jan 16, 2021

Describe the bug

When I run the following code:

from quickumls.spacy_component import SpacyQuickUMLS

nlp = spacy.load('en_core_web_sm')
quickumls_component = SpacyQuickUMLS(nlp, '/home/silverock/umls/quickUMLS')
nlp.add_pipe(quickumls_component)

I got this message:


TypeError Traceback (most recent call last)
in
2
3 nlp = spacy.load('en_core_web_sm')
----> 4 quickumls_component = SpacyQuickUMLS(nlp, '/home/silverock/umls/quickUMLS')
5 nlp.add_pipe(quickumls_component)

~/anaconda3/lib/python3.8/site-packages/quickumls/spacy_component.py in init(self, nlp, quickumls_fp, best_match, ignore_syntax, **kwargs)
23 """
24
---> 25 self.quickumls = QuickUMLS(quickumls_fp,
26 # By default, the QuickUMLS objects creates its own internal spacy pipeline but this is not needed
27 # when we're using it as a component in a pipeline

TypeError: init() got an unexpected keyword argument 'spacy_component'

Could you solve this issue?

**My Environment **

  • OS: Ubuntu 20.04
  • QuickUMLS version 1.4
  • UMLS version: 2020AB
  • spacy 2.4.5
  • Python 3.8.5
@jimhavrilla
Copy link

I think I'm getting a similar error to this. It tells me there is "no such module" as spacy_component.

@pokarats
Copy link

@eunsuk-c I got this error too. Apparently pip install quickumls didn't install the latest commit as seen in this repo. At least that's the case for me and I don't know why. When I checked the core.py in my environment against the core.py as seen in the repo, I got these differences below. So, perhaps you might want to check if this is the case in your environment?

Without looking through all of the diff output below, the spacy_component = False argument is missing in the init function of the QuickUMLS Class in core.py that got installed. So I simply replaced the core.py in my environment with the latest version from this repo that has the spacy_component argument.

After updatingcore.py, I was able to instantiate SpacyQuickUMLS(nlp, '<path to quick umls data>'). That being said, this whole pipeline does not work with SpaCy 3.0, but since you're using spaCy 2.4.5, I'm assuming this is not an issue for you.

$ diff -w /Users/<username>/opt/anaconda3/envs/quickUMLS/lib/python3.7/site-packages/quickumls/core.py QuickUMLS/quickumls/core.py 

29c29,30
<             verbose=False, keep_uppercase=False):
---
>             verbose=False, keep_uppercase=False,
>             spacy_component = False):
148a150,154
>         # if this is not being executed as as spacy component, then it must be standalone
>         if spacy_component:
>             # In this case, the pipeline is external to this current class
>             self.nlp = None
>         else:
325d330
<                 for cui, semtypes, preferred in cuisem_match:
331a337
> 
334a341,342
>                 for cui, semtypes, preferred in cuisem_match:
> 
438a447,468
>         # pass in parsed spacy doc to get concept matches
>         matches = self._match(parsed)
> 
>         return matches
>         
>     def _match(self, doc, best_match=True, ignore_syntax=False):
>         """Gathers ngram matches given a spaCy document object.
> 
>         [extended_summary]
> 
>         Args:
>             text (Document): spaCy Document object to be used for extracting ngrams
> 
>             best_match (bool, optional): Whether to return only the top match or all overlapping candidates. Defaults to True.
>             ignore_syntax (bool, optional): Wether to use the heuristcs introduced in the paper (Soldaini and Goharian, 2016). TODO: clarify,. Defaults to False
> 
>         Returns:
>             List: List of all matches in the text
>             TODO: Describe format
>         """
>         
>         ngrams = None
440c470
<             ngrams = self._make_token_sequences(parsed)
---
>             ngrams = self._make_token_sequences(doc)
442c472
<             ngrams = self._make_ngrams(parsed)
---
>             ngrams = self._make_ngrams(doc)
449c479
<         self._print_verbose_status(parsed, matches)
---
>         self._print_verbose_status(doc, matches)
(quickUMLS) Suruthais-MacBook-Pro:UMLS noonscape$ diff -b /Users/noonscape/opt/anaconda3/envs/quickUMLS/lib/python3.7/site-packages/quickumls/core.py QuickUMLS/quickumls/core.py 
29c29,30
<             verbose=False, keep_uppercase=False):
---
>             verbose=False, keep_uppercase=False,
>             spacy_component = False):
148a150,154
>         # if this is not being executed as as spacy component, then it must be standalone
>         if spacy_component:
>             # In this case, the pipeline is external to this current class
>             self.nlp = None
>         else:
325d330
<                 for cui, semtypes, preferred in cuisem_match:
331a337
> 
334a341,342
>                 for cui, semtypes, preferred in cuisem_match:
> 
438a447,468
>         # pass in parsed spacy doc to get concept matches
>         matches = self._match(parsed)
> 
>         return matches
>         
>     def _match(self, doc, best_match=True, ignore_syntax=False):
>         """Gathers ngram matches given a spaCy document object.
> 
>         [extended_summary]
> 
>         Args:
>             text (Document): spaCy Document object to be used for extracting ngrams
> 
>             best_match (bool, optional): Whether to return only the top match or all overlapping candidates. Defaults to True.
>             ignore_syntax (bool, optional): Wether to use the heuristcs introduced in the paper (Soldaini and Goharian, 2016). TODO: clarify,. Defaults to False
> 
>         Returns:
>             List: List of all matches in the text
>             TODO: Describe format
>         """
>         
>         ngrams = None
440c470
<             ngrams = self._make_token_sequences(parsed)
---
>             ngrams = self._make_token_sequences(doc)
442c472
<             ngrams = self._make_ngrams(parsed)
---
>             ngrams = self._make_ngrams(doc)
449c479
<         self._print_verbose_status(parsed, matches)
---
>         self._print_verbose_status(doc, matches)

My Environment

  • OS: Mac OS X Big Sur
  • QuickUMLS version 1.4 (I installed this Feb 2021)
  • UMLS version: 2019AM
  • Spacy 3.0 (Note that there are other issues with SpacCy 3.0 and this library)
  • Python 3.7

@pokarats
Copy link

I think I'm getting a similar error to this. It tells me there is "no such module" as spacy_component.

@jimhavrilla I had this error as well. For some reasons, when I installed this library with pip install quickumls, the spacy_component.py did not get installed in my environment site packages. Perhaps, that's also what happened with your install?
I had to manually put spacy_component.py where the rest of the module files are for this to work.

My Environment

  • OS: Mac OS X Big Sur
  • QuickUMLS version 1.4 (I installed this Feb 2021)
  • UMLS version: 2019AB
  • Spacy 3.0 (Note that there are other issues with SpacCy 3.0 and this library)
  • Python 3.7
  • anaconda environment

@jimhavrilla
Copy link

jimhavrilla commented Feb 26, 2021 via email

@jmugan
Copy link

jmugan commented Aug 16, 2021

I had to overwrite core.py with the version from the repo.

@DSLituiev
Copy link

can some push this to PYPI?

@soldni @galtay @ldorigo @burgersmoke

@burgersmoke
Copy link
Contributor

@DSLituiev As a part of medspacy, we have started our own fork of QuickUMLS. In this fork, we support spacy v3.x and we have addressed this issue with version 2.5 of medspacy_quickumls.

Please note if you decide to consider any of the options below which address this, please note that as a medspacy team, we have elected to no longer support leveldb as a database backend since we've encountered problems and we do not have time or resources to troubleshoot and fix these.

That repo is here:
https://github.com/medspacy/QuickUMLS

It's also now pip-installable here:
https://pypi.org/project/medspacy-quickumls/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants