Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot download pretrained model (GCSFS failed) #61

Open
1 task done
jwasswa2023 opened this issue Jun 15, 2023 · 18 comments
Open
1 task done

Cannot download pretrained model (GCSFS failed) #61

jwasswa2023 opened this issue Jun 15, 2023 · 18 comments
Assignees
Labels
bug Something isn't working

Comments

@jwasswa2023
Copy link

jwasswa2023 commented Jun 15, 2023

Is there an existing issue for this?

  • I have searched the existing issues and found nothing

Bug description

Hello here, thank you for your efforts.

I have been using Molfeat but failed to run through a tutorial for fine-tuning a pre-trained model. When I try to define the featurizer and load a transformer model using the code below, I get an error,

"featurizer = PretrainedHFTransformer(kind="ChemBERTa-77M-MLM", pooling="bert", preload=True)"

This is the error below.

ERROR:gcsfs:_request out of retries on exception: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7f583491b8b0>)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py", line 111, in refresh
    self._retrieve_info(request)
  File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py", line 87, in _retrieve_info
    info = _metadata.get_service_account_info(
  File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/_metadata.py", line 234, in get_service_account_info
    return get(request, path, params={"recursive": "true"})
  File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/_metadata.py", line 182, in get
    raise exceptions.TransportError(
google.auth.exceptions.TransportError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7f583491b8b0>)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/gcsfs/retry.py", line 114, in retry_request
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/gcsfs/core.py", line 414, in _request
    headers=self._get_headers(headers),
  File "/usr/local/lib/python3.10/site-packages/gcsfs/core.py", line 393, in _get_headers
    self.credentials.apply(out)
  File "/usr/local/lib/python3.10/site-packages/gcsfs/credentials.py", line 187, in apply
    self.maybe_refresh()
  File "/usr/local/lib/python3.10/site-packages/gcsfs/credentials.py", line 182, in maybe_refresh
    self.credentials.refresh(req)
  File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py", line 117, in refresh
    six.raise_from(new_exc, caught_exc)
  File "<string>", line 3, in raise_from
google.auth.exceptions.RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7f583491b8b0>)
---------------------------------------------------------------------------
TransportError                            Traceback (most recent call last)
[/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py](https://localhost:8080/#) in refresh(self, request)
    110         try:
--> 111             self._retrieve_info(request)
    112             self.token, self.expiry = _metadata.get_service_account_token(

25 frames
TransportError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7f583491b8b0>)
The above exception was the direct cause of the following exception:

RefreshError                              Traceback (most recent call last)
<decorator-gen-121> in _request(self, method, path, headers, json, data, *args, **kwargs)

/usr/local/lib/python3.10/dist-packages/six.py in raise_from(value, from_value)

RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7f583491b8b0>)

How to reproduce the bug

No response

Error messages and logs

# Error messages and logs here please

Environment

Current environment
#- Molfeat version (e.g., 0.1.0):
#- PyTorch Version (e.g., 1.10.0):
#- RDKit version (e.g., 2022.09.5): 
#- scikit-learn version (e.g.,  1.2.1): 
#- OS (e.g., Linux):
#- How you installed Molfeat (`conda`, `pip`, source):

Additional context

No response

@jwasswa2023 jwasswa2023 added the bug Something isn't working label Jun 15, 2023
@maclandrol maclandrol changed the title ERROR:gcsfs:_request out of retries on exception: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Cannot download pretrained model (GCSFS failed) Jun 15, 2023
@jstlaurent jstlaurent self-assigned this Jun 15, 2023
@jstlaurent
Copy link
Contributor

@jwasswa2023 : Thanks for reporting this! I'm taking a look.

@jstlaurent
Copy link
Contributor

@jwasswa2023: It seems to be a transient issue with Google Cloud Platform, who hosts the bucket in which the files are located. Could you try it again and tell me if you still have an issue? And if so, are you running your Python code from a terminal where your user is logged in to GCP?

Thank you.

@jwasswa2023
Copy link
Author

Hello Julien,
Sorry for the delayed response. I am running my code in Colab where I am logged in. I have run the tutorial again and I am getting a similar error but with a modified warning message. see below.

2023-06-16 01:48:32 | WARNING | google.auth._default | No project ID could be determined. Consider running gcloud config set project or setting the GOOGLE_CLOUD_PROJECT environment variable
2023-06-16 01:49:08 | ERROR | gcsfs | _request out of retries on exception: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7fd861c2d5d0>)
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py", line 111, in refresh
self._retrieve_info(request)
File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py", line 87, in _retrieve_info
info = _metadata.get_service_account_info(
File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/_metadata.py", line 234, in get_service_account_info
return get(request, path, params={"recursive": "true"})
File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/_metadata.py", line 182, in get
raise exceptions.TransportError(
google.auth.exceptions.TransportError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7fd861c2d5d0>)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/gcsfs/retry.py", line 114, in retry_request
return await func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/gcsfs/core.py", line 414, in _request
headers=self._get_headers(headers),
File "/usr/local/lib/python3.10/site-packages/gcsfs/core.py", line 393, in _get_headers
self.credentials.apply(out)
File "/usr/local/lib/python3.10/site-packages/gcsfs/credentials.py", line 187, in apply
self.maybe_refresh()
File "/usr/local/lib/python3.10/site-packages/gcsfs/credentials.py", line 182, in maybe_refresh
self.credentials.refresh(req)
File "/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py", line 117, in refresh
six.raise_from(new_exc, caught_exc)
File "", line 3, in raise_from
google.auth.exceptions.RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7fd861c2d5d0>)

TransportError Traceback (most recent call last)
/usr/local/lib/python3.10/dist-packages/google/auth/compute_engine/credentials.py in refresh(self, request)
110 try:
--> 111 self._retrieve_info(request)
112 self.token, self.expiry = _metadata.get_service_account_token(

25 frames
TransportError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7fd861c2d5d0>)

The above exception was the direct cause of the following exception:

RefreshError Traceback (most recent call last)
in _request(self, method, path, headers, json, data, *args, **kwargs)

/usr/local/lib/python3.10/dist-packages/six.py in raise_from(value, from_value)

RefreshError: ("Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Engine metadata service. Status: 404 Response:\nb''", <google.auth.transport.requests._Response object at 0x7fd861c2d5d0>)

@azkalot1
Copy link

azkalot1 commented Sep 1, 2023

Hi all!

Having the same problem here. Is there a solution?

@maclandrol
Copy link
Member

Hi @azkalot1, from what we have gathered so far, this is often either:

  • some temporary issues with GCP
  • the user having some existing credentials that interfere with access to the bucket or a wrong setup of gcloud service.

Can you ls the bucket with some tool like gsutil ? (gsutil ls gs://molfeat-store-prod/)

Also how did you install molfeat through pip or conda ?

@azkalot1
Copy link

azkalot1 commented Sep 1, 2023

installed from pip

gsutil ls gs://molfeat-store-prod/
gs://molfeat-store-prod/artifacts/
gs://molfeat-store-prod/checkpoints/

@maclandrol
Copy link
Member

So it seems that you do have access through gsutil but your installed gcsfs can't assess it.

Two other questions:

  1. Could you confirm the above hypothesis with the following code in the same environment as your molfeat installation ?
import datamol as dm

mapper = dm.fs.get_mapper("gs://molfeat-store-prod/")
mapper.fs.ls("gs://molfeat-store-prod/")
  1. Are you on a local machine ?

@azkalot1
Copy link

azkalot1 commented Sep 1, 2023

import datamol as dm

mapper = dm.fs.get_mapper("gs://molfeat-store-prod/")
mapper.fs.ls("gs://molfeat-store-prod/")

gives

['molfeat-store-prod/artifacts', 'molfeat-store-prod/checkpoints']

2 - no, instance from Saturn Cloud (AWS instance)

@maclandrol
Copy link
Member

Everything seems to indicate that it should be working in theory.

I will investigate during the weekend. In the meantime, you might want to try temporary removing any previous gcp credentials you have and check again.

@azkalot1
Copy link

azkalot1 commented Sep 4, 2023

I think the issue is different and the exception is a bit misleading.

from molfeat.store.modelstore import ModelStore
from molfeat.store.modelstore import ModelStoreError

store = ModelStore()
store.search(name='GPT2-Zinc480M-87M')[0]

will give

     83 def match(self, new_card: Union["ModelInfo", dict], match_only: Optional[List[str]] = None):
     84     """Compare two model card information and returns True if they are the same
     85 
     86     Args:
     87         new_card: card to search for in the modelstore
     88         match_only: list of minimum attribute that should match between the two model information
     89     """
---> 91     self_content = self.model_dump().copy()
     92     if not isinstance(new_card, dict):
     93         new_card = new_card.model_dump()

AttributeError: 'ModelInfo' object has no attribute 'model_dump'

this is the actual error, not loading from the cloud

This happens on molfeat 0.9.2

When I install 0.8.9

features = transformer(['CCCCC'])

works!

@maclandrol
Copy link
Member

I haven’t had the time to look yet, sorry.
We recently moved to pydantic 2.0, maybe something in the migration that didn’t work. If the error is the model store, then not forwarding the proper error is a bug that needs fixing.

@maclandrol maclandrol self-assigned this Sep 4, 2023
@cwognum
Copy link
Contributor

cwognum commented Sep 4, 2023

@azkalot1 Could you share your Pydantic version? Besides a too broad except statement (here?), it seems to me that we are lacking a lower bound for the pydantic version in the pyproject.toml, causing issues when not installing with a conda-like dependency manager.

@miladrayka
Copy link

Does anyone find a solution for this bug?

@peiyaoli
Copy link

same issues here

@Jessyjias
Copy link

Jessyjias commented Jun 27, 2024

Any update to this? Tried to run molfeat on colab but still having this issue. Would appreciate if there's any advice around using molfeat on colab.

@maclandrol
Copy link
Member

I will investigate this further. It seems to be something around colab notebooks and authentication in gcsfs.

@maclandrol
Copy link
Member

So this is really an issue with google colab and gcsfs.

When in a google colab, it seems that the authentication does not work (even though the bucket accepts anonymous request by default). You therefore needs to be authenticated to access the bucket.

The following works for me

from google.colab import auth
auth.authenticate_user()
credentials, project_id = google.auth.default()

then

from molfeat.trans.pretrained import PretrainedHFTransformer
featurizer = PretrainedHFTransformer(kind="ChemBERTa-77M-MLM", pooling="bert", preload=True)

Does this address the issues of everyone else ? I will try to find a definitive solution where in the colab, user do not need to run the authentication part.

@orgw
Copy link

orgw commented Jan 16, 2025

hi i'm having issues

WARNING:google.auth.compute_engine._metadata:Compute Engine Metadata server unavailable on attempt 1 of 3. Reason: timed out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants