Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Google Cloud Storage #176

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

perlman
Copy link
Contributor

@perlman perlman commented Nov 20, 2022

This is a small experiment for writing output directly to GCS.

This is done by including google-cloud-nio. outputOptions needs to be non-null for newFileSystem().

While functional, this is still incomplete:

  • values in outputOptions need to be coerced into Integer/Boolean/etc to work with java-storage-nio.
  • some mechanism for specifying alternate credentials (link to google docs for credentials)
  • Add test using com.google.cloud.storage.contrib.nio.testing

(As a side note, we tried using Google's S3 interface. It fails on a permissions check in JZarr before writing data.)

@melissalinkert
Copy link
Member

@perlman: did you have more work planned here?

@perlman
Copy link
Contributor Author

perlman commented Apr 26, 2023

I've been using this unmodified for a while now. I'll bring it up-to-date with main and see where we're at.

@melissalinkert
Copy link
Member

Thanks for the update, @perlman. I'm fine with taking this out of draft status, but adding a usage example to the README would be helpful for testing.

@melissalinkert
Copy link
Member

@perlman, is there a simple example of how to use this feature?

@perlman
Copy link
Contributor Author

perlman commented Jun 14, 2023

Whoops, I let this slip. I'll get to this today or tomorrow! (or Monday, sorry about that.)

@perlman
Copy link
Contributor Author

perlman commented Jun 22, 2023

@melissalinkert I'm wondering where the right place to put an example. I had started to modify the --help text, but it seems that it may be a bit too verbose?

The usage is very straight forward, e.g.:

bioformats2raw-0.7.1-SNAPSHOT/bin/bioformats2raw --tile_width 2048 A_2202_20_ApoB.ndpi gs://jax-zarr-playpen/data/A_2202_20_ApoB.zarr

That's it. The access credentials will come from the environment, e.g, gcloud auth login or inherited from a service account. (Application Default Credentials )

The credentials must allow for read/write on the bucket. (Minimally, this can be Storage Object Creator, Storage Object Viewer and Storage Object Delete).

--output-options does not currently work. Google NIO does not seem happy with the Map<string, string>, with an exception related to the type. I've punted temporarily on digging into this, as it would probably require some special case type conversion of the values.

@melissalinkert
Copy link
Member

Sorry for dropping this - really was just thinking a few lines in the README.md with exactly what you've already noted in #176 (comment) is sufficient documentation.

@melissalinkert
Copy link
Member

@perlman: that's great, thanks. Do you want to take this out of draft so we can consider for 0.8.0? Or did you have more work planned before this is ready for review?

@perlman
Copy link
Contributor Author

perlman commented Oct 24, 2023

I think this meets MVP! I've been using it to convert a bunch of NDPI files to Zarr.

At minimum, I think we should add an example of using s3 to the README (& the suggested flags used for Cloudian deployments).

"Nice to have" would be working flags for GCS (which require correct value types) and a test using com.google.cloud.storage.contrib.nio.testing, which would show functional NIO2 integration.

@perlman perlman marked this pull request as ready for review October 24, 2023 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Inbox
Development

Successfully merging this pull request may close these issues.

None yet

2 participants