-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOMA ingestion integration #589
SOMA ingestion integration #589
Conversation
5b55fa2
to
333132f
Compare
@@ -355,12 +230,97 @@ def __getattr__(self, name: str) -> object: | |||
logging.info("Successfully wrote data from %s to %s", input_uri, output_uri) | |||
|
|||
|
|||
# Until we fully get this version of tiledb.cloud deployed server-side, we must |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this makes it impossible for me to locally test -- unless as_batch
does this for me -- ?
@JohnMoutafis please confirm ... 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnkerl as_batch
will run a function as a batch UDF on TileDB Cloud.
Have you tried running either run_ingest_workflow
or ingest
from your local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JohnMoutafis I need to be able to edit a function and run it and have the remote be running my changed code, not the latest tagged version.
I haven't yet tried locally cloning this PR of yours as an experiment. I'll do so, before approving this PR, to verify that editable execution hasn't been broken by this PR.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did pip install -v -e .
, edited locally (adding a print statement), and ran an ingest -- and confirmed that I did not see my mods being picked up by the remote executor. (More details in Slack.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- and thank you!!!
$ pwd
/Users/johnkerl/git/TileDB-Inc/TileDB-Cloud-Py
$ git branch
* johnmoutafis/sc-42537/enable-soma-ingestion
kerl/512-comment
kerl/append
kerl/mapper
kerl/mapper-bad
kerl/mapper2
kerl/multisoma
kerl/probe
kerl/probe-factor
kerl/soma-ingest-append-mode
kerl/vfs-factor
kerl/vfs-refactor
main
$ git diff
diff --git a/src/tiledb/cloud/soma/ingest.py b/src/tiledb/cloud/soma/ingest.py
index d4bb891c..4e536cb2 100644
--- a/src/tiledb/cloud/soma/ingest.py
+++ b/src/tiledb/cloud/soma/ingest.py
@@ -215,6 +215,8 @@ def ingest_h5ad(
with tiledb.VFS(ctx=soma_ctx.tiledb_ctx).open(input_uri) as input_file:
if dry_run:
+ logging.info("LOGGING.INFO HERE")
+ print("PRINT STATEMENT HERE")
logging.info("Dry run for %s to %s", input_uri, output_uri)
return
and https://cloud.tiledb.com/tasks/38c3e021-bf5b-4bd4-86af-2d70799d5511 correctly has
Logs
PRINT STATEMENT HERE
INFO:root:LOGGING.INFO HERE
INFO:root:Dry run for s3://tiledb-johnkerl/s/a/pbmc3k_processed.h5ad to tiledb://johnkerl-tiledb/s3://tiledb-johnkerl/scratch/csi-p3kp-20240709-111429
💥
Thank you John! |
SOMA ingest implementation according to VCF standard.
Also includes some minor refactoring