Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOMA ingestion integration #589

Merged
merged 7 commits into from
Jul 10, 2024

Conversation

JohnMoutafis
Copy link
Contributor

@JohnMoutafis JohnMoutafis commented Jun 13, 2024

SOMA ingest implementation according to VCF standard.

Also includes some minor refactoring

@JohnMoutafis JohnMoutafis force-pushed the johnmoutafis/sc-42537/enable-soma-ingestion branch from 5b55fa2 to 333132f Compare June 13, 2024 09:13
@JohnMoutafis JohnMoutafis self-assigned this Jun 13, 2024
@JohnMoutafis JohnMoutafis marked this pull request as ready for review June 13, 2024 10:05
@JohnMoutafis JohnMoutafis changed the title SOMA ingestion: Attempt to unittest. SOMA ingestion integration Jun 17, 2024
@@ -355,12 +230,97 @@ def __getattr__(self, name: str) -> object:
logging.info("Successfully wrote data from %s to %s", input_uri, output_uri)


# Until we fully get this version of tiledb.cloud deployed server-side, we must
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this makes it impossible for me to locally test -- unless as_batch does this for me -- ?

@JohnMoutafis please confirm ... 🙏

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@johnkerl as_batch will run a function as a batch UDF on TileDB Cloud.
Have you tried running either run_ingest_workflow or ingest from your local?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnMoutafis I need to be able to edit a function and run it and have the remote be running my changed code, not the latest tagged version.

I haven't yet tried locally cloning this PR of yours as an experiment. I'll do so, before approving this PR, to verify that editable execution hasn't been broken by this PR.

Thanks!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did pip install -v -e ., edited locally (adding a print statement), and ran an ingest -- and confirmed that I did not see my mods being picked up by the remote executor. (More details in Slack.)

Copy link
Collaborator

@johnkerl johnkerl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- and thank you!!!

$ pwd
/Users/johnkerl/git/TileDB-Inc/TileDB-Cloud-Py
$ git branch
* johnmoutafis/sc-42537/enable-soma-ingestion
  kerl/512-comment
  kerl/append
  kerl/mapper
  kerl/mapper-bad
  kerl/mapper2
  kerl/multisoma
  kerl/probe
  kerl/probe-factor
  kerl/soma-ingest-append-mode
  kerl/vfs-factor
  kerl/vfs-refactor
  main
$ git diff
diff --git a/src/tiledb/cloud/soma/ingest.py b/src/tiledb/cloud/soma/ingest.py
index d4bb891c..4e536cb2 100644
--- a/src/tiledb/cloud/soma/ingest.py
+++ b/src/tiledb/cloud/soma/ingest.py
@@ -215,6 +215,8 @@ def ingest_h5ad(

     with tiledb.VFS(ctx=soma_ctx.tiledb_ctx).open(input_uri) as input_file:
         if dry_run:
+            logging.info("LOGGING.INFO HERE")
+            print("PRINT STATEMENT HERE")
             logging.info("Dry run for %s to %s", input_uri, output_uri)
             return

and https://cloud.tiledb.com/tasks/38c3e021-bf5b-4bd4-86af-2d70799d5511 correctly has

Logs
PRINT STATEMENT HERE
INFO:root:LOGGING.INFO HERE
INFO:root:Dry run for s3://tiledb-johnkerl/s/a/pbmc3k_processed.h5ad to tiledb://johnkerl-tiledb/s3://tiledb-johnkerl/scratch/csi-p3kp-20240709-111429

💥

@JohnMoutafis JohnMoutafis merged commit d3bef52 into main Jul 10, 2024
18 checks passed
@JohnMoutafis JohnMoutafis deleted the johnmoutafis/sc-42537/enable-soma-ingestion branch July 10, 2024 07:13
@JohnMoutafis
Copy link
Contributor Author

Thank you John!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants