Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose chunk requests to python WIP #173

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

constantinpape
Copy link
Owner

Chunks corresponding to a request can be read from python:

chunk_ids = ds.get_chunks_in_request(np.s_[10:20, 35:70])

The bounding boxes for each chunk that overlap with each chunk can also be computed:

chunk_ids, chunk_slices = ds.get_chunks_in_request(np.s_[10:20, 35:70], return_chunk_slices=True)

@MatthewBM
Copy link

MatthewBM commented Nov 8, 2020

Does this run and look like a good test script for you?

import z5py
import numpy as np
import os
from shutil import copyfile

filename = '/tmp/test_chunk_requests' + '.n5'
f = z5py.File(filename)

test_shape = (30, 10, 5, 20, 3, 32)
test_data = np.random.rand(*test_shape)
chunk_shape = (10, 1, 1, 12, 1, 6) 


#remove dataset if it already exists before creating it
if os.path.isdir(filename + + '/dim_test'): os.system('rm -r ' + filename + '/dim_test')

dset = f.create_dataset('dim_test',shape = test_shape,chunks = chunk_shape,
                        data = test_data)

read = [2, 8, 3, 4, 0, 1, 16, 19, 2, 3, 11, 19]
subset_test = test_data[read[0]:read[1],
                           read[2]:read[3],
                           read[4]:read[5],
                           read[6]:read[7],
                           read[8]:read[9],
                           read[10]:read[11]]


chunk_ids = dset.get_chunks_in_request(np.s_[read[0]:read[1],
                                           read[2]:read[3],
                                           read[4]:read[5],
                                           read[6]:read[7],
                                           read[8]:read[9],
                                           read[10]:read[11]])

compare_file = filename + '_compare' + '.n5'
if os.path.isdir(compare_file):  os.system('rm -r ' + compare_file)
# create a new empty data container
f = z5py.File(compare_file)
dset = f.create_dataset('dim_test',shape = test_shape,chunks = chunk_shape)
# copy just those chunks into container
for ids in chunk_ids:
    # make the chunk dir if it's not there
    os.makedirs(os.path.dirname(compare_file + '/' + ids), exist_ok=True)
    copyfile(filename + '/' + ids, compare_file + '/' + ids)

# performe the subset chunk read
compare_data = dset[read[0]:read[1],
                           read[2]:read[3],
                           read[4]:read[5],
                           read[6]:read[7],
                           read[8]:read[9],
                           read[10]:read[11]]    
    
if not np.all(subset_test==compare_data):
    raise ValueError('The chunk id list did not produce an array that matched the source')

@constantinpape
Copy link
Owner Author

constantinpape commented Nov 8, 2020

Yes, it works with some minor changes and I think this is a good test.

@MatthewBM are you building z5 yourself or are you using the version from conda-forge?

I need to fix a few things before merging into master and then some more before updating conda-forge, I will try to work on this in the next week or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants