-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add file create data appending #1163
Open
t-b
wants to merge
2
commits into
NeurodataWithoutBorders:dev
Choose a base branch
from
t-b:add-file-create-data-appending
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also test the second round-trip, i.e., close the file and re-open it in read-mode and confirm that the change to
file_create_date
is still present. I am concerned that thefile_create_date
dataset is not chunked and therefore cannot grow, or the change is not saved for some reason.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rly I've pushed something but I need to review that again tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rly You were right. The additional entry does not reach the file.
Questions:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To fix that, the dataset has to be chunked. @ajtritt -- is there a way to chunk only the
NWBFile.file_create_date
dataset? I am also in favor of blanket chunking all datasets in NWB...To use changes in a newer hdmf version, the changes must have been released on PyPI. The recent "mode" function addition isn't released yet, but we could do that this week if these issues are pressing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A new hdmf would be nice!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I force the stored dataset to be chunked?
I tried
but that does not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I have the right solution for you, but a couple of thoughts:
Ultimately, I think the core issue is that we want specific datasets to be written in a resizable fashion (so they can grow). In the case of HDF5 that requires chunking but for other backends that may or may not be the case. In that vain, I think what we may need is a generic (backend-agnostic) way to provide write-hints, which in this case would say "make this dataset resizeable". I'm wondering whether we could add I/O hints on the builder for this and in the object-mapper a way to ask for I/O hints for fields. It would then be up to the backend to decide what to do with those I/O hints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oruebel It totally agree that a HDF5 specific solution is the wrong thing to do here. But up to now I don't have any solution at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm starting to work on this again.
@oruebel
What implicit part are you concerned about? The "making the dataset chunked" or "adding new entries in the file_create_dataset"? The latter is what nwb-schema says how file_create_dataset should be handled.
Yes that would be required. Of course my above hack is a hack and can not be merged as is, but I first wanted to get something working and then make the solution generalizable. I just saw that hdmf.builders.DatasetBuilder has a
chunks
argument as well.I seem to not understand how the object mappers work. According to https://pynwb.readthedocs.io/en/stable/overview_software_architecture.html?highlight=architecture#objectmapper I would think that
should work, but it doesn't. Any hints?