-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regarding multiple opens of the same h5 file #15
Comments
mikejiang
changed the title
regarding multiple read access to the same h5 file
regarding multiple opens of the same h5 file
Sep 27, 2019
mikejiang
pushed a commit
that referenced
this issue
Nov 8, 2019
Also need to prevent concurrent read to the same h5 during concurrent > data("GvHD")
> gs <- GatingSet(GvHD[1:4])
> tmp <- tempfile()
> save_gs(gs, tmp)
Done
To reload it, use 'load_gs' function
> f <- function(i,path){
+ gs <- load_gs(path)
+ nrow(gh_pop_get_data(gs[[i]]))
+ }
> mclapply(1:4, f, path = tmp)
error #000: in H5Fopen(): line 509
major: File accessibilty
minor: Unable to open file
error #001: in H5F_open(): line 1567
major: File accessibilty
minor: Unable to open file
error #002: in H5FD_lock(): line 1640
major: Virtual File Layer
minor: Can't update object
error #003: in H5FD_sec2_lock(): line 959
major: File accessibilty
minor: Bad file ID accessed Basically delay loading all the meta data from h5 until they are requested |
mikejiang
pushed a commit
that referenced
this issue
Dec 6, 2019
mikejiang
pushed a commit
to RGLab/flowWorkspace
that referenced
this issue
Dec 6, 2019
With 50da439 > mclapply(1:4, function(i){
+ gs <- load_gs(tmp, sel = i)
+ nrow(gh_pop_get_data(gs[[1]]))
+ })
[[1]]
[1] 3420
[[2]]
[1] 3405
[[3]]
[1] 3435
[[4]]
[1] 8550 |
beautiful.. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In
ncdfFlow
, we simply closeh5
file handler immediately after each read/write operation. Thus we never experienced any issues. Nowcytolib
keeps theh5
file handler open during the life cycle ofH5CytoFrame
object to maintain the h5 cache for speeding up the subsequent IO.This worked fine for multi-opens within the same process(e.g. the same R session) even if the file are opened with
write
flagHowever when a separate process (e.g. another R session or command line tool) tries open the same file , it will fail on either
H5F_ACC_RDONLY
orH5F_ACC_RDWR
flag since the file was locked by anotherh5 lib
instance.If the initial open was
H5F_ACC_RDONLY
, then it seems to succeed for both process. So I guess the inter-process lock was only applied when the file was opened withwrite
permission.Even though it makes sense for such locking mechanism, the behavior of allowing multi-opens within the same process is somewhat misleading. It could be that
h5 lib
inherently is designed forsingle-process
application and thus no concurrent IOs are expected within the same process.Yet the same assumption can't be hold when it comes to multi-process scenario, which is why
h5
prohibits it.Given the statement from
H5
API specsand also in our use cases, we can't guarantee the data (i.e.
GatingSet
) is always initially opened as read-only, the best we can do is followncdfFlow
's practice by not maintaining the state ofH5File
handler.The text was updated successfully, but these errors were encountered: