You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
A Exception cudaErrorInitializationError: initialization error occurs within the multiprocessing pool when using GPU/CUDA on two or more files. This happens in the feature_finding step but could potentially affect any time CuPY is used within the entire workflow.
To Reproduce Environment: nvcc** --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Nov__3_21:07:56_CDT_2017
Cuda compilation tools, release 9.1, V9.1.85
Using cupy-cuda115==10.2.0.
Script: following the convention described by test_gpu_.py,
def main():
global alphapept
alphapept.performance.set_compilation_mode('cuda')
alphapept.performance.set_worker_count(30)
importlib.reload(alphapept.feature_finding)
settings = load_settings('/home/ubuntu/apps/alphapept/test_settings.yaml')
r = alphapept.interface.import_raw_data(settings)
r = alphapept.interface.feature_finding(settings)
where test_settings.yaml is all the defaults, with two or more files in experiment/file_paths
Error
For three separate files
022-03-08 18:57:55> No *.hdf file with features found for /mnt/EXP21155/EXP21155_2021ms0603X7_A.ms_data.hdf. Adding to feature finding list.
2022-03-08 18:57:55> Feature finding on /mnt/EXP21155/EXP21155_2021ms0603X7_A.raw
2022-03-08 18:57:55> Hill extraction with centroid_tol 8 and max_gap 2
2022-03-08 18:57:55> Feature finding of file /mnt/EXP21155/EXP21155_2021ms0603X7_A.raw failed. Exception cudaErrorInitializationError: initialization error
2022-03-08 18:57:55> Processing of /mnt/EXP21155/EXP21155_2021ms0603X7_A.raw for step find_features failed. Exception cudaErrorInitializationError: initialization error
2022-03-08 18:57:55> No *.hdf file with features found for /mnt/EXP21155/EXP21155_2021ms0609X26_A.ms_data.hdf. Adding to feature finding list.
2022-03-08 18:57:56> Feature finding on /mnt/EXP21155/EXP21155_2021ms0609X26_A.raw
2022-03-08 18:57:56> Hill extraction with centroid_tol 8 and max_gap 2
2022-03-08 18:57:56> Feature finding of file /mnt/EXP21155/EXP21155_2021ms0609X26_A.raw failed. Exception cudaErrorInitializationError: initialization error
2022-03-08 18:57:56> Processing of /mnt/EXP21155/EXP21155_2021ms0609X26_A.raw for step find_features failed. Exception cudaErrorInitializationError: initialization error
A Solution?
After some research, I was able to find the source of the problem. The combination of multiprocessing pools and CUDA is a little tricky. In short, we cannot use the CuPY API before we spawn processes. I'm not exactly sure where this happens in the code given, but I expect it's in some of the settings management. The solution I found was to set multiprocessing.set_start_method('spawn') ('forkserver' also works).
The speed and stability of the three options is up for debate, and I'm not sure if we will be able to obtain performance advantages using GPU if we cannot fork processes. I'm not an expert on multiprocessing, though.
Would like to know if you can replicate this problem and suggest a fix. Thank you.
The text was updated successfully, but these errors were encountered:
Hi,
I had never tested analyzing multiple files on GPU, so this could indeed be an issue, and this potentially will not work out of the box. Historically, the GPU part started with how to improve performance on a single file. The use case here could be to launch multiple docker instances on single files and then combine them later in another instance.
However, if anyone has good ideas to get the multiprocessing to work or wants to tackle this, I am all ears.
Describe the bug
A
Exception cudaErrorInitializationError: initialization error
occurs within the multiprocessing pool when using GPU/CUDA on two or more files. This happens in the feature_finding step but could potentially affect any time CuPY is used within the entire workflow.To Reproduce
Environment:
nvcc** --version
Using
cupy-cuda115==10.2.0
.Script: following the convention described by
test_gpu_.py
,where
test_settings.yaml
is all the defaults, with two or more files inexperiment/file_paths
Error
For three separate files
A Solution?
After some research, I was able to find the source of the problem. The combination of multiprocessing pools and CUDA is a little tricky. In short, we cannot use the CuPY API before we spawn processes. I'm not exactly sure where this happens in the code given, but I expect it's in some of the settings management. The solution I found was to set
multiprocessing.set_start_method('spawn')
('forkserver' also works).The speed and stability of the three options is up for debate, and I'm not sure if we will be able to obtain performance advantages using GPU if we cannot fork processes. I'm not an expert on multiprocessing, though.
Would like to know if you can replicate this problem and suggest a fix. Thank you.
The text was updated successfully, but these errors were encountered: