Replies: 1 comment
-
TL;DR: Yes, if you are using BP5 reads multiple blocks of data in threads when possible. The granularity of a single read operation is one block of a variable on disk (a global array written by multiple writers will consist of multiple blocks of data on disk, where each block is logically contiguous block in a file). If your single read request ( If you read a large array produced by many writers, BP5 will use threads to read them faster than if it was a single block. You can make sure that more stuff is being read at once, if you use Deferred read mode for multiple variables and call The gain is not too much though. The number of threads is set to the number of logical cores of a compute node divided by the number of your MPI ranks of the reading application on that node, and is limited up to 16 by default (because we couldn't ever see an improvement using more threads even for a local NVMe drive). You can manually set the parameter |
Beta Was this translation helpful? Give feedback.
-
Aggregation etc.
Beta Was this translation helpful? Give feedback.
All reactions