-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParticleID #452
base: develop
Are you sure you want to change the base?
ParticleID #452
Conversation
Add 1Byte H5Part support
initalize centers_ to valid state in FocusedOctree
round up GPU buffers to page size add device sync after device realloc for ROCm 5.x compute ns stack size based on maxNumActiveBlocks make host-particlesdata allocation behave like the gpu version
Move groups into separate compilation unit launch constant grid sizes for SPH kernels
allow nullptr groupDt in momentum_energy store groupDt in bdt prop compute minGroupDt find minDt by sorting groupDt added missing sqrt in acc timestep convert turbulence prop to block-dt scheme compute number of rungs and rung boundaries implemented extractGroup added butterfly pattern changed tsRung to activeRung added rung counters to ve-bdt prop added activeGroups allow positionUpdate to back-propagate added driftPositions converted integrator to group indexing converted updateSmoothingLength to group indexing rename driftPositions to driftPositionsGpu print diagnostic info Add rung field to bdt prop create groups for all rungs store particle rungs before domain sync added rungs to integrator add rungs to driftPositions and fix compilation initialize timesteps in ve-bdt prop activate rungs in integrator activate force-groups fix rungRanges grouping change numRungs convention only output observables when all rungs are synced made energyUpdate time reversible added u and temp to drift kernel fix file output for rho,p,curlv load/store block timestep prevent file output when rungs are not in sync update c.o.m on partial sync guard against empty groups replicate rungs in file init with particle splitting restore dt_m1 when splitting particles limit maxDtIncrease and set rungs to zero after splitting add turbulence prop without block-ts store rung groups as views rename forceGroups to time sorted groups update group time steps of active rungs ts renaming rearranged ts/rung/group files pass time steps for all rungs to drift and integrate remove init/restoration of prevTimestep prepare Timestep for variable length substeps removed references to dt * 2^rung eliminate bk variable renamed minDt to nextDt limit each substep to the global particle limit add substep groupDt sort store rungs at each substep allow groups to change rungs update sliced views only once simplify rung updates reuse global minDt function reuse findRungRanges fix init dt in file split init add safetySteps after splitting
only update h for active rungs
warp-coalesced computePositionsKernel warp-coalesced driftPositionsKernel warp-coalesced stirring kernel warp-coalesced store rung kernel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase on develop and squash into a single commit once done.
main/src/init/evrard_init.hpp
Outdated
@@ -179,6 +179,8 @@ class EvrardGlassSphere : public ISimInitializer<Dataset> | |||
|
|||
initEvrardFields(d, settings_); | |||
|
|||
generateParticleIDs(d, rank, numRanks); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generateParticleID
just initializes an additional field. It can be called in initEvrardFields
where all the other fields are initialized as well. Same for the other test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this requires passing rank and numRanks to the init*Fields functions. Should I pass these parameters as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're already using the MPI communicator in generateParticleIDs
. Might as well use it to obtain rank
and numRanks
as well. You can take the communicator as an argument to generateParticleIDs
and pass MPI_COMM_WORLD
at the call site. (One less function to remove MPI_COMM_WORLD
from later).
main/src/init/utils.hpp
Outdated
@@ -120,6 +120,29 @@ void readFileAttributes(InitSettings& settings, const std::string& settingsFile, | |||
} | |||
} | |||
|
|||
//! @brief generate particle IDs at the beginning of the simulation initialization | |||
template<class Dataset> | |||
void generateParticleIDs(Dataset& d, int rank, int numRanks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't need the full dataset, only a std::span<uint64_t> id
. At the call site d.id
can be passed, which will either have size 0 if ids are (temporarily) disabled or have size numLocalParticles
.
main/src/init/utils.hpp
Outdated
template<class Dataset> | ||
void generateParticleIDs(Dataset& d, int rank, int numRanks) | ||
{ | ||
std::vector<size_t> ranksLocalParticles(numRanks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use uint64_t
here to avoid compatibility issues with MacOS
main/src/init/utils.hpp
Outdated
std::vector<size_t> ranksLocalParticles(numRanks); | ||
size_t localNumRanks = d.x.size(); | ||
// fill ranksLocalParticles with the number of particles per rank | ||
MPI_Allgather(&localNumRanks, 1, MPI_UNSIGNED_LONG, ranksLocalParticles.data(), 1, MPI_UNSIGNED_LONG, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use MpiType<uint64_t>{}
here instead of MPI_UNSIGNED_LONG
main/src/init/utils.hpp
Outdated
for (size_t i = 0; i < d.x.size(); i++) | ||
{ | ||
d.id[i] = offset + i; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These for
loops are equivalent to:
std::exclusive_scan(ranksLocalParticles.begin(), ranksLocalParticles.end(), ranksLocalParticles.begin(), uint64_t(0));
std::iota(id.begin(), id.end(), ranksLocalParticles[rank]);
This will work also if id
is not actually allocated, which is needed if we want to make it an optional feature later or temporarily disable it by removing id
from the list of conserved fields.
Initialise the particle IDs at the beginning of the simulation. the id field is conserved throughout the simulation, but does not change during the execution.