-
Notifications
You must be signed in to change notification settings - Fork 2
Mantevo miniAMR reference proxy application
License
arm-hpc/miniAMR
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
miniAMR mini-application -------------------------------------- Contents of this README file: 1. miniAMR overview 2. miniAMR versions 3. building miniAMR 4. running miniAMR 5. notes about the code -------------------------------------- -------------------------------------- 1. miniAMR overview miniAMR applies a stencil calculation on a unit cube computational domain, which is divided into blocks. The blocks all have the same number of cells in each direction and communicate ghost values with neighboring blocks. With adaptive mesh refinement, the blocks can represent different levels of refinement in the larger mesh. Neighboring blocks can be at the same level or one level different, which means that the length of cells in neighboring blocks can differ by only a factor of two in each direction. The calculations on the variables in each cell is an averaging of the values in the chosen stencil. The refinement and coarsening of the blocks is driven by objects that are pushed through the mesh. If a block intersects with the surface or the volume of an object, then that block can be refined. There is also an option to uniformly refine the mesh. Each cell contains a number of variables, each of which is evaluated indepently. -------------------------------------- 2. miniAMR versions: - miniAMR_ref: reference version: self-contained MPI-parallel. - miniAMR_serial serial version of reference version ------------------- 3. Building miniAMR: To make the code, type 'make' in the directory containing the source. The enclosed Makefile.mpi is configured for a general MPI installation. Other compiler or other machines will need changes in the CFLAGS variable to correspond with the flags available for the compiler being used. ------------------- 4. Running miniAMR: miniAMR can be run like this: % <mpi-run-command> ./miniAMR.x where <mpi-run-command> varies from system to system but usually looks something like 'mpirun -np 4 ' or similar. Execution is then driven entirely by the default settings, as configured in default-settings.h. Options may be listed using % ./miniAMR.x --help To run the program, there are several arguments on the command line. The list of arguments and their defaults is as follows: --nx - block size in x --ny - block size in y --nz - block size in z These control the size of the blocks in the mesh. All of these need to be even and greater than zero. The default is 10 for each variable. --init_x - initial blocks in x --init_y - initial blocks in y --init_z - initial blocks in z These control the number of the blocks on each processor in the initial mesh. These need to be greater than zero. The default is 1 block in each direction per processor. The initial mesh is a unit cube regardless of the number of blocks. --reorder - ordering of blocks This controls whether the blocks are ordered by the RCB algorithm or by a natural ordering of the processors. The default is 1 which selects the RCB ordering and the natural ordering is 0. --npx - number of processors in the x direction --npy - number of processors in the y direction --npz - number of processors in the z direction These control the number of processors is each direction. The product of these number has to equal the number of processors being used. The default is 1 block in each direction. --max_blocks - maximun number of blocks per processor The maximun number of blocks used per processor. This is the number of blocks that will be allocated at the start of the run and the code will fail if this number is exceeded. The default is 500 blocks. --num_refine - number of levels of refinement This is the number of levels of refinement that blocks which are refined will be refined to. If it is zero then the mesh will not be refined. the default is 5 levels of refinement. --block_change - number of levels a block can change during refinement This parameter controls the number of levels that a block can change (either refining or coarsening) during a refinement step. The default is the number of levels of refinement. --uniform_refine - if 1, then grid is uniformly refined This controls whether the mesh is uniformly refined. If it is 1 then the mesh will be uniformly refined, while if it is zero, the refinement will be controlled by objects in the mesh. The default is 1. --refine_freq - frequency (in timesteps) of checking for refinement This determines the frequency (in timesteps) between checking if refinement is needed. The default is every 5 timesteps. --target_active - target number of blocks per processor --target_max - max number of blocks per processor --target_min - min number of blocks per processor These allow the user to control the number of blocks per processor. If these are zero, then no adjustment is made. If target_active is greater than zero than the code will adjust the number of blocks to that target after the refinement step. If target_max is greater than zero then the number of blocks will be reduced if it exceeds this number. Likewise, if target_min is greater than zero, than the number of blocks will be raised if there is less than that number after the refinement step. The default for all of these is zero. --inbalance - percentage inbalance to trigger inbalance This parameter allows the user to set a percentage threshold above which the load will be balanced amoung the processors. The value that this is checked against is the maximum number of blocks on a processor minus the minimum number of blocks on a processor divided by the average. The default is zero, which means to always load balance at each refinement step. --lb_opt - (0, 1, 2) determine load balance strategy If set to 0, then load balancing is not performed. The default is set to 1 which load balances each refinement step. Setting the parameter to 2 results in load balancing at each stage of the refinement step. If a processor has a large number of blocks which are refined several steps, this allows the work (and space needed) to be shared amoung more processors. --num_vars - number of variables (> 0) The number of variables the will be calculated on and communicated. The default is 40 variables. --comm_vars - number of vars to communicate together The number of variables that will communicated together. This will allow shorter but more variables if it is set to something less than the total number of variables. The default is zero which will communicate all of the variables at once. --num_tsteps - number of timesteps (> 0) The number of timesteps for which the simulation will be run. The default is 20. --stages_per_ts - number of comm/calc stages per timestep The number of calculate/communicate stages per timestep. The default is 20. --permute - (no argument) permute communication directions If this is set, then the order of the communication directions will be permuted through the six options available. The default is to send messages in the x direction first, then y, and then z. --blocking_send - (no argument) Use blocking sends in the communication routine instead of the default nonblocking sends. --code - change the way communication is done The default is 0 which communicates only the ghost values that are needed. Setting this to 1 sends all of the ghost values, and setting this to 2 also does all of the message processing (refinement or unrefinement) to be done on the sending side. This allows us to more closely minic the communication behaviour of codes. --checksum_freq - number of stages between checksums The number of stages between calculating checksums on the variables. The default is 5. If it is zero, no checks are performed. --stencil - 7 or 27 point 3D stencil The 3D stencil used for the calculations. It can be either 7 or 27 and the default is 7 since the 27 point calculation will not conserve the sum of the variables except for the case of uniform refinement. --error_tol - (e^{-error_tol} ; >= 0) This determines the error tolerance for the checksums for the variables. the tolerance is 10 to the negative power of error_tol. The default is 8, so the default tolerance is 10^(-8). --report_diffusion - (>= 0) none if 0 This determines if the checksums are printed when they are calculated. The default is 0, which is no printing. --report_perf - (0 .. 15) This determines how the performance output is displayed. The default is YAML output (value of 1). There are four output modes and each is controlled by a bit in the value. The YAML output (to a file called results.yaml) is controlled by the first bit (report_perf & 1), the text output file (results.txt) is controlled by the second bit (report_perf & 2), the output to standard out is controlled by the third bit (report_perf & 4), and the output of block decomposition at each refine step is controlled by the forth bit (report_perf & 8). These options can be combined in any way desired and zero to four of these options can be used in any run. Setting report_perf to 0 will result in no output. --refine_freq - frequency (timesteps) of refinement (0 for none) This determines how frequently (in timesteps) the mesh is checked and refinement is done. The default is every 5 timesteps. If uniform refinement is turned on, the setting of refine_freq does not matter and the mesh will be refined before the first timestep. --refine_ghosts - (no argument) The default is to not use the ghost cells of a block to determine if that block will be refined. Specifying this flag will allow those ghost cells to be used. --num_objects - (>= 0) number of objects to cause refinement The number of objects on which refinement is based. Default is zero. --object - type, position, movement, size, size rate of change The object keyword has 14 arguments. The first two are integers and the rest are floating point numbers. They are: type - The type of object. There is 16 types of objects. They include the surface of a rectangle (0), a solid rectangle (1), the surface of a spheroid (2), a solid spheroid (3), the surface of a hemispheroid (+/- with 3 cutting planes) (4, 6, 8, 10, 12, 14), a solid spheroid (+/- with 3 cutting planes)(5, 7, 9, 11, 13, 15), the surface of a cylinder (20, 22, 24), and the volume of a cylinder (21, 23, 25). bounce - If this is 1 then an object will bounce off of the walls when the center hits an edge of the unit cube. If it is zero, then the object can leave the mesh. center - Three doubles that determine the center of the object in the x, y, and z directions. move - Three doubles that determine the rate of movement of the center of the object in the x, y, and z directions. The object moves this far at each timestep. size - The initial size of the object in the x, y, and z directions. If any of these become negative, the object will not be used in the calculations to determine refinement. These sizes are from the center to the edge in the specified direction. inc - The change in size of the object in the x, y, and z directions. Examples of run scripts for a Cray XE6 that illustrate several of the options: One sphere moving diagonally on 27 processors: mpirun -np 27 -N 7 miniAMR.x --num_refine 4 --max_blocks 9000 --npx 3 --npy 3 --npz 3 --nx 8 --ny 8 --nz 8 --num_objects 1 --object 2 0 -1.71 -1.71 -1.71 0.04 0.04 0.04 1.7 1.7 1.7 0.0 0.0 0.0 --num_tsteps 100 --checksum_freq 1 An expanding sphere on 64 processors: mpirun -np 64 miniAMR.x --num_refine 4 --max_blocks 6000 --init_x 1 --init_y 1 --init_z 1 --npx 4 --npy 4 --npz 4 --nx 8 --ny 8 --nz 8 --num_objects 1 --object 2 0 -0.01 -0.01 -0.01 0.0 0.0 0.0 0.0 0.0 0.0 0.0009 0.0009 0.0009 --num_tsteps 200 --comm_vars 2 Two moving spheres on 16 processors: mpirun -np 16 miniAMR.x --num_refine 4 --max_blocks 4000 --init_x 1 --init_y 1 --init_z 1 --npx 4 --npy 2 --npz 2 --nx 8 --ny 8 --nz 8 --num_objects 2 --object 2 0 -1.10 -1.10 -1.10 0.030 0.030 0.030 1.5 1.5 1.5 0.0 0.0 0.0 --object 2 0 0.5 0.5 1.76 0.0 0.0 -0.025 0.75 0.75 0.75 0.0 0.0 0.0 --num_tsteps 100 --checksum_freq 4 --stages_per_ts 16 ------------------- 5. The code: block.c Routines to split and recombine blocks check_sum.c Calculates check_sum for the arrays comm_block.c Communicate new location for block during refine comm.c General routine to do interblock communication comm_parent.c Communicate refine/unrefine information to parents/children comm_refine.c Communicate block refine/unrefine to neighbors during refine comm_util.c Utilities to manage communication lists driver.c Main driver init.c Initialization routine main.c Main routine that reads command line and launches program move.c Routines that check overlap of objects and blocks pack.c Pack and unpack blocks to move plot.c Write out block information for plotting profile.c Write out performance data rcb.c Load balancing routines refine.c Routines to direct refinement step stencil.c Perform stencil calculations target.c Add/subtract blocks to reach a target number util.c Utility routines for timing and allocation -- End README file. Courtenay T. Vaughan ([email protected])
About
Mantevo miniAMR reference proxy application
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published