Skip to content

Commit

Permalink
Split SQL into $JOBS input files using round robin distribution
Browse files Browse the repository at this point in the history
The using of the chunks option with round robin distribution should
create as many input files as there will be jobs and ensure the
projected regeneration times are as equally distributed as possible
  • Loading branch information
sbesson committed Sep 10, 2024
1 parent 8b3c9d7 commit 442341d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/dist/regen-memo-files.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ run_split_parallel_os_dep() {
set -x
export JAVA_OPTS="-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=rslt.${DATESTR} -Xmx2g -Dlogback.configurationFile=${MEMOIZER_HOME}/logback-memoizer.xml -Dprocessname=memoizer"
cd rslt.${DATESTR}
split -a 3 -l ${BATCH_SIZE} ${FULL_CSV} -d input.
split -a 3 -n r/$JOBS ${FULL_CSV} -d input.
PARALLEL_OPTS="--halt now,fail=1 --eta --jobs ${JOBS} --joblog parallel-${JOBS}cpus.log --files --use-cpus-instead-of-cores --results . ${DRYRUN}"
set -x
/usr/bin/time -p -o timed parallel ${PARALLEL_OPTS} \
Expand Down

0 comments on commit 442341d

Please sign in to comment.