Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] plugin: add "age" factor to priority calculation #297

Closed
wants to merge 8 commits into from

Conversation

cmoussa1
Copy link
Member

Background

As mentioned in #291, there is potential to add another factor to the priority calculation of a job that consists of a job's "age," or how long a job has been submitted and waiting to run. This will help further refine the priority of a job and can help increase a low priority job's position in the queue on a reprioritization of all jobs.


This is a [WIP] PR built on top of #295 that adds this new "age" factor to the priority calculation of a job. It unpacks t_submit from the job and subtracts it from the current time to calculate a job's age. This factor is then added to the other two factors that are currently used in the multi-factor priority calculation.

With the changes proposed in #295, it is possible to "disable" this factor (or any factor for that matter) by setting its weight to 0 in the TOML config file:

[priority_factors]
fshare_weight = 1000
queue_weight = 100
age_weight = 0

which is what I have done in the existing set of sharness tests, as to avoid having to adjust every test that checks for a specific priority number.

TODO
  • add tests for age factor

Add a new callback to the plugin that extracts information pertaining to the
priorities' respective weights used in the multi-factor priority calculation.
Place the weight information in a map where the key is a string of the factor's
name and the value is the factor's associated integer weight.
Instead of using hard-coded values for the integer weights for each factor
in the multi-factor priority calculation, use the values located in the
priority_weights map that extract the weights during a conf.update.

If no values are found in a conf.update, the values for the factors located in
the priority_weights map will be -1, so in this case, just use default values
so that a job does not get held in PRIORITY state.
Add the age factor's integer weight to the list of weights unpacked during a
conf.update and place it in the plugin's priority_weights map.
Fetch t_submit in priority_calculation () and use it to calculate the "age"
of a job by subtracting it from the current time in seconds since epoch.
This factor will then be used to further refine the integer priority returned
for a submitted job.
Add the "age" factor to the priority calculation of a job.
@cmoussa1 cmoussa1 added the new feature new feature label Nov 28, 2022
@codecov
Copy link

codecov bot commented Nov 29, 2022

Codecov Report

Merging #297 (5a3603c) into master (43daaf9) will increase coverage by 0.25%.
The diff coverage is 93.10%.

@@            Coverage Diff             @@
##           master     #297      +/-   ##
==========================================
+ Coverage   83.76%   84.02%   +0.25%     
==========================================
  Files          23       23              
  Lines        1226     1252      +26     
==========================================
+ Hits         1027     1052      +25     
- Misses        199      200       +1     
Impacted Files Coverage Δ
src/plugins/mf_priority.cpp 85.59% <93.10%> (+0.83%) ⬆️

@cmoussa1
Copy link
Member Author

cmoussa1 commented Nov 8, 2023

I don't expect to get back to this until higher priority stuff has eventually landed in flux-accounting, but I wanted to jot some notes down from an offline discussion so that I don't forget about it when I circle back to this.

Something to keep in mind when considering the age factor is to calculate it from when a job is released, not when it is submitted, or in other words, consider only the time that the job was available to be scheduled but failed due to another constraint (like a resource constraint) that could be satisfied if it was the next job up in the queue.

This is to prevent users from purposely submitting held jobs for a long time and then getting a very large priority bump when they release their held job.

@cmoussa1
Copy link
Member Author

Going to close this PR and re-open a hopefully cleaner implementation. :-)

@cmoussa1 cmoussa1 closed this May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant