Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job-list: support "hostlist" constraint to allow jobs to be filtered by nodes #5656

Merged
merged 6 commits into from
Jun 11, 2024

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Jan 5, 2024

TSIA.

Maybe the only interesting side note is that until we solve #5367 (which should probably follow this one up) we don't yet have a friendly user interface for this. The tests send constraints objects straight to the job-list module.

@chu11 chu11 force-pushed the issue4186_flux_jobs_filter_nodes branch from 2ea42eb to cb9e7b5 Compare January 5, 2024 06:14
@grondo
Copy link
Contributor

grondo commented Jan 8, 2024

I wonder if we should limit the size of hostlist allowed in a constraint query for job-list. This probably goes for job constraints too. I was basically able to hang job-list with a query like, where no jobs ran on >= pi4:

{
  "or": [
    {
      "and": [
        {
          "states": [
            62
          ]
        },
        {
          "hostlist": [
            "pi[4-1000000]"
          ]
        }
      ]
    },
    {
      "hostlist": [
        "pi[4-10000000]"
      ]
    }
  ]
}

Just add more or to get a longer hang...

@chu11
Copy link
Member Author

chu11 commented Jan 9, 2024

I wonder if we should limit the size of hostlist allowed in a constraint query

Good catch! Hmmm not sure what the limit shouild be ... 1024 should be more than enough?

This probably goes for job constraints too

Yeah, b/c one could make a semi-infinite sized one. Although I'm unsure how to limit the size. Let me create an issue for that.

@grondo
Copy link
Contributor

grondo commented Jan 9, 2024

1024 might be a bit small on a cluster with 10K nodes. Something dynamic might work if job-list can get the maximum expected instance size. However, this doesn't prevent DoS from submitting a complex query (e.g. or host:foo[1-100] x1000). There should be some way to break out of a match operation that is taking too long... 🤔

@chu11
Copy link
Member Author

chu11 commented Jan 9, 2024

Something dynamic might work if job-list can get the maximum expected instance size.

Hmmm, I guess it wouldn't be too hard to keep a running track of largest node count seen so far.

However, this doesn't prevent DoS from submitting a complex query

Yeah, I brought up in #5669, perhaps need some hard caps just to avoid DoS attempts.

@chu11
Copy link
Member Author

chu11 commented Jan 15, 2024

Something dynamic might work if job-list can get the maximum expected instance size.

Was looking into this and going back on pros on cons of various approaches:

  • calling sched.resource-status to get nodelist count
  • getting the number of brokers via flux_get_size()
  • keep running max number of nodes in nodelist of all jobs

the problems with some of the above

  • sched module may not be loaded, then what do you do
  • number of brokers is good upper bound but not accurate (i.e. multiple brokers per node), but this is probably ok
  • assumes cluster size stays the same, which I'm not sure is accurate to assume (e.g. common to shrink cluster as hardware dies off towards EOL .. and there's all the stuff the kubernetes people doing with cluster sizes)
  • keep tabs on running max number of nodes for all jobs is ok for everything in memory, but not when there's a job db
    • this maybe could be solved with a side table with the job-db for "max counts" and stuff like that ... although I'd prefer to not do that.

Sooo, I'm not sure.. I'm beginning to wonder if a heuristic might be best. Like number of brokers OR max job count, whatever is bigger? then max is a multiple of it? That's a lot of work for this. This also makes me wonder if we should hard code some max instead.

Edit: hmmm here's a compromise idea, number of brokers or some-defined N, whichever is larger? That way there's always some decent minimum that is allowed, like 1024 or something.

@chu11 chu11 force-pushed the issue4186_flux_jobs_filter_nodes branch 2 times, most recently from cc950de to 8596055 Compare January 16, 2024 07:13
@chu11
Copy link
Member Author

chu11 commented Jan 16, 2024

re-pushed using the following idea to limit hostlist constraint sizes, the limit is instance size (ie number of brokers) or 1024, whatever is bigger. the 1024 minimum is to give the constraint some decent minimum, in the event the size of the cluster has shrunk over time.

good idea? bad idea?

@grondo
Copy link
Contributor

grondo commented Jan 16, 2024

the limit is instance size (ie number of brokers) or 1024, whatever is bigger. the 1024 minimum is to give the constraint some decent minimum

This seems reasonable to me.

Another idea I had would be for job-list to keep a single hostlist (really a set) of all hosts encountered for all jobs. When a hostlist constraint is encountered, the job-list module could take the intersection of this set and the "all hosts" set to limit the constraint hostlist to only those hosts which could possibly be matched. A similar process could be used to optimize matching of ranks as well.

The drawback is extra work when processing every job to maintain the set of possible hosts, which may not end up being worth it. 🤔

@chu11
Copy link
Member Author

chu11 commented Jan 16, 2024

Another idea I had would be for job-list to keep a single hostlist (really a set) of all hosts encountered for all jobs. When a hostlist constraint is encountered, the job-list module could take the intersection of this set and the "all hosts" set to limit the constraint hostlist to only those hosts which could possibly be matched. A similar process could be used to optimize matching of ranks as well.

In the current implementation this would work, but it'd immediately become a problem if we have older job data stored in a database, which we wouldn't scan upon a instance restart. So I was trying to avoid doing any optimization based on what we've seen so far.

As mentioned above, I guess this could be solved with a side part of the database of extra stuffs (like "max hosts" or "hosts we've seen so far"). Hmmm, let me think about this more. Maybe doing that in the DB would be inevitable for optimization purposes.

Or as you say ... maybe it's not worth the energy to do this optimization.

@grondo
Copy link
Contributor

grondo commented Jan 16, 2024

Just curious, how are you going to match on hosts with older jobs in a database. Maybe the optimization won't be needed in that case if the hosts are indexed or something. (I haven't seen an implementation, so difficult to reason about). I assume a good database will have already handled the query optimization and "DoS" problem...

@chu11
Copy link
Member Author

chu11 commented Feb 1, 2024

re-pushed, updating for conflicts, now based on #5681

@chu11 chu11 force-pushed the issue4186_flux_jobs_filter_nodes branch 2 times, most recently from 1010788 to c80c4c6 Compare February 2, 2024 00:33
@chu11 chu11 force-pushed the issue4186_flux_jobs_filter_nodes branch from c80c4c6 to 27f8b5a Compare March 20, 2024 17:11
Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a quick first pass, found some places where error.text is not set on error and thus garbage would be returned in the error response.

Comment on lines 441 to 442
if (inc_check_comparison (c->mctx, comparisons, errp) < 0)
return -1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The count of comparisons is incremented here and also for each host. Doesn't that mean one host comparison is counted twice?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my logic was there was a comparison simply to see if we should bother to check hosts (i.e. if the job didn't run, we don't check anything, and that should count for at least 1 comparison). But now that you mention it, perhaps that is not the most sensible way to think about it. I'll remove the first check.

match_hostlist,
wrap_hostlist_destroy,
errp)))
return NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errp->text not set here.

Copy link
Member Author

@chu11 chu11 Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this is ok, errp is set in the call to list_constraint_new()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, but looks like it is missing in the EINVAL path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nevermind, I think I was looking at the wrong function in the diff.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, I think I was looking at the wrong function in the diff.

/* Create a single hostlist if user specifies multiple nodes or
* RFC29 hostlist range */
if (!(hl = hostlist_create ()))
goto error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errp->text not set here.

}
if (!zlistx_add_end (c->values, hl)) {
hostlist_destroy (hl);
goto error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errp->text not set here.

@@ -743,6 +831,28 @@ struct match_ctx *match_ctx_create (flux_t *h)
goto error;
}

if (flux_get_size (mctx->h, &mctx->max_hostlist) < 0)
goto error;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errp->text not set here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you mean I forgot a flux_log in this case, but correct!!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, heh. Yeah I must have been blindly checking goto error without errprintf() 🤦

@chu11 chu11 force-pushed the issue4186_flux_jobs_filter_nodes branch 3 times, most recently from 19ff734 to eb1fcf5 Compare April 17, 2024 05:21
@chu11
Copy link
Member Author

chu11 commented Apr 17, 2024

re-pushed with tweaks per comments above

@chu11 chu11 force-pushed the issue4186_flux_jobs_filter_nodes branch from eb1fcf5 to e1fef43 Compare April 19, 2024 17:37
@chu11
Copy link
Member Author

chu11 commented Apr 19, 2024

rebased now that #5681 is merged

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this one several times as part of development of #5711 and LGTM!

@grondo grondo added this to the flux-core-0.64.0 milestone Jun 5, 2024
@chu11
Copy link
Member Author

chu11 commented Jun 11, 2024

@Mergifyio rebase

chu11 added 6 commits June 11, 2024 15:23
Problem: In the near future it'd be convenient to do calculations
on the job nodelist, but it often needs to be in a hostlist
data structure for processing.  We'd like to avoid converting the
nodelist to hostlist struct over and over again.

Add a field into the job_data struct to hold a cached version of the
nodelist in a hostlist struct.
Problem: It would be convenient to filter jobs based on the nodes
they ran on.

Add a constraint operator "hostlist" to filter on nodes within the job
nodelist.  Multiple nodes can be specified.  Hostlists represented in
RFC29 format are acceptable for input to the constraint.

Fixes flux-framework#4186
Problem: There are no unit tests for the new 'hostlist' constraint
operator.

Add tests in match and state_match tests.
Problem: In t2260-job-list.t, some test jobs are submitted using
"flux job submit".  This makes it inconvenient to
set job requirements on those jobs, such as specific nodes
the jobs should run on.

Solution: Convert all uses of "flux job submit" to use "flux submit".
Problem: In the near future we would like to filter jobs based
on nodes in a job's nodelist.  This would be an issue in the current
test setup because test brokers all run under the same host and
it is unknown which nodes jobs actually ran on.

Solution: Setup fake hostnames for the test brokers in
t2260-job-list.t and when necessary, run test jobs on specific
hosts.
Problem: There is no testing / coverage for the new hostlist constraint
in the job-list module.

Add some tests to t2260-job-list.t.
Copy link
Contributor

mergify bot commented Jun 11, 2024

rebase

✅ Branch has been successfully rebased

@chu11 chu11 force-pushed the issue4186_flux_jobs_filter_nodes branch from e1fef43 to d381bf3 Compare June 11, 2024 15:23
@chu11
Copy link
Member Author

chu11 commented Jun 11, 2024

oops, missed that this had been approved last week. rebased and setting MWP. thanks

@mergify mergify bot merged commit 6462184 into flux-framework:master Jun 11, 2024
34 of 35 checks passed
@chu11 chu11 deleted the issue4186_flux_jobs_filter_nodes branch June 11, 2024 18:22
trws pushed a commit to trws/flux-core that referenced this pull request Jun 14, 2024
…_filter_nodes

job-list: support "hostlist" constraint to allow jobs to be filtered by nodes
Copy link

codecov bot commented Sep 9, 2024

Codecov Report

Attention: Patch coverage is 78.18182% with 12 lines in your changes missing coverage. Please review.

Project coverage is 83.29%. Comparing base (67fc412) to head (d381bf3).
Report is 451 commits behind head on master.

Files with missing lines Patch % Lines
src/modules/job-list/match.c 76.92% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5656      +/-   ##
==========================================
- Coverage   83.30%   83.29%   -0.01%     
==========================================
  Files         519      519              
  Lines       83680    83734      +54     
==========================================
+ Hits        69707    69747      +40     
- Misses      13973    13987      +14     
Files with missing lines Coverage Δ
src/modules/job-list/job_data.c 93.64% <100.00%> (+0.02%) ⬆️
src/modules/job-list/state_match.c 92.81% <100.00%> (+0.03%) ⬆️
src/modules/job-list/match.c 89.08% <76.92%> (-1.81%) ⬇️

... and 9 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants