Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc43: add new RFC for job list service #408

Merged
merged 1 commit into from
Jan 13, 2024

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Jan 10, 2024

Problem: job-list services are not documented.

Add an RFC to document current RPCs.


first run through, I might have some rst-isms wrong. But mostly went back and forth on various wording and various layout. Lets see what initial reaction is :P

@chu11 chu11 force-pushed the rfc43_job_list branch 2 times, most recently from 9781f9d to 4aed692 Compare January 10, 2024 06:04
@garlick
Copy link
Member

garlick commented Jan 10, 2024

Nice! Very good to get this protocol documented IMHO. Just a couple of quick thoughts on a skim:

  • a table might be better presentation for attributes and/or constraint operators
  • no need to require the service to be loaded on rank 0. perhaps a future implementation using the same protocol will have distributed caching of some sort.
  • I'd reference the job info RFC when you mention the job info service

spec_43.rst Outdated
Comment on lines 143 to 147
- ``states``: The set of *integer* values SHALL designate one or more job states
and match jobs in those job states.

- ``results``: The set of *integer* values SHALL designate one or more job results
and match jobs with those results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Can the states and results values be a bitmask of states to match? I.e. is {"states": [48, 14]} equivalent to {"states": [62]}? If so, it might be useful to call that out specifically here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! and i forgot that string inputs are ok too (i.e. "depend")

@chu11
Copy link
Member Author

chu11 commented Jan 11, 2024

re-pushed based on comments above, minor tweaks forthcoming once i see how it looks on read the docs

Edit: ok, I went back to the listing for constraint operators, b/c the table was really large and suddenly trying to get things to fit and get multiline cells wasn't working out. But I need to use something other than a list, like a "term list" or something (dunno if that's the right term ...).

@chu11 chu11 force-pushed the rfc43_job_list branch 4 times, most recently from 93ee9cd to 9e396c3 Compare January 11, 2024 06:09
@chu11 chu11 changed the title WIP: rfc43: add new RFC for job list service rfc43: add new RFC for job list service Jan 11, 2024
@chu11
Copy link
Member Author

chu11 commented Jan 11, 2024

removed WIP, I think it's at an ok point to read with more seriousness :-)

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! First pass up through the Constraint Operators section. Just some minor suggestions so far.

index.rst Outdated
Comment on lines 284 to 286
The Flux Job List Service provides summary information for jobs in the
system. It provides read-only access. Several ways to find / filter
jobs is also supported.
Copy link
Contributor

@grondo grondo Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: Could simplify the first two sentences to "The Flux Job List Service provides read-only summary information for jobs."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, IMO the slash (/) should be avoided in the more formal prose of RFCs. An "and" could work fine here.

https://style.mla.org/slashed-terms-mla-style/

spec_43.rst Outdated
Comment on lines 8 to 10
The Flux Job List Service provides summary information for jobs in the
system. It provides read-only access. Several ways to find / filter
jobs is also supported.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above comment.

spec_43.rst Outdated
Comment on lines 43 to 44
Users are interested in seeing jobs that have been submitted to the
scheduler. Some reason may include:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor, but jobs are technically submitted to the job manager not the scheduler (a scheduler need not be loaded to submit and query or list jobs).

Also "Some reason may include:" ➡️ "Some reasons may include"

spec_43.rst Outdated
While the job info service described in RFC41 is capable of providing job owners information about their
own jobs, it has several limitations:

- job information may not be easily parsed / collated from multiple sources into one easily parsable format
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'd replace / with or here.

spec_43.rst Outdated

- job information may not be easily parsed / collated from multiple sources into one easily parsable format
- information from multiple jobs is not collated into a simple to parse list
- information about non-owned jobs is not available
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: something like?

Suggested change
- information about non-owned jobs is not available
- information about the jobs of other users in a multi-user instance is not available

spec_43.rst Outdated

- Hide the complexity of parsing or collating data from multiple sources for commonly accessed information.

- Provide ways to find and/or filter jobs callers are interested in.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between find and filter may not be clear. Maybe replace "find and/or filter" with just "filter" or "query"? I don't have a strong opinion on this one though

spec_43.rst Outdated
- ranks a job ran on
- integer
* - nodelist
- nodes a job ran on, may accept RFC29 Hostlist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- nodes a job ran on, may accept RFC29 Hostlist
- nodes assigned to a job in RFC29 Hostlist format

Unless I'm misunderstanding what is meant by "may accept" here (maybe this was meant to describe the constraint not the attribute?)

Copy link
Member Author

@chu11 chu11 Jan 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh yeah I messed up meaning the constraint input (which is of course non-existent in the present version of this RFC).

spec_43.rst Outdated
Designate one or more job results (*string* or *integer*) and match jobs with those results. Both bitmasks (including multiple results) and string names of the results SHALL be accepted.

``t_submit``, ``t_depend``, ``t_run``, ``t_cleanup``, ``t_inactive``
Designate one timestamp with a prefixed comparison operator (*string*). The accepted comparison operators SHALL be `>`, `<`, `>=`, and `<=`, for greater than, less than, greater than or equal, or less than or equal. Match jobs where the respective timestamp matches against the submitted timestamp and comparison operator.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:

Match jobs where the respective timestamp matches against the submitted timestamp and comparison operator.

Maybe reword with RFC syntax ("A timestamp operator SHALL match...") and replace "submitted timestamp" (could be confused with t_submit or job submission) with "provided timestamp"?

@chu11 chu11 force-pushed the rfc43_job_list branch 2 times, most recently from 3a2ec91 to b28f548 Compare January 11, 2024 19:32
@chu11
Copy link
Member Author

chu11 commented Jan 11, 2024

thanks @grondo, re-pushed with some fixups based on your comments. Also fixed a few minor English issues.

Copy link
Contributor

@grondo grondo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just one more suggestion. Not sure if @garlick wants to have another pass too.

spec_43.rst Outdated
Designate one or more job results (*string* or *integer*) and match jobs with those results. Both bitmasks (including multiple results) and string names of the results SHALL be accepted.

``t_submit``, ``t_depend``, ``t_run``, ``t_cleanup``, ``t_inactive``
Designate one timestamp with a prefixed comparison operator (*string*). The accepted comparison operators SHALL be `>`, `<`, `>=`, and `<=`, for greater than, less than, greater than or equal, or less than or equal. A timestamp operator SHALL match jobs where the respective timestamp matches against the provided timestamp.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "Designate one timestamp with an optional prefixed comparison operator" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it's not optional ... I think we decided to go with that to avoid confusion of "what is the default?"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense since equality with a floating point wouldn't work out too well would it? Sorry I had forgotten that.

Maybe reword to add the RFC keyword REQUIRED then to make it very explicit that this was a design choice, e.g. "Designate a timestamp and REQUIRED prefix comparison operator". Though I guess it is fine as is if you'd prefer it that way.

@garlick
Copy link
Member

garlick commented Jan 12, 2024

Yes I'll have a run through, thanks :-)

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - thanks for doing this!

I just had a few cosmetic comments that aren't super important. Up to you what you want to take or leave.

spec_43.rst Outdated
@@ -0,0 +1,360 @@
.. github display
GitHub is NOT the preferred viewer for this file. Please visit
https://flux-framework.rtfd.io/projects/flux-rfc/en/latest/spec_41.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/spec_41/spec_43/

spec_43.rst Outdated
Comment on lines 41 to 42
Users are interested in seeing jobs that have been submitted to the
job-manager. Some reasons may include:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: reword intro to:

The job list service provides a simple job query service that collates information from several sources, including the job info service described in RFC 41. It supports the following use cases:

Then drop the job info stuff after the use cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, although I think your wording suggests this is a complete set of use cases. I used the phrase "Some use cases include:" instead.

spec_43.rst Outdated
Goals
*****

- Provide read-only access to non-sensitive information for all jobs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: split the first goal

  • provide read-only access to job information
  • limit guest access to sensitive job information such as ... (example?)

If we are mentioning sensitive information we need to define it IMHO.

spec_43.rst Outdated
Implementation
**************

The job list service SHALL provide callers the ability to read job information via identifier keys, which will be called *attributes*. See `Job Attributes` below for details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: instead of

read job information via identifier keys, which will be called attributes.

how about

request specific job attributes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to make the section references actual links with :ref:

See https://www.sphinx-doc.org/en/master/usage/referencing.html#cross-referencing-arbitrary-locations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm I tried the section references earlier but it didn’t work. Lemme try again.

spec_43.rst Outdated

The job list service SHALL provide callers the ability to read job information via identifier keys, which will be called *attributes*. See `Job Attributes` below for details.

The job list service SHALL provide a RFC31 constraint syntax for filtering jobs. See `Constraint Operators` below for details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a space between RFC and 31.

spec_43.rst Outdated
Job Attributes
==============

Job information is defined by the following *attribute* keys listed below.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete sentence? Not sure it's helping anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ehhh, it feels empty without something before the table. Perhaps a simpler sentence: The job list service SHALL support the following attributes.

spec_43.rst Outdated
Comment on lines 180 to 179
The *attribute* *id* SHALL always be returned for each job. Every other attribute is optional.

Copy link
Member

@garlick garlick Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This renders as attribute id which is confusing. Could drop the itialicization of "attribute" here.

Maybe attribute names should be made into sphinx literals with double backtics?

See https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#inline-markup

spec_43.rst Outdated
Constraint Operators
====================

Using the constraint syntax described by RFC31, jobs can be filtered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Space between RFC and 31

@chu11
Copy link
Member Author

chu11 commented Jan 12, 2024

re-pushed with tweaks per recommendations above. possible tweaks coming depending on how the final output looks. (oops, I see make check already failed)

@chu11 chu11 force-pushed the rfc43_job_list branch 2 times, most recently from 5bebe7c to 59df57b Compare January 12, 2024 23:29
@chu11
Copy link
Member Author

chu11 commented Jan 12, 2024

fixed a few minor typos, but overall looks good to me if you guys wanna take one more skim

@garlick
Copy link
Member

garlick commented Jan 12, 2024

It's looking good! So what do we do with attributes that are not available, just omit them or set the attribute to NULL or? Might be good to clarify that.

The only other thing I can think of that might be helpful is to show a JSON example of the list request and response like we've done in some of the other RFCs.

Great work!

@chu11
Copy link
Member Author

chu11 commented Jan 12, 2024

It's looking good! So what do we do with attributes that are not available, just omit them or set the attribute to NULL or? Might be good to clarify that.

Yup, the text I put in is "If an attribute has not been set for a job, it SHALL NOT be returned in the returned data object."

The only other thing I can think of that might be helpful is to show a JSON example of the list request and response like we've done in some of the other RFCs.

Good idea, I'll stick one in at the end.

Problem: job-list services are not documented.

Add an RFC to document current RPCs.
@chu11
Copy link
Member Author

chu11 commented Jan 13, 2024

repushed, stuck an example at the end of the rfc, just one for job-list.list, didn't think necessary to have one example for each rpc

@garlick
Copy link
Member

garlick commented Jan 13, 2024

Looks good!

@mergify mergify bot merged commit 9734d88 into flux-framework:master Jan 13, 2024
7 checks passed
@chu11 chu11 deleted the rfc43_job_list branch January 13, 2024 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants