Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

job-info: return job state as number encourages hardcoding values in shell scripts #2635

Open
garlick opened this issue Jan 15, 2020 · 11 comments
Labels

Comments

@garlick
Copy link
Member

garlick commented Jan 15, 2020

Problem: job-info.list returns job states as numerical values. This encourages people parsing the output in a bash script with jq to hard code the enum values.

It would be better IMHO to return the state names (strings) from RFC 21.

That might work out better in the python bindings too...

@chu11
Copy link
Member

chu11 commented Jan 15, 2020

Seems like a good idea, instead of the user converting numeral to string.

@grondo
Copy link
Contributor

grondo commented Jan 15, 2020

Another approach would be to add flux job statestr N utility that wraps flux_job_statetostr() and allow scripts to use that. I'm ambivalent either way. (though if states ever start getting other flags masked in, sending an integer would start to be much more compact, instead of just a little more compact)

@garlick
Copy link
Member Author

garlick commented Jan 15, 2020

That's maybe a better idea, thanks!

@chu11
Copy link
Member

chu11 commented Jan 15, 2020

Granted flux job is a plumbing tool, but I just have trouble seeing the use case where someone needs the number of a job state returned from job-info.list, so I like the idea of returning the string.

If we really need a numeral someday, we could just add a new attribute in the job-info.list service, like state-number or something.

@grondo
Copy link
Contributor

grondo commented Jan 15, 2020

I wasn't considering a requirement for a state number instead of string, but instead was considering the size of underlying json strings. Probably not an issue at this point since it would just add a few characters per job.

@grondo
Copy link
Contributor

grondo commented Jan 18, 2020

It occurred to me this morning that switching to a state string may be a little more inconvenient in C/C++ code. E.g. @dongahn's MPIR PR we have:

    if (state > FLUX_JOB_RUN)

We'd likely at least want to add flux_job_stringtostate (s) to the API if state was returned as a string, though the state as integer seems much more convenient here.

@garlick
Copy link
Member Author

garlick commented Jan 18, 2020

OK, I'm thinking now it may be less bad to keep the numbers. It will be irritating to have to convert in order to use it it in masks and switch statements in C.

if (state > FLUX_JOB_RUN)

This is probably a bit unsafe though, in case we ever add a state and tack it on the end numerically to avoid breaking ABI (and it doesn't "come after" RUN).

Would it be appropriate to formalize the numbers and maybe the single character abbreviations in RFC 21?

@grondo
Copy link
Contributor

grondo commented Jan 18, 2020

This is probably a bit unsafe though, in case we ever add a state and tack it on the end numerically to avoid breaking ABI (and it doesn't "come after" RUN).

Good point!

Would it be appropriate to formalize the numbers and maybe the single character abbreviations in RFC 21?

Good idea. I wonder if we'll also want to explicitly state that the order of the state numbers is meaningless, and provide some convenience functions to satisfy use cases like this. Actually maybe in the case pasted above (where flux job attach wants to detect if a job has not proceeded past the RUN state), the code should use the virtual states (FLUX_JOB_PENDING | FLUX_JOB_RUNNING).

I'll comment in #2654 to that effect.

Note that I'm still open to switching job states away from state numbers. As long as we've got the use cases easily covered. (e.g. even a python script may want to test for a virtual state like "pending" . This could be done with state strings just as easily as state numbers, just a lot more string comparisons)

@grondo
Copy link
Contributor

grondo commented Jan 18, 2020

Actually maybe in the case pasted above (where flux job attach wants to detect if a job has not proceeded past the RUN state), the code should use the virtual states (FLUX_JOB_PENDING | FLUX_JOB_RUNNING).

Uh I guess that should have been FLUX_JOB_PENDING | FLUX_JOB_RUN, because the CLEANUP state would also be invalid for the debugger attach case.

@stale
Copy link

stale bot commented Mar 5, 2021

This issue has been automatically marked as stale because it has not had activity for 365 days. It will be closed if no further activity occurs within 14 days. Thank you for your contributions.

@stale stale bot added the wontfix label Mar 5, 2021
@chu11
Copy link
Member

chu11 commented May 5, 2022

i'm wondering if this issue should be resurrected given discussion in #4273, #4313, and #4234

  • we may want to dump large amounts of job data into the archive in 1 big json blob, so that's a lot of hard coded values going into the archive
  • i have a WIP branch that adds a new "states_mask" (bitmask of states that have been seen by a job) that would probably be needed for archiving, so thats more hardcoded stuff going into the archive.
  • additional job states may be coming: rfc21: add offline job state rfc#306

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants