Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

idea: add a method to determine the context within Flux in which a process is currently running #3817

Open
grondo opened this issue Aug 7, 2021 · 12 comments
Assignees
Labels
design don't expect this to ever be closed... enhancement

Comments

@grondo
Copy link
Contributor

grondo commented Aug 7, 2021

As noted in #3744, it would be useful to have some way for a process to determine the context in which it is running as it relates to Flux jobs, initial program, etc. Off the top of my head, I can think of a few different contexts we might want to delineate:

  1. Not within any Flux instance (flux_open () fails with ENOENT)
  2. Enclosing instance is the multi-user system instance (instance-level attribute is 0, security.owner != current UID, jobid attribute not set)
  3. Enclosing instance is a job in a foreign RM or flux start --test-size session, and process is running as part of initial program (same as above, but security.ower == current uid, instance-level is 0)
  4. Enclosing instance is a Flux job and process is running as part of initial program (jobid is set, `instance-level > 0)
  5. Enclosing instance is a Flux job and process is part of a job within that instance

AFAICT, there is not a good way to easily determine the difference between 4 and 5 above. Perhaps less importantly, there is not a clean way to tell the difference between 2 and 3 either (in the case a process is running with the UID of the flux user for example)

It might be nice if we could add a function that would return "something" to allow a process to differentiate between these different contexts. Since "context" is actually a bit of an overloaded term, we might need something different, but the only idea I've come up with so far is to have a set of named process "scopes".
This should be just considered an early idea at this point and we can iterate as much as people desire, or even throw out this idea as unnecessary if it will cause too much confusion.

Here's a first cut at names for the "scopes" outlined above:

  • 1: none
  • 2: system
  • 3,4: initial program (I suppose instance-level could be used to differentiate these two)
  • 5: job

We could add an API call flux_get_process_scope(3) which would return one of these strings, and would allow programs to alter behavior based on their current context. For the example of flux bcast it could abort with a warning if run in job scope since it likely doesn't make sense to run that command as a job.

A flux scope command could simply print the result of flux_get_process_scope(3) for use in scripts, etc.

@grondo grondo added enhancement design don't expect this to ever be closed... labels Aug 7, 2021
@grondo
Copy link
Contributor Author

grondo commented Dec 8, 2021

In discussing the repercussions of our inability to determine if a process is running in the "scope" of a Flux instance or a job within a Flux instance with @ofaaland, we had the idea to use a simple environment variable set by the job shell, but cleared by the flux-broker. Keying off this environment variable would allow flux_get_process_scope() or similar to determine whether the scope is job or initial program (perhaps instance is a better name for that one, I don't know)

This would be trivial to implement and would assist @ofaaland's use case immediately.

For now, the FLUX_KVS_NAMESPACE environment variable could be used as a stand-in for any future environment variable, since it is set only for jobs and cleared for the initial program.

@garlick
Copy link
Member

garlick commented Dec 8, 2021

It sounds like this could be helpful. Were you thinking the prototype would be something like this?

const char *flux_get_process_scope (void)

Maybe init would be OK as an abbreviation for initial program? A short, one word scope would be a little nicer popping out of a flux scope command.

@ofaaland
Copy link

ofaaland commented Dec 8, 2021 via email

ofaaland added a commit to ofaaland/libyogrt that referenced this issue Dec 12, 2021
Look in the environment for FLUX_JOB_ID.  Parse it to obtain
the 64-bit unsigned integer representation and store it.

Determine whether to query the current flux instance or the parent
for the expiration time of the allocation.  Note that this
currently works by checking the environment for FLUX_KVS_NAMESPACE,
but flux will provide a more explicit mechanism in the future.
See flux-framework/flux-core#3817
for details and status.

Fetch the expiration time and calculate remaining time based on that.
ofaaland added a commit to ofaaland/libyogrt that referenced this issue Dec 12, 2021
Look in the environment for FLUX_JOB_ID.  Parse it to obtain
the 64-bit unsigned integer representation and store it.

Determine whether to query the current flux instance or the parent
for the expiration time of the allocation.  Note that this
currently works by checking the environment for FLUX_KVS_NAMESPACE,
but flux will provide a more explicit mechanism in the future.
See flux-framework/flux-core#3817
for details and status.

Fetch the expiration time and calculate remaining time based on that.

Support get_rank() by looking in the environment for FLUX_TASK_RANK.
ofaaland added a commit to ofaaland/libyogrt that referenced this issue Dec 14, 2021
Look in the environment for FLUX_JOB_ID.  Parse it to obtain
the 64-bit unsigned integer representation and store it.

Determine whether to query the current flux instance or the parent
for the expiration time of the allocation.  Note that this
currently works by checking the environment for FLUX_KVS_NAMESPACE,
but flux will provide a more explicit mechanism in the future.
See flux-framework/flux-core#3817
for details and status.

Fetch the expiration time and calculate remaining time based on that.

Support get_rank() by looking in the environment for FLUX_TASK_RANK.
ofaaland added a commit to ofaaland/libyogrt that referenced this issue Dec 14, 2021
Look in the environment for FLUX_JOB_ID.  Parse it to obtain
the 64-bit unsigned integer representation and store it.

Determine whether to query the current flux instance or the parent
for the expiration time of the allocation.  Note that this
currently works by checking the environment for FLUX_KVS_NAMESPACE,
but flux will provide a more explicit mechanism in the future.
See flux-framework/flux-core#3817
for details and status.

Fetch the expiration time and calculate remaining time based on that.

Support get_rank() by looking in the environment for FLUX_TASK_RANK.
ofaaland added a commit to ofaaland/libyogrt that referenced this issue Dec 14, 2021
Look in the environment for FLUX_JOB_ID.  Parse it to obtain
the 64-bit unsigned integer representation and store it.

Determine whether to query the current flux instance or the parent
for the expiration time of the allocation.  Note that this
currently works by checking the environment for FLUX_KVS_NAMESPACE,
but flux will provide a more explicit mechanism in the future.
See flux-framework/flux-core#3817
for details and status.

Fetch the expiration time and calculate remaining time based on that.

Support get_rank() by looking in the environment for FLUX_TASK_RANK.
ofaaland added a commit to ofaaland/libyogrt that referenced this issue Jan 23, 2022
Add configure check X_AC_FLUX based on X_AC_LSF.

When the user does not specify a location, use pkg-config to determine
whether flux-core is installed and where.  Otherwise look for
flux-core.h and attempt to link to flux_open().

At runtime, look in the environment for FLUX_JOB_ID.

Determine whether to query the current flux instance or the parent
for the expiration time of the allocation.  Note that this
currently works by checking the environment for FLUX_KVS_NAMESPACE,
but flux will provide a more explicit mechanism in the future.
See flux-framework/flux-core#3817
for details and status.

Fetch the expiration time and calculate remaining time based on that.

Support get_rank() by looking in the environment for FLUX_TASK_RANK.
ofaaland added a commit to ofaaland/libyogrt that referenced this issue Jan 24, 2022
Add configure check X_AC_FLUX based on X_AC_LSF.

When the user does not specify a location, use pkg-config to determine
whether flux-core is installed and where.  Otherwise look for
flux/core.h and attempt to link against flux-core.so to use flux_open().

At runtime, look in the environment for FLUX_JOB_ID.

Determine whether to query the current flux instance or the parent
for the expiration time of the allocation.  Note that this
currently works by checking the environment for FLUX_KVS_NAMESPACE,
but flux will provide a more explicit mechanism in the future.
See flux-framework/flux-core#3817
for details and status.

Fetch the expiration time and calculate remaining time based on that.

Support get_rank() by looking in the environment for FLUX_TASK_RANK.
@jameshcorbett
Copy link
Member

The ability to distinguish between 1, 2, and 3/4 would be very useful for some workflow systems I either know about or work on directly.

@chu11
Copy link
Member

chu11 commented Oct 17, 2022

apologies, what is the difference between #4 and #5 above? There's a subtlety I'm missing.

@garlick
Copy link
Member

garlick commented Oct 17, 2022

The initial program (4) is not running as a job in its instance. It's just spawned directly by the broker. If there's a FLUX_JOB_ID set in its environment, it's the job ID of the flux instance in its enclosing instance.

The job (5) on the other hand is spawned by the flux shell and has a job ID in the flux instance.

Edit: confusing hence the need for tools :-)

@chu11
Copy link
Member

chu11 commented Oct 17, 2022

@garlick ahh, so basically "flux start foo.sh" vs "flux start flux mini run foo.sh"

@grondo
Copy link
Contributor Author

grondo commented Oct 17, 2022

in real world terms (4) is a batch script and associated processes (inlcluding the flux mini run in your example) while (5) is actual parallel job tasks.

@ofaaland
Copy link

ofaaland commented Oct 17, 2022 via email

@chu11 chu11 self-assigned this Oct 17, 2022
@chu11
Copy link
Member

chu11 commented Oct 18, 2022

slowly beginning to work on this and amongst the contexts listed above, it was hard to distinguish between a few of them in my head. As I thought about it, I think there's two different things trying to be differentiated:

  1. what "flux instance" am I running under, i.e. system instance, user instance (i.e. flux start --test-size), job instance (i.e. flux mini submit flux start)

  2. am i the initial program or a job

would two separate functions for these two things be better? I seems like we're mixing two thing together into one.

Aside, I guess for me, when I started "permutating" things i couldn't understand why the potential scopes weren't

1 - none
2A - enclosing instance system instance, process is initial program
2B - enclosing instance system instance, process is job
3A - enclosing instance job in foreign RM / flux start --test-size, process is initial program
3B - enclosing instance job in foreign RM / flux start --test-size, process is a job
4 - enclosing instance is flux job, process is initial program
5 - enclosing instance is flux job, process is a job

I suppose 2A is only conceptually possible??? although practically stupid

Edit: Oh wait, system instance is started via systemd, so I think impossible?

@grondo
Copy link
Contributor Author

grondo commented Oct 19, 2022

would two separate functions for these two things be better? I seems like we're mixing two thing together into one.

There are already simple ways to determine if the enclosing instance is a system instance vs single-user instance, or if you are in an initial program or a job (actually we have hit a problem here since FLUX_JOB_ID is set for the initial program, but that can be fixed).

I think the purpose of this issue is to add a single function that makes it easy for a caller to determine their rough "context", so that callers can make simple decisions with a single call to the Flux API.

Aside, I guess for me, when I started "permutating" things i couldn't understand why the potential scopes weren't

I think the initial scopes listed above were the conclusion of the particular use cases we had in mind. i.e. these were the 3 or 4 cases that were important to differentiate. There is balance between adding every permutation and keeping the call useful, i.e. we don't want every caller to have to have a long conditional to match every case where the current process is part of an initial program (i.e. batch script). It is better to IMO to keep the interface simple and cater to the common use case.

Edit: But I meant to say if we find a need to differentiate a couple other cases then that is fine too, but we should err on the side of simplicity. (e.g. I can't think of a reason a process would need to know whether it was in the "initial program" of a job that was running in a system instance, vs a single user Flux instance, vs a foreign RM, vs flux start --test-size. The whole point of Flux is that it shouldn't matter, and if it does (i.e. you need to talk to the parent, then you can further refine by checking attributes...)

@garlick
Copy link
Member

garlick commented Oct 25, 2022

I think if we create a flux_get_process_scope() API call, we should be sure it returns something sensible no matter where it is used. Looking over the current PR it would seem to fall short when called from a flux-proxy environment, or from anything running as instance owner in the system instance (cron jobs, perilog scripts, rc scripts).

Also, if we need a broker connection to obtain attributes to make the determination, it seems like we should allow that to be passed in to the API call so that a user doesn't have to connect to the broker twice (assuming they want to do more fluxish stuff), but then how do we know that the broker connection is the correct one?

IMHO it might be wise at this stage to provide a flux_get_remaining_time() call or similar, to constrain any heuristics to this one use case.

Sorry @chu11 to make this discouraging comment after a PR is already posted. I find this problem confusing to think about and the PR actually helped me make more sense of it than when we were discussing it here in the abstract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design don't expect this to ever be closed... enhancement
Projects
None yet
Development

No branches or pull requests

5 participants