Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Handle all possible exceptions when scheduling single node session #2411

Conversation

fregataa
Copy link
Member

@fregataa fregataa commented Jul 9, 2024

follow-up #643 #2155
refs https://github.com/lablup/giftbox/issues/691

manager's scheduler does not handle some of GenericBadRequest exceptions raised in _schedule_single_node_session(), which make status information empty for some pending sessions.

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue

@fregataa fregataa added the urgency:blocker IT SHOULD BE RESOLVED BEFORE NEXT RELEASE! label Jul 9, 2024
@fregataa fregataa added this to the 24.03 milestone Jul 9, 2024
@fregataa fregataa requested a review from adrysn July 9, 2024 14:56
@fregataa fregataa self-assigned this Jul 9, 2024
@github-actions github-actions bot added comp:manager Related to Manager component size:S 10~30 LoC labels Jul 9, 2024
Copy link

graphite-app bot commented Jul 9, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “flow:merge-queue” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “flow:hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@fregataa fregataa removed the urgency:blocker IT SHOULD BE RESOLVED BEFORE NEXT RELEASE! label Jul 9, 2024
@achimnol
Copy link
Member

I don't think this will fix all "pending-but-no-reason" cases, but the logic of fix is correct. Thanks!

@achimnol achimnol enabled auto-merge July 15, 2024 14:18
@achimnol achimnol added this pull request to the merge queue Jul 15, 2024
Merged via the queue into main with commit 735f737 Jul 15, 2024
25 of 26 checks passed
@achimnol achimnol deleted the fix/handle-all-possible-exceptions-when-scheduling-singlenode-session branch July 15, 2024 14:23
lablup-octodog pushed a commit that referenced this pull request Jul 15, 2024
…on (#2411)

Co-authored-by: Joongi Kim <[email protected]>
Backported-from: main (24.09)
Backported-to: 24.03
Backport-of: 2411
github-merge-queue bot pushed a commit that referenced this pull request Jul 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:manager Related to Manager component size:S 10~30 LoC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants