Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Set timeout for DB Advisory lock #1826

Merged
merged 11 commits into from
Feb 13, 2024
Merged

Conversation

fregataa
Copy link
Member

@fregataa fregataa commented Jan 8, 2024

Postgres Connections that wait for getting an advisory lock remain as idle_in_transaction status.

  1. Postgres closes any connection within a certain time. When a connection for distribution lock is closed, other process can obtain the lock. We should explicitly release the lock within the Postgres connection timeout.
  2. it is recommended to clean up sessions that open a transaction and remain idle.

How to test "Connection closed"

  1. Set [db] lock-conn-timeout value in manager.toml file. The default value is 0, which is infinite
  2. Insert await asyncio.sleep(61) in any lock context.
# src/ai/backend/manager/scheduler/dispatcher.py
# line 285
async with self.lock_factory(LockID.LOCKID_SCHEDULE, 60):
    await asyncio.sleep(61)
  1. Run manager and check logs

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation

@fregataa fregataa added this to the 24.03 milestone Jan 8, 2024
@fregataa fregataa self-assigned this Jan 8, 2024
@github-actions github-actions bot added comp:manager Related to Manager component size:M 30~100 LoC labels Jan 8, 2024
@fregataa fregataa added the urgency:3 Must be finished within a certain time frame. label Jan 9, 2024
@achimnol achimnol added this pull request to the merge queue Feb 13, 2024
Merged via the queue into main with commit 7c2ce24 Feb 13, 2024
26 checks passed
@achimnol achimnol deleted the fix/timeout-pg-advisory-lock branch February 13, 2024 02:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:manager Related to Manager component size:M 30~100 LoC urgency:3 Must be finished within a certain time frame.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants