Improved monitoring of leased connections #15

renatomassaro · 2024-11-24T15:06:00Z

Any process can fetch a connection. When they do, this connection is locked to this process, which is expected to release it once he's done. What happens if the process dies in the meantime? If it crashes? If it forgets to release?

That's where this PR comes in. If a process terminates without releasing the connection, we'll get notified and we'll clean up their mess.

If the caller process terminated normally (without an exception) and yet the connection was not released, emit a warning: it is likely the application forgot to commit or rollback their transaction.

Finally, in another commit I should handle the scenario where a caller process goes on to live forever with a locked connection. That might be okay, but only if explicitly requested by the caller process. I'm thinking timeout intervals with lease renewals.

EDIT: The above is now handled, except for "lease renewals". That's not needed now and it may never be needed. AFAIK not even Ecto has a similar feature. If the caller needs a big timeout, just set it from the start.

Summary by CodeRabbit

New Features
- Introduced a timeout mechanism for connection requests to enhance connection management.
- Added a new function to notify the repository upon release, improving transaction handling.
- Introduced a new module for managing repository configuration, including a struct for required fields.
Bug Fixes
- Enhanced error handling for connection management and transaction rollbacks.
Tests
- Expanded tests for connection management to cover new timeout scenarios and state validations after connection releases.
- Updated assertions in existing tests to validate additional connection state fields.
- Introduced a new function to block processes indefinitely for more reliable testing.
Documentation
- Added documentation for the new block_forever function to clarify its purpose in tests.

Any process can fetch a connection. When they do, this connection is locked to this process, which is expected to release it once he's done. What happens if the process dies in the meantime? If it crashes? If it forgets to release? That's where this monitoring comes in. If a process terminates without releasing the connection, we'll get notified and we'll clean up their mess. If the caller process terminated normally (without an exception) and yet the connection was not released, emit a warning: it is likely the application forgot to commit or rollback their transaction.

Now we automatically enqueue the requests; there is no polling

Now a request / caller / leasee can no longer hold the connection open indefinitely -- unless explicitly stated.

This allows the Repo to perform internal clean-up in order to be ready to serve new requests.

The implicit "if manager absent then it is nil" logic was causing bugs and it's always better to be explicit

coderabbitai · 2024-11-28T01:15:41Z

Walkthrough

The changes in this pull request enhance the functionality and error handling of the Feeb.DB.Repo and Feeb.DB.Repo.Manager modules. Key modifications include the addition of a notify_release/1 function for managing release notifications, updates to the initialization functions to accommodate a new manager_pid argument, and improvements to connection management logic with timeout mechanisms. Tests have been expanded to validate these changes, ensuring robustness in connection handling and transaction management across various scenarios.

Changes

File	Change Summary
`lib/feeb/db/repo.ex`	- Added method: `notify_release/1` - Updated method signatures for `start_link/1` and `init/1` to include `manager_pid` - Enhanced `handle_call/3` to manage `:mgt_connection_released` messages and rollback transactions if active.
`lib/feeb/db/repo/manager.ex`	- Added `@default_repo_timeout` constant - Updated `fetch_or_create_connection` to accept `caller` and `opts` - Enhanced `handle_info` to manage timeouts and dead processes. - Updated connection management functions to track caller processes and timeouts.
`test/db/repo/manager/registry_test.exs`	- Expanded assertions in `fetch_or_create/2` test to include new fields in manager state.
`test/db/repo/manager_test.exs`	- Added tests for connection release and timeout scenarios, ensuring robustness in connection management.
`test/db/repo_test.exs`	- Updated initialization of `Repo` to include `nil` in test cases. - Added tests for handling connection release scenarios.
`test/support/db.ex`	- Updated `ensure_migrated/2` to include `nil` in `Repo.start_link/1` call.
`test/support/db/prop.ex`	- Added alias for `Feeb.DB` for concise references in the code.
`test/support/utils.ex`	- Added new function `block_forever` to keep spawned processes active during tests.
`test/db/schema_test.exs`	- Updated assertions for `repo_config` to check against `%DB.Repo.RepoConfig{}` struct instead of a plain map.
`test/support/db/schemas/friend.ex`	- Updated `get_repo_config/2` to require `%DB.Repo.RepoConfig{}` struct as second argument.
`lib/feeb/db/repo/manager/repo_entry.ex`	- Introduced new module `RepoEntry` for managing repository entries with a focus on their busy state.
`lib/feeb/db/repo/repo_config.ex`	- Introduced new module `RepoConfig` to manage repository configuration with required fields.

Poem

In the burrow deep, where the code does flow,
A rabbit hops in, with changes to show.
With connections that timeout, and releases that sing,
The Repo now dances, with each little thing.
So let’s celebrate, with a hop and a cheer,
For the code is now better, let’s all give a cheer! 🐇✨

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 8

🧹 Outside diff range and nitpick comments (14)

test/support/db.ex (1)

Line range hint 52-71: Consider updating documentation to reflect new connection monitoring

The existing documentation thoroughly explains the current behavior regarding locked connections and test isolation. Given this PR's improvements to connection monitoring, consider adding a note about how the new connection monitoring features might affect or benefit the test infrastructure.
test/db/repo/manager/registry_test.exs (1)
21-43: Reduce duplication in connection state assertions

The assertions for write_1, read_1, and read_2 are identical. Consider extracting the expected state to a helper function or setup block to make the tests more maintainable.

Example approach:
+ defp expected_connection_state do
+   %{
+     pid: nil,
+     busy?: false,
+     caller_pid: nil,
+     monitor_ref: nil,
+     timer_ref: nil
+   }
+ end

  test "creates a previously non-existent Manager", %{shard_id: shard_id} do
    # ...
-   assert manager_state.write_1 == %{
-     pid: nil,
-     busy?: false,
-     caller_pid: nil,
-     monitor_ref: nil,
-     timer_ref: nil
-   }
+   assert manager_state.write_1 == expected_connection_state()
-   assert manager_state.read_1 == %{
-     pid: nil,
-     busy?: false,
-     caller_pid: nil,
-     monitor_ref: nil,
-     timer_ref: nil
-   }
+   assert manager_state.read_1 == expected_connection_state()
-   assert manager_state.read_2 == %{
-     pid: nil,
-     busy?: false,
-     caller_pid: nil,
-     monitor_ref: nil,
-     timer_ref: nil
-   }
+   assert manager_state.read_2 == expected_connection_state()
test/db/repo_test.exs (2)

12-12: LGTM! Consider adding documentation for the new parameter.

The addition of manager_pid parameter (set to nil) in the initialization tuples is consistent with the PR objectives. However, it would be helpful to document why this parameter is typically set to nil in tests.

Consider adding a comment explaining that nil indicates no manager validation is needed for these test cases.

Also applies to: 26-26, 51-51, 77-77

96-99: LGTM! Good addition of SQLite-level transaction verification.

The new assertions effectively verify the actual transaction state at the SQLite level, which strengthens the tests.

There's a typo in the comment on line 109: "colurse" should be "course".

Also applies to: 109-112
test/db/repo/manager_test.exs (4)
333-334: Ensure proper cleanup after transaction begins

In the test where a transaction is begun but not completed, ensure that the transaction is appropriately rolled back or cleaned up after the test to prevent side effects on other tests.

274-277: Consider asserting on monitor reference for completeness

After fetching the connection and before killing request_pid, you might want to assert that a monitor reference exists to confirm that the process is being monitored.
assert state_before.write_1.monitor_ref
307-310: Ensure the queue length reflects the number of waiting processes

In the assertion assert :queue.len(state_before.write_queue) == 1, confirm that the queue length accurately reflects the number of processes waiting for a connection. This adds clarity to the test and ensures that the state is as expected.

330-332: Handle potential timing issues in timeout test

The timeout value of 25 milliseconds may cause flakiness in test environments with varying performance. Consider increasing the timeout slightly or using a more robust method to trigger the timeout condition reliably.
lib/feeb/db/repo.ex (3)
20-21: Use named parameters in start_link/1 for improved readability

The function start_link/1 currently matches a tuple with anonymous elements {_, _, _, _, _}, which can reduce code clarity and maintainability. Using named variables enhances readability and makes the code easier to understand.

Consider refactoring the function signature as follows:
-def start_link({_, _, _, _, _} = args),
+def start_link({context, shard_id, path, mode, manager_pid} = args),
30-31: Enhance documentation for notify_release/1

The docstring for notify_release/1 can be improved for clarity and completeness. Providing more precise documentation helps other developers understand the purpose and usage of the function.

Consider updating the documentation:
 @doc """
-  Used by the Repo.Manager to notify once the Repo has been released. Useful to resetting internal
-  counters, transaction_id etc.
+  Notifies the `Repo` that it has been released by the `Repo.Manager`.
+  Resets internal counters and the `transaction_id`.
  """
 def notify_release(pid),
396-405: Simplify assert_release_signal_from_manager!/2 function clauses

The multiple clauses of assert_release_signal_from_manager!/2 can be consolidated for better readability and maintainability.

Combine the clauses and handle the test environment condition within the function:
-defp assert_release_signal_from_manager!(manager_pid, manager_pid), do: :ok

-if @env == :test do
-  defp assert_release_signal_from_manager!(nil, _caller_pid), do: :ok
-end

-defp assert_release_signal_from_manager!(manager_pid, other_pid) do
-  "Repo can only be released by its Manager (#{inspect(manager_pid)}, got #{inspect(other_pid)})"
-  |> raise()
-end
+defp assert_release_signal_from_manager!(expected_manager_pid, caller_pid) do
+  cond do
+    expected_manager_pid == caller_pid ->
+      :ok
+    @env == :test and is_nil(expected_manager_pid) ->
+      :ok
+    true ->
+      raise "Repo can only be released by its Manager (#{inspect(expected_manager_pid)}, got #{inspect(caller_pid)})"
+  end
+end
lib/feeb/db/repo/manager.ex (3)
81-87: Consider using a struct for connection entries to improve maintainability

The current implementation initializes each connection entry with similar keys. Converting each connection entry to a struct would enhance code readability, reduce duplication, and simplify future modifications.

288-288: Resolve the review comment regarding linking GenServers

There's a comment questioning whether to link both GenServers:
# REVIEW: Do I really want to link both genservers?
Please decide whether it's appropriate to link the Repo process to the Manager process and update the code accordingly. Unresolved comments can be confusing and should be addressed before merging.

360-360: Adjust log level for dead caller information

In the process_enqueued_callers/2 function, when a caller has died while waiting for a connection, an info-level log is generated:
Logger.info("Caller #{inspect(caller_pid)} waiting for a connection has died; skipping it")
Consider changing the log level to :debug, as this may be a normal occurrence and might not require attention at the :info level.

Apply this diff to adjust the log level:
- Logger.info("Caller #{inspect(caller_pid)} waiting for a connection has died; skipping it")
+ Logger.debug("Caller #{inspect(caller_pid)} waiting for a connection has died; skipping it")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 0e9f398 and 72d82b8.

📒 Files selected for processing (8)

lib/feeb/db/repo.ex (4 hunks)
lib/feeb/db/repo/manager.ex (10 hunks)
test/db/repo/manager/registry_test.exs (1 hunks)
test/db/repo/manager_test.exs (3 hunks)
test/db/repo_test.exs (7 hunks)
test/support/db.ex (1 hunks)
test/support/db/prop.ex (2 hunks)
test/support/utils.ex (1 hunks)

🔇 Additional comments (11)

test/support/utils.ex (2)

27-30: Documentation clearly explains the purpose and usage

The documentation effectively communicates the function's purpose and its role in preventing test flakiness.

31-35: Verify cleanup in test teardown

While the implementation is correct for keeping processes alive during tests, ensure that any processes using block_forever/0 are properly terminated in test teardown to prevent resource leaks.

Consider adding a helper function for safer process management in tests:

def spawn_blocking_process(fun) do
  pid = spawn_and_wait(fun)
  ExUnit.Callbacks.on_exit(fn ->
    Process.exit(pid, :kill)
  end)
  pid
end

This would ensure automatic cleanup while maintaining the same blocking behavior.

✅ Verification successful

No cleanup needed - test process management is appropriate

The block_forever usage is actually part of a test that deliberately spawns a process to verify error handling behavior. The process is intentionally spawned to trigger a specific error condition, and the test properly:

Uses Process.flag(:trap_exit, true) to handle the process termination
Verifies the error condition with assert_receive {:EXIT, ^repo, ...}
The spawned process is automatically cleaned up when the repo process exits

The process blocking is an intentional part of the test's error handling verification, and the test properly manages and verifies process termination. No additional cleanup is necessary in this case.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if there are tests using block_forever without proper cleanup

# Look for test files using block_forever
echo "Checking for tests using block_forever..."
rg -l "block_forever" test/**/*_test.exs | while read -r file; do
  echo "Analyzing $file..."
  # Check if the file has proper cleanup mechanisms
  rg -A 10 "block_forever" "$file"
  # Look for cleanup patterns in the same file
  echo "Checking for cleanup patterns..."
  rg "on_exit|ExUnit.Callbacks|teardown" "$file"
done

Length of output: 986

Script:

#!/bin/bash
# Let's check for process management patterns around block_forever usage

# First, let's get more context around the block_forever usage
echo "Checking broader context of block_forever usage in repo_test.exs..."
rg -B 10 -A 10 "block_forever" test/db/repo_test.exs

# Let's also check for process management related setup/teardown in the test file
echo "\nChecking test setup and process management..."
rg "setup|describe|test" test/db/repo_test.exs

# Check if there are any process monitoring or cleanup patterns
echo "\nChecking process management patterns..."
rg "Process\.(monitor|exit|flag)|spawn(_link)?" test/db/repo_test.exs

Length of output: 2854

test/support/db/prop.ex (2)

3-3: LGTM: Clean alias addition

The new alias improves code readability by allowing direct references to DB functions.

25-29: Verify connection cleanup in test setup

Given the PR's focus on connection leasing and monitoring, we should verify that test connections are properly managed. The immediate commit after begin is fine, but let's ensure we're not leaving any connections in a leased state during test setup.

✅ Verification successful

Let me analyze the results and generate additional verification scripts to check for proper connection cleanup patterns.

Let me try one more verification to specifically look at the test support modules and their connection handling patterns.

Based on the verification results, I can now provide a final response:

Connection handling is properly managed in test setup

The test database setup in test/support/db/prop.ex correctly manages connections:

The DB.begin/commit sequence is properly paired
The SQLite connection is explicitly closed with SQLite.close(db_conn)
The connection is only used for initialization and not kept open
The special test transaction ID (-99) is used consistently

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for proper connection handling in test setup
# Look for connection cleanup patterns in test files

# Check for connection cleanup patterns in test files
echo "Checking for connection cleanup patterns in tests..."
rg -A 5 "DB\.(begin|commit|rollback)" "test/"

# Look for potential connection leaks in test setup
echo "Checking for potential connection leaks in test setup..."
ast-grep --pattern 'def setup do
  $$$
  DB.begin($$$)
  $$$
end'

Length of output: 31795

Script:

#!/bin/bash
# Look for connection cleanup patterns in test setup/teardown

# Check for cleanup in setup/teardown blocks
echo "Checking setup/teardown blocks for connection cleanup..."
rg -A 10 "setup do|setup_all do" "test/"

# Check for DB.commit/rollback in test helper modules
echo "Checking test helpers for connection cleanup..."
rg -A 5 "def (setup|teardown|after_setup|before_setup)" "test/support/"

# Look for on_exit cleanup patterns
echo "Checking for on_exit cleanup..."
rg "on_exit" "test/"

Length of output: 1226

Script:

#!/bin/bash
# Check test support modules for connection handling patterns

# Look at the specific test support file content
echo "Checking test support files..."
cat test/support/db/prop.ex

# Look for connection cleanup patterns in test support
echo "Checking for connection cleanup in test support..."
rg -B 5 -A 5 "DB\.(commit|rollback)" "test/support/"

# Check for any connection cleanup helpers
echo "Checking for cleanup helpers..."
rg -l "def cleanup|def reset|def clear" "test/support/"

Length of output: 2384

test/support/db.ex (2)

28-33: LGTM! Good connection management pattern

The function follows a robust pattern by explicitly closing the connection after migration, which aligns well with the PR's focus on proper connection management. The synchronous close via GenServer.call ensures the migration completes before proceeding.

30-30: Verify consistent usage of manager_pid parameter across test files

The addition of nil as the manager_pid parameter aligns with the PR's connection monitoring improvements. However, we should verify that this pattern is consistently applied across all test files that initialize the Repo.

✅ Verification successful

Consistent usage of manager_pid parameter confirmed

The verification shows that manager_pid parameter is consistently used across test files:

In test/support/db.ex: Uses nil for migration setup
In test/db/repo_test.exs: Uses self() for testing connection management

The different values (nil vs self()) are intentional and appropriate for their contexts:

Migration setup doesn't need connection management, hence nil
Repository tests verify connection management, hence self()

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for other Repo.start_link calls in test files to ensure consistent parameter usage
# Expected: All calls should include the manager_pid parameter

# Search for Repo.start_link calls in test files
rg "Repo\.start_link" test/ -A 2

Length of output: 490

test/db/repo_test.exs (1)

204-254: LGTM! Excellent test coverage for connection release scenarios.

The new test block comprehensively covers the connection release functionality:

Regular lifecycle verification
Automatic transaction rollback for abandoned connections
Manager validation for release signals

This aligns perfectly with the PR objectives of improving connection lease monitoring.

test/db/repo/manager_test.exs (2)

364-364: Check the initial value of timer_ref for consistency

When setting timer_ref to :no_timer, ensure that this value is consistently used throughout the codebase to represent the absence of a timer. Verify that there are no cases where nil or another sentinel value is used instead.

342-342: 🛠️ Refactor suggestion

Clarify the source of the {:EXIT, manager, :feebdb_repo_timeout} message

In the test, the process is trapping exits and expects to receive an {:EXIT, manager, :feebdb_repo_timeout} message. However, the manager process does not appear to be linked to the test process, so this message may not be received as expected. Ensure that the test process is properly linked to the manager or adjust the test to assert on the correct message.

Verify that the manager process is linked:

# Add this before fetching the connection
Process.link(manager)

Alternatively, if the intention is to receive a message when the leasee process exits due to timeout, consider monitoring the leasee process instead.

lib/feeb/db/repo.ex (1)

35-35: Ensure consistent parameter naming between start_link/1 and init/1

After refactoring start_link/1 to use named parameters, ensure that init/1 also uses the same parameter names for consistency and clarity.

lib/feeb/db/repo/manager.ex (1)

387-395: Verify handling of :infinity values in repository timeouts

The start_repo_timeout_timer/3 function checks if repo_timeout is an integer before setting up a timeout:

if is_integer(repo_timeout) do
  Process.send_after(self(), {:repo_timeout, key, caller_pid}, repo_timeout)
else
  :no_timer
end

Ensure that passing :infinity as a timeout value behaves as intended, and that the absence of a timer does not cause any unintended issues.

Run the following script to check for any usage of :infinity in repo_timeout and verify that it's handled correctly:

✅ Verification successful

Let me gather more information about the usage of timeouts in the codebase.

:infinity timeout handling is correctly implemented

The codebase shows proper handling of :infinity timeout values:

The test "allows for infinite timeout" explicitly verifies that timeout: :infinity works as expected and confirms no timer is created (assert manager_state.write_1.timer_ref == :no_timer)
The implementation in start_repo_timeout_timer/3 correctly handles non-integer values (including :infinity) by returning :no_timer
The behavior is documented, as seen in the codebase comments indicating that :infinity is a valid timeout value
The functionality is actively used in tests, particularly in stress tests where timeout: :infinity is needed for long-running operations

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all calls to `fetch_connection` that may pass `:infinity` as `:timeout` option.

# Search for `fetch_connection` calls with `:timeout` option.
rg 'fetch_connection\(.+,\s*.+,\s*\[.*:timeout\s*=>\s*:infinity.*\]\)' -A 5

# Verify that `start_repo_timeout_timer/3` properly handles `:infinity` values.
rg 'defp\s+start_repo_timeout_timer.*' -A 10

Length of output: 762

Script:

#!/bin/bash
# Search for any usage of :infinity in relation to timeouts
rg ':infinity.*timeout|timeout.*:infinity' -B 2 -A 2

# Search for @default_repo_timeout definition
rg '@default_repo_timeout.*' -B 2 -A 2

# Search for any tests related to repo timeouts
rg 'test.*repo.*timeout|test.*timeout.*repo' -B 2 -A 2

# Look for any documentation or specs related to timeout handling
ast-grep --pattern 'defp start_repo_timeout_timer($$$) :: $$$'

Length of output: 3729

coderabbitai · 2024-11-28T01:22:04Z

test/db/repo/manager/registry_test.exs

-      assert manager_state.read_1 == %{pid: nil, busy?: false}
-      assert manager_state.read_2 == %{pid: nil, busy?: false}
+
+      # TODO: Assert against initial struct instead


🛠️ Refactor suggestion

Address the TODO by using struct-based assertions

Instead of manual map comparisons, consider using the actual struct type for more maintainable assertions.

Example approach:

- # TODO: Assert against initial struct instead - assert manager_state.write_1 == %{ - pid: nil, - busy?: false, - caller_pid: nil, - monitor_ref: nil, - timer_ref: nil - } + initial_state = %Feeb.DB.Repo.Manager.ConnectionState{ + pid: nil, + busy?: false, + caller_pid: nil, + monitor_ref: nil, + timer_ref: nil + } + assert manager_state.write_1 == initial_state

Committable suggestion skipped: line range outside the PR's diff.

Yes, as the comment suggests this is something I intend on doing, but now now... Fine, I'll do it now.

@renatomassaro, thank you for making the change!

If you found this review helpful, would you consider giving us a shout-out on X?
Thank you for using CodeRabbit!

coderabbitai · 2024-11-28T01:22:04Z

test/db/repo_test.exs

+      # It blows up if called from elsewhere
+      spawn(fn ->
+        GS.call(repo, {:mgt_connection_released})
+        block_forever()


⚠️ Potential issue

Fix undefined function reference.

The block_forever() function is called but not defined in the test file.

Consider adding the function definition or using an alternative approach like:

-block_forever() +Process.sleep(:infinity)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

block_forever()

Process.sleep(:infinity)

It's defined elsewhere

@renatomassaro, thank you for the clarification. Since block_forever/0 is defined elsewhere in the codebase, no changes are needed.

✏️ Learnings added

Learnt from: renatomassaro PR: renatomassaro/FeebDB#15 File: test/db/repo_test.exs:247-247 Timestamp: 2024-11-28T09:52:22.579Z Learning: The function `block_forever/0` is defined in `Test.Utils` and is used in tests to keep a process active indefinitely.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-11-28T01:22:05Z

test/db/repo/manager_test.exs

+      # Give ample time for the `handle_info({:DOWN, ...})` callback to run
+      :timer.sleep(10)


🛠️ Refactor suggestion

Use deterministic synchronization instead of :timer.sleep/1

Using :timer.sleep/10 to wait for asynchronous operations can lead to flaky tests, especially in environments with variable performance like CI servers. Consider using message passing or monitoring to ensure that the asynchronous operations have completed.

For example, you can monitor the process and wait for the :DOWN message:

# Replace this: Process.exit(request_pid, :kill) :timer.sleep(10) # With this: ref = Process.monitor(request_pid) Process.exit(request_pid, :kill) assert_receive {:DOWN, ^ref, :process, ^request_pid, _reason}

This ensures that the test proceeds only after request_pid has been terminated and the handle_info/2 callback has been invoked.

Also applies to: 312-313, 340-341, 315-316

I agree, but I don't see a way to sync deterministically without having a hack that affects the test readability.

For example, I could instead do a GenServer.call to the Repo.Manager: singe a GenServer is sequential, the call will only return after handle_info has executed -- but it would still have the race condition of the call arriving before the :DOWN signal, so we are back to square one.

Your suggestion of trapping exits doesn't apply -- or maybe it does, but not in the way you suggested: I can wait for the :DOWN message in the spawned pid and, once it arrives, I send a message to the test pid so it can proceed. That would work, but IMO it's not worth the readability downsides. I'll do that only if this test proves to flake. So far, it hasn't.

@renatomassaro Understood, thank you for the clarification. If the test starts to flake in the future, revisiting synchronization methods could be helpful. Let me know if you need any assistance later on.

coderabbitai · 2024-11-28T01:22:05Z

lib/feeb/db/repo.ex

+  def handle_call({:mgt_connection_released}, {caller_pid, _}, state) do
+    # Make sure only the Repo.Manager can send mgt signals to the Repo.
+    assert_release_signal_from_manager!(state.manager_pid, caller_pid)
+
+    if not is_nil(state.transaction_id) do
+      Logger.info("Connection released forcibly; rolling back transaction #{state.transaction_id}")
+      :ok = SQLite.exec(state.conn, "ROLLBACK")
+    end
+
+    # Reset the GenServer state so we are ready to serve a new request
+    {:reply, :ok, %{state | transaction_id: nil}}
+  end


⚠️ Potential issue

Handle potential race conditions when rolling back transactions on release

In handle_call/3 for {:mgt_connection_released}, there may be a race condition if a transaction is committed or rolled back just before the check for state.transaction_id. This could lead to an incorrect rollback or state inconsistency.

Consider synchronizing the access to state.transaction_id or using a more robust mechanism to manage transaction state.

there may be a race condition if a transaction is committed or rolled back just before the check for state.transaction_id

That's not true. If we are handling the :mgt_connection_released call, then the GenServer is unable to process any other messages at the same time.

If it processes a :commit message immediately after, it will fail because you can't COMMIT with transaction_id=nil.

@renatomassaro, thank you for the clarification. You're correct that the GenServer processes messages sequentially, so there's no race condition in this scenario. I appreciate the explanation, and I'll keep this in mind for future reviews.

✏️ Learnings added

Learnt from: renatomassaro PR: renatomassaro/FeebDB#15 File: lib/feeb/db/repo.ex:142-153 Timestamp: 2024-11-28T09:59:48.393Z Learning: In the `Feeb.DB.Repo` module, the GenServer processes messages sequentially, so there is no race condition within `handle_call/3` functions, and `state` mutations are safe from concurrent modifications.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-11-28T01:22:05Z

lib/feeb/db/repo.ex

+    assert_release_signal_from_manager!(state.manager_pid, caller_pid)
+


🛠️ Refactor suggestion

Clarify parameter names in assert_release_signal_from_manager!/2

The function assert_release_signal_from_manager!/2 uses the same name for both parameters in pattern matching, which can be confusing and potentially error-prone.

Refactor the function to use distinct parameter names:

-defp assert_release_signal_from_manager!(manager_pid, manager_pid), do: :ok +defp assert_release_signal_from_manager!(expected_manager_pid, caller_pid) when expected_manager_pid == caller_pid, do: :ok

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

assert_release_signal_from_manager!(state.manager_pid, caller_pid)

assert_release_signal_from_manager!(state.manager_pid, caller_pid)

It may be confusing to someone who is new to Erlang/Elixir; otherwise this is considered idiomatic pattern matching.

@renatomassaro, understood. Thank you for the clarification. I'll keep in mind that using the same parameter name in pattern matching is considered idiomatic in Erlang/Elixir.

✏️ Learnings added

Learnt from: renatomassaro PR: renatomassaro/FeebDB#15 File: lib/feeb/db/repo.ex:144-145 Timestamp: 2024-11-28T10:01:57.008Z Learning: In `lib/feeb/db/repo.ex`, when working with Erlang/Elixir code, it's idiomatic to use the same parameter name in function pattern matching to assert that parameters are equal. For example, defining `defp assert_release_signal_from_manager!(manager_pid, manager_pid), do: :ok` is acceptable and preferred.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-11-28T01:22:05Z

lib/feeb/db/repo/manager.ex

+  def handle_info({:repo_timeout, key, caller_pid}, state) do
+    case state[key] do
+      %{pid: repo_pid, busy?: true, caller_pid: ^caller_pid} ->
+        # Kill the caller
+        Process.exit(caller_pid, :feebdb_repo_timeout)
+
+        # Release the connection. It is supposed to always succeed
+        {:ok, ^key, new_state} = do_release_connection(state, repo_pid)
+        notify_enqueued_requests(new_state, key)
+
+        {:noreply, new_state}
+
+      %{caller_pid: _other_caller_or_nil} ->
+        # Despite the timeout, the connection is no longer used (or used by a different caller)
+        # Theoretically, this is a possible race condition. In this case, we just don't do anything
+        {:noreply, state}
+    end
+  end


⚠️ Potential issue

Ensure safe termination of caller processes on repository timeout

The handle_info/2 callback forcibly terminates the caller process using Process.exit/2 when a repository timeout occurs. While this ensures that long-running processes are halted, it may lead to unintended side effects or affect process supervision trees.

Consider whether abruptly killing the caller process is the best approach. Alternatively, you might send a timeout message to the process or implement a graceful shutdown strategy.

Yes, that's exactly what we want. The timeout is a contract, and if the process exceeded it it should die.

@renatomassaro, understood. I'll note that for future reviews.

✏️ Learnings added

Learnt from: renatomassaro PR: renatomassaro/FeebDB#15 File: lib/feeb/db/repo/manager.ex:137-154 Timestamp: 2024-11-28T10:03:22.504Z Learning: In the `Feeb.DB.Repo.Manager` module (`lib/feeb/db/repo/manager.ex`), it's acceptable to terminate caller processes using `Process.exit(caller_pid, :feebdb_repo_timeout)` in the `handle_info/2` callback when a repository timeout occurs, as the timeout is a contract and the process should die if it exceeds it.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2024-11-28T01:22:05Z

lib/feeb/db/repo/manager.ex

+      # connection is properly released
+      {key, %{pid: repo_pid, busy?: true, caller_pid: ^caller_pid}} ->
+        # Sanity check: by definition, the caller process is dead
+        false = Process.alive?(caller_pid)


⚠️ Potential issue

Avoid assertive pattern matching when verifying process liveness

Using false = Process.alive?(caller_pid) will raise a MatchError if the process is unexpectedly alive. While the process is expected to be dead at this point, race conditions may occur.

Instead, consider using an if-statement to log a warning or take appropriate action if the process is still alive.

Apply this diff to handle this case more gracefully:

- false = Process.alive?(caller_pid) + if Process.alive?(caller_pid) do + Logger.warning("Caller process #{inspect(caller_pid)} is still alive during DOWN handler") + # Additional handling if necessary + end

Committable suggestion skipped: line range outside the PR's diff.

race conditions may occur

I don't think a race condition is possible in this case, but I'm not 100% sure, which is why I want to have a "blow everything up" protection in case it does happen.

If after some time using this code I realize this is impossible, then I'll just remove this assertion. Furthermore, since we do a clean-up at the Repo itself, even if the caller is still alive and somehow still manages to performs requests, the request will fail at the Repo layer (because it had its state reset).

@renatomassaro, understood. Keeping the assertion as a safeguard makes sense in this case. Thank you for the clarification.

coderabbitai · 2024-11-28T01:22:05Z

lib/feeb/db/repo/manager.ex

+        :ok = Repo.notify_release(pid)
+


⚠️ Potential issue

Handle potential errors from Repo.notify_release/1

Currently, the code assumes that Repo.notify_release(pid) will always return :ok. If this function fails or returns an error, it will cause a MatchError at runtime. To prevent unexpected crashes, consider handling potential errors.

Apply this diff to handle errors more gracefully:

- :ok = Repo.notify_release(pid) + case Repo.notify_release(pid) do + :ok -> + :ok + {:error, reason} -> + Logger.error("Failed to notify repo release: #{inspect(reason)}") + # Handle the error appropriately, such as retrying or proceeding without the notification + end

Committable suggestion skipped: line range outside the PR's diff.

This is on purpose. If we are unable to release the Repo connection for any reason, kill everything and start fresh because something is wrong.

@renatomassaro Understood. Thank you for the clarification.

✏️ Learnings added

Learnt from: renatomassaro PR: renatomassaro/FeebDB#15 File: lib/feeb/db/repo/manager.ex:249-250 Timestamp: 2024-11-28T10:08:23.535Z Learning: In `lib/feeb/db/repo/manager.ex`, within the `do_release_connection/2` function, the line `:ok = Repo.notify_release(pid)` is intentionally designed to match `:ok`, ensuring that if `Repo.notify_release/1` fails, a `MatchError` is raised to trigger a crash and restart of the system. This is by design to handle critical errors, and handling potential errors from `Repo.notify_release/1` is not desired.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)

test/db/schema_test.exs (1)
Line range hint 113-118: LGTM! Consider adding write mode test cases.

The change from plain map to DB.Repo.RepoConfig struct improves type safety and connection tracking, which aligns well with the PR's goal of better connection management. The struct's fields effectively capture essential connection metadata including context, shard, mode, and path.

Consider adding test cases that verify the repo_config virtual field with write mode connections to ensure complete test coverage of connection modes. Example:
test "virtual fields with write mode connections", %{shard_id: shard_id} do
  DB.begin(@context, shard_id, :write)
  
  friend = DB.one({:friends, :get_by_name}, "Ross")
  
  expected_repo_config = %DB.Repo.RepoConfig{
    context: @context,
    shard_id: shard_id,
    mode: :write,
    path: DB.Repo.get_path(@context, shard_id)
  }
  
  assert friend.repo_config == expected_repo_config
end
test/db/repo_test.exs (1)

204-255: LGTM: Comprehensive test coverage for connection release management

Excellent test coverage for the new connection release management functionality. The tests effectively verify:

Regular lifecycle behavior

Transaction rollback on forced release

Manager authorization checks

Consider adding these additional test scenarios:

Concurrent release attempts from multiple processes

Release during long-running transactions

Release with pending queries

Would you like me to help implement these additional test scenarios?
lib/feeb/db/repo.ex (2)
31-36: Enhance documentation with usage examples.

While the documentation explains the purpose, it would be more helpful to include:

When exactly this function should be called

Example usage scenarios

Expected behavior when called during an active transaction

156-157: Consider adding success logging for connection releases.

Adding a log message when successfully resetting the state would help with monitoring and debugging connection lifecycle events.
 # Reset the GenServer state so we are ready to serve a new request
+Logger.info("Successfully reset connection state")
 {:reply, :ok, %{state | transaction_id: nil}}

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 72d82b8 and 29b1dd5.

📒 Files selected for processing (6)

lib/feeb/db/repo.ex (4 hunks)
lib/feeb/db/repo/manager.ex (10 hunks)
test/db/repo/manager/registry_test.exs (2 hunks)
test/db/repo_test.exs (7 hunks)
test/db/schema_test.exs (1 hunks)
test/support/db/schemas/friend.ex (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

lib/feeb/db/repo/manager.ex
test/db/repo/manager/registry_test.exs

🧰 Additional context used

📓 Learnings (2)

lib/feeb/db/repo.ex (1)

Learnt from: renatomassaro
PR: renatomassaro/FeebDB#15
File: lib/feeb/db/repo/manager.ex:249-250
Timestamp: 2024-11-28T10:08:23.649Z
Learning: In `lib/feeb/db/repo/manager.ex`, within the `do_release_connection/2` function, the line `:ok = Repo.notify_release(pid)` is intentionally designed to match `:ok`, ensuring that if `Repo.notify_release/1` fails, a `MatchError` is raised to trigger a crash and restart of the system. This is by design to handle critical errors, and handling potential errors from `Repo.notify_release/1` is not desired.

test/db/repo_test.exs (1)

Learnt from: renatomassaro
PR: renatomassaro/FeebDB#15
File: test/db/repo_test.exs:247-247
Timestamp: 2024-11-28T09:52:22.690Z
Learning: The function `block_forever/0` is defined in `Test.Utils` and is used in tests to keep a process active indefinitely.

🔇 Additional comments (8)

test/support/db/schemas/friend.ex (2)

3-3: LGTM! Clean alias addition

The new alias improves code readability and maintains consistency with the existing aliasing pattern.

22-22: Verify test coverage with the stricter type checking

The addition of pattern matching %DB.Repo.RepoConfig{} improves type safety but may affect existing tests.

Let's verify that all test cases pass the correct config type:

✅ Verification successful

Pattern matching on RepoConfig is properly tested

The verification shows that:

The get_repo_config/2 function is used in test cases with the correct %DB.Repo.RepoConfig{} struct type in test/db/schema_test.exs
The test properly initializes the struct with all required fields: context, shard_id, mode, and path

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential test failures due to incorrect repo config type usage

# Search for test cases using get_repo_config
rg -A 5 "get_repo_config" test/

# Search for RepoConfig struct usage in tests
rg -A 5 "DB\.Repo\.RepoConfig" test/

Length of output: 1456

test/db/repo_test.exs (2)

12-12: LGTM: Consistent initialization parameter updates

The addition of nil as the manager_pid parameter across all initialization calls is consistent and aligns with the new connection management functionality.

Also applies to: 26-26, 51-51, 77-77

96-99: LGTM: Enhanced transaction state verification

Good addition of explicit transaction state verification using direct SQLite calls. This improves test coverage by ensuring the actual database state matches the expected behavior.

Also applies to: 108-112

lib/feeb/db/repo.ex (4)

7-9: LGTM! Well-structured struct definition.

The struct definition properly enforces required fields while allowing transaction_id to be optional, which aligns with the connection management requirements.

Line range hint 40-57: LGTM! Robust initialization with proper error handling.

The initialization properly integrates the manager_pid into the state and maintains comprehensive error handling for database connection failures.

147-158: LGTM! Robust connection release handling.

The implementation correctly:

Validates the release signal source
Rolls back any active transaction
Resets the state for new requests

Note: The sequential message processing of GenServer ensures thread-safety here.

401-410: LGTM! Idiomatic implementation of release signal validation.

The implementation follows Erlang/Elixir best practices:

Uses pattern matching for validation
Provides clear error messages
Includes test-specific handling

Type safety!

coderabbitai

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (7)

lib/feeb/db/repo/repo_config.ex (1)
1-4: Add type specifications for struct fields.

The struct definition is clean and well-organized. Consider adding type specifications to improve code documentation and enable better tooling support.
+  @type context :: atom()
+  @type shard_id :: String.t()
+  @type mode :: :primary | :replica
+  @type path :: String.t()
+
+  @type t :: %__MODULE__{
+    context: context(),
+    shard_id: shard_id(),
+    mode: mode(),
+    path: path()
+  }

   @struct_keys [:context, :shard_id, :mode, :path]
   @enforce_keys @struct_keys
   defstruct @struct_keys
lib/feeb/db/repo/manager/repo_entry.ex (4)
1-3: LGTM! Consider adding type specs and field documentation.

The struct definition is well-structured with appropriate fields for tracking connection state. The use of @enforce_keys ensures the busy state is always set.

Consider adding:
 defmodule Feeb.DB.Repo.Manager.RepoEntry do
   @enforce_keys [:busy?]
+  @type t :: %__MODULE__{
+    pid: pid() | nil,           # Repository process ID
+    caller_pid: pid() | nil,    # Process ID of the connection lessee
+    timer_ref: reference() | nil,# Reference for lease timeout timer
+    monitor_ref: reference() | nil,# Reference for process monitor
+    busy?: boolean()            # Whether the connection is currently leased
+  }
   defstruct [:pid, :caller_pid, :timer_ref, :monitor_ref, busy?: false]
5-8: Enhance function documentation.

While the function is correctly implemented, the documentation could be more descriptive about the function's purpose and return value.

Consider expanding the documentation:
   @doc """
-  Creates an initial RepoEntry
+  Creates an initial RepoEntry representing an available connection slot.
+
+  Returns a %RepoEntry{} with busy?: false and all other fields set to nil.
   """
10-14: Add type validation and preserve existing fields.

The function correctly ensures state transition but could be improved in two ways:

Add type validation for repo_pid

Preserve any existing optional fields during the transition

Consider this implementation:
   @doc """
   Set a Repo pid to an initial RepoEntry
   """
+  def on_establish(%__MODULE__{pid: nil, busy?: false} = entry, repo_pid) when is_pid(repo_pid),
-  def on_establish(%__MODULE__{pid: nil, busy?: false}, repo_pid),
-    do: %__MODULE__{pid: repo_pid, busy?: false}
+    do: %{entry | pid: repo_pid}
1-43: Consider adding explicit error handling and resource cleanup.

While the module correctly implements the state machine for repository entries, consider these architectural improvements:

Add explicit error handling for invalid state transitions

Ensure proper cleanup of timer and monitor references when transitioning states

Consider adding logging for state transitions to aid in debugging

This will help prevent resource leaks and make the system more maintainable.
lib/feeb/db/repo.ex (1)
156-157: Consider adding success logging for connection releases.

Adding a log message when successfully resetting the state would help with monitoring and debugging connection lifecycle events.
 # Reset the GenServer state so we are ready to serve a new request
+Logger.info("Successfully reset connection state")
 {:reply, :ok, %{state | transaction_id: nil}}
lib/feeb/db/repo/manager.ex (1)
291-302: Consider improving error handling in establish_connection

The error handling could be more specific about what went wrong during connection establishment.

Consider this improvement:
 case Repo.start_link({context, shard_id, db_path, mode, self()}) do
   {:ok, repo_pid} ->
     log(:info, "Established and fetched #{mode} connection", state)
     current_repo_spec = Map.fetch!(state, key)
     {:ok, repo_pid, Map.put(state, key, RepoEntry.on_establish(current_repo_spec, repo_pid))}
-  error ->
-    log(:error, "Error creating #{mode} connection: #{inspect(error)}", state)
+  {:error, reason} ->
+    log(:error, "Failed to create #{mode} connection: #{inspect(reason)}", state,
+      error_type: reason, mode: mode)
     {:error, state}
+  unexpected ->
+    log(:error, "Unexpected error creating #{mode} connection: #{inspect(unexpected)}", state,
+      error_type: :unexpected, mode: mode)
+    {:error, state}
 end

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 29b1dd5 and 0a54299.

📒 Files selected for processing (8)

lib/feeb/db/repo.ex (4 hunks)
lib/feeb/db/repo/manager.ex (10 hunks)
lib/feeb/db/repo/manager/repo_entry.ex (1 hunks)
lib/feeb/db/repo/repo_config.ex (1 hunks)
test/db/repo/manager/registry_test.exs (2 hunks)
test/db/repo_test.exs (7 hunks)
test/db/schema_test.exs (1 hunks)
test/support/db/schemas/friend.ex (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

test/db/repo/manager/registry_test.exs
test/db/repo_test.exs
test/db/schema_test.exs
test/support/db/schemas/friend.ex

🧰 Additional context used

📓 Learnings (2)

lib/feeb/db/repo.ex (2)

Learnt from: renatomassaro
PR: renatomassaro/FeebDB#15
File: lib/feeb/db/repo.ex:144-145
Timestamp: 2024-11-28T10:01:57.103Z
Learning: In `lib/feeb/db/repo.ex`, when working with Erlang/Elixir code, it's idiomatic to use the same parameter name in function pattern matching to assert that parameters are equal. For example, defining `defp assert_release_signal_from_manager!(manager_pid, manager_pid), do: :ok` is acceptable and preferred.

Learnt from: renatomassaro
PR: renatomassaro/FeebDB#15
File: lib/feeb/db/repo/manager.ex:249-250
Timestamp: 2024-11-28T10:08:23.649Z
Learning: In `lib/feeb/db/repo/manager.ex`, within the `do_release_connection/2` function, the line `:ok = Repo.notify_release(pid)` is intentionally designed to match `:ok`, ensuring that if `Repo.notify_release/1` fails, a `MatchError` is raised to trigger a crash and restart of the system. This is by design to handle critical errors, and handling potential errors from `Repo.notify_release/1` is not desired.

lib/feeb/db/repo/manager.ex (2)

Learnt from: renatomassaro
PR: renatomassaro/FeebDB#15
File: lib/feeb/db/repo/manager.ex:249-250
Timestamp: 2024-11-28T10:08:23.649Z
Learning: In `lib/feeb/db/repo/manager.ex`, within the `do_release_connection/2` function, the line `:ok = Repo.notify_release(pid)` is intentionally designed to match `:ok`, ensuring that if `Repo.notify_release/1` fails, a `MatchError` is raised to trigger a crash and restart of the system. This is by design to handle critical errors, and handling potential errors from `Repo.notify_release/1` is not desired.

Learnt from: renatomassaro
PR: renatomassaro/FeebDB#15
File: lib/feeb/db/repo/manager.ex:137-154
Timestamp: 2024-11-28T10:03:22.695Z
Learning: In the `Feeb.DB.Repo.Manager` module (`lib/feeb/db/repo/manager.ex`), it's acceptable to terminate caller processes using `Process.exit(caller_pid, :feebdb_repo_timeout)` in the `handle_info/2` callback when a repository timeout occurs, as the timeout is a contract and the process should die if it exceeds it.

🔇 Additional comments (11)

lib/feeb/db/repo/repo_config.ex (1)

1-17: Consider adding connection monitoring fields.

Given the PR's focus on monitoring leased connections, consider adding fields to track connection state and lease information.

Let's verify if similar fields exist in other modules:

lib/feeb/db/repo.ex (5)

7-9: LGTM! Well-structured struct definition.

The addition of manager_pid to the struct keys and its inclusion in @enforce_keys ensures proper initialization of the repository state.

31-36: LGTM! Well-documented public interface.

The notify_release/1 function provides a clear interface for the Manager to notify the Repo of connection releases, with good documentation explaining its purpose and effects.

Line range hint 40-57: LGTM! Proper state initialization.

The init/1 function correctly initializes the repository state with the manager_pid and properly constructs the repo_config from the complete state.

147-158: LGTM! Robust connection release handling.

The implementation correctly:

Validates the release signal source
Handles active transactions by rolling them back
Resets the state for new requests

The sequential message processing of GenServer ensures thread-safety.

401-410: LGTM! Idiomatic release signal validation.

The implementation uses idiomatic Erlang/Elixir pattern matching for parameter validation, with a clear error message for unauthorized release attempts. The special case for the test environment is properly isolated.

lib/feeb/db/repo/manager.ex (5)

43-61: LGTM! Well-structured module configuration

The module attributes and struct definitions are well-organized, with clear timeout thresholds and proper enforcement of struct keys.

141-204: LGTM! Robust timeout and process monitoring implementation

The implementation properly handles connection timeouts and process monitoring:

Terminates processes that exceed the repo timeout (intentional design)
Releases connections when processes die
Includes appropriate warning logs for normal process termination without connection release

247-257: LGTM! Thorough connection cleanup implementation

The connection release logic includes all necessary cleanup steps:

Stops process monitoring
Cancels timeout timer
Notifies repo of release (with intentional crash on failure)

Line range hint 380-401: LGTM! Well-structured timer management

The timer management functions are well-implemented with:

Clear separation between repo and queue timeouts
Proper handling of infinite timeouts (:no_timer)
Clean timer cancellation logic

294-294: Address the REVIEW comment regarding GenServer linking

The comment "Do I really want to link both genservers?" needs to be addressed. Consider the implications of linking:

Pros: Ensures both processes fail together, maintaining consistency
Cons: May cause cascading failures

✅ Verification successful

Let me gather more information about the supervision and linking strategy.

Let me check one more thing about the process linking and monitoring.

Keep the GenServer linking as it ensures proper cleanup and crash propagation

The current linking strategy is appropriate because:

The Repo process is supervised by a DynamicSupervisor with :one_for_one strategy
The Manager monitors client processes to handle cleanup on client crashes
The link between Manager and Repo ensures database connections are properly closed if either process crashes

The REVIEW comment can be removed as the current implementation follows Elixir's best practices for process supervision and error handling.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if there are any crash handling or supervision strategies in place
rg -A 5 "use Supervisor|use DynamicSupervisor"

Length of output: 426

Script:

#!/bin/bash
# Check the supervisor implementation and child spec
cat lib/feeb/db/repo/manager/supervisor.ex

# Check how the Repo process is started and supervised
ast-grep --pattern 'def start_link($$$)'

# Look for restart strategies
rg -A 5 "restart:|strategy:"

Length of output: 1900

Script:

#!/bin/bash
# Check for process monitoring and linking patterns
rg -A 5 "Process\.monitor|Process\.link"

# Check the Repo module implementation for crash handling
cat lib/feeb/db/repo.ex

Length of output: 15786

coderabbitai · 2024-11-28T11:12:26Z

lib/feeb/db/repo/repo_config.ex

+  @doc """
+  Builds the initial (and only) RepoConfig from the Repo state
+  """
+  def from_state(state) do
+    %__MODULE__{
+      context: state.context,
+      shard_id: state.shard_id,
+      mode: state.mode,
+      path: state.path
+    }
+  end


🛠️ Refactor suggestion

Enhance documentation and add input validation.

The function implementation is correct but could benefit from improved documentation and validation.

@doc """ Builds the initial (and only) RepoConfig from the Repo state + + ## Parameters + * `state` - A map containing repository state with the following required fields: + * `:context` - The repository context + * `:shard_id` - The shard identifier + * `:mode` - The repository mode (:primary or :replica) + * `:path` - The repository path + + ## Returns + * `%RepoConfig{}` - A new repository configuration struct + + ## Examples + iex> state = %{context: :user, shard_id: "1", mode: :primary, path: "/db/1"} + iex> RepoConfig.from_state(state) + %RepoConfig{context: :user, shard_id: "1", mode: :primary, path: "/db/1"} """ + @spec from_state(map()) :: t() def from_state(state) do + unless Enum.all?(@struct_keys, &Map.has_key?(state, &1)) do + raise ArgumentError, "state must contain all required fields: #{inspect(@struct_keys)}" + end + %__MODULE__{ context: state.context, shard_id: state.shard_id,

Committable suggestion skipped: line range outside the PR's diff.

coderabbitai · 2024-11-28T11:12:27Z

lib/feeb/db/repo/manager/repo_entry.ex

+  @doc """
+  Lease a Repo (in an established RepoEntry) to a particular caller
+  """
+  def on_acquire(%__MODULE__{pid: pid, busy?: false}, {caller_pid, monitor_ref, timer_ref}) do
+    %__MODULE__{
+      pid: pid,
+      busy?: true,
+      caller_pid: caller_pid,
+      monitor_ref: monitor_ref,
+      timer_ref: timer_ref
+    }
+  end


🛠️ Refactor suggestion

Add type validation and preserve existing fields.

While the function implements the leasing logic correctly, it should validate inputs and preserve any existing fields not related to the lease state.

Consider this implementation:

@doc """ Lease a Repo (in an established RepoEntry) to a particular caller """ + def on_acquire(%__MODULE__{pid: pid, busy?: false} = entry, {caller_pid, monitor_ref, timer_ref}) + when is_pid(pid) and is_pid(caller_pid) and is_reference(monitor_ref) and is_reference(timer_ref) do - def on_acquire(%__MODULE__{pid: pid, busy?: false}, {caller_pid, monitor_ref, timer_ref}) do - %__MODULE__{ - pid: pid, - busy?: true, - caller_pid: caller_pid, - monitor_ref: monitor_ref, - timer_ref: timer_ref - } + %{entry | + busy?: true, + caller_pid: caller_pid, + monitor_ref: monitor_ref, + timer_ref: timer_ref} end

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

@doc """

Lease a Repo (in an established RepoEntry) to a particular caller

"""

def on_acquire(%__MODULE__{pid: pid, busy?: false}, {caller_pid, monitor_ref, timer_ref}) do

%__MODULE__{

pid: pid,

busy?: true,

caller_pid: caller_pid,

monitor_ref: monitor_ref,

timer_ref: timer_ref

}

end

@doc """

Lease a Repo (in an established RepoEntry) to a particular caller

"""

def on_acquire(%__MODULE__{pid: pid, busy?: false} = entry, {caller_pid, monitor_ref, timer_ref})

when is_pid(pid) and is_pid(caller_pid) and is_reference(monitor_ref) and is_reference(timer_ref) do

%{entry |

busy?: true,

caller_pid: caller_pid,

monitor_ref: monitor_ref,

timer_ref: timer_ref}

end

coderabbitai · 2024-11-28T11:12:27Z

lib/feeb/db/repo/manager/repo_entry.ex

+  @doc """
+  Release the Repo in a leased RepoEntry
+  """
+  def on_release(%__MODULE__{pid: repo_pid}) do
+    %__MODULE__{
+      pid: repo_pid,
+      busy?: false
+    }
+  end
+
+  @doc """
+  Remove the Repo from an initial, established or released RepoEntry.
+  """
+  def on_close(%__MODULE__{pid: _, busy?: false}), do: on_start()


💡 Codebase verification

🛠️ Refactor suggestion

Add state validation and cleanup of monitoring references

The review comment's suggestions are valid and important. The code shows that RepoEntry maintains timer_ref, monitor_ref, and caller_pid fields that are set during on_acquire but not properly cleaned up during on_release. The test files confirm these fields need proper cleanup.

Consider this implementation:

def on_release(%__MODULE__{pid: pid, busy?: true} = entry) when is_pid(pid) do %{entry | busy?: false, caller_pid: nil, monitor_ref: nil, timer_ref: nil} end def on_close(%__MODULE__{pid: pid, busy?: false} = _entry) when is_pid(pid), do: on_start()

The changes:

Add guard to ensure valid pid and state in both functions

For on_release:

Verify entry is actually leased (busy?: true)

Clear all lease-related fields (timer_ref, monitor_ref, caller_pid)

Use struct update syntax for efficiency

For on_close:

Keep existing behavior but add pid validation

🔗 Analysis chain

Add state validation and improve implementation efficiency.

Both functions should validate their input states more strictly and update existing structs rather than creating new ones.

Consider these implementations:

@doc """ Release the Repo in a leased RepoEntry """ - def on_release(%__MODULE__{pid: repo_pid}) do - %__MODULE__{ - pid: repo_pid, - busy?: false - } + def on_release(%__MODULE__{pid: pid, busy?: true} = entry) when is_pid(pid) do + %{entry | + busy?: false, + caller_pid: nil, + monitor_ref: nil, + timer_ref: nil} end @doc """ Remove the Repo from an initial, established or released RepoEntry. """ - def on_close(%__MODULE__{pid: _, busy?: false}), do: on_start() + def on_close(%__MODULE__{pid: pid, busy?: false} = _entry) when is_pid(pid), do: on_start()

The changes:

Add guards to ensure valid states and types

For on_release:

Verify the entry is actually leased (busy?: true)

Clear all lease-related fields

For on_close:

Verify we have a valid pid to close

Let's verify the connection cleanup behavior:

The test files and implementation show that timer_ref and monitor_ref are indeed important fields that need proper cleanup during release. Let's verify the complete struct definition and the on_acquire implementation to ensure our suggested changes align with the codebase's design.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash # Search for cleanup of timer and monitor references rg -A 5 "timer_ref|monitor_ref" --type elixir

Length of output: 8147

Script:

#!/bin/bash # Get the complete RepoEntry module implementation ast-grep --pattern 'defmodule $_ do $$$ end' lib/feeb/db/repo/manager/repo_entry.ex

Length of output: 2963

This was a bug uncovered by the warnings added by FeebDB on PR renatomassaro/FeebDB#15

renatomassaro force-pushed the improved-monitoring-of-leased-connections branch from a587ec4 to 528af16 Compare November 24, 2024 15:10

renatomassaro added 3 commits November 26, 2024 06:17

Add block_forever test util

abc5d29

Remove outdated comment

7568d79

Now we automatically enqueue the requests; there is no polling

Add concept of Repo timeout via the :timeout opt

f60d508

Now a request / caller / leasee can no longer hold the connection open indefinitely -- unless explicitly stated.

renatomassaro force-pushed the improved-monitoring-of-leased-connections branch from cdbff15 to f3c6f7c Compare November 27, 2024 10:27

renatomassaro added 2 commits November 27, 2024 07:33

Have the Repo.Manager notify the Repo on connection release

0253482

This allows the Repo to perform internal clean-up in order to be ready to serve new requests.

Fix typo

94733e4

renatomassaro force-pushed the improved-monitoring-of-leased-connections branch from f3c6f7c to 94733e4 Compare November 27, 2024 10:34

Have a unified API for starting a Repo

72d82b8

The implicit "if manager absent then it is nil" logic was causing bugs and it's always better to be explicit

coderabbitai bot reviewed Nov 28, 2024

View reviewed changes

Refactor Repo and RepoManager to use structs for internal state

0a54299

Type safety!

renatomassaro force-pushed the improved-monitoring-of-leased-connections branch from 29b1dd5 to 0a54299 Compare November 28, 2024 11:09

coderabbitai bot reviewed Nov 28, 2024

View reviewed changes

renatomassaro merged commit e734106 into main Nov 30, 2024
5 checks passed

renatomassaro deleted the improved-monitoring-of-leased-connections branch November 30, 2024 00:01

renatomassaro added a commit to HackerExperience/HackerExperience that referenced this pull request Dec 1, 2024

Commit transaction opened when fetching user session

d9b4cf3

This was a bug uncovered by the warnings added by FeebDB on PR renatomassaro/FeebDB#15

		# Give ample time for the `handle_info({:DOWN, ...})` callback to run
		:timer.sleep(10)

		assert_release_signal_from_manager!(state.manager_pid, caller_pid)

Improved monitoring of leased connections #15

Improved monitoring of leased connections #15

Conversation

renatomassaro commented Nov 24, 2024 • edited by coderabbitai bot Loading

Summary by CodeRabbit

coderabbitai bot commented Nov 28, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024 • edited Loading

Choose a reason for hiding this comment

renatomassaro Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Nov 28, 2024

Choose a reason for hiding this comment

renatomassaro commented Nov 24, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 28, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Nov 28, 2024 •

edited

Loading

coderabbitai bot Nov 28, 2024 •

edited

Loading

coderabbitai bot Nov 28, 2024 •

edited

Loading

renatomassaro Nov 28, 2024 •

edited

Loading

coderabbitai bot Nov 28, 2024 •

edited

Loading

coderabbitai bot Nov 28, 2024 •

edited

Loading

coderabbitai bot Nov 28, 2024 •

edited

Loading

coderabbitai bot Nov 28, 2024 •

edited

Loading

coderabbitai bot Nov 28, 2024 •

edited

Loading