Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/93 add api key #95

Merged
merged 4 commits into from
Nov 12, 2024
Merged

Feat/93 add api key #95

merged 4 commits into from
Nov 12, 2024

Conversation

amindadgar
Copy link
Member

@amindadgar amindadgar commented Nov 12, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced API key validation for the /ask and /status endpoints to enhance security.
    • Added functionality for managing API keys, including validation against a MongoDB backend.
  • Bug Fixes

    • Improved error handling for missing or invalid API keys.
  • Tests

    • Added a comprehensive test suite for the API key validation logic, covering various scenarios for token availability.

Copy link
Contributor

coderabbitai bot commented Nov 12, 2024

Warning

Rate limit exceeded

@amindadgar has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 33 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between a9b9e0d and b0dd8bb.

Walkthrough

The changes in this pull request enhance the FastAPI application by implementing API key validation for specific endpoints. The routers/http.py file has been updated to require a valid API key for the ask and status endpoints. A new file, services/api_key.py, has been introduced to handle API key validation, including a method to check the key's validity against a MongoDB database. Additionally, a new test class in tests/integration/test_validate_token.py has been created to ensure the functionality of the API key validation logic through various test scenarios.

Changes

File Change Summary
routers/http.py Updated ask and status methods to include dependencies=[Depends(api_key_header)] for API key validation.
services/api_key.py Introduced get_api_key method for retrieving and validating the API key, and added ValidateAPIKey class with a validate method for MongoDB key checking.
tests/integration/test_validate_token.py Added TestValidateToken class to test ValidateAPIKey functionality with methods for various token validation scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API
    participant Validator
    participant Database

    Client->>API: POST /ask with API key
    API->>Validator: Validate API key
    Validator->>Database: Check API key
    Database-->>Validator: Return validation result
    Validator-->>API: Return validation status
    API->>Client: Return task ID and status
Loading

🐇 "In the code, a key we now seek,
With validation, security we tweak.
From ask to status, a safeguard we place,
In our FastAPI, a more secure space.
With tests in tow, our logic is sound,
Hooray for the changes, let joy abound!" 🥳


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (12)
routers/http.py (2)

Line range hint 29-35: Add error handling for payload persistence.

The persistence operation could fail but there's no error handling. Consider wrapping it in a try-except block similar to the one in the /status endpoint.

     # persisting the payload
     persister = PersistPayload()
-    persister.persist_http(payload_http)
+    try:
+        persister.persist_http(payload_http)
+    except Exception as e:
+        logging.error(f"Failed to persist initial payload: {e}")
+        # Continue execution as this is non-critical

Line range hint 44-51: Add type validation for task.result access.

The code assumes task.result contains specific keys ('community_id', 'question', 'response') without validation. This could raise KeyError if the task result structure is different.

     if task.status == "SUCCESS":
+        result = task.result
+        if not all(key in result for key in ['community_id', 'question', 'response']):
+            logging.error(f"Invalid task result structure: {result}")
+            return {"id": task.id, "status": "ERROR", "message": "Invalid result structure"}
+
         http_payload = HTTPPayload(
-            communityId=task.result["community_id"],
-            question=QuestionModel(message=task.result["question"]),
-            response=ResponseModel(message=task.result["response"]),
+            communityId=result["community_id"],
+            question=QuestionModel(message=result["question"]),
+            response=ResponseModel(message=result["response"]),
             taskId=task.id,
         )
tests/integration/test_validate_token.py (2)

5-7: Remove extra empty line

There's an unnecessary extra empty line between imports and the class definition.

 from services.api_key import ValidateAPIKey
 
-
 class TestValidateToken(TestCase):

21-75: Enhance test coverage with additional test cases

The current test suite could be improved with:

  1. Additional test cases for edge cases:
    • Invalid API key formats
    • Empty API key
    • Very long API keys
    • Special characters in API keys
  2. Validation of error messages
  3. Reusable test data fixtures

Here's a suggested implementation for additional test methods:

def test_invalid_api_key_formats(self):
    test_cases = [
        "",  # empty
        " ",  # whitespace
        "a" * 1000,  # very long
        "!@#$%^&*()",  # special chars
    ]
    for api_key in test_cases:
        with self.subTest(api_key=api_key):
            valid = self.validator.validate(api_key)
            self.assertFalse(valid)

@staticmethod
def _create_token_doc(token_id: int, token: str) -> dict:
    return {
        "id": token_id,
        "token": token,
        "options": {},
    }

def test_validate_with_error_message(self):
    api_key = "invalid_key"
    result = self.validator.validate(api_key)
    self.assertFalse(result)
    # Assuming validate method returns tuple(bool, str)
    # self.assertEqual(error_message, "Invalid API key format")

Also, consider refactoring the existing test data creation:

-    self.client[self.validator.db][self.validator.tokens_collection].insert_many(
-        [
-            {
-                "id": 1,
-                "token": "1111",
-                "options": {},
-            },
-            {
-                "id": 2,
-                "token": "2222",
-                "options": {},
-            },
-            {
-                "id": 3,
-                "token": "3333",
-                "options": {},
-            },
-        ]
-    )
+    test_tokens = [
+        self._create_token_doc(1, "1111"),
+        self._create_token_doc(2, "2222"),
+        self._create_token_doc(3, "3333"),
+    ]
+    self.client[self.validator.db][self.validator.tokens_collection].insert_many(test_tokens)
services/api_key.py (8)

3-3: Remove unused import 'List' from 'typing'.

The List import is not used anywhere in the code and can be safely removed to clean up the imports.

Apply this diff to remove the unused import:

-from typing import List
🧰 Tools
🪛 Ruff

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)


53-53: Simplify the return statement in the validate method.

You can simplify the return statement by directly returning the boolean evaluation of document using the bool() function.

Apply this diff to simplify the return statement:

-    return True if document else False
+    return bool(document)
🧰 Tools
🪛 Ruff

53-53: Use bool(...) instead of True if ... else False

Replace with `bool(...)

(SIM210)


28-33: Externalize database configuration parameters.

The database name and collection ("hivemind" and "tokens") are hardcoded. Consider fetching these values from a configuration file or environment variables to enhance flexibility and maintainability.

Apply this diff as an example of how to externalize configuration:

+import os

 class ValidateAPIKey:
     def __init__(self) -> None:
         self.client = MongoSingleton.get_instance().get_client()
-        self.db = "hivemind"
-        self.tokens_collection = "tokens"
+        self.db = os.getenv("MONGODB_DATABASE", "hivemind")
+        self.tokens_collection = os.getenv("TOKENS_COLLECTION", "tokens")

35-48: Improve the docstring for clarity and consistency.

The docstring in the validate method can be improved for better readability and to follow standard conventions.

Here is a revised docstring:

    def validate(self, api_key: str) -> bool:
        """
        Check if the API key is available in MongoDB.

        Parameters
        ----------
        api_key : str
            The provided API key to check in the database.

        Returns
        -------
        bool
            True if the key is valid (exists in the collection), False otherwise.
        """

6-6: Handle potential exceptions when connecting to MongoDB.

When obtaining the MongoDB client, there may be connection errors that could cause unhandled exceptions. Consider adding error handling to manage potential connection issues gracefully.

Example modification:

from pymongo.errors import ConnectionFailure

class ValidateAPIKey:
    def __init__(self) -> None:
        try:
            self.client = MongoSingleton.get_instance().get_client()
        except ConnectionFailure:
            raise HTTPException(
                status_code=HTTP_401_UNAUTHORIZED, detail="Database connection failed"
            )
        self.db = os.getenv("MONGODB_DATABASE", "hivemind")
        self.tokens_collection = os.getenv("TOKENS_COLLECTION", "tokens")

18-20: Consider generalizing error messages for unauthorized access.

Providing specific details in authentication error messages can aid malicious users. It's a security best practice to use generic error messages for authentication failures.

Apply this diff to generalize the error message:

             raise HTTPException(
-                status_code=HTTP_401_UNAUTHORIZED, detail="No API key provided"
+                status_code=HTTP_401_UNAUTHORIZED, detail="Unauthorized"
             )

23-23: Consider generalizing error messages for invalid API keys.

Similar to the previous point, it's advisable to avoid revealing whether an API key is invalid. Using a generic message can enhance security.

Apply this diff to generalize the error message:

         raise HTTPException(
-            status_code=HTTP_401_UNAUTHORIZED, detail="Invalid API key"
+            status_code=HTTP_401_UNAUTHORIZED, detail="Unauthorized"
         )

14-25: Optimize the instantiation of ValidateAPIKey.

Creating a new instance of ValidateAPIKey on every request may be inefficient. Consider instantiating it once at the module level or using dependency injection with FastAPI.

Example modification:

Instantiate validator at the module level:

 validator = ValidateAPIKey()

 async def get_api_key(api_key_header: str = Security(api_key_header)):
-    validator = ValidateAPIKey()

Alternatively, use FastAPI's dependency injection:

from fastapi import Depends

validator = ValidateAPIKey()

async def get_api_key(
    api_key_header: str = Security(api_key_header),
    validator: ValidateAPIKey = Depends(ValidateAPIKey),
):
    # Rest of the code
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 80afc66 and a9b9e0d.

📒 Files selected for processing (3)
  • routers/http.py (3 hunks)
  • services/api_key.py (1 hunks)
  • tests/integration/test_validate_token.py (1 hunks)
🧰 Additional context used
🪛 Ruff
services/api_key.py

3-3: typing.List imported but unused

Remove unused import: typing.List

(F401)


53-53: Use bool(...) instead of True if ... else False

Replace with `bool(...)

(SIM210)

🔇 Additional comments (5)
routers/http.py (4)

4-4: LGTM! Clean dependency management.

The new imports are well-organized and follow good separation of concerns by keeping API key validation logic in a separate service.

Also applies to: 9-9


20-20: LGTM! Proper implementation of API key validation.

The API key validation is correctly implemented using FastAPI's dependency injection system.


40-40: LGTM! Proper implementation of API key validation.

The API key validation is correctly implemented using FastAPI's dependency injection system.


20-20: Verify API key protection coverage.

Let's ensure all endpoints that need protection have the API key validation dependency.

Also applies to: 40-40

✅ Verification successful

Based on the results, I can generate the final response as we have clear evidence that all endpoints in the codebase (/ask and /status) are properly protected with the API key validation dependency.

API key protection is properly implemented across all endpoints

Both existing endpoints have the required API key validation dependency:

  • /ask (POST) is protected
  • /status (GET) is protected
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find all router endpoints in the codebase and verify API key protection

# Search for all FastAPI router endpoints
echo "Searching for all FastAPI router endpoints..."
rg "@router\.(get|post|put|delete|patch)" -A 2

# Search for endpoints with API key validation
echo -e "\nSearching for endpoints with API key validation..."
rg "@router\.(get|post|put|delete|patch).*dependencies=\[Depends\(api_key_header\)\]"

Length of output: 874

tests/integration/test_validate_token.py (1)

1-75: Verify test infrastructure and similar test cases

Since this is an integration test depending on MongoDB:

  1. Ensure CI/CD pipeline has MongoDB available
  2. Check for similar test cases to prevent duplication
✅ Verification successful

MongoDB integration is properly configured and test setup is appropriate

The verification shows that:

  1. MongoDB is properly configured in both development (docker-compose.dev.yml) and test (docker-compose.test.yml) environments
  2. The test uses a separate test database (hivemind_test) to avoid conflicts with production data
  3. Other integration tests (test_data_source_selector.py, test_fetch_community_id.py, test_persist_payload.py) follow similar patterns for MongoDB testing
  4. The test is not duplicated as it's the only test specifically for token validation
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for similar test files and CI configuration

# Look for similar test files
echo "Searching for similar test files..."
fd -e py test_.*token test_.*api_key tests/

# Check for CI/CD MongoDB configuration
echo "Checking CI/CD configuration..."
fd -e yml -e yaml . .github/workflows | xargs rg -l mongodb

Length of output: 367


Script:

#!/bin/bash
# Let's try a different approach to check CI configuration and similar tests

# Check for any CI/CD configuration files
echo "=== CI/CD Configuration Files ==="
fd -e yml -e yaml . .github/

# Look for MongoDB setup in any configuration
echo -e "\n=== MongoDB Configuration ==="
rg -i "mongo" -A 5 -B 5

# Search for similar test patterns
echo -e "\n=== Similar Test Patterns ==="
rg -l "ValidateAPIKey|MongoSingleton" tests/
rg "class Test.*Token" tests/

# Check the actual ValidateAPIKey implementation
echo -e "\n=== ValidateAPIKey Implementation ==="
rg -l "class ValidateAPIKey" .

Length of output: 25182

tests/integration/test_validate_token.py Show resolved Hide resolved
services/api_key.py Outdated Show resolved Hide resolved
@amindadgar amindadgar requested a review from cyri113 November 12, 2024 10:52
@amindadgar amindadgar merged commit 7260171 into main Nov 12, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant