Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add ModelManager to schedule calls between different model backend #1031

Open
1 of 2 tasks
lightaime opened this issue Oct 10, 2024 · 0 comments
Open
1 of 2 tasks
Labels
call for contribution enhancement New feature or request New Feature P1 Task with middle level priority
Milestone

Comments

@lightaime
Copy link
Member

lightaime commented Oct 10, 2024

Required prerequisites

Motivation

This will be helpful when we are limited to the rate limit. We have use ModelManager in ChatAgent and have different API keys for each models. This may be also powerful if we want to do model-adaptive inference based on tasks. A pseudo code would be like:

class ModelManager:

    def __init__(self, models, scheduling_strategy=None):
        """
        Initialize the ModelManager with a list of models and a scheduling strategy.
        models: list of model backends (e.g., model instances, APIs)
        scheduling_strategy: function that defines how to select the next model.
                             Defaults to round-robin if not provided.
        """
        self.models = models  # List of model backends

        # Set the scheduling strategy; default is round-robin
        if scheduling_strategy:
            self.scheduling_strategy = scheduling_strategy
        else:
            # Implement round-robin inside init if no strategy is passed
            self.current_index = 0
            self.scheduling_strategy = self.round_robin

    @staticmethod
    def round_robin(models, current_index):
        """
        Static round-robin scheduling strategy.
        It accepts the list of models and the current index.
        """
        model = models[current_index]
        next_index = (current_index + 1) % len(models)
        return model, next_index

    def run(self, messages):
        """
        Process a list of messages by selecting a model based on the scheduling strategy.
        Sends the entire list of messages to the selected model, and returns a single response.
        messages: List of lists of messages to be processed.
        """
        if self.scheduling_strategy == self.round_robin:
            # Call static round-robin method with models and current index
            model, self.current_index = self.scheduling_strategy(
                self.models, self.current_index)
        else:
            # Custom scheduling strategy (without using round-robin)
            model = self.scheduling_strategy()

        # Pass all messages to the selected model and get the response
        response = model.run(messages)

        return response


# Example Model class that processes a list of lists of messages and returns a single response
class Model:

    def __init__(self, name):
        self.name = name

    def run(self, messages):
        """
        Simulate message processing for a list of lists of messages and return a single response.
        messages: List of lists of messages to process.
        """
        return f"Model {self.name} processed these messages: {messages}"


# Example Model class that processes a list of lists of messages and returns a single response
class Model:

    def __init__(self, name):
        self.name = name

    def run(self, messages):
        """
        Simulate message processing for a list of lists of messages and return a single response.
        messages: List of lists of messages to process.
        """
        return f"Model {self.name} processed these messages: {messages}"


# Custom scheduling strategy (e.g., always use the first model)
def always_first_model():
    return models[0]


# Initialize the ModelManager with different models
models = [Model("A"), Model("B"), Model("C")]

# Use default round-robin strategy
manager_round_robin = ModelManager(models)

# List of lists of messages to be processed
messages_list = [["Message 1a", "Message 1b"], ["Message 2a", "Message 2b"],
                 ["Message 3a", "Message 3b"]]

for messages in messages_list:
    # Process all messages and get a single response with round-robin
    response_round_robin = manager_round_robin.run(messages)

    # Output response
    print(response_round_robin)

# Use custom scheduling strategy (always select first model)
manager_custom = ModelManager(models, scheduling_strategy=always_first_model)

for messages in messages_list:
    # Process all messages with custom strategy and get a single response
    response_custom = manager_custom.run(messages)

    # Output response
    print(response_custom)
Model A processed these messages: ['Message 1a', 'Message 1b']
Model B processed these messages: ['Message 2a', 'Message 2b']
Model C processed these messages: ['Message 3a', 'Message 3b']
Model A processed these messages: ['Message 1a', 'Message 1b']
Model A processed these messages: ['Message 2a', 'Message 2b']
Model A processed these messages: ['Message 3a', 'Message 3b']

Similar ideas can be applied to other objects that requires api keys and so on. Like tools and loaders.

Solution

No response

Alternatives

No response

Additional context

No response

@lightaime lightaime added the enhancement New feature or request label Oct 10, 2024
@lightaime lightaime self-assigned this Oct 10, 2024
@Wendong-Fan Wendong-Fan linked a pull request Oct 16, 2024 that will close this issue
@Wendong-Fan Wendong-Fan removed a link to a pull request Oct 16, 2024
@Wendong-Fan Wendong-Fan added call for contribution P1 Task with middle level priority New Feature labels Oct 18, 2024
@Wendong-Fan Wendong-Fan added this to the Sprint 15 milestone Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
call for contribution enhancement New feature or request New Feature P1 Task with middle level priority
Projects
Status: No status
Development

No branches or pull requests

2 participants