Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: add support for distributed serving type #1187

Merged
merged 3 commits into from
Nov 7, 2024

Conversation

linnlh
Copy link
Contributor

@linnlh linnlh commented Nov 1, 2024

Purpose of this PR

This PR introduces a new serving type called distributed to Arena's serving module. The primary motivation behind these changes is to enable the deployment of large-scale models across multiple nodes within a Kubernetes (K8s) cluster.

Proposed changes:

  • Introduce a new serving type called distributed to Arena's serving module which can deploy model across multiple nodes.
  • Update the relevant doc to provide guidance for using distributed serving type.

Which issue(s) this PR fixes:
Fixes #1186

Change Category

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Rationale

The distributed serving type addressed the increasing demand for multi-host inference due to the advancement of large language models (LLMs) such as Meta's Llama-3.1-405B. Currently, Arena lacks the capability to deploy models distributed across multiple nodes, and this PR aims to fill the gap.

林联辉 added 2 commits October 31, 2024 19:12
@linnlh linnlh changed the title feat: add support for distributed serving type Feat: add support for distributed serving type Nov 1, 2024
Signed-off-by: 林联辉 <[email protected]>
@linnlh
Copy link
Contributor Author

linnlh commented Nov 1, 2024

@ChenYi015 @cheyang @Syulin7

b.AddArgValue(key, value)
}
if err := b.PreBuild(); err != nil {
return nil, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using fmt.Errorf("failed to build args: %v", err) instead of err.

return nil, err
}
if err := b.ArgsBuilder.Build(); err != nil {
return nil, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

@cheyang
Copy link
Collaborator

cheyang commented Nov 7, 2024

@linnlh Please run the following commands to download the go module into the vendor package.

go mod tidy
go mod vendor

Copy link
Collaborator

@cheyang cheyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheyang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 68b71f9 into kubeflow:master Nov 7, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for deploying large-scale model across multiple nodes
2 participants