-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(ai): add AI orchestrator metrics #3097
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit adds the initial AI gateway metrics so that they can reviewed by others. The code still need to be cleaned up and the buckets adjusted.
This commit improves the AI metrics so that they are easier to work with.
This commit ensures that an error is logged when the Gateway could not find orchestrators for a given model and capability.
This commit ensure that the `ticket_value_sent` abd `tickets_sent` metrics are also created for a AI Gateway.
This commit ensures that the AI gateway metrics contain the orch address label.
rickstaa
changed the title
ai orchestrator metrics
feat(ai): add AI orchestrator metrics
Jul 14, 2024
This commit introduces a suite of AI orchestrator metrics to the census module, mirroring those received by the Gateway. The newly added metrics include `ai_models_requested`, `ai_request_latency_score`, `ai_request_price`, and `ai_request_errors`, facilitating comprehensive tracking and analysis of AI request handling performance on the orchestrator side.
rickstaa
force-pushed
the
ai-orchestrator-metrics
branch
from
July 14, 2024 09:40
a1d53c6
to
a3f7d53
Compare
rickstaa
commented
Jul 17, 2024
monitor/census.go
Outdated
Name: "ai_request_latency_score", | ||
Measure: census.mAIRequestLatencyScore, | ||
Description: "AI request latency score", | ||
TagKeys: append([]tag.Key{census.kPipeline, census.kModelName}, baseTagsWithNodeInfo...), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eliteprox, @ad-astra-video do you think listing this per gateway label makes sense?
This commit ensures that the right tags are attached to the Orchestrator AI metrics.
rickstaa
force-pushed
the
ai-orchestrator-metrics
branch
from
July 18, 2024 13:16
1513362
to
f3bab00
Compare
This commit ensures that no devide by zero errors can occur in the latency score calculations.
eliteprox
added a commit
to eliteprox/go-livepeer
that referenced
this pull request
Jul 26, 2024
* Add gateway metric for roundtrip ai times by model and pipeline * Rename metrics and add unique manifest * Fix name mismatch * modelsRequested not working correctly * feat: add initial POC AI gateway metrics This commit adds the initial AI gateway metrics so that they can reviewed by others. The code still need to be cleaned up and the buckets adjusted. * feat: improve AI metrics This commit improves the AI metrics so that they are easier to work with. * feat(ai): log no capacity error to metrics This commit ensures that an error is logged when the Gateway could not find orchestrators for a given model and capability. * feat(ai): add TicketValueSent and TicketsSent metrics This commit ensure that the `ticket_value_sent` abd `tickets_sent` metrics are also created for a AI Gateway. * fix(ai): ensure that AI metrics have orch address label This commit ensures that the AI gateway metrics contain the orch address label. * feat(ai): add orchestrator AI census metrics This commit introduces a suite of AI orchestrator metrics to the census module, mirroring those received by the Gateway. The newly added metrics include `ai_models_requested`, `ai_request_latency_score`, `ai_request_price`, and `ai_request_errors`, facilitating comprehensive tracking and analysis of AI request handling performance on the orchestrator side. * refactor: improve orchestrator metrics tags This commit ensures that the right tags are attached to the Orchestrator AI metrics. * refactor(ai): improve latency score calculations This commit ensures that no devide by zero errors can occur in the latency score calculations. --------- Co-authored-by: Elite Encoder <[email protected]>
eliteprox
added a commit
to eliteprox/go-livepeer
that referenced
this pull request
Jul 26, 2024
* Add gateway metric for roundtrip ai times by model and pipeline * Rename metrics and add unique manifest * Fix name mismatch * modelsRequested not working correctly * feat: add initial POC AI gateway metrics This commit adds the initial AI gateway metrics so that they can reviewed by others. The code still need to be cleaned up and the buckets adjusted. * feat: improve AI metrics This commit improves the AI metrics so that they are easier to work with. * feat(ai): log no capacity error to metrics This commit ensures that an error is logged when the Gateway could not find orchestrators for a given model and capability. * feat(ai): add TicketValueSent and TicketsSent metrics This commit ensure that the `ticket_value_sent` abd `tickets_sent` metrics are also created for a AI Gateway. * fix(ai): ensure that AI metrics have orch address label This commit ensures that the AI gateway metrics contain the orch address label. * feat(ai): add orchestrator AI census metrics This commit introduces a suite of AI orchestrator metrics to the census module, mirroring those received by the Gateway. The newly added metrics include `ai_models_requested`, `ai_request_latency_score`, `ai_request_price`, and `ai_request_errors`, facilitating comprehensive tracking and analysis of AI request handling performance on the orchestrator side. * refactor: improve orchestrator metrics tags This commit ensures that the right tags are attached to the Orchestrator AI metrics. * refactor(ai): improve latency score calculations This commit ensures that no devide by zero errors can occur in the latency score calculations. --------- Co-authored-by: Elite Encoder <[email protected]>
This was referenced Aug 12, 2024
5 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this pull request do? Explain your changes. (required)
This pull request introduces new Orchestrator AI metrics to the
ai-video
branch:To decrease code duplication this pull request uses the same metrics as the Gateway metrics pull request (see #3087).
Specific updates (required)
census.go
to include the new Orchestrator metrics.ai_http.go
file to log these metrics.How did you test each of these updates (required)
I set up both an on-chain and off-chain gateway to validate the metrics. I verified their visibility at
http://localhost:7935/metrics
and ensured they were correctly visualized in Grafana.Does this pull request close any open issues?
This implements the functionality outlined in https://livepeer-ai.productlane.com/roadmap?id=d56cae33-2dbd-4187-8d3a-d1c5c35f890a
Checklist:
make
runs successfully./test.sh
passHow to test
http://localhost:7935/metrics
to view the new AI orchestrator metrics.http://localhost:3000
to inspect these metrics in Grafana.