Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add speech to text code improvements #3098

Closed

Conversation

rickstaa
Copy link
Member

  • server: Add unimplemented AI handler
  • multi: Add /text-to-image for O
  • core+server: Add /image-to-image to O
  • core+server: Add /image-to-video for O
  • multi: Transcode PNG -> mp4 for image-to-video
  • server: Impl B -> O image-to-video
  • server: Impl B -> O text-to-image
  • server: Set Content-Type header on B /image-to-video
  • server: Impl B -> O image-to-image
  • mod: Bump go-tools to v0.3.5
  • server: Upload to OS for all AI endpoints
  • mod: Bump ai-worker + go-tools
  • cmd: Add -aiModels to load models
  • server: Log oapi validation error
  • temp disable CI tests
  • ci+docker: Use go1.21.5
  • docker: Install zlib
  • temp disable CI arm64 builds
  • cmd: Use livepeer/ai-runner image ID
  • cmd: Create models dir if DNE
  • cmd: Add -aiModelsDir flag
  • server: Log HTTP errors for AI endpoints
  • server: Add AI process logs for O
  • mod+cmd: Bump ai-worker
  • server: Check JSON200 from O
  • server: Temp disable oapi validation
  • mod: Bump ai-worker
  • mod+server: Prefer: respond-async for image-to-video
  • server: Fix error check in ImageToVideo
  • server: Fix JSON for ImageToVideoResult.Error
  • mod: Bump ai-worker
  • server: Add error struct for AI endpoints
  • server: Add backoff for AI requests
  • mod: Bump ai-worker
  • server: Read errors from O for AI requests
  • server: Use multipart writer helpers for AI reqs
  • mod+server: Add seed in AI responses
  • multi: Support specifying external containers
  • multi: Pass bearer token to runners
  • mod: Bump livepeer/ai-worker
  • mod: Bump livepeer/ai-worker
  • multi: Model cap constraints + multi-O for text-to-image
  • multi: Multi-O for image-to-image
  • multi: Multi-O for image-to-video
  • mod: Bump livepeer/ai-worker
  • core: Fix resolution for image-to-video
  • server: Fix service unavailable error image-to-video
  • core: Use software transcoder for image-to-video
  • server: Return 503 if no Os have cap
  • server: Add AISessionManager
  • cmd: Use abs path for default model dir
  • server: Remove check for deprecated seg data profiles
  • server: Payments for text-to-image
  • server: Payments + session manager for all AI caps
  • multi: Support price per AI cap
  • core: Default transcode max price
  • server: Use frame count for out pixels
  • server: Re-use session logic for AI
  • docs(ai): add AI subnet orch setup guide
  • docs(ai): add ai subnet broadcaster instructions
  • docs(ai): enhance AI Subnet documentation with binary installation guide
  • docs(ai): improve ai subnet documentation
  • docs(ai): fix broken huggingface documentation link
  • docs(ai): fix broken 'dl_checkpoint' command
  • docs(ai): enhance clarity and accuracy of AI subnet docs
  • docs(ai): improve model configuration documentation
  • docs(ai): add instructions for on-chain configuration of AI Subnet Orchestrator
  • docs(ai): improve AI on-chain instructions
  • docs(ai): improve ai docs syntax
  • docs(ai): add promtail metrics sending docs
  • docs(ai): update volume mount config for promtail
  • docs(ai): name docker containers
  • docs(ai): improve models config descriptiona and add ticketEV param
  • docs(ai): improve cli description and remove redeemer method
  • docs(ai): add command outputs
  • ci(ai): add AI issue templates
  • ci(ai): add AI pull request labeler
  • ci: change issue template order
  • ci(ai): add PR labeler config file
  • ci(ai): fix incorrect labels
  • ci: rename labeler and remove trailing whitespace
  • feat(ai): add pipelines optimization flags (feat(ai): add pipelines optimization flags #3013)
  • docs(ai): add optimization flags to docs ( docs(ai): add optimization flags to docs #3014)
  • ci(ai): temporary change build action branch to ai-video
  • ci(ai): temporary change docker action branch to ai-video
  • ci(ai): fix pull request config warning (ci(ai): fix pull request config warning #3018)
  • fix: flush writer when encoding AI results (fix invalid PNG) (Fix truncated PNG' #3020)
  • ci(ai): add myself as branch CODE OWNER
  • ci(ai): run labeler also on 'pull_request_target'
  • ci(ai): cleanup labeler actions
  • ci(ai): auto assign AI issues and feature requests
  • feat(ai): enable AI orchestrator discovery (feat(ai): enable AI orchestrator discovery #3004)
  • refactor(ai): add extra devtool input arguments (refactor(ai): add extra devtool input arguments #3026)
  • chore: improve devtool documentation and add scripts
  • refactor: log advertised capabilities and price on startup (Log O AIModels config on startup #3031)
  • feat(ai): enforce 'aiModels' flag requirement (feat(ai): enforce 'aiModels' flag requirement #3032)
  • fix(ai): improve AI selection algorithm (fix(ai): improve AI selection algorithm #3030)
  • refactor(ai): improve orch select retry ctx logic (refactor(ai): improve select retry ctx logic #3039)
  • refactor(ai): improve orch retry timeout msg
  • fix(ai): prevent insufficient capacity payments (fix(ai): prevent insufficient capacity payments #3035)
  • ci(ai): add temporary ai-video latest binary url upload
  • chore(ai): remove temporary AI subnet docs
  • fix(ai): fix infinite loop when no Os are found (fix(ai): fix infinite loop when no Os are found #3042)
  • feat(ai): Enhance orchestrator selection by incorporating latency (feat(ai): enhance orchestrator selection by incorporating latency #3043)
  • chore(ai): update 'ai-worker' dependency
  • feat: add '-gateway' and deprecate '-broadcaster' (Add -gateway flag, deprecation warning for -broadcaster #3048)
  • feat: remove -pricePerUnit requirement for -aiWorker flag (Remove -pricePerUnit requirement for -aiWorker flag #3047)
  • perf(ai): update ai-worker to enable DEEPCACHE optimization
  • chore: fix Mockgen dependency error
  • ci(ai): ensure docker builder is build and pushed
  • feat(ai): enable NSFW safety filter (enable nsfw filter #3054)
  • chore(ai): update ai-worker version
  • ci(ai): ensure livepeer builder builds on AI version tags
  • fix: apply runner nil error fix (handle empty orch response #3058)
  • refactor(census): rename Broadcaster metrics to Gateway (rename census broadcaster metrics #3055)
  • refactor: add -pricePerGateway and deprecate -pricePerBroadcaster (ai video deprecate priceperbroadcaster #3061)
  • ci: Protect Docker 'stable' tag
  • ci: fix syntax error in Docker action tags
  • fix(ai): fix runtime error in aiWorker when pricePerUnit is unset (Fix runtime error in aiWorker when pricePerUnit is unset #3059)
  • fix(ai): fix cli prices nil error (fix(ai): fix cli prices nil error #3063)
  • feat: add -aiRunnerImage flag to pin docker image ver (Add startup flag to specify ai runner docker image #3064)
  • chore(ai): update ai-worker dependency
  • ci(docker): ensure stable tag is created on master branch
  • feat: ai video add safety check to image to video 2 (Ai video add safety check to image to video 2 #3071)
  • chore(ai): update ai-worker version
  • chore(ai): update ai-worker to v0.0.5
  • chore(ai): update ai-worker to latest version
  • feat(ai): account for num_inference_steps inT2I latency inference score calculation (Latency inference score t2 i #3074)
  • chore(ai): update to latest ai-worker
  • feat(ai): add upscaling pipeline (feat(ai): add upscaling pipeline #3077)
  • Add speech-to-text pipeline, refactor processAIRequest and handleAIRequest to allow for various response types
  • Pin gomod to ai-runner for testing
  • Revert "Pin gomod to ai-runner for testing"
  • Update go mod dep for ai-worker
  • Calculate pixel value of audio file
  • fix go-mod deps
  • Adjust price calculation
  • one second per pixel
  • cleanup, fix missing duration
  • Add supported file types, calculate price by milliseconds
  • Add bad request response for unsupported file types
  • Update name of function
  • Update go mod to ai-runner
  • Use ffmpeg to get duration
  • update install_ffmpeg.sh to parse audio better
  • Check for audio codec instead of video codec
  • gomod edits
  • add docker file
  • chore: update Image2Image and Upscale OS storage to use requestID similar to Text2Image and Image2Video (chore: update OS storage to be same on all pipelines #3092)
  • Update install_ffmpeg.sh to improve audio support, Add duration validation and logging, pin lpms
  • rename speech-to-text to audio-to-text
  • Update go-mod
  • cleanup
  • update go mod
  • remove comment
  • fix(ai): account for number of images in I2I latency score (fix I2I latency score #3093)
  • update gomod
  • Update lpms mod
  • Update to latest lpms
  • Update lpms
  • feat(ai): apply code improvements to AudioToText pipeline

What does this pull request do? Explain your changes. (required)

Specific updates (required)

How did you test each of these updates (required)

Does this pull request close any open issues?

Checklist:

eliteprox and others added 29 commits June 16, 2024 15:15
…3093)

This commit ensures that the I2I pipeline latency score calculation now
considers the number of images.
This commit applies several code improvements to the AudioToText
codebase.
@rickstaa rickstaa closed this Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants