add speech to text code improvements #3098

rickstaa · 2024-07-14T14:37:11Z

server: Add unimplemented AI handler
multi: Add /text-to-image for O
core+server: Add /image-to-image to O
core+server: Add /image-to-video for O
multi: Transcode PNG -> mp4 for image-to-video
server: Impl B -> O image-to-video
server: Impl B -> O text-to-image
server: Set Content-Type header on B /image-to-video
server: Impl B -> O image-to-image
mod: Bump go-tools to v0.3.5
server: Upload to OS for all AI endpoints
mod: Bump ai-worker + go-tools
cmd: Add -aiModels to load models
server: Log oapi validation error
temp disable CI tests
ci+docker: Use go1.21.5
docker: Install zlib
temp disable CI arm64 builds
cmd: Use livepeer/ai-runner image ID
cmd: Create models dir if DNE
cmd: Add -aiModelsDir flag
server: Log HTTP errors for AI endpoints
server: Add AI process logs for O
mod+cmd: Bump ai-worker
server: Check JSON200 from O
server: Temp disable oapi validation
mod: Bump ai-worker
mod+server: Prefer: respond-async for image-to-video
server: Fix error check in ImageToVideo
server: Fix JSON for ImageToVideoResult.Error
mod: Bump ai-worker
server: Add error struct for AI endpoints
server: Add backoff for AI requests
mod: Bump ai-worker
server: Read errors from O for AI requests
server: Use multipart writer helpers for AI reqs
mod+server: Add seed in AI responses
multi: Support specifying external containers
multi: Pass bearer token to runners
mod: Bump livepeer/ai-worker
mod: Bump livepeer/ai-worker
multi: Model cap constraints + multi-O for text-to-image
multi: Multi-O for image-to-image
multi: Multi-O for image-to-video
mod: Bump livepeer/ai-worker
core: Fix resolution for image-to-video
server: Fix service unavailable error image-to-video
core: Use software transcoder for image-to-video
server: Return 503 if no Os have cap
server: Add AISessionManager
cmd: Use abs path for default model dir
server: Remove check for deprecated seg data profiles
server: Payments for text-to-image
server: Payments + session manager for all AI caps
multi: Support price per AI cap
core: Default transcode max price
server: Use frame count for out pixels
server: Re-use session logic for AI
docs(ai): add AI subnet orch setup guide
docs(ai): add ai subnet broadcaster instructions
docs(ai): enhance AI Subnet documentation with binary installation guide
docs(ai): improve ai subnet documentation
docs(ai): fix broken huggingface documentation link
docs(ai): fix broken 'dl_checkpoint' command
docs(ai): enhance clarity and accuracy of AI subnet docs
docs(ai): improve model configuration documentation
docs(ai): add instructions for on-chain configuration of AI Subnet Orchestrator
docs(ai): improve AI on-chain instructions
docs(ai): improve ai docs syntax
docs(ai): add promtail metrics sending docs
docs(ai): update volume mount config for promtail
docs(ai): name docker containers
docs(ai): improve models config descriptiona and add ticketEV param
docs(ai): improve cli description and remove redeemer method
docs(ai): add command outputs
ci(ai): add AI issue templates
ci(ai): add AI pull request labeler
ci: change issue template order
ci(ai): add PR labeler config file
ci(ai): fix incorrect labels
ci: rename labeler and remove trailing whitespace
feat(ai): add pipelines optimization flags (feat(ai): add pipelines optimization flags #3013)
docs(ai): add optimization flags to docs ( docs(ai): add optimization flags to docs #3014)
ci(ai): temporary change build action branch to ai-video
ci(ai): temporary change docker action branch to ai-video
ci(ai): fix pull request config warning (ci(ai): fix pull request config warning #3018)
fix: flush writer when encoding AI results (fix invalid PNG) (Fix truncated PNG' #3020)
ci(ai): add myself as branch CODE OWNER
ci(ai): run labeler also on 'pull_request_target'
ci(ai): cleanup labeler actions
ci(ai): auto assign AI issues and feature requests
feat(ai): enable AI orchestrator discovery (feat(ai): enable AI orchestrator discovery #3004)
refactor(ai): add extra devtool input arguments (refactor(ai): add extra devtool input arguments #3026)
chore: improve devtool documentation and add scripts
refactor: log advertised capabilities and price on startup (Log O AIModels config on startup #3031)
feat(ai): enforce 'aiModels' flag requirement (feat(ai): enforce 'aiModels' flag requirement #3032)
fix(ai): improve AI selection algorithm (fix(ai): improve AI selection algorithm #3030)
refactor(ai): improve orch select retry ctx logic (refactor(ai): improve select retry ctx logic #3039)
refactor(ai): improve orch retry timeout msg
fix(ai): prevent insufficient capacity payments (fix(ai): prevent insufficient capacity payments #3035)
ci(ai): add temporary ai-video latest binary url upload
chore(ai): remove temporary AI subnet docs
fix(ai): fix infinite loop when no Os are found (fix(ai): fix infinite loop when no Os are found #3042)
feat(ai): Enhance orchestrator selection by incorporating latency (feat(ai): enhance orchestrator selection by incorporating latency #3043)
chore(ai): update 'ai-worker' dependency
feat: add '-gateway' and deprecate '-broadcaster' (Add -gateway flag, deprecation warning for -broadcaster #3048)
feat: remove -pricePerUnit requirement for -aiWorker flag (Remove -pricePerUnit requirement for -aiWorker flag #3047)
perf(ai): update ai-worker to enable DEEPCACHE optimization
chore: fix Mockgen dependency error
ci(ai): ensure docker builder is build and pushed
feat(ai): enable NSFW safety filter (enable nsfw filter #3054)
chore(ai): update ai-worker version
ci(ai): ensure livepeer builder builds on AI version tags
fix: apply runner nil error fix (handle empty orch response #3058)
refactor(census): rename Broadcaster metrics to Gateway (rename census broadcaster metrics #3055)
refactor: add -pricePerGateway and deprecate -pricePerBroadcaster (ai video deprecate priceperbroadcaster #3061)
ci: Protect Docker 'stable' tag
ci: fix syntax error in Docker action tags
fix(ai): fix runtime error in aiWorker when pricePerUnit is unset (Fix runtime error in aiWorker when pricePerUnit is unset #3059)
fix(ai): fix cli prices nil error (fix(ai): fix cli prices nil error #3063)
feat: add -aiRunnerImage flag to pin docker image ver (Add startup flag to specify ai runner docker image #3064)
chore(ai): update ai-worker dependency
ci(docker): ensure stable tag is created on master branch
feat: ai video add safety check to image to video 2 (Ai video add safety check to image to video 2 #3071)
chore(ai): update ai-worker version
chore(ai): update ai-worker to v0.0.5
chore(ai): update ai-worker to latest version
feat(ai): account for num_inference_steps inT2I latency inference score calculation (Latency inference score t2 i #3074)
chore(ai): update to latest ai-worker
feat(ai): add upscaling pipeline (feat(ai): add upscaling pipeline #3077)
Add speech-to-text pipeline, refactor processAIRequest and handleAIRequest to allow for various response types
Pin gomod to ai-runner for testing
Revert "Pin gomod to ai-runner for testing"
Update go mod dep for ai-worker
Calculate pixel value of audio file
fix go-mod deps
Adjust price calculation
one second per pixel
cleanup, fix missing duration
Add supported file types, calculate price by milliseconds
Add bad request response for unsupported file types
Update name of function
Update go mod to ai-runner
Use ffmpeg to get duration
update install_ffmpeg.sh to parse audio better
Check for audio codec instead of video codec
gomod edits
add docker file
chore: update Image2Image and Upscale OS storage to use requestID similar to Text2Image and Image2Video (chore: update OS storage to be same on all pipelines #3092)
Update install_ffmpeg.sh to improve audio support, Add duration validation and logging, pin lpms
rename speech-to-text to audio-to-text
Update go-mod
cleanup
update go mod
remove comment
fix(ai): account for number of images in I2I latency score (fix I2I latency score #3093)
update gomod
Update lpms mod
Update to latest lpms
Update lpms
feat(ai): apply code improvements to AudioToText pipeline

What does this pull request do? Explain your changes. (required)

Specific updates (required)

How did you test each of these updates (required)

Does this pull request close any open issues?

Checklist:

Read the contribution guide
make runs successfully
All tests in ./test.sh pass
README and other documentation updated
Pending changelog updated

…ilar to Text2Image and Image2Video (livepeer#3092)

…ation and logging, pin lpms

Rename audio to text

…3093) This commit ensures that the I2I pipeline latency score calculation now considers the number of images.

This commit applies several code improvements to the AudioToText codebase.

yondonfu added 30 commits March 25, 2024 13:40

server: Add unimplemented AI handler

3ede48f

multi: Add /text-to-image for O

6f29698

core+server: Add /image-to-image to O

e2735cd

core+server: Add /image-to-video for O

8883824

multi: Transcode PNG -> mp4 for image-to-video

92bfa74

server: Impl B -> O image-to-video

be72c37

server: Impl B -> O text-to-image

7cd7913

server: Set Content-Type header on B /image-to-video

abe1b5a

server: Impl B -> O image-to-image

7ec59b3

mod: Bump go-tools to v0.3.5

1822db5

server: Upload to OS for all AI endpoints

52785b2

mod: Bump ai-worker + go-tools

332ecbd

cmd: Add -aiModels to load models

14deb6a

server: Log oapi validation error

289cb49

temp disable CI tests

48f560c

ci+docker: Use go1.21.5

24c1623

docker: Install zlib

d60b801

temp disable CI arm64 builds

b49d503

cmd: Use livepeer/ai-runner image ID

27c8da1

cmd: Create models dir if DNE

0adf0b6

cmd: Add -aiModelsDir flag

d894ee8

server: Log HTTP errors for AI endpoints

d1af6e2

server: Add AI process logs for O

1759f35

mod+cmd: Bump ai-worker

e089c10

server: Check JSON200 from O

857ee5f

server: Temp disable oapi validation

3fcc300

mod: Bump ai-worker

d6d4261

mod+server: Prefer: respond-async for image-to-video

19da59d

server: Fix error check in ImageToVideo

7d765d6

server: Fix JSON for ImageToVideoResult.Error

9985d5f

eliteprox and others added 29 commits June 16, 2024 15:15

fix go-mod deps

4d76749

Adjust price calculation

1104708

one second per pixel

4fcca57

cleanup, fix missing duration

0280296

Add supported file types, calculate price by milliseconds

34f9d2e

Add bad request response for unsupported file types

1579920

Update name of function

494654d

Update go mod to ai-runner

63c20e3

Use ffmpeg to get duration

b78f11f

update install_ffmpeg.sh to parse audio better

3fea27b

Check for audio codec instead of video codec

c00b210

gomod edits

29309eb

add docker file

9920cca

chore: update Image2Image and Upscale OS storage to use requestID sim…

f2c9bb6

…ilar to Text2Image and Image2Video (livepeer#3092)

Update install_ffmpeg.sh to improve audio support, Add duration valid…

a185b5d

…ation and logging, pin lpms

rename speech-to-text to audio-to-text

7f18820

Update go-mod

1a6c500

cleanup

6d534bc

Merge branch 'add-speech-to-text' into rename-audio-to-text

54f8833

Merge pull request #3 from eliteprox/rename-audio-to-text

b048341

Rename audio to text

update go mod

cad5b60

remove comment

fa62b5a

fix(ai): account for number of images in I2I latency score (livepeer#…

29d4603

…3093) This commit ensures that the I2I pipeline latency score calculation now considers the number of images.

Merge branch 'ai-video' into add-speech-to-text

ba4f9dc

update gomod

bf03e0d

Update lpms mod

ff202d7

Update to latest lpms

8bad0ff

Update lpms

2c0bfb9

feat(ai): apply code improvements to AudioToText pipeline

3e88726

This commit applies several code improvements to the AudioToText codebase.

rickstaa closed this Jul 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add speech to text code improvements #3098

add speech to text code improvements #3098

rickstaa commented Jul 14, 2024

add speech to text code improvements #3098

add speech to text code improvements #3098

Conversation

rickstaa commented Jul 14, 2024