Backward probabilities (\beta) not necessary #1

rakeshvar · 2020-09-05T10:45:27Z

Namaste @maetshju ,
Great work, Have been waiting for CTC in Julia for a long time. I have written ctc in python a very long time ago, before there were big name packages from Baidu, etc.
After studying it, I realized you do not need to calculate backward probabilities and take a mean.
You can see my implementation here
You just need to take the corner most value in the forward probabilities and that is all.
You could try that and see if you are getting same/similar results.

maetshju · 2020-09-05T14:54:09Z

I think that's an interesting idea. I had noticed that the loss values ended up being equivalent (or nearly so) at each time step, so only one time step's loss value would be needed. With the current implementation as it is, though, I'm not sure I can completely skip the beta probabilities because I need to calculate the gradients manually since current releases of the autodiff Zygote don't support array mutation.

I would certainly be interested in looking at skipping the beta calculations though, since it'd result in fewer instances of index math that I have found tricky to get correct. Perhaps after I get this merged into Flux: FluxML/Flux.jl#1287

rakeshvar · 2020-09-08T10:26:48Z

I was writing my own ctc, and ran into ERROR: LoadError: Mutating arrays is not supported
😄

May be I will try avoiding array mutation by allocating \alpha as array of arrays...
if that fails...
I will try to write the gradient myself... based on your gradient!

maetshju · 2020-09-08T15:16:21Z

Something that may work is to write a separate function, say, calcAlpha, that would return the indices of the probability values that are multiplied or added to get to that bottom-right corner of the alpha values. You would need to either set up a custom adjoint with @adjoint to return nothing as the gradient for calcAlpha, or perhaps use the @nograd macro. Then, you would call calcAlpha from within your ctc function to get the indices, and then call sum or prod on your probability values indexed by the result of calcAlpha and then your standard -1 * log as necessary, which should allow Zygote to calculate the gradients for the loss value.

rakeshvar · 2020-09-09T15:26:59Z

Hmmm...
I could not find a way around 'Array Mutation'.
I was reading Graves' book, and when writing our own code for gradients, betas come in handy, so it makes sense to calculate them. However you do not need another loop, you can use the same code as forward pass, but with inputs flipped. But that won't save much code either.

I will just do what you are doing. I used to think manual gradients are for 20th century losers, but I see it can come in really handy. 😄

Let me know if you want me to look into anything specific in your code.

I had a cute toy example of CTC. You can check it out in the pictures of my python repository.

rakeshvar · 2020-12-13T08:28:06Z

I implemented a few versions of CTC in julia.
https://github.com/rakeshvar/Explore-CTC-Loss.jl

Just did it for fun...

maetshju mentioned this issue Dec 23, 2020

Add CTC loss to new Losses module FluxML/Flux.jl#1287

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backward probabilities (\beta) not necessary #1

Backward probabilities (\beta) not necessary #1

rakeshvar commented Sep 5, 2020 •

edited

Loading

maetshju commented Sep 5, 2020

rakeshvar commented Sep 8, 2020

maetshju commented Sep 8, 2020 •

edited

Loading

rakeshvar commented Sep 9, 2020

rakeshvar commented Dec 13, 2020

Backward probabilities (\beta) not necessary #1

Backward probabilities (\beta) not necessary #1

Comments

rakeshvar commented Sep 5, 2020 • edited Loading

maetshju commented Sep 5, 2020

rakeshvar commented Sep 8, 2020

maetshju commented Sep 8, 2020 • edited Loading

rakeshvar commented Sep 9, 2020

rakeshvar commented Dec 13, 2020

rakeshvar commented Sep 5, 2020 •

edited

Loading

maetshju commented Sep 8, 2020 •

edited

Loading