Use mul_mat_transpose
at axpy op for large batch
#51
Job | Run time |
---|---|
1m 46s | |
21s | |
1m 30s | |
1m 18s | |
4m 4s | |
3m 29s | |
3m 10s | |
7m 53s | |
5m 1s | |
3m 17s | |
1m 7s | |
3m 23s | |
1m 12s | |
27s | |
40s | |
1m 27s | |
1m 6s | |
1m 34s | |
45s | |
1m 49s | |
1m 28s | |
1m 31s | |
13m 23s | |
13m 21s | |
52s | |
0s | |
1h 15m 54s |