Fix that if use_past_kv_cache is set to True models from the Bloom family produce weird outputs. #1733
Job | Run time |
---|---|
4m 40s | |
1m 23s | |
1m 11s | |
4m 56s | |
1m 4s | |
1m 12s | |
5m 32s | |
1m 35s | |
1m 57s | |
58s | |
6m 40s | |
0s | |
0s | |
31m 8s |
Job | Run time |
---|---|
4m 40s | |
1m 23s | |
1m 11s | |
4m 56s | |
1m 4s | |
1m 12s | |
5m 32s | |
1m 35s | |
1m 57s | |
58s | |
6m 40s | |
0s | |
0s | |
31m 8s |