Skip to content

V4.5.0 Performance improvements, Bug fixes, add hpa_hgemm

Compare
Choose a tag to compare
@amcamd amcamd released this 12 Sep 14:55
· 3670 commits to master since this release

Features

  • add support for vega20
  • add hpa_hgemm assembly and source
  • tuning for sgemm and hgemm
  • bug fixes for sgemm and hgemm small sizes
  • use SGPR for alpha and beta