Skip to content

V4.6.0 Performance improvements, Bug fixes, add source hpa_hgemm

Compare
Choose a tag to compare
@amcamd amcamd released this 11 Oct 23:34
· 3588 commits to master since this release

Features

  • Merge gfx906 code into gfx900/gfx803 code
  • Tune hgemm and sgemm for Resnet50 on gfx906
  • Add source hpa_hgemm
  • Use precise bounds check when possible
  • Tested on ROCm 1.9