By using this site, you agree to our Privacy Policy and our Terms of Use. Close
Captain_Yuri said:

haxxiy said:

Well Nvidia rates their H100 SXM at 3,958 TOPS with Sparse + INT8

https://resources.nvidia.com/en-us-tensor-core/nvidia-tensor-core-gpu-datasheet

But yea I do agree that it will take some time for CDNA to be ready to compete against Nvidia. Least Epyc is slaughtering Intel in the meantime though.

Ops, that's correct - forgot Hopper has TMA, which allows it to do two 8-bit inferences simultaneously. CUDA can't do that (yet - it might be available later).

That being said, the power consumption is very high with TMA, to the point it's not that bad of an idea to disregard it when normalizing vs. other GPGPUs:

As for AMD, assuming MI300's tensors are competitive, it'll likely still be at least a couple of years to build up some support around ROCm. Until then Nvidia has the monopoly.

Last edited by haxxiy - on 14 June 2023