By using this site, you agree to our Privacy Policy and our Terms of Use. Close
Captain_Yuri said:

AMD Unveils Instinct MI300 APUs In MI300A & MI300X Flavors: CDNA 3 GPU, Up To 24 Zen 4 Cores, 192 GB HBM3, 153 Billion Transistors

https://wccftech.com/amd-unveils-instinct-mi300-apus-mi300a-mi300x-flavors-cdna-3-gpu-up-to-24-zen-4-cores-192-gb-hbm3-153-billion-transistors/

If you were watching the press conference, AMDs approach to comparing their products against Intel and Nvidia was quite fascinating. Against Intel vs Epyc, it was all about performance and numbers. AMDs Epyc CPUs are blah x faster, blah x more efficient, etc. Against Nvidia on the other hand, their spin was essentially, thanks to our chiplet design, we can add more cores and vram which allows you to have better total cost of ownership as Nvidia only gives you 80GB on their hopper GPUs while we are giving you 192GB. Like there was no performance comparisons even against Nvidia's previous gen GPUs or anything. Really goes to show the state of things.

I think likely because the thing is still far from ready. They compared the A100 to the MI210/250s with no problem. Only using the instructions that looked favorable to them like high precision floats, but still.

AMD claims it has '8 times the AI performance of the MI250'. That's of course a vague statement but it obviously applies to low-precision instructions with sparsity that CDNA2 can't do natively, hence the huge if somewhat misleading gains.

RTX 4090: 512 TCs x 2520 MHz x 1024 OPs = 1321 sparse INT8 TOPS

H100 SMX: 528 TCs x 1830 MHz x 2048 OPs = 1979 sparse INT8 TOPS

MI300: 'MI250X x 8' = either 1532 or 3064 sparse INT8 TOPS? Depending on how exactly they are counting.

Mind, in real-life models will be using instructions that are a quarter or so of these theoretical numbers. Anyway, I think both numbers make sense given the huge transistor count and the uncertain but likely very high TDP.