By using this site, you agree to our Privacy Policy and our Terms of Use. Close
Trumpstyle said:
haxxiy said:

That's a lot of random numbers you got in your post. First I believe both Xbox and PS5 will have 44CU's and if AMD made a 48CU's discrete Gpu I expect that to have a die size of 270-280mm2.

We know from Flute leaked that Sony very likely has 8mb GameCache instead of 32MB found in 3700x, this reduces the Zen2 from 70mm2 to 40mm2.

Your numbers for RT+tensors cores are completely off, the TU104 (geforce 2070) adds 35mm2 for a 445mm2 die size gpu. I'm assuming the RT stuff will add between 10-20mm2 to our discrete gpu so 280-300mm2 now.

https://www.reddit.com/r/nvidia/comments/baaqb0/rtx_adds_195mm2_per_tpc_tensors_125_rt_07/

We can look at PS4, PS4 pro and Xbox one X Soc's die sizes to make an estimate.

PS4, 212mm2 discrete gpu + 50mm2 cpu = 348mm2 (86mm2 wasted die space)

PS4 pro, 220-230mm2 gpu + 25mm2 cpu = 320mm2 (65-75mm2 wasted die space)

Xbox one X, 270-290mm2 gpu + 25mm2 cpu = 360mm2 (45-65mm2 wasted die space)

Xbox one X gpu we can't do perfect estimate, radeon 480 have a die size of 232mm2 and 36CU's, Xbox one X 44CU's but 4 disable, a 384-bit bus and fabbed at TSMC 16nm+ instead of 14nm Glofo/samsung. Tsmc 16nm+ is less dense and will add 8-9% die area.

Let's do next-gen Socs 40mm2 cpu + 280-290mm2 gpu (hardware RT included) = 370-380mm2, I'm assuming 50mm2 wasted die because of another shrink. I think both Sony/Microsoft will use tsmc 7nm+ this shrinks the SOC 20% but I don't expect perfect scaling.

370-380mm2 x 0.85 = 314.5-323mm2

Edit: Forgot you mentioned clock speed, my post is long enough already, I don't agree with you. I expect the clock speed on the gpu to be from 1790-1899mhz.

You can't estimate the size of each components from die pictures - specially low-resution ones in black and white - and Reddit speculation since the exposed die will show the final, last layer of silicon (back-end) while both the tensor cores and the ray tracing cores are deeply embedded within the SM design. I surely haven't seen anyone using industrial-grade solvents and analysing the chip on an electron microscope to be able to discern these features.

We can tell, on the other hand, that Geforce 20 SMs are significantly larger than those of the Geforce 10 and 16 series. They are larger even than those within the Volta GPGPUs. How larger? Well, Nvidia's developer diaries and their own die renders portrait the RT cores as larger than the tensor cores, which by themselves are around the same size as the floating point units within the SM. That alone means something like 20% of the SM and 10% of the die. And you need to take into account that more elements within the SM also mean larger register files, dispatch units, warp schedulers, L0 instruction cache etc.

What I'm doing is merely extrapolating these features from the much larger CUDA cores into whatever should be their equivalents within the RDNA architecture assuming a die size shrink and a more sophisticated solution than the ones in lower-end Turing GPUs.

Besides, no one reasonably would expect another 20% die shrink with 7 nm+ just because TSMC claims so, since not all components are scalable and these chips badly need improved electron flow from lower densities. The same argument also applies for whatever clock they're able to extract from binning, where you have to balance between costs and poor samples which scale poorly. And consoles notoriously don't bin their dies nowhere as carefully as PC solutions. None has ever come close to match clockspeeds from their PC equivalents. They aren't going to miraculously match and exceed PC GPU clockspeeds just because.

Edit - the L3 die size in Zen 2 also seems very questionable since not even these L3 cache sizes in the original 14 nm Zen would balloon to 30 mm². And they occupy far less space than some 40% of the die in the Core i9 CPUs, when Intel's manufactures using a less dense solution than AMD.

Last edited by haxxiy - on 30 October 2019