Forums - Sony Discussion - What do yo think will be the hardware specifications of PS5 if it arrives arround 2019-2020?

CrazyGPU said:
Trumpstyle said:

Yes about 2x more gpu performance than xbox one x gpu, but for PS4 it should be upwards 8x. Both next-gen consoles will be above 10 Teraflops and DF have run tests showing NAVI gpu architecture having 50% better TF compared GCN 1.0 (ps4 gpu). Just enough for Sony to do native 4k/60fps for all their PS4 exclusive games.

CPU performance will probably be lower, keep in mind Gddr6 has higher latency than DDR4, they very likely cutting of GameCache and around 3,2ghz clockspeed. So around 4-5x.

As Permalite would love to say, there are not better teraflops, but more efficient hardware centered on gaiming. Teraflop is the same, millon of a million of floting point operations. But the feeding of the execution units improved, with better architecture, caché and bandwith. Anyway, everybody undestand your point there, but I would say that the real performance is improved 50% given the same number of teraflops compared to VEGA 64. Better performance than you predicted a couple of years ago, so you might be happy. 

I don't think anyone could had foreseen that the next-gen consoles would be unreal monsters, my first post was actually navi 9TF but with vega 56 performance so it's quite off. But 10TF for both is close to geforce 2070 super, most optimistics predictions were 12-13TF with vega64 performance so even they were off :)

Pemalite said:
CrazyGPU said:

As Permalite would love to say, there are not better teraflops, but more efficient hardware centered on gaiming. Teraflop is the same, millon of a million of floting point operations. But the feeding of the execution units improved, with better architecture, caché and bandwith. Anyway, everybody undestand your point there, but I would say that the real performance is improved 50% given the same number of teraflops compared to VEGA 64. Better performance than you predicted a couple of years ago, so you might be happy. 

Graphics Core Next is an extremely compute centric design, when AMD designed the Architecture they had forward projections that games would become more compute heavy rather than pixel/texture fill-rate heavy or more geometry demanding.

Obviously history plays out differently... So AMD spent the next 8+ years "modifying" Graphics Core Next in order to bolster geometry throughput, improving the ROPS, TMU's, increasing bandwidth via delta colour compression and more... But GCN just had a ton of architectural bottlenecks that held back it's gaming potential regardless of AMD's attempts.
Didn't help that nVidia was making massive strides in efficiency starting with Kepler, making a big leap with Maxwell and refining that with Pascal, which undercut AMD's efforts even more in comparisons.

In short... The 5700XT's peak single precision floating point capability is down 23% from Vega 64, but the more important aspect of the GPU for gaming... Pixel Fillrate had an increase of 23%, ROP/Texture Mapping Unit/Geometry performance increased by 28.7% (And that is before we calculate in architectural efficiency improvements so likely higher!)... And it's no wonder it's able to beat Vega 64 in Gaming.

In scenarios that are compute-heavy, like GPGPU though? Vega will still beat Navi hands down, but that is just because Vega isn't being bottlenecked by components like the geometry engines or ROPS in those instances and thus it can make actual use of it's theoretical TFLOP potential.


I don't expect someone like Trumpstyle to ditch the "TFLOP" argument anytime soon, but people are slowly starting to wake up to how meaningless TFLOPS are for gaming which is really good to see, if only they knew that a console generation ago!

Hmm let's w8 for Navi 12, it should have 80CU with hbm2e and easily beat your Vega64/7 in compute!

No problems



"Donald Trump is the greatest president that god has ever created" - Trumpstyle

6x master league achiever in starcraft2

Beaten Sigrun on God of war mode

Beaten DOOM ultra-nightmare with NO endless ammo-rune, 2x super shotgun and no decoys on ps4 pro.

1-0 against Grubby in Wc3 frozen throne ladder!!

Around the Network

It's very unlikely there'll be more than 40 - 44 CUs on next generation consoles even if the die size is around 400 mm2 and they take I/O out of it.

60 - 80 mm2 will be the Zen CPU which we know the size already. The ray-tracing cores occupy around 180 mm2 in the high-end Turing GPUs. Shrink it to 7 nm and you're talking about 100 - 110 mm2 at the very least. That leaves 210 - 240 mm2 for the GPU - smaller than the 250 mm2 Radeon 5700 chips and their 40 CUs.

Of course, you might opt to severely gimp the ray-tracing cores or take L3 cache out of the Zen CPU, but all of that will come with severe performance costs (such as effectively disabling any meaningful ray-tracing or reducing CPU IPC by 20-30%). Is the trade-off worth it? They've probably studied it far better than we can surmise. But even then, the die space saved wouldn't be enough for the 11 - 14 TFLOPS some people thoughtlessly dream of.

These Radeon cards spit blood above 1600 - 1700 MHz, and we've discussed elsewhere how power consumption influences everything in design, including up to consumer-class product rules and licencing, which are a very significant dealbreaker. Not to mention the costs...

(edit - corrected number of CUs)

Last edited by haxxiy - on 29 October 2019

 

 

 

 

 

CPU: Ryzen/ 8cores-- 16Threads @3.2GHz core clock ( more than 4x faster as the Ps4 Vanilla cpu)
24 GB GDDR 6 (2GB for OS)
1 TB SSD
BlueRay-Rom
GPU between 10-14 TeraFlogs (which also will use the GDDR6) 4k 120 fps for Fifa xD



haxxiy said:

It's very unlikely there'll be more than 40 - 44 CUs on next generation consoles even if the die size is around 400 mm2 and they take I/O out of it.

60 - 80 mm2 will be the Zen CPU which we know the size already. The ray-tracing cores occupy around 180 mm2 in the high-end Turing GPUs. Shrink it to 7 nm and you're talking about 100 - 110 mm2 at the very least. That leaves 210 - 240 mm2 for the GPU - smaller than the 250 mm2 Radeon 5700 chips and their 40 CUs.

Of course, you might opt to severely gimp the ray-tracing cores or take L3 cache out of the Zen CPU, but all of that will come with severe performance costs (such as effectively disabling any meaningful ray-tracing or reducing CPU IPC by 20-30%). Is the trade-off worth it? They've probably studied it far better than we can surmise. But even then, the die space saved wouldn't be enough for the 11 - 14 TFLOPS some people thoughtlessly dream of.

These Radeon cards spit blood above 1600 - 1700 MHz, and we've discussed elsewhere how power consumption influences everything in design, including up to consumer-class product rules and licencing, which are a very significant dealbreaker. Not to mention the costs...

(edit - corrected number of CUs)

That's a lot of random numbers you got in your post. First I believe both Xbox and PS5 will have 44CU's and if AMD made a 48CU's discrete Gpu I expect that to have a die size of 270-280mm2.

We know from Flute leaked that Sony very likely has 8mb GameCache instead of 32MB found in 3700x, this reduces the Zen2 from 70mm2 to 40mm2.

Your numbers for RT+tensors cores are completely off, the TU104 (geforce 2070) adds 35mm2 for a 445mm2 die size gpu. I'm assuming the RT stuff will add between 10-20mm2 to our discrete gpu so 280-300mm2 now.

https://www.reddit.com/r/nvidia/comments/baaqb0/rtx_adds_195mm2_per_tpc_tensors_125_rt_07/

We can look at PS4, PS4 pro and Xbox one X Soc's die sizes to make an estimate.

PS4, 212mm2 discrete gpu + 50mm2 cpu = 348mm2 (86mm2 wasted die space)

PS4 pro, 220-230mm2 gpu + 25mm2 cpu = 320mm2 (65-75mm2 wasted die space)

Xbox one X, 270-290mm2 gpu + 25mm2 cpu = 360mm2 (45-65mm2 wasted die space)

Xbox one X gpu we can't do perfect estimate, radeon 480 have a die size of 232mm2 and 36CU's, Xbox one X 44CU's but 4 disable, a 384-bit bus and fabbed at TSMC 16nm+ instead of 14nm Glofo/samsung. Tsmc 16nm+ is less dense and will add 8-9% die area.

Let's do next-gen Socs 40mm2 cpu + 280-290mm2 gpu (hardware RT included) = 370-380mm2, I'm assuming 50mm2 wasted die because of another shrink. I think both Sony/Microsoft will use tsmc 7nm+ this shrinks the SOC 20% but I don't expect perfect scaling.

370-380mm2 x 0.85 = 314.5-323mm2

Edit: Forgot you mentioned clock speed, my post is long enough already, I don't agree with you. I expect the clock speed on the gpu to be from 1790-1899mhz.

Last edited by Trumpstyle - on 29 October 2019

"Donald Trump is the greatest president that god has ever created" - Trumpstyle

6x master league achiever in starcraft2

Beaten Sigrun on God of war mode

Beaten DOOM ultra-nightmare with NO endless ammo-rune, 2x super shotgun and no decoys on ps4 pro.

1-0 against Grubby in Wc3 frozen throne ladder!!

haxxiy said:

It's very unlikely there'll be more than 40 - 44 CUs on next generation consoles even if the die size is around 400 mm2 and they take I/O out of it.

60 - 80 mm2 will be the Zen CPU which we know the size already. The ray-tracing cores occupy around 180 mm2 in the high-end Turing GPUs. Shrink it to 7 nm and you're talking about 100 - 110 mm2 at the very least. That leaves 210 - 240 mm2 for the GPU - smaller than the 250 mm2 Radeon 5700 chips and their 40 CUs.

Of course, you might opt to severely gimp the ray-tracing cores or take L3 cache out of the Zen CPU, but all of that will come with severe performance costs (such as effectively disabling any meaningful ray-tracing or reducing CPU IPC by 20-30%). Is the trade-off worth it? They've probably studied it far better than we can surmise. But even then, the die space saved wouldn't be enough for the 11 - 14 TFLOPS some people thoughtlessly dream of.

These Radeon cards spit blood above 1600 - 1700 MHz, and we've discussed elsewhere how power consumption influences everything in design, including up to consumer-class product rules and licencing, which are a very significant dealbreaker. Not to mention the costs...

(edit - corrected number of CUs)

Clockspeeds can be tricky, it depends how aggressively they wish to bin things and how hard they wish to dial the voltages home.

If the base Playstation 4 and Xbox One consoles are a gauge though, they went conservative on both clocks and voltages to keep power consumption in check... Because you do have an efficiency curve to abide by... So you are right on the money there.

In saying that, Microsoft took things in the opposite direction with the Xbox One X, they dialed up the clockrates in favour of a smaller and cheaper to manufacture chip. (Plus they mitigated some of the bottlenecks inherent in GCN's design somewhat). - Some of that cost saving was sunk into a more intricate cooling and power delivery system though.

Either way... They will find balance in the force.



--::{PC Gaming Master Race}::--

Around the Network
Trumpstyle said:
haxxiy said:

That's a lot of random numbers you got in your post. First I believe both Xbox and PS5 will have 44CU's and if AMD made a 48CU's discrete Gpu I expect that to have a die size of 270-280mm2.

We know from Flute leaked that Sony very likely has 8mb GameCache instead of 32MB found in 3700x, this reduces the Zen2 from 70mm2 to 40mm2.

Your numbers for RT+tensors cores are completely off, the TU104 (geforce 2070) adds 35mm2 for a 445mm2 die size gpu. I'm assuming the RT stuff will add between 10-20mm2 to our discrete gpu so 280-300mm2 now.

https://www.reddit.com/r/nvidia/comments/baaqb0/rtx_adds_195mm2_per_tpc_tensors_125_rt_07/

We can look at PS4, PS4 pro and Xbox one X Soc's die sizes to make an estimate.

PS4, 212mm2 discrete gpu + 50mm2 cpu = 348mm2 (86mm2 wasted die space)

PS4 pro, 220-230mm2 gpu + 25mm2 cpu = 320mm2 (65-75mm2 wasted die space)

Xbox one X, 270-290mm2 gpu + 25mm2 cpu = 360mm2 (45-65mm2 wasted die space)

Xbox one X gpu we can't do perfect estimate, radeon 480 have a die size of 232mm2 and 36CU's, Xbox one X 44CU's but 4 disable, a 384-bit bus and fabbed at TSMC 16nm+ instead of 14nm Glofo/samsung. Tsmc 16nm+ is less dense and will add 8-9% die area.

Let's do next-gen Socs 40mm2 cpu + 280-290mm2 gpu (hardware RT included) = 370-380mm2, I'm assuming 50mm2 wasted die because of another shrink. I think both Sony/Microsoft will use tsmc 7nm+ this shrinks the SOC 20% but I don't expect perfect scaling.

370-380mm2 x 0.85 = 314.5-323mm2

Edit: Forgot you mentioned clock speed, my post is long enough already, I don't agree with you. I expect the clock speed on the gpu to be from 1790-1899mhz.

You can't estimate the size of each components from die pictures - specially low-resution ones in black and white - and Reddit speculation since the exposed die will show the final, last layer of silicon (back-end) while both the tensor cores and the ray tracing cores are deeply embedded within the SM design. I surely haven't seen anyone using industrial-grade solvents and analysing the chip on an electron microscope to be able to discern these features.

We can tell, on the other hand, that Geforce 20 SMs are significantly larger than those of the Geforce 10 and 16 series. They are larger even than those within the Volta GPGPUs. How larger? Well, Nvidia's developer diaries and their own die renders portrait the RT cores as larger than the tensor cores, which by themselves are around the same size as the floating point units within the SM. That alone means something like 20% of the SM and 10% of the die. And you need to take into account that more elements within the SM also mean larger register files, dispatch units, warp schedulers, L0 instruction cache etc.

What I'm doing is merely extrapolating these features from the much larger CUDA cores into whatever should be their equivalents within the RDNA architecture assuming a die size shrink and a more sophisticated solution than the ones in lower-end Turing GPUs.

Besides, no one reasonably would expect another 20% die shrink with 7 nm+ just because TSMC claims so, since not all components are scalable and these chips badly need improved electron flow from lower densities. The same argument also applies for whatever clock they're able to extract from binning, where you have to balance between costs and poor samples which scale poorly. And consoles notoriously don't bin their dies nowhere as carefully as PC solutions. None has ever come close to match clockspeeds from their PC equivalents. They aren't going to miraculously match and exceed PC GPU clockspeeds just because.

Edit - the L3 die size in Zen 2 also seems very questionable since not even these L3 cache sizes in the original 14 nm Zen would balloon to 30 mm². And they occupy far less space than some 40% of the die in the Core i9 CPUs, when Intel's manufactures using a less dense solution than AMD.

Last edited by haxxiy - on 30 October 2019

 

 

 

 

 

haxxiy said:
Trumpstyle said:

That's a lot of random numbers you got in your post. First I believe both Xbox and PS5 will have 44CU's and if AMD made a 48CU's discrete Gpu I expect that to have a die size of 270-280mm2.

We know from Flute leaked that Sony very likely has 8mb GameCache instead of 32MB found in 3700x, this reduces the Zen2 from 70mm2 to 40mm2.

Your numbers for RT+tensors cores are completely off, the TU104 (geforce 2070) adds 35mm2 for a 445mm2 die size gpu. I'm assuming the RT stuff will add between 10-20mm2 to our discrete gpu so 280-300mm2 now.

https://www.reddit.com/r/nvidia/comments/baaqb0/rtx_adds_195mm2_per_tpc_tensors_125_rt_07/

We can look at PS4, PS4 pro and Xbox one X Soc's die sizes to make an estimate.

PS4, 212mm2 discrete gpu + 50mm2 cpu = 348mm2 (86mm2 wasted die space)

PS4 pro, 220-230mm2 gpu + 25mm2 cpu = 320mm2 (65-75mm2 wasted die space)

Xbox one X, 270-290mm2 gpu + 25mm2 cpu = 360mm2 (45-65mm2 wasted die space)

Xbox one X gpu we can't do perfect estimate, radeon 480 have a die size of 232mm2 and 36CU's, Xbox one X 44CU's but 4 disable, a 384-bit bus and fabbed at TSMC 16nm+ instead of 14nm Glofo/samsung. Tsmc 16nm+ is less dense and will add 8-9% die area.

Let's do next-gen Socs 40mm2 cpu + 280-290mm2 gpu (hardware RT included) = 370-380mm2, I'm assuming 50mm2 wasted die because of another shrink. I think both Sony/Microsoft will use tsmc 7nm+ this shrinks the SOC 20% but I don't expect perfect scaling.

370-380mm2 x 0.85 = 314.5-323mm2

Edit: Forgot you mentioned clock speed, my post is long enough already, I don't agree with you. I expect the clock speed on the gpu to be from 1790-1899mhz.

You can't estimate the size of each components from die pictures - specially low-resution ones in black and white - and Reddit speculation since the exposed die will show the final, last layer of silicon (back-end) while both the tensor cores and the ray tracing cores are deeply embedded within the SM design. I surely haven't seen anyone using industrial-grade solvents and analysing the chip on an electron microscope to be able to discern these features.

We can tell, on the other hand, that Geforce 20 SMs are significantly larger than those of the Geforce 10 and 16 series. They are larger even than those within the Volta GPGPUs. How larger? Well, Nvidia's developer diaries and their own die renders portrait the RT cores as larger than the tensor cores, which by themselves are around the same size as the floating point units within the SM. That alone means something like 20% of the SM and 10% of the die. And you need to take into account that more elements within the SM also mean larger register files, dispatch units, warp schedulers, L0 instruction cache etc.

What I'm doing is merely extrapolating these features from the much larger CUDA cores into whatever should be their equivalents within the RDNA architecture assuming a die size shrink and a more sophisticated solution than the ones in lower-end Turing GPUs.

Besides, no one reasonably would expect another 20% die shrink with 7 nm+ just because TSMC claims so, since not all components are scalable and these chips badly need improved electron flow from lower densities. The same argument also applies for whatever clock they're able to extract from binning, where you have to balance between costs and poor samples which scale poorly. And consoles notoriously don't bin their dies nowhere as carefully as PC solutions. None has ever come close to match clockspeeds from their PC equivalents. They aren't going to miraculously match and exceed PC GPU clockspeeds just because.

Edit - the L3 die size in Zen 2 also seems very questionable since not even these L3 cache sizes in the original 14 nm Zen would balloon to 30 mm². And they occupy far less space than some 40% of the die in the Core i9 CPUs, when Intel's manufactures using a less dense solution than AMD.

There are other very knowledge dudes that have looked at this and come to the conclusion they are even smaller than the reddit link I gave, if you don't wanna believe any of them just compare the geforce 1660 ti to geforce 2070 they are both full dies, they have TURING architecture and 1660 ti has no RT+Tensor cores. The geforce 1660 ti has 1536 gpu cores (284mm2 die) and Geforce 2070 has 2304 gpu cores (445mm2 die).

2304/1536 = 1,5 x 284 = 426mm2 /////// 445 - 426 = 19mm2

So here tensor+RT cores takes up 19mm2 not even close to 100mm2.

About the clock speed, xbox one X is clocked at 1172mhz, radeon 480 has boost clock of 1266mhz and radeon 470 has 1206mhz, but this doesn't matter we have the Gonzalo code, it says 3,2ghz CPU and 1,8ghz GPU and we had another dude over at resetera posting several of these codes from PS4, xbox one and xbox one S and clock speed matched perfectly except for Xbox one which got a late clock boost. (no I'm gonna find link, it's old has many threads that are 400 pages and very long pages)

Edit: Forgot the tsmc 7nm+, yes tsmc says 20% but if you checked my math I assumed 15% this I based on PS4 slim and Xbox one s. The L3 numbers aren't mine

Last edited by Trumpstyle - on 30 October 2019

"Donald Trump is the greatest president that god has ever created" - Trumpstyle

6x master league achiever in starcraft2

Beaten Sigrun on God of war mode

Beaten DOOM ultra-nightmare with NO endless ammo-rune, 2x super shotgun and no decoys on ps4 pro.

1-0 against Grubby in Wc3 frozen throne ladder!!