Pemalite said: AMD has spent a ton of engineering resources on Vega. |
Specifying some of AMD's improvement is irrelevant as long as you don't also specify what nVidia has achieved. A lot of the engineering work goes into improving the performance and power efficiency by switching from third pary cell libraries to custom IC designs for a particular process node. Something nVidia obviously has spent a lot more resources on than AMD and that's something which doesn't show up as a new feature in marketing material.
I spend much of my working time analyzing GPU frame-traces, identifying bottlenecks and how to work around them. Every GPU architecture has bottlenecks, that's nothing new, it's just a matter of what kind of workload you throw at them. I have full access to all the performance counters of the GCN achitecture, both in a numerical and a visual form. For instance, I can see the number of wavefronts executing on each individual SIMD of each CU at any given time during the trace, the issue rate of VALU, SALU, VMEM, EXP, branch instructions, wait cycles due to accessing the K$ cache, exporting pixels or fetching instructions, stalls due to texture rate or texture memory accesses, number of read/write accesses to the color or depth caches, the number of drawn quads, the number of context rolls, the number of processed primitives and percentage of culled primitives, stalls in the rasterizer due to the SPI (Shader Processor Input) or the PA (Primitive Assembly), number of indices processed and reused by the VGT (Vertex Geometry Tessellator), the number of commands parsed/processed by the CGP/CPC, stalls in the CPG/CPC, number of L2 read/writes, L2 hit/miss rate. That's just a few of the available performace counters I've access to. In addition to that I have full documentation to the GCN architecture and I've developed several released games targeting it. Based on that I've a pretty good picture of the strengths/weaknesses of the architecture and I'm interested in hearing if you perhaps have some insight that I lack.
The geometry rate isn't really a bottleneck for GCN. Even if it was, the geometry processing parallelizes quite well and could be solved by increasing the number of VGTs. It won't be a problem in the future either for two reasons. 1) The pixel rate will always be the limiting factor. 2) Primitive/mesh shaders gives the graphics programmer the option to use the CU's compute power to process geometry.
I asked you to specify the inherent flaws and bottlenecks in the GCN architecture that you claim prevents the PS5 from using more than 64CUs, not AMD's marketing material about their GPUs. So again, can you please specify the "multitude of bottlenecks".
Pemalite said: Yes it is. The entire reason why Terascale 3 ever existed was because load balancing for VLIW5 was starting to get meddlesome as often there were parts of the array being underutilized... |
That's irrelevant to your claim about running out of parallelizable work due to screen-space issues when scaling past 64 CUs.
The PS5 has probably been in development for over 5 years already. It's Sony's single most important coming product by far. They have spent vast amount of money and HR on it. AMD has dedicated a big amount of the RTG engineers working on it. Is it reasonble to believe the PS5 essentially will be a PS4 Pro with 64 CUs and 64 ROPs shrunk down to 7nm? If so, it'll be the most expensive and inefficient die shrink EVER. The Pro is designed to run 4K in checkerboard. Obviously a true 4K console needs a rasterizer with at least twice the rate of the Pro's 128 pixels/cycle, so it goes without saying that AMD need to scale up other parts than the number of CUs any way and I don't believe they will make a bare minimum upscale on those parts since the lifecycle of a console is about 5-6 years. IMO, they will most likely scale the number of ROPs above 64 as well, but that's less certain. That said, I think there is merit to your claim that there won't be more than 64 CUs in the PS5. I might even agree it's the most plausible configuration. However, I don't agree to your claims about inherent flaws in the GCN architucture preventing the PS5 of having more than 64CUs. IMO, it's more a question of which pricepoint PS5 will have and how big of an initial financial hit Sony is prepared to take than technical hurdles.
Pemalite said: https://www.anandtech.com/show/12677/tsmc-kicks-off-volume-production-of-7nm-chips |
I'm not sure what you mean. It clearly says the area reduction is 70% and a 60% reduction in power consumtion. Pretty inline with what I wrote. An area reduction of 70% would yield a density increase of 3.3x. Probably just a rounding issue.
Here are the links to TSMC's own numbers.
https://www.tsmc.com/english/dedicatedFoundry/technology/10nm.htm
https://www.tsmc.com/english/dedicatedFoundry/technology/7nm.htm