Straffaren666 said:
No, it's not. It only reflects the fact that AMD spent very little engineering resources on the last two generational GCN updates. It doesn't necessarily mean there are major inherent flaws in the GCN architecture. What do you propose? A switch back to a VLIW architecture? Out of curiosity, what are the "multitude of bottlenecks" of GCN that needs a revolutionary architecture and can't be overcome by evolutionary/generational updates to the GCN architecture? |
AMD has spent a ton of engineering resources on Vega.
It implemented all of Polaris's improvements like Instruction Prefetching and a larger instruction buffer which increased the IPC of each pipe as there is less wave stalls.
But one of Graphics Core Next's largest bottlenecks is... Geometry. Which is ironic considering AMD was pushing Tessellation even back in 2001 when the Playstation 2 was flaunting it's stuff.
To that end... AMD introduced the Primitive Discard Accelerator, which abolishes triangles that are to small and pointless to render.. We also saw the introduction of an Index cache, which stores instanced geometry next to the caches.
Graphics Core Next also tends to be ROP limited, which is why AMD reworked them on Polaris which saw a boost to Delta Colour Compression, Larger L2 caches and so on.
And then with Vega AMD kicked it up again by introducing the Draw Stream Binning Rasterization... Which is where Vega gains the ability to bin polygons on a tiled-basis... That in conjunction with the Primitive Discard Accelerator means a significant reduction in the amount of geometry work that needs to be done, boosting geometry throughput substantially.
On the ROP side of the equation... AMD made the ROPS a client of the L2 cache rather than the memory controller, which as L2 caches increases means the ROPS can better leverage it to bolster overall performance... And also enables render-to-texture instead to a frame--buffer, it's a boon for deferred engines.
And then we have the primitive shader too.
In short... Just during the Polaris/Vega introductions a ton of engineering has been done to the geometry side of the equation, it's always been a sore point with AMD's hardware even going back to Terascale.
Straffaren666 said: No, it's not. Evidently, neither AMDs nor nVidias architectures exhibit any significant inefficiencies due to not being able to feed the CUs with rasterized pixels in a 16CU configuration at 1080p, because of parallelization. Why would it suddenly become a problem for 64CUs@4K? Sure, if one goes wide enough, eventually one will run out of work to parallelize. There is no empiric evidence we've reached that point due to screen-space issues. |
Yes it is. The entire reason why Terascale 3 ever existed was because load balancing for VLIW5 was starting to get meddlesome as often there were parts of the array being underutilized...
The solution? Reduce it down to VLIW4.
It is also why AMD hasn't pushed out past 64 CU's. They potentially can... But that would require a significant overhaul of various parts of Graphics Core Next in order to balance the load and get more efficient utilization.
It's not always about going big and going home... Graphics Core Next tends to already be substantially larger, slower and hotter than the nVidia equivalent anyway.
Straffaren666 said: Well, the only two things I've said are 1) That I wouldn't rule out more than 64CUs in PS5 (I never claimed there will be more than 64CUs), based on the fact that the PS4 Pro, built on 16nm, has 36CUs and that there is about a 3x density improvement, according to TSMC, when going from their 16nm to their 7nm. 2) The die area reductions achieved by Vega 7 is not a good measurement for the 7nm process node, due to it being a low volume niche product. What is it I don't understand? |
https://www.anandtech.com/show/12677/tsmc-kicks-off-volume-production-of-7nm-chips
Apparently there isn't a 3x density improvement? Got a link to substantiate your claims?
Straffaren666 said:
I never claimed it to be a 3x density improvement over 14nm. TSMC claims their 7nm process node yields about a 3x density improvment over 16nm, which is the process node the Pro is built on and was the relevant frame of reference here. To be more precise, TSMC claims the density improvement to be 3.1. |
Hence why I stated if it was an "Apples to Apples" comparison.
16nm is just advertising fluff anyway, it's not a true 16nm process.
--::{PC Gaming Master Race}::--