By using this site, you agree to our Privacy Policy and our Terms of Use. Close
Aielyn said:
OK, perhaps someone can explain something to me.

The Wii U is speculated to have 320 stream processors, right? That's based on 40 stream processors per block, 8 blocks. And other possible values are either 256 stream processors (32 per block) or 160 stream processors (20 per block). And assuming each one gets one FLOP per cycle, you get the entire set producing up to 352 GFLOPS of processing power, right?

The 360 had just 48 stream processors, with a clock speed just a little slower than that of the Wii U's GPU, but got 240 GFLOPS from them. It did this by having each stream processor capable of up to 10 FLOPs per cycle.

So my question is this: why is it automatically assumed that the Wii U's stream processors are only capable of one FLOP per cycle, given this? It's a serious question, not rhetorical - I'm trying to understand why this isn't under consideration; is it lack of knowledge of GPUs on my part, a detail that I'm not aware of, or is it a possible oversight by the people analysing the system?

The other detail that, to me, goes with this question, is why, given the current speculation about the GPU, do we keep hearing about how the Wii U gets an amazing amount of graphical capability for its power draw? It has been repeatedly suggested or implied that the Wii U's efficiency is remarkably high. How does this mesh with the speculated details? What impact would having stream processors similar to those in the 360 (that is, stream processors that have a net power of multiple FLOPs per cycle) have on the power draw, relative to having more stream processors?

actually it is assumed they do 2 floating point operations(FLOPS) per clock (single precision), because all stream processors based on VLIW-5 (or VLIW-4 or GCN) architecture do so, you can read that up on amd.com (320 sps x 2 flops/clock x 550MHz = 352Gflops)

the shaders cores in the Xenos each contain 5 ALUs, which are capable of 2Flops/clock each and hence are similar to the SPs in newer designs, so you get a total of 48x5 = 240 ALUs x 2 flops/clock x 500MHz = theoretical processing power of 240Gflops

as for the power consumption I can only imagine the newer designs of the ALUs/SPs themselves are just less prone to power leakage and more simple, while maintaining the same workrate, so the chip is using less power overall and the newer organization into big blocks might get rid of bottlenecks in certain situations