By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - Wii U vs PS4 vs Xbox One FULL SPECS (January 24, 2014)

So this makes it look like the PSO (Omni?) might still pack just a bit more punch than the NeXtbox. Am I looking at these the wrong way or something or is that actually what these stats are showing?



Around the Network

Nevermind. After looking over things again they seem to be pretty much the same. I think?



HoloDust said:
As for RSX GFLOPS number I took it from wiki, and I find it very strange - actually, to me it makes no sense at all (any one with greater knowledge mind to explain?), so I was guessing it might be in combo with Cell (Cell number is also wrong, full Cell has 230.4, PS3 version can only do 179GFLOPS (1PPE+6SPE, 25.6 per element)), but that doesn't add up good also, so I'm puzzled. Discrepancy in Xenos is more understandable cause of nature of VLIW5 (5th fat one doesn't seem to be used much), so it can be argued that it's 120G Shader ops/s - 240GFLOPS, or 96/192, but I put official specs.

*** Sorry, my post below is freaken long but I think it'll benefit you from wasting more time beyond worrying too much about theoretical specs ***

Your chart is very pretty but I am gonna drop a bomb on the discussion:

- You cannot under any circumstances compare GLFOPs, Texture, Pixel or Shader performance across different brands of GPUs (especially with different architectures) and even when you are doing it within the same brand (i.e., Nvidia or AMD), sometimes you cannot compare those metrics because AMD and NV upgrade architectures (AMD went from VLIW-5 in R500 in Xbox 360 to VLIW-4 in HD6900 and then to Graphics Core next in HD7000), making the comparison largely meaningless. At best, the comparison can only be made across a specific generation if the underlying architecture is the same (so all GeForce 7s to each other, GeForce GTX400s to each other, all HD7000 series, etc.), but even then you have to be extremely careful. There are various reasons for this:

1) Theoretical #s are meaningless for extrapolating real world gaming performance as they do not account for bottlenecks in a specific SKU. GFLOPs only cover arithmetic calculations performed by the shaders of the GPU and tell us nothing about other bottlenecks on the GPU architecture. To perform anti-aliasing for instance, we need a ton of memory bandwidth. GFLOPs tell us nothing about memory bandwidth. Then you have architecture inefficiencies. RSX is a fixed vertex/pixel shader architecture with fragment pipelines vs. GeForce 8 which is a unified shader architecture. What that means is the latter has shaders that can perform pixel and vertex operations but the former only has 24 pixels and 8 vertex shaders. If a game needs more pixel shaders, RSX is stuck with 24 at max. A unified architecture can seamlessly change this to 30/2, 24/8, 16/16, etc. The former has a lot more inefficiencies but looking at FLOPs won't tell us that. Comparing theoretical GLFOPs, TMUs, Pixel shading power will not tell you what those architectures can actually put down in a game in the real world due to their specific efficiencies (I will provide examples to prove this below). Some games need more texture performance, some need more shaders, some need compute shaders. You can have 10x the texture performance of a competing GPU but if you are compute shader limited for calculating HDAO/super-sampled AA on textures in Far Cry 3, your GPU is bottlenecked.

Let's tackle the FLOPs more closely.

2) GFLOP performance does NOT directly translate into gaming performance. Comparing FLOPS is marketing unless you know for sure what it is you are you are using the GPU for. For example, if you are performing SHA-256 password hashing, the GPUs ALUs (in this case shaders/CUDA cores) perform a 32-bit "right rotate operation". Modern AMD GPUs can do this in a single instruction, but NV's require 3 separate instructions (2 shifts + 1 add). If you only compared GFLOP performance, you do not know how the GPU can run different instructions based on this # alone. Each architecture is different. Complicating matters, games do not only use shaders/ALUs but GFLOPs only looks at shaders/ALUs. First we need to understand what a GFLOP is. 

"FLOPS (or flops or flop/s, for floating-point operations per second) is a measure of computer performance, especially in fields of scientific calculations that make heavy use of floating-point calculations, similar to the older, simpler, instructions per second."

The problem is there are various floating point instructions. You can have single precision, double precision floating point operations (for this type there is a 24x penalty on GTX680, 16x penalty on HD7870 and only a 4x penalty on HD7970 but it tells you nothing about games). But just so you can wrap a real world example around this, I'll show you why FLOPS are meaningless:

Floating Point operations - calculated off the shaders/cuda cores/stream processors in the GPU:

GTX580 = 512 CUDA cores x 772mhz GPU clock x 2 for Shader clock (as shaders operate at 2x the speed of the GPU in this architecture) x 2 Instructions per clock (this is 2 floating ops/clock cycle) / (1000 Megaflops in 1 Gflop) = 1581 GFlops or 1.581 Tflops 

GTX680 = 1536 CUDA cores x 1058mhz GPU clock (same as shader clock so no need to multiply by 2) x 2 Ops/clock / (1000 Megaflops in 1 Gflop) = 3250 Gflops or 3.25 TFlops

HD7970 Ghz Ed. = 2048 Stream processors/shaders x 1050mhz GPU clock (same as stream processor clock) x 2 Ops/clock / (1000 Megaflops in 1 Gflop) = 4300 Gflops or 4.30 Tflops

Now from the above, you might think that GTX680 is 2.055x faster in games than GTX580 and HD7970 GE is 32% faster than GTX680. We know for a fact this is not true: http://www.techpowerup.com/reviews/HIS/HD_7970_X_Turbo/28.html

3) OK next, moving on to addressing the comparison of things like Textures (TMUs), Pixel shading power and ROPs and so on. All those are again meaningless without knowing how the architecture works. 

I'll use AMD this time to show you how anyone can make a major mistake using this type of theoretical comparison.

Theoretical Pixel Shading Power:

HD6970 = 32 ROPs x 880mhz GPU clock = 28.16 GPixels/sec

HD7970 = 32 ROPs x 925mhz GPU clock = 29.6 GPixels/sec

It's logical to conclude that HD7970 Ghz Ed. has 5% more Pixel shading performance, correct? Incorrect. The theoretical numbers didn't explain how the architecture changed from HD6000 to HD7000 series. HD7970 uses Graphics Core Next architecture, which unlike VLIW-4 of HD6970, decouples the ROPs from the memory controllers and thus the ROPs are fed more memory bandwidth directly via a crossbar. To make a story short, HD7970 same 32 ROPs are 50% more efficient, or HD6970 can never reach its theoretical rate of 28.16 GPixels because it's ROPs are much more inefficient. When you run a real world test, you see the real Pixel shading throughput that is 52% greater, not 5%.

Now you are starting to see slowly how comparing specs is becoming meaningless. More evidence? For example, a key feature of DX11 games is tessellation. Even if you dove deeper into the specs of HD6970 vs. HD7970, you'd find that HD6970 has 1 Geometry engine and HD7970 has 2. These geometry engines are responsible for performing tessellation. If you looked at GPU clocks of HD6970 of 880mhz and HD7970 of 925mhz, you might say OK, I think HD7970 would be slightly more than 2x faster. That seems like a reasonable assumption. It would be incorrect again. 

If you just looked at specs like GFLOPs, Textures, Pixels and Memory Bandwidth, you would miss an integral part of next generation games - geometry engine performance. Even if you compared 1 Geometry engine of HD6970 to 2 of HD7970, you'd have to read deeper to understand that in Graphics Core Next of HD7970 (unlike VLIW-4 of HD6970), the parameter cache was enlarged, allowing increased vertex re-use and off-chip buffering. The DX11 geometry/tessellation preformance increased 1.7-4x, not the "slightly more than 2x faster" I noted above. 

It gets MORE complicated since modern games now use Pixel shaders, geometry shaders, vertex shaders, textures, z-culling, and a new type of shaders called compute shader. There is a relationship between memory bandwidth and ROPs and how they are connected to anti-aliasing. All of these differ in efficiencies not just across AMD and NVidia but against those brands themselves since you cannot just compare Nvidia's GeForce 7 with GTX600 or HD6000 to HD7000 as both of those are entirely different architectures.

To make things easier, IGNORE ALL THESE THEORETICAL SPECS. Focus on 3 things only:

1) Find out exactly what architecture is used in the GPU of a next gen console, or at least the codename for the chip so we can back into the architecture (G70/71 = RSX, R500 = Xenos of Xbox 360, etc.)

2) Look up the specifications on this website (http://www.gpureview.com/show_cards.php) and compare them to the specs of the GPU in the console. Do they match? If so, proceed to point #3, if not proceed to point #4.

3) Look up the Voodoo GPU Power # in this chart corresponding to the GPU that matches the console's GPU specs (http://alienbabeltech.com/abt/viewtopic.php?p=41174). These #s are actual gaming performance in real world games, compiled into 1 number for simplicity after reviewing hundreds of reviews of PC hardware parts in games. If specs match, it's easy to compare as dividing 1 number by the other to see % difference. That's it, you are done. So for example, HD7850 2GB is 139 VP and X1800XT is 16.7 VP (~ performance of Xbox 360's GPU). That means HD7850 2GB is going to be 8.3x faster in games on average.

4) If specs do not match, the GPU may be a mobile version of a desktop part. You'd need to start looking here (http://www.notebookcheck.net/Mobile-Graphics-Cards-Benchmark-List.844.0.html). More likely than not, the GPU in next generation consoles will be a variant of a Mobile GPU or even a custom GPU/APU. If specs do not match (like RSX in PS3 is a 7950GT with 50% less ROPs and 50% less memory bandwidth), then you are going to have to start doing estimates and reducing performance per chart in #3 and trying to figure out where the architecture and specs fit relative to desktop GPUs in 3). So in the chart in #3, a full-blown HD7950GT is 19.9 VP but we know RSX in PS3 has units cut. We know for sure it's below this mark. 

Hope this saves you a lot of headache since comparing theoretical numbers is just going to mislead you guys!!! I know I already posted something like this here but I see it's being ignored.  You are going to find yourself spinning wheels trying to compare theoretical #s that have no relation to each other and it's just going to confuse you or lead to incorrect conclusions. 

First find out what the underlying architecture is, then focus on the specs, but ONLY so you can look up real world gaming performance in that Voodoo GPU power chart.

What makes things even more complicated is that Xbox 720 might use PowerPC CPU and PS4 might use x86 CPU. Even if we know the GPU comparison, it'll become very difficult to compare the CPUs. Then there is eDRAM which complicates matters as it can be used reduce the performance hit a GPU incurs when accelerating anti-aliasing, somewhat making up for lost memory bandwidth. And there is always a possibility PS4/Xbox 720 could use some truly innovative technologies to reduce latencies like die stacking (http://semiaccurate.com/2012/09/06/die-stacking-has-promise-and-problems/). But let's not even go there :)



Thanks for the long post, but I think the general idea of that was already clear in this thread. I for one most of the time listed "more modern architecture"  in my posts when comparing next gen GPUs to this gen GPUs to account for the facts you listed in detail.

And what do we have apart from some numbers like GFlops etc. to compare GPUs in a simple way? Of course the only real way to compare gaming performance of different GPUs would be to list different gaming benchmarks. But we don't have that (and it would be far to long for the opening post to list dozens of gaming benchmarks). GFlops etc. at least give you general hint how powerful a GPU could be (yes it might be off by a factor of 2 or 3 or something when comparing different architectures, but the XBox720 and the PS4 seem to go with AMD GPUs and most likely both will use the VLIW4 architecture).

From the PC-gaming benchmarks I know of VLIW4 most of the time really is around 20% faster than VLIW5 (when games are optimized for it, like e.g. Dirt 3). So once we know exactly what architecture each GPU uses we could include a remark mentioning this in the opening post. But do we even know what the WiiU GPU uses? VLIW5 seems a lot more likely, but we don't really know yet.

edit: but thanks for the link to alienbabletech.com. I'm helping with the purchasing advices for PC gaming hardware at an other forum and 'til now I always searched a couple of hours each week to find the newest GPU benchmarks/tests (most websites only test around 5-6 games and that's not nearly enough to get a good estimator for the average power of the GPUs). Is that your list?



BlueFalcon said:

*** Sorry, my post below is freaken long but I think it'll benefit you from wasting more time beyond worrying too much about theoretical specs ***

Wow - all I can say is THANK YOU.

I made those charts to try to make comparing things bit easier, though I knew from the start that in lot of aspects they will be very inaccurate. I was thinking of adding someting like 3Dmark at least (if it's possible to find any meaningful for PS360 equivalents), and I will most certainly use those Voodoopower ratings you provided.

I knew there would be lot of problems comparing GPUs, specially with RSX cause of non-unified architecture, so considering your knowledge on the matter, what do you think, how much Redwood/Turks architectures (both that are viable candidates for WiiU's GPU) differe from Xenos, both being VLIW5? To be honest, I was always confused when people compared X1800 XT and Xenos, first is 16:8:16:16 non-unified architecture, and second is 48(x5):16:8 unified, more like R600.



Around the Network
BlueFalcon said:

What makes things even more complicated is that Xbox 720 might use PowerPC CPU and PS4 might use x86 CPU. Even if we know the GPU comparison, it'll become very difficult to compare the CPUs. Then there is eDRAM which complicates matters as it can be used reduce the performance hit a GPU incurs when accelerating anti-aliasing, somewhat making up for lost memory bandwidth. And there is always a possibility PS4/Xbox 720 could use some truly innovative technologies to reduce latencies like die stacking (http://semiaccurate.com/2012/09/06/die-stacking-has-promise-and-problems/). But let's not even go there :)

Thanks for all the info but I think a lot of users here came to similar conclusions. Any comparisons based on only a limited number of specs are going to be bias and only show a fragment of the overall story.

Btw, I'm bookmarking your post, lots of useful info there



Superchunk I'm not sure if your OP has it somewhere but their is a big possibility the PS4 will not have a discreet gpu at all.

Given the latest rumour of the use of jaguar cores in the APU theres a big chance that they wont need to have a discreet gpu as they will effectively be able to fit a better gpu in the APU if that makes sense :)

All in all i think given the latest rumours there will be a significant bump in visuals next gen..and this should be accredited to AMD for creating low power solutions with high power performance. Really amazing stuff happening there IMO.



Intel Core i7 3770K [3.5GHz]|MSI Big Bang Z77 Mpower|Corsair Vengeance DDR3-1866 2 x 4GB|MSI GeForce GTX 560 ti Twin Frozr 2|OCZ Vertex 4 128GB|Corsair HX750|Cooler Master CM 690II Advanced|

M.U.G.E.N said:
umm actually isn't this the latest 'rumor'

Playstation 4 is using Thebe-J which hasn't finished yet nor is it related to the Jaguar or Trinity or Kaveri architectures. The only one that is showing any signs of finalization is Xbox's Kryptos which is a >450 mm² chip. To get back on Thebe-J it was delayed from Trinity side-by-side development to Kaveri side-by-side development.

I assume if they are going to use Jaguar it is going to be in a big.LITTLE formation. Which will have them in a configuration where the Jaguar portion will control all of the system, os, etc stuff that generally isn't compute intensive. While the Thebe portion will control all of the gaming, hpc, etc. stuff that is generally compute intensive. Since, each year the performance part of the Playstation Orbis was upgraded it is safe to assume that they are going for an APU with the specs.


First:
A8-3850 + HD 7670
400 VLIW5 + 480 VLIW5 => 880 VLIW5 -> VLIW4 => 704

Second:
A10-5700 + HD-7670
384 VLIW4 + 480 VLIW5 => 864 VLIW4/5 -> VLIW4 => 768

I have heard that the third generation of the test Orbis uses an APU with GCN2.
Unknown APU + HD8770
384 GCN2 + 768 GCN2 -> 1152 GCN2

It is assumed that the APU only has four cores because AMD doesn't plan to increase the core count other than the GPU cores from now on.

via neogaf/jeff rigby/anandtechforums

Those latest rumours are really interesting don't you think MUGEN?

We could be looking a major upgrade if its true. Also since the devkits seem to be getting more powerfull as time goes by im really hopefull. It also seems that AMD has done some nice work with regards to keeping the power down to a minimum.



Intel Core i7 3770K [3.5GHz]|MSI Big Bang Z77 Mpower|Corsair Vengeance DDR3-1866 2 x 4GB|MSI GeForce GTX 560 ti Twin Frozr 2|OCZ Vertex 4 128GB|Corsair HX750|Cooler Master CM 690II Advanced|

Shinobi-san said:
M.U.G.E.N said:
umm actually isn't this the latest 'rumor'

Playstation 4 is using Thebe-J which hasn't finished yet nor is it related to the Jaguar or Trinity or Kaveri architectures. The only one that is showing any signs of finalization is Xbox's Kryptos which is a >450 mm² chip. To get back on Thebe-J it was delayed from Trinity side-by-side development to Kaveri side-by-side development.

I assume if they are going to use Jaguar it is going to be in a big.LITTLE formation. Which will have them in a configuration where the Jaguar portion will control all of the system, os, etc stuff that generally isn't compute intensive. While the Thebe portion will control all of the gaming, hpc, etc. stuff that is generally compute intensive. Since, each year the performance part of the Playstation Orbis was upgraded it is safe to assume that they are going for an APU with the specs.


First:
A8-3850 + HD 7670
400 VLIW5 + 480 VLIW5 => 880 VLIW5 -> VLIW4 => 704

Second:
A10-5700 + HD-7670
384 VLIW4 + 480 VLIW5 => 864 VLIW4/5 -> VLIW4 => 768

I have heard that the third generation of the test Orbis uses an APU with GCN2.
Unknown APU + HD8770
384 GCN2 + 768 GCN2 -> 1152 GCN2

It is assumed that the APU only has four cores because AMD doesn't plan to increase the core count other than the GPU cores from now on.

via neogaf/jeff rigby/anandtechforums

Those latest rumours are really interesting don't you think MUGEN?

We could be looking a major upgrade if its true. Also since the devkits seem to be getting more powerfull as time goes by im really hopefull. It also seems that AMD has done some nice work with regards to keeping the power down to a minimum.

I actually like that set up.  Though I can only see problems with it depending on what Microsoft does.  Like I can see porting between consoles is gonna be a problem again.



darkknightkryta said:

I actually like that set up.  Though I can only see problems with it depending on what Microsoft does.  Like I can see porting between consoles is gonna be a problem again.

Rumour has it that both Durango and Orbis are going to be based on the same tec.

Also about the APU + 8770 rumour. While i agree those are decent specs for next gen (a tad low though), im more excited about the fact that they have moved to jaguar instead of bobcat. This opens up a range of possibilities for the GPU size of the APU alone. Which makes it possible to leave out the 8770 entirely.

I would imagine this would help a lot with keeping the power usage very low.

Also the shift from older amd tech to GCN is a very good sign :)

Obiously this is all rumour though...but given these rumours making a lot of sense i would say this is pretty much spot on. The main thing to keep in mind though when discussing early dev kits is that they are usually only used to mimmick the final specs. Which makes the lone APU theory entirely plausible.



Intel Core i7 3770K [3.5GHz]|MSI Big Bang Z77 Mpower|Corsair Vengeance DDR3-1866 2 x 4GB|MSI GeForce GTX 560 ti Twin Frozr 2|OCZ Vertex 4 128GB|Corsair HX750|Cooler Master CM 690II Advanced|