D-Joe said: Cost-saving?esRAM are expensive |
It's only expensive relative to its size but not in the context of the entire "GPU kit". The Xbox 360 GPU "kit" was $141 in 2005, including eDRAM:
http://www.xbitlabs.com/news/multimedia/display/20051123214405.html
In 2010, it cost nearly $50 for 2GB of GDDR5 at cost from AMD (https://www.google.com/search?q=mercury+research+graphics+card+cost&hl=en&tbo=d&source=lnms&tbm=isch&sa=X&ei=0XQHUf6GLLSvygHX5YH4Bg&ved=0CAcQ_AUoAA&biw=1920&bih=989#imgrc=g3aafU_5vjHpwM%3A%3BIFzhtEtu0xhYnM%3Bhttp%253A%252F%252Fi.imgur.com%252F4u06C.jpg%3Bhttp%253A%252F%252Fforums.anandtech.com%252Fshowthread.php%253Fp%253D34458518%3B881%3B339)
^ That's nearly as much for 2GB of GDDR5 as the entire PS3's GPU and all of PS3's memory. By that point the entire Xbox 360 GPU was also probably not worth more than $45-50 (http://www.xbitlabs.com/news/multimedia/display/20091215232901_Sony_Still_Sells_PlayStation_3_at_a_Loss_Analysis.html)
If using eDRAM+DDR3 was the superior approach, how come no high-end GPU has used such a setup? Also, we already have real-world evidence that eDRAM does not make up for memory bandwidth bottleneck --> Xbox 360's GPU failed to deliver on the "free" 4x Anti-aliasing, promised by eDRAM.
ethomaz said:
@BlueFalcon
It have 32 ROPs and 72 TMUs... a true GCN architecture... there is no fake specs.
|
The specs say 8 ROPs and 18 TMUs. Make no sense unless they forgot to multiply those by 4. Also, the idea of splitting the GPU's compute units into a 14+4 setup makes no sense whatsoever because all compute units inside GCN are equal.
You are confusing MB and Mb. In 2008 they have 512Mb chips or /8 bits per 1 Byte = 64MB chips, not 512MB chips. The largest density GDDR5 chip is 256MB right now. You need 16 of those to get 4GB of GDDR5.
ethomaz said:
The PS3 uses 8x512MB GDDR5 (or 4Gbit like you said) to get ~170GB/s.... GTX 680 uses 16 chips because they need a 512bits bus width for near 380GB/s brandwidth... it's easy to make maths.
8 x 32bits = 256bits bus width = 190GB/s at high speeds 16 x 32bits = 512bits bus width = 380GB/s at high speeds
nVidia choose 16 x 256MB chips because they need to reach 380GB/s bandwidth... not because there is no 512MB chip.
|
GTX680 has a 256-bit bus with 192GB/sec bandwidth not a 512-bit bus width with 380GB/sec.
Also, you are way oversimplying things. Adding more chips doesn't automatically increase bandwidth. That's not how GPUs work:
GTX680 256-bit 2GB of GDDR5
GTX660Ti 192-bit 2GB of GDDR5
http://www.anandtech.com/show/6159/the-geforce-gtx-660-ti-review
HD7870 has 4 memory controllers, 64-bit each, that support dual-channel.
4x 64-bit channels = 256-bit bus
4x channels in Dual-channel = 4x (2x256MB) = 8x 256MB chips = 2048MB in total. Because GTX680 also has 4-memory controllers with 64-bit width, it also has 2048MB of GDDR5 by default. To get to 4GB, 8 more 256MB chips are added to the back, but the bus-width doesn't grow from 256-bit to 512-bit. If HD7870 (HD7970M) ships with 4GB of GDDR5, it doesn't mean the bus width will grow from 256-bit to 512-bit. The maximum bus width available is dictated by the GPU's internal memory controllers, not how many memory chips it ships with. The design of Pitcairn GPU (HD7870) dictates that it has 4x64-bit = 256-bit bus. The only other way to increase memory bandwidth is to replace GDDR5 4800mhz chips on it with say GDDR5 6000mhz ones, bumping up the memory bandwidth from ~154GB/sec to 192GB/sec. This doesn't matter though as HD7870 is not memory banwidth limited by design. How do we know?
Because when HD7870 gets 17% more shading power (1536 SP vs. 1280) and memory speed increases from 4.8Ghz to 6Ghz, performance only increases 9% (100% vs. 92%):
http://www.techpowerup.com/reviews/VTX3D/Radeon_HD_7870_XT_Black/28.html
Also, HD7950 has 1792 Shaders and a 384-bit bus, or 50% more memory bandwidth over HD7870, but performance is just 11% faster (same graph 102% for 7950 vs. 92% for 7870).
Getting above HD7870's standard memory bandwidth of 154GB/sec to say 176GB/sec (bumping GDDR5 from 4.8Ghz to 5.5Ghz) is honestly of waste of power since the GPU is bottlenecked somewhere else more. What it needs are more CUs (and thus more shaders) & TMUs. If developers will use the compute functionality of GCN on PS4/Xbox 720, then the # of Compute Units and the GPU clock will matter more than for the GPU to have > 154GB/sec memory bandwidth.
You can see this in a compute heavy game like Dirt Showdown:
HD7950 Black = 900mhz 28 CUs, 264 GB/sec bandwidth
HD7970 = 925mhz 32 CUs, 264 GB/sec bandwidth
HD7970 Ghz = 1050mhz 32 CUs, 288 GB/sec bandwidth
That's why I said earlier, the difference between a 12 CU AMD GPU in Xbox 720 and an 18 CU one could be a substantial one. That's a 50% deficit for the 720, even before we talk about PS4's GDDR5 vs. Xbox 720's DDR3 setups that only exacerbate the performance disadvantage for the Xbox 720's GPU (of course these are just rumors still). If PS4 runs on Linux, they might use OpenGL or OpenCL for games. If developers use OpenCL compute for PS4 games, the amount of compute units its GPU has will matter a lot more than if PS4 was running a Windows OS with a traditional DX11 API. That kinda would make sense why Sony would want a more powerful GPU given their OS choice is unlikely to be a Windows one.
http://en.wikipedia.org/wiki/OpenCL
Also, AMD's GPUs are faster in both OpenGL and OpenCL, which gives yet another reason why an NV GPU was not chosen for PS4 if Sony aims to focus around them.