By using this site, you agree to our Privacy Policy and our Terms of Use. Close

I think some people should maybe believe more what PS4 architect is saying than what MS spin machine is telling them when it comes to PS4. Ethomaz already posted Cerny's words, I really don't understand where this "PS4 is designed with 14 CUs in mind for graphics, just like XOne" comes from.

Reminder:

PS4s GPU is 1152:72:32 config with 176GB/s bandwidth (18CUs), 7850 is 1024:64:32 with 153,6GB/s (16CUs), 7870 is 1280:80:32 with 153.6GB/s (20 CUs)

XOne is 768:48:16 (12CUs), 7770 is 640:40:16 (10CUs) and 7790 is 896:56:16 (14CUs)



There is no "4 are customized" for general tasks, if you need them all for gfx, you use them all in that way. PS4 has additional customization to enable compute simultaneously with gfx tasks.

"Next, to support the case where you want to use the GPU L2 cache simultaneously for both graphics processing and asynchronous compute, we have added a bit in the tags of the cache lines, we call it the 'volatile' bit. You can then selectively mark all accesses by compute as 'volatile,' and when it's time for compute to read from system memory, it can invalidate, selectively, the lines it uses in the L2. When it comes time to write back the results, it can write back selectively the lines that it uses. This innovation allows compute to use the GPU L2 cache and perform the required operations without significantly impacting the graphics operations going on at the same time -- in other words, it radically reduces the overhead of running compute and graphics together on the GPU."

"Our overall approach was to put in a very large number of controls about how to mix compute and graphics, and let the development community figure out which ones they want to use when they get around to the point where they're doing a lot of asynchronous compute."

Cerny expects developers to run middleware -- such as physics, for example -- on the GPU. Using the system he describes above, you can run at peak efficiency, he said.

"If you look at the portion of the GPU available to compute throughout the frame, it varies dramatically from instant to instant. For example, something like opaque shadow map rendering doesn't even use a pixel shader, it’s entirely done by vertex shaders and the rasterization hardware -- so graphics aren't using most of the 1.8 teraflops of ALU available in the CUs. Times like that during the game frame are an opportunity to say, 'Okay, all that compute you wanted to do, turn it up to 11 now.'"


http://www.gamasutra.com/view/feature/191007/