I posted a reply in another thread explaining trying to explain this and figured that I should make a thread so we could discuss this stuff. I considered making a series of threads all starting with "tech talk" where we could discuss certain hardware, software, render, middleware rlated topics in the hope that those that find stuff like this interesting will learn from them. So this is a first of the tech talk series. And pls, anyone is free to make a tech talk thread and talk about anything you tech related in relation to the PS4. XB1, WiiU or PC.
This talk is primarily focused on PS4 GPU compute, but I touched the Xbox ESRam (lightly cause it deserves its own thread) and I talked about the assasins creed unity 900p thingy. Enjoy.
First of, yes, using compute means that you will have to do less typical GPU tasks cause it runs on the same GPU pipeline. But thats just half the story. There are two ways that compute can be used.
- render pipeline assisted compute. In this case you would be using just a little bit of the available 64 compute lanes and thus a little of the GPU to handle tasks that would in turn boost the overall yeild and performance of the GPU. In an example Cerny gave;
say you are running a game that is extremly geometry heavy on the GPU. You can input code that makes a compute pass on the render pipeline that will identify all the front facing polygons(the ones that the gamer can see) of all geometry objects in the scene and the GPU will then only render those polygons and ommit rendering the rest of them even though its getting the complete geometry render instruction set from the CPU. This way it would be possible to render 3-5 times(not sure how much more but basically more) more polygons than your engine would have typically handled or the GPU will spend less time carrying out that specific task giving it more frame time for other tasks.
This could be used in more elaborate ways like marking out certain objects in a scene for AA passes. The small amount of the GPU used to pick out certain aspects in a frame that would'not need an AA pass/shadow detail.....etc will end up freeing up GPU power to be used to do more than it otherwise would have been able to do or basically something else. - Idle time compute. Contrary to what some may think, ALL of the GPU is not active 100% of the time all the time. There are times when only anywhere between 60-80% of the GPU is being used. Now to put this in perspective you need to understand exactly what is being referred to here.
Take a game running at 30fps. The CPU/GPU basically has 33ms to render each individual frame. That doesn't mean that the GPU spent 33ms rendering the frame. Don't forget the GPU is just one part of the equation. It had to wait for the updated scene render instruction from the CPU. So for those 33ms, the GPU could very well be idle for anything between 10ms-20ms. That idle time could be used for GPU compute and and take some of the work load off the CPU. Thus letting the CPU finish its work faster and cutting down the GPU idle time from 10-20ms to say 5-10ms. Things like this is how compute on the PS4 can assist the CPu which in turn can lead to either higher or more stable framerates in games. Or just more physics/AI/lighting? calculations all around.
AC:CREED UNITY SECTION(if you understand the above point with regard to CPU/GPU render times then you will also understnd why what ubisoft said about the CPU lwork oad being responsible for why AC:Unity runs at 900p on both consoles. thats just bullshit. A CPU load just measn that the GPU has less time to do its thing, that is more likely to affect framerate than anything else because of how GPUs work. This is where it gets interesting. IThe only way what ubisoft are saying is true is if the XB1 CPU is better than the PS4s, or that the game is better optimized for the XB1 [which brings us back to the whole parity nonsense]. Its interesting cause if say the CPU heavy load takes 25ms to complete its task for the next frame then passes off the render instructions to the GPU, then theoretically limiting the resolution means that they wanted to give the GPU less work to do so it spit outs the frame on time still hitting that 33ms limit. But this is where there is a problem with the story. If the GPUs have only 8ms to complete its task and the XB1 completes that with a 900p frame, what happend to the 40% more GPU that the PS4 has? The PS4 should be able to complete the exact same task 40% faster than the XB1. So if the time is constant, then it means they could have simply allowed the PS4 do more work and it would have still met that render time limit. Unless of course they want you to believe that the XB1 CPU did completed its much much much faster than the PS4s CPU and that way the PS4s more powerful GPU had less time to render the frame than the XB1 so the extra power of the PS4 went to still mathcing the XB1. Which simply isn't the case)