Personal Comment - This dev is basically explaining what was presented by Mark Cerny on the benefits of the lower CU count at higher frequency, lower level access and more specialized code. I'll paste the whole interview below so you don't need to click the link, but if moderation decides that is breaking copyright I'll remove.
Edit 2: There was a twitter translation feed but unfortunately it was taken down by its OP
Edit 3: The original interview got taken down.
Edit 4: Since the original is taken down a user here managed to capture the original in an image, so I'm posting the link to his post.
OK Here we go! It is a long one but full of info.
The hardware specifications of the PlayStation 5 and Xbox Series X were officially announced a few weeks ago by Sony and Microsoft, and Digital Foundry had the opportunity to take a deep technical look at what we expect. Although there aren't many games for consoles yet, and we don't know much about their overall performance and user experience, the two companies are constantly competing in technical and complex debates that no one but engineers and programmers can understand. Providing the deepest technical information is not avoided this time around.
As we tracked down the information and read the specifications and were searching for more information on the matter, it seemed better to talk with an engineer and programmer at Crytek, one of the world's most tech-savvy companies, with a powerful gaming engine. That's why I called Ali Salehi, a rendering engineer from Crytek, and asked him, as an expert, to answer our questions about Xbox Traflops adavtages over PS5 and the power of the consoles, and to comment on which one is more powerful. Convincing answers with simple and understandable explanations that were contrary to expectations and numbers on paper.
In the following, you will read the conversation between Mohsen Vafnejad and Shayan Ziaei with Ali Salehi about the hardware specifications of the PlayStation 5 and Xbox Series X.
[Questions bolded, answers not]
Vijayato: In short, what is the job of a rendering engineer in a gaming company?
Ali Salehi: The technical visual section of each game is what we do. That means supporting new consoles, optimizing current algorithms, troubleshooting current ones, implementing new technology and features like RayTracing are somethings we do.
What is the significance of Teraflops, and does higher Teraflops mean a console is stronger?
Teraflops shows that this processor can be as efficient if it is in the best and most ideal state possible. The Teraflops figure is in ideal and theoretical conditions. In practice, however, the graphics card and console are a complex entities that rarely get to their fullest potential. Several elements must work together in harmony to provide each part of the feed to the other and output one part to another. If each of these elements fails to work properly, the efficiency of the other part will decrease. A good example of this is the PlayStation 3 console. Because of its SPUs, the PlayStation 3 had a lot more power on paper than the Xbox 360. But in practice, because of its complex architecture and bottlenecked Memory and other problems, you never reached the peak of efficiency.
There is an image here with following
[Woes of PlayStation 3
The PlayStation 3 had a hard time running multi-platform games compared to the Xbox 360. Red Dead Redemption and GTA IV, for example, ran at 720p on the Microsoft console, but the PlayStation 3 had a poorer output and eventually up scaled the resolution to 720p. But Sony's own studios have been able to offer more detailed games such as The Last of Us and Uncharted 2 and 3 due to their greater familiarity with the console and the development of special software accessibility.]
That is why it is not a good idea to base our opinions only on numbers. But if all the parts in the Xbox Series X can work optimally and the GPU works in its own peak, which is not possible in practice, we can achieve 12 TFlops. In addition to all this, we also have a software section. The example is the advent of of Vulkan and DirectX 12. The hardware did not change, but due to the change in the architecture of the software, the hardware could be better put in use.
The same can be said for consoles. Sony runs PlayStation 5 on its own operating system, but Microsoft has put a customized version of Windows on the Xbox Series X. The two are very different. Because Sony has developed exclusive software for the PlayStation 5, it will definitely give developers much more capabilities than Microsoft, which has almost the same directX PC and for its consoles.
How have you experienced working with both consoles and how do you evaluate them?
I can't say anything right now about my own work, but I'm quoting others who have made a public statement. Developers say that the PlayStation 5 is the easiest console they’ve ever coded for. so they can reach the console's peak performance. In terms of software, coding on the PlayStation 5 is extremely simple and has many features which leave a lot of options for developers. All in all, the PlayStation 5 is a better console.
If I understood correctly, is Traflaps the final defining factor over GPU power? Or what do these floating points mean? How would you describe it for a user who doesn't understand all of these?
I think it was a bad PR move to put all these information out. This technical information does not matter to the average user and is not a final judgement over GPU power.
Graphics cards, for example, have 20 different sections, one of which is Compute Units, which performs the processing. If the rest of the components are best put to use in the best possible way, and there are no other restrictions, there is not bottleneck in memory, and as long as the processor has the necessary information, 12 Tflops can be achieved. So in an ideal world where we remove all the limiting parameters, that's possible, but it's not. ( he means we cannot remove all bottlenecks and 12 Tflpos only remains on paper)
A good example of this is the Xbox Series X hardware. Microsoft two seprate pools of Ram. The same mistake that they made over Xbox one. One pool of RAM has high bandwidth and the other pool of RAM has lower bandwidth. As a result, coding for the console is sometimes problematic. Because the total number of things we have to put in the faster pool RAM is so much that it will be annoying again, and add insult to injury the 4k output needs even more bandwidth. So there will be some factors which bottleneck XSX’s GPU.
You talked about the CUs. The PlayStation 5 now has 36 CUs, and the Xbox Series X has 52 CUs are available to the developer. What is the difference?
The main difference is that the working frequency of the PlayStation 5 is much higher and they work at a higher frequency. That's why, despite the differences in CU count, the two consoles’ performance is almost the same. An interesting analogy from an IGN reporter was that the Xbox Series X GPU is like an 8-cylinder engine, and the PlayStation 5 is like turbocharged 6- cylinder engine. Raising the clock speed on the PlayStation 5 seems to me to have a number of benefits, such as the memory management, rasterization, and other elements of the GPU whose performance is related to the frequency not CU count. So in some scenarios PlayStation 5's GPU works faster than the Series X. That's what makes the console GPU to work even more frequently on the announced peak 10.28 Teraflops. But for the Series X, because the rest of the elements are slower, it will not probably reach its 12 Teraflops most of the time, and only reach 12 Teraflops in highly ideal conditions.
Doesn't this difference decline at the end of the generation, when developers become more familiar with the Series X hardware?
No, because the PlayStation API generally gives devs more freedom, and usually at the end of each generation, Sony consoles produce more detailed games. For example, in the early seventh generation, even multi-platform games for both consoles performed poorly on the PlayStation 3. But late in the generation Uncharted 3 and The Last of Us came out on the console. I think the next generation will be the same. But generally speaking XSX must have less trouble pushing more pixels. (He emphasizes on “only” pixels)
Sony says the smaller the number of CUs, the more you can integrate the tasks. What does Sony's claim mean?
It costs resources to use all the CUs at the same time. Because CUs need resources that are allocated to the GPU when they want to run code. If the GPU fails to distribute all the resources on all the CUs to execute a code, it will be forced to drop a number of CUs in use. For example, instead of 52, use 20 of them because GPU doesn't have enough resources for all CUs at all times.
Aware of this, Sony has used a faster GPU instead of a larger GPU to reduce allocation costs. A more striking example of this was in the CPUs. AMD has had high-core CPUs for a long time. Intels on the other hand has used less core but faster ones. Intel CPUs with less cores but faster ones perform better in Gaming. Clearly, a 16- or 32-core CPU has a higher number of Teraflops, but a CPU with a faster core will definitely do a better job. Because it's hard for gamers and programmers to use all the cores all the time, they prefer to have fewer cores but faster.
Could the Hyperthreading feature included in the X series be the Microsoft's winning ace at the end of gerneration?
Technically, hypertheading has been on desktop computers since Pentium 4, and each physical core considers the CPU as two virtual cores, and in most cases helps with performance. Does the Series X feature allow the developer to decide for themselves whether they want to use these virtual cores or turn them off with more CPU clocks? And that's exactly what you're saying. It's not exactly a big deal to make a local decision from the start, so the use of hyperthreading is likely to be used at later time of the generation not at first.
Can you elaborate?
That is, the analysis requires very accurate code execution. So it's not something everyone knows right now. There are now much more important concerns for recognizing console hardware, and developers are likely to work with a smaller number of cores at the beginning of the next generation, but with a higher clock, and then move on to use SMT (Hyperthreading).
The 3328 Shader is available in the Xbox Series X Computing Unit. What is a Shader?, what does it do, and what does 3328 Shaders mean?
When developers want to execute code, they do so through units called Wavefront. Multiply the number of CUs by the number of Wavefronts and you have the number of shaders. But it doesn't really matter, and everything I said about the CUs applies here. Again, there are limitations that make all of these shaders unusable, and having many of them all at once aren't necessarily good.
There is another important issue to consider, as Mark Cerny put it. CUs or even Traflaps are not necessarily the same between all architectures. That is, Teraflops cannot be compared between devices and decide which one is actually numerically superior. So you can't trust these numbers and call it a day.
Comparisons between Android devices and Apple iPhones have also recently risen analogous to consoles, with Internet discussions suggesting that Android users have higher RAM but poorer performance than iPhones. Is the comparison between the two with the consoles correct?
Software stacks that are placed on top of the hardware determine everything. As performance updates increase exponentially, so do they. Sony has always had better software because Microsoft has to use Windows. So that's right.
Microsoft has insisted that the Xbox Series X frequency is constant under any circumstances, but Sony does not have such an approach and provides the console with a certain amount of energy to use it as a variable and depending on the situation. What are the differences between the two and which will be better for the developer?
What Sony has done is much more logical because it decides whether the GPU frequency is higher or the CPU's frequency at certain times, depending on the processing load. For example, on a loading page, only the CPU is needed and the GPU is not used. Or in a close-up scene of the character's face, GPU gets involved and CPU plays a very small role. On the other hand, it's good that the Series X has good cooling and guarantees to keep the frequency constant and it doesn't have throttling, but the practical freedom that Sony has given is really a big deal.
Doesn't this freedom of action make things harder for the developer?
Not really, because we're already doing that on the engine. For example, the Dynamic Resolution Scaling technique used by some games is now measuring different elements and measuring how much the GPU is under pressure and how low the resolution should be kept to be fixed on the frame. So it's very easy to connect these together.
What is the use of the geometry engine or Geometry Engine that Sony is talking about?
I don't think it will be very useful in the first year or two. We'll probably see more of an impact for the second wave of games released on this console, but it doesn't have much use at the start.
The Series X chipset is 7 nanometers, and we know that the smaller the number, the better the chipset. Are you exploring the nanometer and transistors?
Lowering the nanometer means more transistors and controlling their heat in large numbers and smaller spaces. A production technology is better and the number of nanometers is not very important, what matters is the number of transistors.
PlayStation 5 SSD speeds reach 8-9 GB/s in peak mode. Now that we've reached this speed, what else will happen apart from loading games and more details?
The first thing to do is remove the loading page from the games. Microsoft also showed the ability to stop and run new games, which can run multiple games simultaneously and move between each in less than 5-6 seconds. This time will be under a second in PlayStation. Another thing that can be expected is a change in the game menu. When there is no loading, of course, there is no expectation and you no longer need to watch a video to load the game in the background.
How will the games on PC be in the meantime? Because having an SSD is a choice for a PC user.
Consoles have always determined what the standard is. Game developers also build games based on consoles, and if someone has a PC and doesn't have an SSD on it, they have to deal with long loads or think about buying an SSD.
As a programmer and developer, which do you consider the best console for working and coding? PlayStation 5 or Xbox X series?
Definitely PlayStation 5.
As a programmer, I would say that the PlayStation 5 is much better, and I don't think you can find a programmer who chooses XBX over PS5. For the Xbox, they have to put DirectX and Windows on the console, which is many years old, but for each new console that Sony builds, it also rebuilds the software and APIs in any way it wants. It is in their interest and in our interest. Because there is only one way to do everything, and theirs is the best way possible.
duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"
Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"
Azzanation: "PS5 wouldn't sold out at launch without scalpers."