By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - [DF] Metro Redux: what it's really like to develop for PS4 and Xbox One

 

http://www.eurogamer.net/articles/digitalfoundry-2014-metro-redux-what-its-really-like-to-make-a-multi-platform-game

Metro Redux: what it's really like to develop for PS4 and Xbox One

Frank discussion with 4A Games about the new wave of consoles.

As tech interviews go, this one's a corker. Readers of our previous Metro 2033 and Metro Last Light tech Q&As will know that 4A Games' chief technical officer Oles Shishkovstov isn't backward about coming forward on the matters that are important to him, and in the transition across to the new wave of console hardware, clearly there are plenty of important topics to discuss.

And it's this frankness and direct, to the point honesty that always makes Oles' interviews so refreshing. In this case, 4A is the first developer willing to talk in-depth and on the record about the process of developing for the new consoles, discussing the problems and opportunities represented by the hardware and software that powers PlayStation 4 and Xbox One. Oles illuminates points that were previously the subject of rumour and hearsay, painting a picture of the challenges that face Xbox One game-makers in particular, offering us a glimpse of how Microsoft is working behind the scenes to improve the development XDK.

There's a wealth of information to sink your teeth into - the performance differential between Xbox One and PlayStation 4 of course, a frank and honest assessment of the Microsoft console's ESRAM, the implications of both CPU and GPU sharing the same memory space (and bandwidth), and observations on PC hardware and DirectX 12. There are some revelations too. Did you know that Microsoft now allows developers to bypass DX11 and talk to the hardware directly in the similar manner to Sony's GNM API? And just how much of a big deal is the return of the Kinect GPU time-slice to developers?

By the way, we were hoping to bring you our Metro Redux Face-Off today. However, some last-minute patching to the PC version means that'll have to wait. In the meantime however, we have included some of the complete console assets we've been working on. For more in-depth coverage of the console versions, our last-gen vs Redux and performance analysis pieces are worth checking out if you missed them. As things stand, we have no issue whatsoever in recommending the game - it's rather special.

Digital Foundry: In our last interview you were excited by the possibilities of the next-gen consoles. Now you've shipped your first game(s) on both Xbox One and PlayStation 4. Are you still excited by the potential of these consoles?

Oles Shishkovstov: I think what we achieved with the new consoles was a really good job given the time we had with development kits in the studio - just four months hands-on experience with Xbox One and six months with the PlayStation 4 (I guess the problems we had getting kits to the Kiev office are well-known now).

But the fact is we haven't begun to fully utilise all the computing power we have. For example we have not utilised parallel compute contexts due to the lack of time and the 'alpha' state of support on those consoles at that time. That means that there is a lot of untapped performance that should translate into better visuals and gameplay as we get more familiar with the hardware.

Of the two games, Metro 2033 sees the most extensive remastering work. In this video, we compare the original release on Xbox 360 and stack it up with the revised, reduxed version on Xbox One.

Digital Foundry: Xbox 360 and PS3 were highly ambitious designs for the 2006/7 era. Xbox One and PS4 are much more budget conscious - have they got what it takes to last as long as their predecessors?

Oles Shishkovstov: Well obviously they aren't packing the bleeding edge hardware you can buy for PC (albeit for insane amounts of money) today. But they are relatively well-balanced pieces of hardware that are well above what most people have right now, performance-wise. And let's not forget that programming close to the metal will usually mean that we can get 2x performance gain over the equivalent PC spec. Practically achieving that performance takes some time, though!

But to answer the question - they could last as long. Just remember - back when PS3 first hit the stores - Nvidia G80 was released as well, and it was almost 2x faster than the RSX at the time...

Digital Foundry: We are essentially looking at smartly integrated versions of existing PC components. For the first time we have parity in the architectures of all major platforms - how important is that for you?

Oles Shishkovstov: Well, similar GPU architecture is a good thing, really. The reason is that modern GPUs are really complex devices with not-so-obvious performance cliffs. You can't say anymore: 'Here we are ALU limited or ROP limited or texture addressing limited or texture filtering limited or occupancy limited.' There is no correct and simple answer at all. We could be somewhat limited by ALU and somewhat limited by texture addressing and somewhat limited by bandwidth - all at the same time... Mastering that takes some time.

As for the CPU - it doesn't really matter at all, as long as performance is enough. As for RAM hierarchy and its performance - it is different between platforms anyway.

Digital Foundry: How did you assess what these consoles were capable of when you first got hold of the development kits?

Oles Shishkovstov: Well we just ported the games over and ran a lot of tests!

One little example I can give: Metro Last Light on both previous consoles has some heavily vectorised and hand-optimised texture-generation tasks. One of them takes 0.8ms on single PS3 SPU and around 1.2ms on a single Xbox 360 hyper-thread. Once we profiled it first time - already vectorised via AVX+VEX - on PS4, it took more than 2ms! This looks bad for a 16ms frame. But the thing is, that task's sole purpose was to offload a few cycles from (older) GPUs, which is counter-productive on current-next-gen consoles. That code path was just switched off.

The Redux edition of Metro 2033 compared on PlayStation 4 and Xbox One. Use the full-screen button and select 1080p resolution for the best experience.

Digital Foundry: Xbox One's lower compute unit count, memory bandwidth and ESRAM issues are well documented. Resolution differences in multi-platform games are commonplace and in some titles we're even looking at 720p vs 1080p. What's your take on the differences between Xbox One and PlayStation 4?

Oles Shishkovstov: Well, you kind of answered your own question - PS4 is just a bit more powerful. You forgot to mention the ROP count, it's important too - and let's not forget that both CPU and GPU share bandwidth to DRAM [on both consoles]. I've seen a lot of cases while profiling Xbox One when the GPU could perform fast enough but only when the CPU is basically idle. Unfortunately I've even seen the other way round, when the CPU does perform as expected but only under idle GPU, even if it (the CPU) is supposed to get prioritised memory access. That is why Microsoft's decision to boost the clocks just before the launch was a sensible thing to do with the design set in stone.

Counting pixel output probably isn't the best way to measure the difference between them though. There are plenty of other (and more important factors) that affect image quality besides resolution. We may push 40 per cent more pixels per frame on PS4, but it's not 40 per cent better as a result... your own eyes can tell you that.

Digital Foundry: Is ESRAM really that much of a pain to work with?

Oles Shishkovstov: Actually, the real pain comes not from ESRAM but from the small amount of it. As for ESRAM performance - it is sufficient for the GPU we have in Xbox One. Yes it is true, that the maximum theoretical bandwidth - which is somewhat comparable to PS4 - can be rarely achieved (usually with simultaneous read and write, like FP16-blending) but in practice I've seen only a few cases where it becomes a limiting factor.

Digital Foundry: DirectX 11 vs GNMX vs GNM - what's your take on the strengths and weakness of the APIs available to developers with Xbox One and PlayStation 4? Closer to launch there were some complaints about XO driver performance and CPU overhead on GNMX.

Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

In general - I don't really get why they choose DX11 as a starting point for the console. It's a console! Why care about some legacy stuff at all? On PS4, most GPU commands are just a few DWORDs written into the command buffer, let's say just a few CPU clock cycles. On Xbox One it easily could be one million times slower because of all the bookkeeping the API does.

But Microsoft is not sleeping, really. Each XDK that has been released both before and after the Xbox One launch has brought faster and faster draw-calls to the table. They added tons of features just to work around limitations of the DX11 API model. They even made a DX12/GNM style do-it-yourself API available - although we didn't ship with it on Redux due to time constraints.

Metro Last Light Redux compared on PlayStation 4 and Xbox One. Again, we recommend full-screen playback and 1080p resolution.

Digital Foundry: The Metro games have a reputation for pushing visual boundaries, but even high-end PCs can struggle to sustain 60fps. I've played Metro Last Light for hours on PS4 and Xbox One and that 60fps is basically locked. Obviously some of the PC high-end features are reduced or altered, but in an age where not even Naughty Dog can't run its last-gen title consistently at 1080p60 on PS4, what is the secret of your success?

Oles Shishkovstov: There is no secret. We just adapted to the target hardware.

GCN doesn't love interpolators? OK, ditch the per-vertex tangent space, switch to per-pixel one. That CPU task becomes too fast on an out-of-order CPU? Merge those tasks. Too slow task? Parallelise it. Maybe the GPU doesn't like high sqrt count in the loop? But it is good in integer math - so we'll use old integer tricks. And so on, and so on.

That's just the art of optimisations and that's it. By the way, the PC version directly benefits from those optimisations as well, especially CPU-wise, as all of the platforms have out-of-order CPUs.

Digital Foundry: Surely the easier path would have been to lock at 1080p30 and concentrate on integrating as many high-end rendering features as possible. Why target 60fps over 30fps?

Oles Shishkovstov: Because we can! Actually for the next unannounced project, the designers want more and more of everything (as usual) and quite possibly we will target 30fps.

Look, we shipped a rock-solid 60fps game with the quality right in the middle between the high and very high preset of the PC version. Let's discard around 30 per cent of frame-time for post-processing (as this is basically a constant cost) - so we are at around 11ms for the stuff on screen. Now just imagine if we do target 30fps, that would enable around 2.5 times better, richer visuals.

Digital Foundry: Metro Redux isn't just a port, it's been improved. How did you choose which improvements to make and did the design of the consoles have anything to do with those choices?

Oles Shishkovstov: Since Metro Last Light hit the shelves we've collected numerous improvement suggestions from our players in order to include them in Metro Redux. The power of the new consoles also allowed us to improve the games in the field most critical to gameplay, especially gunplay and general feel - for example combat and cut-scenes became smoother, and controls became much more responsive. Besides that, the new incarnation of Metro 2033 enjoys a lot of the upgrades introduced in Metro: Last Light: new weapons and their upgrades, improved stealth mode viability and takedowns, improved AI with its more realistic behaviour, and the improved visuals etc.

Digital Foundry: Which of the Redux elements of the new games are you most happy with?

Oles Shishkovstov: We're quite happy with the games becoming more balanced: they just play better, run faster and look fresher. And the fact that we managed to gather the whole of the Metro world into one package with all of the DLCs, game modes and difficulty settings to provide the definitive package.

Performance analysis of Metro 2033 running on Xbox One and PlayStation 4. Both games operate at 60fps, with just the odd, essentially unnoticeable dip beneath the target frame-rate.

Digital Foundry: There's been a lot of discussion amongst gamers about whether developers should be making new games rather than converting across your existing titles to Xbox One and PS4 - what's your response to that?

Oles Shishkovstov: We are doing both. We have been working on a new game as well as Redux. We had the production resource free to handle Redux while the next project was in early pre-production, although now the Redux team are needed on the next project as we ramp up! But you have already seen, Metro Redux is not just a port or conversion - it presents a whole new experience, especially for the 2033 part of it!

Digital Foundry: Will your next game built from the ground up for the new hardware benefit because of the time you've spent creating the Redux?

Oles Shishkovstov: Definitely.

Digital Foundry: You've improved both Metro titles, but the Redux is the same two games at their core. Last time we spoke, you dropped some hints about the future - specifically physics-based character animation. Now you've been hands-on with the next-gen consoles, can you tease anything else you're working on?

Oles Shishkovstov: For the game we are working on now, our designers have shifted to a more sand-box-style experience - less linear but still hugely story-driven. I will not go into details, but it requires some work from programmers as well. Also, we are improving graphics in very different aspects, like recently we did a physically-based global ambient occlusion (instead of local, like SSAO). I will not talk about PBR (physically-based rendering) here, because here we are at the stage when artists are still adapting their mentality to it.

Digital Foundry: It seems that virtually every major next-gen title has shipped with some form of physically-based lighting - so that's your choice for your first title designed with the new wave of hardware in mind?

Oles Shishkovstov: Actually, what is PBR and why to use it? First it means less tuning of content to make it look good. As a result - lighting artists love it, texture artists hate it. But from the technical side PBR is all about specular as a first class citizen in every pixel. Actually the Redux comes with energy-preserving specular - which is a little but important part of PBR, although we intentionally preserved all the tuning knobs for the artists. But yes, we are fully PBR now for the next project. No more tuning knobs - at least for now.

Digital Foundry: What's your take on DirectX 12 and Mantle? Is it all about making PC games development tie in more closely with Xbox One and PlayStation 4?

Oles Shishkovstov: Aside from them being much more close to the (modern) metal, those APIs are a paradigm-shift in API design. DX11 was 'I will keep track of everything for you'. DX12 says 'now it's your responsibility' - so it could be a much thinner layer. As for Mantle, it is a temporary API, in my honest opinion.

Digital Foundry: To what extent will DX12 prove useful on Xbox One? Isn't there already a low CPU overhead there in addressing the GPU?

Oles Shishkovstov: No, it's important. All the dependency tracking takes a huge slice of CPU power. And if we are talking about the multi-threaded command buffer chunks generation - the DX11 model was essentially a 'flop', while DX12 should be the right one.

Frame-rate tests on the more technologically demanding Metro Last Light see both PlayStation 4 and Xbox One stick doggedly to their 60fps target - the sheer consistency is a major plus point.

Digital Foundry: Microsoft returning the Kinect GPU reservation in the June XDK made a lot of headlines - I understand you moved from a 900p to a 912p rendering resolution, which sounds fairly modest. Just how important was that update? Has its significance been over-played?

Oles Shishkovstov: Well, the issue is slightly more complicated - it is not like 'here, take that ten per cent of performance we've stolen before', actually it is variable, like sometimes you can use 1.5 per cent more, and sometimes seven per cent and so on. We could possibly have aimed for a higher res, but we went for a 100 per cent stable, vsync-locked frame-rate this time That is not to say we could not have done more with more time, and per my earlier answer, the XDK and system software continues to improve every month.

Digital Foundry: Would you still say that PC game-makers target Nehalem performance levels when creating games? Does DX12/Mantle suggest that it's all about getting more out of that level of performance rather than relying on gamers buying into faster processors?

Oles Shishkovstov: Well, CPU performance has essentially stalled due to various factors - economics being one of them. I'd say that PC game-makers should target console CPUs.

Digital Foundry: Console development trends do have an impact on PC gaming. If you were building a mainstream gaming PC now with the future in mind, what choices would you make?

Oles Shishkovstov: This is tricky to answer without going into 'fan wars'. Get the most powerful components your budget allows for, with the emphasis on GPU.

Digital Foundry: A while back Nvidia announced a unified memory set-up for PC - how important is that for the future of the platform?

Oles Shishkovstov: The problem with unified memory is memory coherence. Even on consoles, where we see highly integrated SoCs (system on chips), we have the option to map the memory addresses ranges basically 'for CPU', 'for GPU' and 'fully coherent'. And being fully coherent is really not that useful as it wastes performance. As for the traditional PC? Going through some kind of external bus just to snoop the caches - it will be really slow.

Digital Foundry: Any thoughts on SteamOS? You have Last Light on Linux now running on an older OpenGL, but I understand you have a more advanced version in the works...

Oles Shishkovstov: Yes, the original Metro Last Light Linux port was based on OpenGL 3.2 - it was stable but did not support high-end features. For Redux we are essentially replicating the DX11 version, with almost one-to-one correspondence in features. The downside of that approach - the GPU should be at least OpenGL 4 'core profile'.

Digital Foundry: So with Nvidia Tegra K1 on mobile we have a situation now where we have reasonable CPU power combined with GPU capabilities on par with last-gen console, plus access to the full OpenGL API - do you see mobile as a potential platform for your existing library of games?

Oles Shishkovstov: Definitely. K1 is simply a bright star in the mobile world. I wish the sky would be full of stars to make it economically viable for us!



Around the Network

Interesting bits:

DF: Are you still excited by the potential of these consoles?

Oles Shishkovstov: I think what we achieved with the new consoles was a really good job given the time we had with development kits in the studio - just four months hands-on experience with Xbox One and six months with the PlayStation 4 (I guess the problems we had getting kits to the Kiev office are well-known now).

 

Out of context: And let's not forget that programming close to the metal will usually mean that we can get 2x performance gain over the equivalent PC spec. Practically achieving that performance takes some time, though!

 

Digital Foundry: Xbox One's lower compute unit count, memory bandwidth and ESRAM issues are well documented. Resolution differences in multi-platform games are commonplace and in some titles we're even looking at 720p vs 1080p. What's your take on the differences between Xbox One and PlayStation 4?

Oles Shishkovstov: Well, you kind of answered your own question - PS4 is just a bit more powerful. You forgot to mention the ROP count, it's important too - and let's not forget that both CPU and GPU share bandwidth to DRAM [on both consoles]. I've seen a lot of cases while profiling Xbox One when the GPU could perform fast enough but only when the CPU is basically idle. Unfortunately I've even seen the other way round, when the CPU does perform as expected but only under idle GPU, even if it (the CPU) is supposed to get prioritised memory access. That is why Microsoft's decision to boost the clocks just before the launch was a sensible thing to do with the design set in stone.

 

Digital Foundry: Is ESRAM really that much of a pain to work with?

Oles Shishkovstov: Actually, the real pain comes not from ESRAM but from the small amount of it. As for ESRAM performance - it is sufficient for the GPU we have in Xbox One. Yes it is true, that the maximum theoretical bandwidth - which is somewhat comparable to PS4 - can be rarely achieved (usually with simultaneous read and write, like FP16-blending) but in practice I've seen only a few cases where it becomes a limiting factor.

 

 

Digital Foundry: DirectX 11 vs GNMX vs GNM - what's your take on the strengths and weakness of the APIs available to developers with Xbox One and PlayStation 4? Closer to launch there were some complaints about XO driver performance and CPU overhead on GNMX.

Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

In general - I don't really get why they choose DX11 as a starting point for the console. It's a console! Why care about some legacy stuff at all? On PS4, most GPU commands are just a few DWORDs written into the command buffer, let's say just a few CPU clock cycles. On Xbox One it easily could be one million times slower because of all the bookkeeping the API does.

But Microsoft is not sleeping, really. Each XDK that has been released both before and after the Xbox One launch has brought faster and faster draw-calls to the table. They added tons of features just to work around limitations of the DX11 API model. They even made a DX12/GNM style do-it-yourself API available - although we didn't ship with it on Redux due to time constraints.



This ones the most important one.

Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.



Tamron said:
This ones the most important one.

Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

 

Yes, that's interesting. Not that it should matter in all games or generally but still something that is interesting. But as they mentioned MS is constantly reducing draw calls and even made a GNM like API.



Refreshing honesty and straight talk. Its good to know that LL on Linux was actually running on OpenGL 3.2 (DX 10 equivalent) because it explains why it didn't look as good as the Windows version. Also all this done in 4 to 6 months is impressive, no lazy dev talk here.



I predict that the Wii U will sell a total of 18 million units in its lifetime. 

The NX will be a 900p machine

Around the Network
Tamron said:
This ones the most important one.

Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

so PS4 faster then the speed of light confirmed.



BeElite said:
Tamron said:
This ones the most important one.

Oles Shishkovstov: Let's put it that way - we have seen scenarios where a single CPU core was fully loaded just by issuing draw-calls on Xbox One (and that's surely on the 'mono' driver with several fast-path calls utilised). Then, the same scenario on PS4, it was actually difficult to find those draw-calls in the profile graphs, because they are using almost no time and are barely visible as a result.

so PS4 faster then the speed of light confirmed.

Don't be flippant, it's just the most important one because it's a genuine issue that will effect virtually ALL games and ALL engines, is present on their mono driver even with fast path calls used, so theres clearly a huge performance gap there , it doesn't mean it can't be fixed by better software however. and that's exactly what theyre aiming to do

Be as fast as PS4?, unlikely (and more or less impossible), but as slow as current? no.



Great read, but still not as indepth as I would like.

I want to know what exact usage scenario benefits every thing they say the PS4 has better really means besides. Kinda how they explained with the draw calls in relation to the APIs, but now do one for ROP, compute units, what exactly are they doing with all that extra constant accessible memory bandwidth on the PS4...etc



Intrinsic said:
Great read, but still not as indepth as I would like.

I want to know what exact usage scenario benefits every thing they say the PS4 has better really means besides. Kinda how they explained with the draw calls in relation to the APIs, but now do one for ROP, compute units, what exactly are they doing with all that extra constant accessible memory bandwidth on the PS4...etc

 

For bandwidth: he said that the theoretical maximum can't always be achieved and of course this is true for every memory as it is for measuring TFlops. This is always under perfect circumstances and numbers given are always max values. Don't expect the PS4 constantly calculating at 1.8 TFlops. It is able to do so but won't in real life applications.

Overall his only statement of what you wanted to hear was "PS4 is just a bit more powerful.". Make of this what you want, he won't go deeper but I don't see why he would lie here given his openness in general (there were interviews with this guy before and he is a real tech guy who one would guess gives a sh** about who "wins").



walsufnir said:

For bandwidth: he said that the theoretical maximum can't always be achieved and of course this is true for every memory as it is for measuring TFlops. This is always under perfect circumstances and numbers given are always max values. Don't expect the PS4 constantly calculating at 1.8 TFlops. It is able to do so but won't in real life applications.

Overall his only statement of what you wanted to hear was "PS4 is just a bit more powerful.". Make of this what you want, he won't go deeper but I don't see why he would lie here given his openness in general (there were interviews with this guy before and he is a real tech guy who one would guess gives a sh** about who "wins").

Thats the thing though, first off we know that on a hardware level there really is no way that the PS4 is "just a bit more powerful". Don't wanna come off sounding anal here, but every single metric by which performance can be measured the PS4 has more of it. And not just slightly more but marginally more. Simple look at GPU compute units or ROP count will prove this. So I can't help but feel that though candid he was still somewhat talking tounge in check.

It doesn't help that right after saying the PS4 was "slightly more powerful" all he went on to talk about was how this or that is better on the PS4 and how MS have to or have been doing this or that to make things a little better on the XB1. Honestly, I don't really care for how much better the PS4 is to the XB1 as I already have a PS4 and it would make no real difference to me. I just wanna know what does what and why and how that is compared to the XB1.