By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming - Xbox One Games Need To Use Two Pools Of Memory To Achieve Optimal Performance.

 

The Playstation 4 uses one pool of memory DDR5 to achieve optimal performance, The Xbox One on the other hand has to use two

pools of memory to achieve optimal performance DDR3 and ESRAM. This takes time as the developer has to see which parts of the

game will run in DDR3 and which parts of the game will run in ESRAM, no wonder many developers are having trouble achieving optimal

performance on Xbox One.

Here is a digital foundry interview with the architects of the Xbox One. I have posted some of the main points below, but you can see the

full interview here: http://www.eurogamer.net/articles/digitalfoundry-the-complete-xbox-one-interview

 

Digital Foundry: So you didn't want to go for a daughter die as you did with Xbox 360?

Nick Baker: No, we wanted a single processor, like I said. If there'd been a different time frame or technology options we could maybe have had a different technology there but for the product in the timeframe, ESRAM was the best choice.

Digital Foundry: If we look at the ESRAM, the Hot Chips presentation revealed for the first time that you've got four blocks of 8MB areas. How does that work?

Nick Baker: First of all, there's been some question about whether we can use ESRAM and main RAM at the same time for GPU and to point out that really you can think of the ESRAM and the DDR3 as making up eight total memory controllers, so there are four external memory controllers (which are 64-bit) which go to the DDR3 and then there are four internal memory controllers that are 256-bit that go to the ESRAM. These are all connected via a crossbar and so in fact it will be true that you can go directly, simultaneously to DRAM and ESRAM.

Digital Foundry: Simultaneously? Because there's been a lot of controversy that you're adding your bandwidth together and that you can't do this in a real-life scenario.

Nick Baker: Over that interface, each lane - to ESRAM is 256-bit making up a total of 1024 bits and that's in each direction. 1024 bits for write will give you a max of 109GB/s and then there's separate read paths again running at peak would give you 109GB/s. What is the equivalent bandwidth of the ESRAM if you were doing the same kind of accounting that you do for external memory... With DDR3 you pretty much take the number of bits on the interface, multiply by the speed and that's how you get 68GB/s. That equivalent on ESRAM would be 218GB/s. However, just like main memory, it's rare to be able to achieve that over long periods of time so typically an external memory interface you run at 70-80 per cent efficiency.

The same discussion with ESRAM as well - the 204GB/s number that was presented at Hot Chips is taking known limitations of the logic around the ESRAM into account. You can't sustain writes for absolutely every single cycle. The writes is known to insert a bubble [a dead cycle] occasionally... One out of every eight cycles is a bubble, so that's how you get the combined 204GB/s as the raw peak that we can really achieve over the ESRAM. And then if you say what can you achieve out of an application - we've measured about 140-150GB/s for ESRAM. That's real code running. That's not some diagnostic or some simulation case or something like that. That is real code that is running at that bandwidth. You can add that to the external memory and say that that probably achieves in similar conditions 50-55GB/s and add those two together you're getting in the order of 200GB/s across the main memory and internally.

One thing I should point out is that there are four 8MB lanes. But it's not a contiguous 8MB chunk of memory within each of those lanes. Each lane, that 8MB is broken down into eight modules. This should address whether you can really have read and write bandwidth in memory simultaneously. Yes you can there are actually a lot more individual blocks that comprise the whole ESRAM so you can talk to those in parallel and of course if you're hitting the same area over and over and over again, you don't get to spread out your bandwidth and so that's why one of the reasons why in real testing you get 140-150GB/s rather than the peak 204GB/s is that it's not just four chunks of 8MB memory. It's a lot more complicated than that and depending on how the pattern you get to use those simultaneously. That's what lets you do read and writes simultaneously. You do get to add the read and write bandwidth as well adding the read and write bandwidth on to the main memory. That's just one of the misconceptions we wanted to clean up.

Andrew Goossen: If you're only doing a read you're capped at 109GB/s, if you're only doing a write you're capped at 109GB/s. To get over that you need to have a mix of the reads and the writes but when you are going to look at the things that are typically in the ESRAM, such as your render targets and your depth buffers, intrinsically they have a lot of read-modified writes going on in the blends and the depth buffer updates. Those are the natural things to stick in the ESRAM and the natural things to take advantage of the concurrent read/writes.

Digital Foundry: So 140-150GB/s is a realistic target and you can integrate DDR3 bandwidth simultaneously?

Nick Baker: Yes. That's been measured.

Andrew GoossenOf course with Xbox One we're going with a design where ESRAM has the same natural extension that we had with eDRAM on Xbox 360, to have both going concurrently. It's a nice evolution of the Xbox 360 in that we could clean up a lot of the limitations that we had with the eDRAM. The Xbox 360 was the easiest console platform to develop for, it wasn't that hard for our developers to adapt to eDRAM, but there were a number of places where we said, "Gosh, it would sure be nice if an entire render target didn't have to live in eDRAM," and so we fixed that on Xbox One where we have the ability to overflow from ESRAM into DDR3 so the ESRAM is fully integrated into our page tables and so you can kind of mix and match the ESRAM and the DDR memory as you go.

Sometimes you want to get the GPU texture out of memory and on Xbox 360 that required what's called a "resolve pass" where you had to do a copy into DDR to get the texture out - that was another limitation we removed in ESRAM, as you can now texture out of ESRAM if you want to. From my perspective it's very much an evolution and improvement - a big improvement - over the design we had with the Xbox 360. I'm kind of surprised by all this, quite frankly.

Digital Foundry: Obviously though, you are limited to just 32MB of ESRAM. Potentially you could be looking at say, four 1080p render targets, 32 bits per pixel, 32 bits of depth - that's 48MB straight away. So are you saying that you can effectively separate render targets so that some live in DDR3 and the crucial high-bandwidth ones reside in ESRAM?

Andrew Goossen: Oh, absolutely. And you can even make it so that portions of your render target that have very little overdraw... For example, if you're doing a racing game and your sky has very little overdraw, you could stick those subsets of your resources into DDR to improve ESRAM utilisation. On the GPU we added some compressed render target formats like our 6e4 [six bit mantissa and four bits exponent per component] and 7e3 HDR float formats [where the 6e4 formats] that were very, very popular on Xbox 360, which instead of doing a 16-bit float per component 64pp render target, you can do the equivalent with us using 32 bits - so we did a lot of focus on really maximizing efficiency and utilisation of that ESRAM.

 



Around the Network

So the reason X1 games run at sub PS4 quality is its more complex yet weaker hardware that is harder to utilize ?



That article is almost a year old Jega :p It's pretty common knowledge to anyone that cares about these sorts of things that the X1's RAM set up is more difficult to work with. That's not the X1's biggest disadvantage against the PS4 though. That would be its GPU.



There is actually a quote about the XO's memory setup from Frank Savage, Xbox One Team Partner Development Lead, that you should have included:

"So the last thing you have to do to get it all composited up is to get it copied over to main memory. That copy over to main memory is really fatst, and it doesn’t use any CPU or GPU time either, because we have DNA engines that actually do that for you in the console. This is how you get to 1080p, this is how you run at 60 frames per second… period, if you’re bottlenecked by graphics."

Pow!  Your move, Playstation 4.  Your move.



"No, we wanted a single processor, like I said. If there'd been a different time frame or technology options we could maybe have had a different technology there but for the product in the timeframe, ESRAM was the best choice."

So the reason Microsoft went with ESRAM...was because they could make ESRAM quicker? Everyone except Nintendo uses EDRAM because it's three times cheaper, and even for Nintendo, it's special use only. Sony must have badly beat Microsoft on the draw, or else someone would have used EDRAM just to save dough. Even if the difference is only a few pennies per console, that adds up over ten million units.



Around the Network

"Yes you can there are actually a lot more individual blocks that comprise the whole ESRAM so you can talk to those in parallel and of course if you're hitting the same area over and over and over again, you don't get to spread out your bandwidth and so that's why one of the reasons why in real testing you get 140-150GB/s rather than the peak 204GB/s is that it's not just four chunks of 8MB memory. It's a lot more complicated than that and depending on how the pattern you get to use those simultaneously"

 

And that right there is probably why people are having issues with it.  A simple pool of GDDR5 memory sounds much easier.



If developers could figure out the mystery that's PS3, they should be able to better utilize Xbox One down the road.

For the record, that took more than a year.



Recently Completed
River City: Rival Showdown
for 3DS (3/5) - River City: Tokyo Rumble for 3DS (4/5) - Zelda: BotW for Wii U (5/5) - Zelda: BotW for Switch (5/5) - Zelda: Link's Awakening for Switch (4/5) - Rage 2 for X1X (4/5) - Rage for 360 (3/5) - Streets of Rage 4 for X1/PC (4/5) - Gears 5 for X1X (5/5) - Mortal Kombat 11 for X1X (5/5) - Doom 64 for N64 (emulator) (3/5) - Crackdown 3 for X1S/X1X (4/5) - Infinity Blade III - for iPad 4 (3/5) - Infinity Blade II - for iPad 4 (4/5) - Infinity Blade - for iPad 4 (4/5) - Wolfenstein: The Old Blood for X1 (3/5) - Assassin's Creed: Origins for X1 (3/5) - Uncharted: Lost Legacy for PS4 (4/5) - EA UFC 3 for X1 (4/5) - Doom for X1 (4/5) - Titanfall 2 for X1 (4/5) - Super Mario 3D World for Wii U (4/5) - South Park: The Stick of Truth for X1 BC (4/5) - Call of Duty: WWII for X1 (4/5) -Wolfenstein II for X1 - (4/5) - Dead or Alive: Dimensions for 3DS (4/5) - Marvel vs Capcom: Infinite for X1 (3/5) - Halo Wars 2 for X1/PC (4/5) - Halo Wars: DE for X1 (4/5) - Tekken 7 for X1 (4/5) - Injustice 2 for X1 (4/5) - Yakuza 5 for PS3 (3/5) - Battlefield 1 (Campaign) for X1 (3/5) - Assassin's Creed: Syndicate for X1 (4/5) - Call of Duty: Infinite Warfare for X1 (4/5) - Call of Duty: MW Remastered for X1 (4/5) - Donkey Kong Country Returns for 3DS (4/5) - Forza Horizon 3 for X1 (5/5)

Why have we started a thread about an article that is nearly a year old and is common knowledge?



pokoko said:

Pow!  Your move, Playstation 4.  Your move.

PS4 didn't need to move data between memories... it have only a single memory to work that can be accessed by CPU/GPU.

Devs don't need to code that on PS4.



pokoko said:

There is actually a quote about the XO's memory setup from Frank Savage, Xbox One Team Partner Development Lead, that you should have included:

"So the last thing you have to do to get it all composited up is to get it copied over to main memory. That copy over to main memory is really fatst, and it doesn’t use any CPU or GPU time either, because we have DNA engines that actually do that for you in the console. This is how you get to 1080p, this is how you run at 60 frames per second… period, if you’re bottlenecked by graphics."

Pow!  Your move, Playstation 4.  Your move.


No need for PS4 to do any move. The games are already speaking for themselves. ^_-

It's quite sad that there are still people in denial, who can't accept the cold hard truth.