Quantcast
(Update) Rumor: PlayStation 5 will be using Navi 9 (more powerful than Navi 10), new update Jason Schreier said Sony aim more then 10,7 Teraflop

Forums - Gaming Discussion - (Update) Rumor: PlayStation 5 will be using Navi 9 (more powerful than Navi 10), new update Jason Schreier said Sony aim more then 10,7 Teraflop

How accurate this rumored is compared to the reality

Naah 26 35.62%
 
Its 90% close 14 19.18%
 
it's 80% close 8 10.96%
 
it's 70% close 5 6.85%
 
it's 50% close 13 17.81%
 
it's 30% close 7 9.59%
 
Total:73
Intrinsic said:

AMD gettig $100 for ach chip they give to sony isn't them selling them at bargain prices at all. Thats them selling them at "bulk/OEM" pricing which is totally normal when any company puts in orders in the region of millions. 

Take the 3600G or instance, say AMD sells that at retail for $220, that pansout like this... the actual cost of making each of those chips (what AMD pays to the foundry) is like $30/$40. Then AMD will add their markup to account for things like yields, profits, packaging and shipping..etc. At this point the chip comes up to around $170. Then they put their MSRP sticker price of $220 so te retailers make their own ut too.

Pretty sure AMD's profit margins are on average 61% for PC chips.
So a $220 CPU is likely costing AMD $85.8 in manufacturing and other logistics.

Consoles are actually a significant revenue driver for AMD though, which is good... Not nearly as lucrative as PC chip sales, but it helped keep AMD afloat when it needed it most.
https://www.anandtech.com/show/8913/amd-reports-q4-fy-2014-and-full-year-results

Bofferbrauer2 said:

@bolded: We don't even know if that's a true chip (and at 20CU, I really doubt it, especially considering it will be totally bandwith starved even with DDR4 4000). But I digress.

DDR4 4000 can offer more bandwidth than HBM2. It is entirely how wide you wish to take things... But before then you reach a point where it's more economical to choose another technology anyway.

However... Considering that current Ryzen APU's with 38GB/s~ of bandwidth are certainly bandwidth starved with 11 CU's... I doubt that is going to change with 20CU APU's that have 68GB/s~ bandwidth.

But if you were to run that DDR4 4000 DRAM on a 512-bit bus, suddenly we are talking 256GB/s of bandwidth, which is more than sufficient for even a 40 CU count.

Straffaren666 said:

Specifying some of AMD's improvement is irrelevant as long as you don't also specify what nVidia has achieved. A lot of the engineering work goes into improving the performance and power efficiency by switching from third pary cell libraries to custom IC designs for a particular process node. Something nVidia obviously has spent a lot more resources on than AMD and that's something which doesn't show up as a new feature in marketing material.

I am aware. Not my first Rodeo.

Straffaren666 said:

I spend much of my working time analyzing GPU frame-traces, identifying bottlenecks and how to work around them. Every GPU architecture has bottlenecks, that's nothing new, it's just a matter of what kind of workload you throw at them. I have full access to all the performance counters of the GCN achitecture, both in a numerical and a visual form. For instance, I can see the number of wavefronts executing on each individual SIMD of each CU at any given time during the trace, the issue rate of VALU, SALU, VMEM, EXP, branch instructions, wait cycles due to accessing the K$ cache, exporting pixels or fetching instructions, stalls due to texture rate or texture memory accesses, number of read/write accesses to the color or depth caches, the number of drawn quads, the number of context rolls, the number of processed primitives and percentage of culled primitives, stalls in the rasterizer due to the SPI (Shader Processor Input) or the PA (Primitive Assembly), number of indices processed and reused by the VGT (Vertex Geometry Tessellator), the number of commands parsed/processed by the CGP/CPC, stalls in the CPG/CPC, number of L2 read/writes, L2 hit/miss rate. That's just a few of the available performace counters I've access to. In addition to that I have full documentation to the GCN architecture and I've developed several released games targeting it. Based on that I've a pretty good picture of the strengths/weaknesses of the architecture and I'm interested in hearing if you perhaps have some insight that I lack.

I am unable to verify any of that, nor does it take precedence over my own knowledge or qualifications. In short, it's irrelevant.

Straffaren666 said:

The geometry rate isn't really a bottleneck for GCN. Even if it was, the geometry processing parallelizes quite well and could be solved by increasing the number of VGTs. It won't be a problem in the future either for two reasons. 1) The pixel rate will always be the limiting factor. 2) Primitive/mesh shaders gives the graphics programmer the option to use the CU's compute power to process geometry.

It's always been a bottleneck in AMD's hardware even going back to Terascale.

Straffaren666 said:

I asked you to specify the inherent flaws and bottlenecks in the GCN architecture that you claim prevents the PS5 from using more than 64CUs, not AMD's marketing material about their GPUs. So again, can you please specify the "multitude of bottlenecks".

Bottlenecks (Like Geometry) have always been an Achilles heel of AMD GPU architectures even back in the Terascale days.

nVidia was always on the ball once they introduced their Polymorph engines.

But feel free to enlighten me on why AMD's GPU's fall short despite their overwhelming advantage in single precision floating point operations relative to their nVidia counterpart.

Straffaren666 said:

I'm not sure what you mean. It clearly says the area reduction is 70% and a 60% reduction in power consumtion. Pretty inline with what I wrote. An area reduction of 70% would yield a density increase of 3.3x. Probably just a rounding issue.

Here are the links to TSMC's own numbers.

https://www.tsmc.com/english/dedicatedFoundry/technology/10nm.htm

https://www.tsmc.com/english/dedicatedFoundry/technology/7nm.htm

I was neither disagreeing or agreeing with your claims, just wanted evidence for my own curiosity to take your claim seriously.

Density at any given node is always changing, Intel is on what... It's 3rd or 4th iteration of 14nm? And each time density has changed. Hence why it's important to do apple to apples comparisons.

As for TSMC's 10nm and 7nm comparisons... I would not be surprised if TSMC's 10nm process actually leveraged a 14nm BEOL... TSMC, Samsung, Global Foundries etc' don't tend to do full node (FEOL+BEOL) shrinks at the same time like Intel does.
The 7nm process likely leverages 10nm design rules...

But even then TSMC's information on their 7nm process is likely optimized for sram at the moment, where-as their 10nm process is not in the links you provided, which ultimately skews things in 7nm's favour as you are less likely to need less patterning... And you can optimize for the sram cells relatively simple structures compared to more complex logic.



Around the Network
Pemalite said:

 

Bofferbrauer2 said:

@bolded: We don't even know if that's a true chip (and at 20CU, I really doubt it, especially considering it will be totally bandwith starved even with DDR4 4000). But I digress.

DDR4 4000 can offer more bandwidth than HBM2. It is entirely how wide you wish to take things... But before then you reach a point where it's more economical to choose another technology anyway.

However... Considering that current Ryzen APU's with 38GB/s~ of bandwidth are certainly bandwidth starved with 11 CU's... I doubt that is going to change with 20CU APU's that have 68GB/s~ bandwidth.

But if you were to run that DDR4 4000 DRAM on a 512-bit bus, suddenly we are talking 256GB/s of bandwidth, which is more than sufficient for even a 40 CU count.

Knowing you, I'm sure you are aware that to connect the RAM sticks with a 512 bit bus would need many more layers on the motherboards, thus making them hugely expensive, so not exactly an economic solution. It's after all also the reason why we're still stuck with just dual channel, quad channel would help iGPU/APU a lot but make the boards much more expensive.

However, what I could see as possible would be reintroducing the sideport memory, in this case as a 2-4GB HBM stack, functioning as a LLC. But for that I would have expected special boards for APUs, like a 540G and 560GX, similar to the 760G/790GX during the HD 4000 series. Just with a very fast sideport this time, please.



Bofferbrauer2 said:

Knowing you, I'm sure you are aware that to connect the RAM sticks with a 512 bit bus would need many more layers on the motherboards, thus making them hugely expensive, so not exactly an economic solution. It's after all also the reason why we're still stuck with just dual channel, quad channel would help iGPU/APU a lot but make the boards much more expensive.

Hence why I said it would be more economical to choose another technology anyway. :P

Bofferbrauer2 said:

However, what I could see as possible would be reintroducing the sideport memory, in this case as a 2-4GB HBM stack, functioning as a LLC. But for that I would have expected special boards for APUs, like a 540G and 560GX, similar to the 760G/790GX during the HD 4000 series. Just with a very fast sideport this time, please.

Ah sideport. I had the Asrock M3A790GHX at one point with 128MB of DDR3 Sideport memory. - But because it clocked at only 1200mhz, it only offered 4.8GB/s of bandwidth verses the system memories 25.6GB/s of bandwidth... So the increase in performance was marginal at best. (I.E. Couple of percentage points.)

On older boards that ran with DDR2 memory that topped out at 800mhz (12.8GB/s of bandwidth) the difference was certainly more pronounced.

On that Asrock board I got more of a performance kick from simply overclocking the IGP to 950Mhz than from turning on Sideport memory, but that is entirely down to the implementation.

In saying that, GPU performance has certainly outstripped the rate of system memory bandwidth increases... I mean heck... The Latest Ryzen notebooks are often still running Dual-Channel DDR4 @ 2400mhz, which is 38.4GB/s of bandwidth, not really a big step up over the Asrock's 25.6Gb/s of bandwidth is it? Yet the GPU is likely 50x more capable overall.

Sideport is great, if implemented well and not on a narrow 16-bit bus. - And you wouldn't even need to use expensive HBM memory to get some big gains.
GDDR5 is cheap and plentiful and on a 32-bit bus could offer 50GB/s which combined with system memory (I assume by a striping method) would offer some tangible gains.
GDDR6 would be a step up even again where 75GB/s should be easy enough to hit... Ideally you would want around 100-150GB/s for decent 1080P gaming.

If they threw it onto a 64-bit bus, then that would double all of those rates, but I would imagine tracing would become an issue, especially on ITX/mATX boards, forcing the requirement of more PCB layers.



Pemalite said: 
Bofferbrauer2 said:

However, what I could see as possible would be reintroducing the sideport memory, in this case as a 2-4GB HBM stack, functioning as a LLC. But for that I would have expected special boards for APUs, like a 540G and 560GX, similar to the 760G/790GX during the HD 4000 series. Just with a very fast sideport this time, please.

Ah sideport. I had the Asrock M3A790GHX at one point with 128MB of DDR3 Sideport memory. - But because it clocked at only 1200mhz, it only offered 4.8GB/s of bandwidth verses the system memories 25.6GB/s of bandwidth... So the increase in performance was marginal at best. (I.E. Couple of percentage points.)

On older boards that ran with DDR2 memory that topped out at 800mhz (12.8GB/s of bandwidth) the difference was certainly more pronounced.

On that Asrock board I got more of a performance kick from simply overclocking the IGP to 950Mhz than from turning on Sideport memory, but that is entirely down to the implementation.

In saying that, GPU performance has certainly outstripped the rate of system memory bandwidth increases... I mean heck... The Latest Ryzen notebooks are often still running Dual-Channel DDR4 @ 2400mhz, which is 38.4GB/s of bandwidth, not really a big step up over the Asrock's 25.6Gb/s of bandwidth is it? Yet the GPU is likely 50x more capable overall.

Sideport is great, if implemented well and not on a narrow 16-bit bus. - And you wouldn't even need to use expensive HBM memory to get some big gains.
GDDR5 is cheap and plentiful and on a 32-bit bus could offer 50GB/s which combined with system memory (I assume by a striping method) would offer some tangible gains.
GDDR6 would be a step up even again where 75GB/s should be easy enough to hit... Ideally you would want around 100-150GB/s for decent 1080P gaming.

If they threw it onto a 64-bit bus, then that would double all of those rates, but I would imagine tracing would become an issue, especially on ITX/mATX boards, forcing the requirement of more PCB layers.

Yeah, I had Sideport Memory too on my Asus 880G (don't remember the exact model). And yeah, that 16bit bus sucked, but it was good enough when my GPU failed.

I was talking about HBM simply for one reason: It has a very small footprint. Sure, GDDR5/6 would do also, but would also certainly need more space on a board, which would make it hard to create such boards in ITX or STX sizes. Too bad HMC never got popular, as I think it would have been the best choice for this. And at bit rates of up to 480GB/s, that would have sufficed for a long time in that regard.

Well, 38.4GiB/s is almost 50% more than 25.6GiB/s, but I get your point. Hence why DDR5 will start at DDR5-4400, way above the highest specified DDR4 memory (3200) and supposed to go all the way up to DDR5-6400. iGP/APU will certainly benefit a lot from the bandwith boost, as will some server applications. DDR5-6400 would also bring 51.4GiB/s on a single channel, so over 100GB/s on a dual channel interface, which should be enough for most integrated graphics, at least for now.

Just for comparison, Baffin (RX 460/560) has 112GB/s, so 100GB/s should be enough for 16CU@1200Mhz (RX560 reaches 1300) without choking. Still not enough for that rumored R5 3600G with 20CU, though, unless it would clock at only 800Mhz or so. And no chance even with DDR4-3200, which only delivers half of that bandwith.



Bofferbrauer2 said:

Yeah, I had Sideport Memory too on my Asus 880G (don't remember the exact model). And yeah, that 16bit bus sucked, but it was good enough when my GPU failed.

Some boards had it on a 32bit bus, it was a gamble depending on what model motherboard you got.

Bofferbrauer2 said:

I was talking about HBM simply for one reason: It has a very small footprint. Sure, GDDR5/6 would do also, but would also certainly need more space on a board, which would make it hard to create such boards in ITX or STX sizes. Too bad HMC never got popular, as I think it would have been the best choice for this. And at bit rates of up to 480GB/s, that would have sufficed for a long time in that regard.

Put it underneath the motherboard like with many many board that have NVME.

Bofferbrauer2 said:

Well, 38.4GiB/s is almost 50% more than 25.6GiB/s, but I get your point. Hence why DDR5 will start at DDR5-4400, way above the highest specified DDR4 memory (3200) and supposed to go all the way up to DDR5-6400. iGP/APU will certainly benefit a lot from the bandwith boost, as will some server applications. DDR5-6400 would also bring 51.4GiB/s on a single channel, so over 100GB/s on a dual channel interface, which should be enough for most integrated graphics, at least for now.

Just for comparison, Baffin (RX 460/560) has 112GB/s, so 100GB/s should be enough for 16CU@1200Mhz (RX560 reaches 1300) without choking. Still not enough for that rumored R5 3600G with 20CU, though, unless it would clock at only 800Mhz or so. And no chance even with DDR4-3200, which only delivers half of that bandwith.

JEDEC spec anyway. G.Skill had DDR4 running at 5ghz last year.
https://www.tomshardware.co.uk/g-skill-first-ddr4-5066mhz-memory,news-58644.html

DDR5 can't happen soon enough, Notebooks will lap up the bandwidth that's for sure.



Around the Network
BraLoD said:
Robert_Downey_Jr. said:

But what about in a year and a half at Sony costs?

People tend to forget Sony/MS will be buying components in the millions of units from the get go.

If I buy 3 of something in the market I can already get a discount. Imagine if I buy 5 milion units at once and the vendor knows I may be needing to buy up to 100M in the next five years? Sony and MS probably gets massive amounts of discount on anything they put in their consoles, specially Sony as the PS consoles are basically guaranteed to go around 100M units made.

Massive discounts still don't erase the bast cost to even make a product. High end processors don't have that much wiggle room in cost. Especially with demand and supply shortages due to foundries being full, etc. You'll get a discount, but it stops scaling at a point. Especially when it comes to the GPU tech proposed.



errorpwns said:
BraLoD said:

People tend to forget Sony/MS will be buying components in the millions of units from the get go.

If I buy 3 of something in the market I can already get a discount. Imagine if I buy 5 milion units at once and the vendor knows I may be needing to buy up to 100M in the next five years? Sony and MS probably gets massive amounts of discount on anything they put in their consoles, specially Sony as the PS consoles are basically guaranteed to go around 100M units made.

Massive discounts still don't erase the bast cost to even make a product. High end processors don't have that much wiggle room in cost. Especially with demand and supply shortages due to foundries being full, etc. You'll get a discount, but it stops scaling at a point. Especially when it comes to the GPU tech proposed.

That's why in my following post I said you can't expect to buy gold for the price of silver.

Doesn't mean those massive companies doing massive deals can't sell you such components in their products for a whole next level of cheaper than what you as an end user have to pay for.

Specially when console making companies are ok with taking loses on hardware sales to get a lot more money on software sales.

All in all, the next gen consoles will be much cheaper than what they would cost an end user to build with the same components.