By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - PC - NVIDIA reveals Volta next next-gen GPU platform - After Maxwell 1TB/s Memory bandwidth from stacked DRAM

Netyaroze said:

I think Nvidia has to focus more on raw numbercrunching if they want to build Exascale Supercomputers someday. They need to step up their game imo.


I guess it's a possibility. We'll have to wait and see if their possible gains are greater than the costs of finding a way to theoretically double their FP performance without compromising other requirements such as die size or power consumption. 



 

 

 

 

 

Around the Network
Netyaroze said:

Appearently it must be possible to stack ram and spread the heat or else stacking couldn't be done. 

 Which is why I wrote this in my first post:

 "and if stacked ram becomes reality we can look forward to a smaller quiter cheaper PS4."

I don't think we will see stacked ram in a future PS4. Not worth the effort. What you win in surface you loose in height, essentially.  By the time this becomes an option, we are looking at a potential PS5. Potential savings are in the SoC (which might be a multi-die assembly for the first year(s). This 28nm stuff is brand new (at least for GF, a potential manufacturer), untested technology so the yields are probably very low on the cpu and gpu dies. If you have a single large die and the cpu part is bad, you have to throw away the gpu part as well and vice-versa. If you have two separate dies, you only have to throw away the defective die, not both dies).

If you look at any PC dimm that has a heat spreader, you'll notice that the heat spreader is not in contact with the ram chips, but with the small pcb board the ram chips are soldered onto. Ram chips are basically cooled through their pins, the heat flows through the pins onto the pcb and barely through convection. So if you have stacked ram, you just insure the pins are "large enough" (whatever that means in the end) soldered onto a "large" heat spreader. Better airflow around the heat spreader will do the job (if I look at my board with four ddr3-1600 dimms, the air flow is pretty shitty but the heat spreaders apparently still do the job).



dahuman said:


It'd be an actual chip redesign, it's not just slapping RAM on top of the GPU and calling it good ^_^; I think they'll come up with better cooling solutions down the line and maybe do the stacking on the side to reduce overall size but I don't think they'll tie the RAM to the chip directly.

If you look at the schematics, you clearly see the stacked ram is NOT on the volta gpu. I think the first company to actually "stack*  is a future (lower power) Intel cpu. Technologically, they are maybe 16-18 months ahead of TSMC and probably 24-26 months ahead of GF.



drkohler said:

I don't think we will see stacked ram in a future PS4. Not worth the effort. What you win in surface you loose in height, essentially.  By the time this becomes an option, we are looking at a potential PS5. Potential savings are in the SoC (which might be a multi-die assembly for the first year(s). This 28nm stuff is brand new (at least for GF, a potential manufacturer), untested technology so the yields are probably very low on the cpu and gpu dies. If you have a single large die and the cpu part is bad, you have to throw away the gpu part as well and vice-versa. If you have two separate dies, you only have to throw away the defective die, not both dies).

If you look at any PC dimm that has a heat spreader, you'll notice that the heat spreader is not in contact with the ram chips, but with the small pcb board the ram chips are soldered onto. Ram chips are basically cooled through their pins, the heat flows through the pins onto the pcb and barely through convection. So if you have stacked ram, you just insure the pins are "large enough" (whatever that means in the end) soldered onto a "large" heat spreader. Better airflow around the heat spreader will do the job (if I look at my board with four ddr3-1600 dimms, the air flow is pretty shitty but the heat spreaders apparently still do the job).

 

When you say not worth the effort you mean basically just the space the ram takes away ?

 

But GDDR5 produces more heat and eats up alot more power than DDR3 or DDR4. If this could be reduced by lets say 20 Watts through stacked ram the powersupply could be made smaller and cheaper aswell as the cooling and the board. the number of chips might be less too DDR3/4 densities are easier to increase than GDDR5.  

Ofcourse if its not worth the effort Sony won't do it. But we don't know if it might be worth the effort someday in the future.

Sure guaranteed savings are in the SoC, but stacking could come out cheaper than using GDDR5 one day. If the process is proven and widespread. PS2 got redesigned after PS3 was on the market so who knows. 

 

 



MY biggest problem is that there is absolutley nothing that needs this sort of power, and to create a game that uses it would be cost prohibitive. This is why we are still stuck on crysis level graphics despite having the power to make crysis cry.



Who is John Galt?

 

3DS Friend Code : 2535-4338-9000 

AMD FX 8150 , 8 GB DDR3 Kingston Memory,  EVGA GTX 560 TI 2 GB superclocked, Samsung 256 GB SSD

Around the Network
CityOfNoobs said:
MY biggest problem is that there is absolutley nothing that needs this sort of power, and to create a game that uses it would be cost prohibitive. This is why we are still stuck on crysis level graphics despite having the power to make crysis cry.


It will be pretty simple to use up all the bandwith if game developers can target Volta specs as baseline, through highres textures and high quality assets. The assets are already high quality but downscaled for use in games. But even without changing anything all that power will be used up easily.

4k gaming at 60fps should become reality with Volta. I am pretty sure 3D 4k next gen console titles with texture mods will make a Volta SLI system necessary. Game budgets wont even be influenced by that. 

 

 

 

 



It's totally exciting - we are in a new era of tech advancing super fast again.  IMO, things have been rather stalled the last 6-7 years, as tech was moving to more efficient and less raw power and speed.  It's kinda sad in a way new consoles are coming out right at the beginning of this, as in just a few years, things will be 10-100 times more powerful. 

Still, I will do my best to enjoy it now, and also when the new stuff comes along.



 

Really not sure I see any point of Consol over PC's since Kinect, Wii and other alternative ways to play have been abandoned. 

Top 50 'most fun' game list coming soon!

 

Tell me a funny joke!

Netyaroze said:
Slimebeast said:
Netyaroze said:
Volta will have like 6-8 Tflops ?

Nvidia said some time ago that in 2019 they will have 40 Tflop in a 100 Watt GPU. Thats like 3 shrinks away. Volta: 20nm xx: 16nm xxx: 10nm

That 1 TB bandwith might soon be necessary and if stacked ram becomes reality we can look forward to a smaller quiter cheaper PS4.

2020-2025 Avatar realtime will hopefully become reality.

How does that projected number, 40TFlop, correspond to the manufactoring process of 10 nm? I have forgotten the maths behind die shrinks (the correlation between process shrink to amount of transistors and power consumption). Could a shrink from 28nm (today) to 10 nm really allow an increase of around 20 x the processing power that we today get from a 100W GPU?

 

Its roughly 8-9 times more transistors from 28nm to 10nm. So you could get a 8-9 times increase in raw gpu performance just by putting 8-9 28nm Gpus on the wafer, theoretically. But there is no math for that anymore the development isn't as straightforward as it once was.

It also depends on architecture. There are many transistors who have not really something to do with flop performance this comes from the cores. And I doubt Nvidia needs 9 times more of everything. So they probably target to put a higher percentage of transistors to use for flop performance than in current Gpus. Its possible to have a 20 times increase tflop performance from 28nm to 10nm it all depends on the architecture the wafer quality etc. Could be anywhere really without inside knowledge its impossible to say. Nvidia always reached roughly their goals. So a 30-40 Tflop 100-120 Watt Gpu from 2019-2022 can be expected.        

 

 

 

So 8-9 times by going from 28nm to 10nm, but what about from 28nm to 20nm which is around the corner? What sort of increase does that give in raw performance (ignoring any architectural advances)?



Zappykins said:

It's totally exciting - we are in a new era of tech advancing super fast again.  IMO, things have been rather stalled the last 6-7 years, as tech was moving to more efficient and less raw power and speed.  It's kinda sad in a way new consoles are coming out right at the beginning of this, as in just a few years, things will be 10-100 times more powerful. 

Still, I will do my best to enjoy it now, and also when the new stuff comes along.

The first card to come out utilizing this will be launched in 2016 at the earliest, and it will probably be reserved for their $500 and above cards only at first. By the time this becomes more mainstream we will be talking about PS5 anyway.

Also just to compare:

X360 ''Xenos'' gpu has a bandwith of 22.8 GB/s. PS4 will, if it's indeed based on HD78xx technology about 153GB/s of bandwith. So that's about 8 times as much. Going from 153GB/s to 1000GB/s would be an increase off about 6.5 times. So no technology advancement is not speeding up.



Slimebeast said:
Netyaroze said:
Slimebeast said:
Netyaroze said:
Volta will have like 6-8 Tflops ?

Nvidia said some time ago that in 2019 they will have 40 Tflop in a 100 Watt GPU. Thats like 3 shrinks away. Volta: 20nm xx: 16nm xxx: 10nm

That 1 TB bandwith might soon be necessary and if stacked ram becomes reality we can look forward to a smaller quiter cheaper PS4.

2020-2025 Avatar realtime will hopefully become reality.

How does that projected number, 40TFlop, correspond to the manufactoring process of 10 nm? I have forgotten the maths behind die shrinks (the correlation between process shrink to amount of transistors and power consumption). Could a shrink from 28nm (today) to 10 nm really allow an increase of around 20 x the processing power that we today get from a 100W GPU?

 

Its roughly 8-9 times more transistors from 28nm to 10nm. So you could get a 8-9 times increase in raw gpu performance just by putting 8-9 28nm Gpus on the wafer, theoretically. But there is no math for that anymore the development isn't as straightforward as it once was.

It also depends on architecture. There are many transistors who have not really something to do with flop performance this comes from the cores. And I doubt Nvidia needs 9 times more of everything. So they probably target to put a higher percentage of transistors to use for flop performance than in current Gpus. Its possible to have a 20 times increase tflop performance from 28nm to 10nm it all depends on the architecture the wafer quality etc. Could be anywhere really without inside knowledge its impossible to say. Nvidia always reached roughly their goals. So a 30-40 Tflop 100-120 Watt Gpu from 2019-2022 can be expected.        

 

 

 

So 8-9 times by going from 28nm to 10nm, but what about from 28nm to 20nm which is around the corner? What sort of increase does that give in raw performance (ignoring any architectural advances)?

28nm to 20 is double. It depends how good the process turns out, Its really hard to tell. 

 

Lets compare 40nm vs 28nm die size wattage included.

Top End GTX 580 for 40nm 244 Watts Die size 520mm2 Tflop 1.6

Top End GTX 680 for 28nm 195 Watts Die size 294mm2 Tflop 3

So they doubled Tflop within one shrink. But if you look at Wattage and die size then GTX 680 is not the equivalent of a GTX 580 actually that would be this:

Real Top End 28nm Geforce Titan 250 Watt Die size 560mm2 Tflop 4.5

 

 

They almost tripled the Flops within one die shrink. 28nm to 20nm should see about the same increase. 

Guess GTX 780 will be 6 Tflops and Super Titan 9-10 Tflops. Titan would have more flops if the whole chip would be working.