By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Nintendo Discussion - Clarifying the 1.5TFLOPS of the SWITCH for those who just see the numbers.

vivster said:
walsufnir said:

It isn't really faster just because you lose precision. It *could* be faster if

a) the algorithm or the computation doesn't need to be fp32 in all cases and

b) the hardware can store and use two 16 bit floating point numbers in the same time it would store and use one fp32.

Everyone is free to use fp16 right now but there is no real benefit for that.

Shouldn't it be easy to implement though? Developers mark the stuff that is FP16 and the GPU driver does the rest.

From an implementation point of view, yes. You just don't allocate a 32bit register anymore but a 16bit.

What I am wondering is if devs actually do that because that is extra work for a better PS4. Cerny said it should be as simple as possible to develop a pro mode off a ps4 game. Using fp16 just for a better version of the game and to test if accuracy is sufficient is a different beast. Perhaps you can use fp16 safely also on vanilla ps4 without any negatives but of course I don't know about that. But it doesn't seem that easy to really go into all the power the Neo provides over the vanilla ps4, especially when there are no exclusives allowed.



Around the Network
walsufnir said:
vivster said:

Shouldn't it be easy to implement though? Developers mark the stuff that is FP16 and the GPU driver does the rest.

From an implementation point of view, yes. You just don't allocate a 32bit register anymore but a 16bit.

What I am wondering is if devs actually do that because that is extra work for a better PS4. Cerny said it should be as simple as possible to develop a pro mode off a ps4 game. Using fp16 just for a better version of the game and to test if accuracy is sufficient is a different beast. Perhaps you can use fp16 safely also on vanilla ps4 without any negatives but of course I don't know about that. But it doesn't seem that easy to really go into all the power the Neo provides over the vanilla ps4, especially when there are no exclusives allowed.

Wait, why are we talking about the Pro now?^^

For the Switch it could certainly make a difference with that limited amount of power. Even if 3rd parties don't use it, Nintendo probably will.



If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

vivster said:
walsufnir said:

From an implementation point of view, yes. You just don't allocate a 32bit register anymore but a 16bit.

What I am wondering is if devs actually do that because that is extra work for a better PS4. Cerny said it should be as simple as possible to develop a pro mode off a ps4 game. Using fp16 just for a better version of the game and to test if accuracy is sufficient is a different beast. Perhaps you can use fp16 safely also on vanilla ps4 without any negatives but of course I don't know about that. But it doesn't seem that easy to really go into all the power the Neo provides over the vanilla ps4, especially when there are no exclusives allowed.

Wait, why are we talking about the Pro now?^^

For the Switch it could certainly make a difference with that limited amount of power. Even if 3rd parties don't use it, Nintendo probably will.

Oh yes, thread confusion :)

I think it can be a major problem with performance for 3rd party devs because porting existing games will result in quite low performance if they don't adjust their code for fp16 where suitable.



Well for one we don't even know if the Switch will hit 750 Gflops even in FP32. I doubt Nintendo is going to aim for 15 - 20 watts for this device which is needed to hit that number. I'm guessing they will lower the clock rate to improve battery life. Also, even in docked form I feel the form factor is too small to handle the 15 - 20 watts TDP.    

Lastly, that SoC will be heavily bottlenecked with its 3GB's of LPDDR4 Ram and 25 - 50GB/s of memory bandwdith, so even it if did hit 750 Gflops it wouldn't be performing at its full potential. 



That's a great explanation! Thanks! Although, it really dampers my hopes for the Switch :P 4gb RAM and 750Tflops? meh



I'm on Twitter @DanneSandin!

Furthermore, I think VGChartz should add a "Like"-button.

Around the Network
walsufnir said:
Zkuq said:
I'm no expert on graphics or game programming, but I'm under the impression that there's quite a limited range of situations where half-precision FP numbers are usable. Any experts capable of shedding some light on this? How much of a boost does FP16 give over FP32 without too much noticeable degradation in image quality (or other areas) in a typical modern game?

It isn't really faster just because you lose precision. It *could* be faster if

a) the algorithm or the computation doesn't need to be fp32 in all cases and

b) the hardware can store and use two 16 bit floating point numbers in the same time it would store and use one fp32.

Everyone is free to use fp16 right now but there is no real benefit for that.

I'm familiar with that much, and my question was more about the end result (i.e. the game). Thanks anyway.



Anything strong enough to ensure support for the NX. They only need something in the level of the Xbox One or a little more powerful.



Nexus7 said:

Well for one we don't even know if the Switch will hit 750 Gflops even in FP32. I doubt Nintendo is going to aim for 15 - 20 watts for this device which is needed to hit that number. I'm guessing they will lower the clock rate to improve battery life. Also, even in docked form I feel the form factor is too small to handle the 15 - 20 watts TDP.    

Lastly, that SoC will be heavily bottlenecked with its 3GB's of LPDDR4 Ram and 25 - 50GB/s of memory bandwdith, so even it if did hit 750 Gflops it wouldn't be performing at its full potential. 

Actually looks like the Parker Tegra might 'easily' hit 750GFLOP/s with 10-12W TDP. At least actual power consumption of 10W or less.

 

We might see half power and halt resoultion in mobile mode, i wouldn't bother though.



The people who thinks its 1.5Tflops won't care or believe it's only 0.75Tflops.

But thank you for proving that I was right again.

Anyway, it's going to be a powerful handheld. I hope powerful enough to have a good third party support :)