By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Nintendo Discussion - Clarifying the 1.5TFLOPS of the SWITCH for those who just see the numbers.

There are a lot of rumors floating around the 1.5TFLOPS number but they all only tell half of the story. You could say they are only "half precise" ;)

This often quoted number relates to the Parker performance for FP16. In layman's terms FP16 stands for "half precision" which means that the GPU will process certain information less reliable. In games, as in contrast to graphic programs, precision is not important in all tasks. That means it's possible to use only half the precision and will still gain a good enough result.

There is also FP32 which is the normal "single precision" and FP64 which is "double precision". The latter is usually used in professional graphic programs while the former is used for most other stuff and also usually what people refer to when they are talking about FLOPS.

GPUs use FP16 to their advantage in games because they can calculate certain tasks in double the speed because they only need to be half as precise with the results.

However you cannot use this way of processing for every single task in a game. That's where we fall back to FP32 which slows down the process.

Now the Switch's mystical number of 1.5TFLOPS refers to FP16. The FP32 performance is logically at 750GFLOPS. For comparison the PS4 sports 1.84TFLOPS in FP32.

 

I hope this clears up a few things. I'm not an expert in these things so if I got something completely wrong, someone please correct me.

https://en.wikipedia.org/wiki/Tegra#Tegra_P1



If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

Around the Network

Hopefully not too many people needed this clarification. A mixture of hope and limited technical knowledge is a dangerous thing.



Barkley said:

Hopefully not too many people needed this clarification. A mixture of hope and limited technical knowledge is a dangerous thing.

unfortunately, you'd be surprised...



I think it's good enough



Roronaa_chan said:
I think it's good enough

The FLOPS or my clarification?^^

I think both is.



If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

Around the Network

Nice explanation. 😂😂😂



vivster said:

 

This often quoted number relates to the Parker performance for FP16. In layman's terms FP16 stands for "half precision" which means that the GPU will process certain information less reliable. In games, as in contrast to graphic programs, precision is not important in all tasks. That means it's possible to use only half the precision and will still gain a good enough result.

 

Just a correction you are not entirely correct about this. Percision is not how accuratly the chip does the calculation but how big the instructions are, that is when you are dealing with programming and mathematics. Percision is how "big" the number is, putting it in decimal terms if you are figuring a percentage in some cases .16(16%) is will work well enough, but depending on different factors you may need more percision, .1641(16.41%). How reliable is "target", which depends on the information you are putting into the equation. You can be very percise but if your information isn't on target you will not get the right results. There was a entire chapter devoted to this when I took advanced chamistry in High School and how to determine your percision. Because at some point percision is just usless if the information and calculations don't get that percise, which may affect the target also.

Going back on topic, FP16, floating point 16, are floating point instructions that are only 16-bits big maximum. And FP32 are 32 bit max instructions. For some procedures you only need FP16 and if you optimize correctly you can run two instruction concurently depending on the chip, and from how nVidia is advertising this chip it seems to be the case. So in some cases and if you don't need 32-bit instructions you can process instructions faster this way.

This is one of the reasons why you don't want to look at just FLOPS, or anything else as is. I seen a person saying that the Xbox One runs at twice the speed if you do it in FP16 which may not be the case since the chip might not be able to do two FP16 instructions at the same time, though it may allow them.



I don't really know anything about hardware specs tbh. I know about good games, and that's what I really want... Switch will succeed with a good catalogue. Imo it has to be also cheap to suceed, so I really don't mind if the console is underpowered in the end.



What's up with these clarification threads lately? Do we need a sticky post for technical things we can refer to?
I would gladly help in providing explanations.



This was actually a pretty clear explanation, especially for someone that never bothered to look up the difference like me.