Lafiel said:
Bofferbrauer2 said:

That 0.79 TFlops is most certainly in FP16 aka half precision, while the 0.393 is in FP32 aka Full or single precision. XBO and PS4 GPU are from a time half precision was seldom used and thus are unable to do 2 half precision calculation in place of one in full precision.

In other words, Switch has about one quarter of XBO performance in FP32, but that advantage is cut to just half when half precision is in use. An upgrade to an X2 in 10 or 7nm should be able to match the XBO in half precision without taxing the battery too much, but not yet in FP32.

It unfortunately seems that there isn't all too much use in video games for half-precision calculations from what I read about the topic (it also came up with the PS4 Pro). Certainly not enough to speed up the code by a factor of 2, maybe a 10-20% gain is realistic if it's used to full extend.

Yeah, that's mostly correct. Mobile games make extended use of FP16 as their GPUs are just too weak otherwise, but PC/Console games do much less so, though the amount seems to be slowly rising.