Bofferbrauer2 said: That 0.79 TFlops is most certainly in FP16 aka half precision, while the 0.393 is in FP32 aka Full or single precision. XBO and PS4 GPU are from a time half precision was seldom used and thus are unable to do 2 half precision calculation in place of one in full precision. |
It unfortunately seems that there isn't all too much use in video games for half-precision calculations from what I read about the topic (it also came up with the PS4 Pro). Certainly not enough to speed up the code by a factor of 2, maybe a 10-20% gain is realistic if it's used to full extend.