Pemalite said:
Yes they can. Although I would hope not. There isn't a need for an OS to gobble 4GB of DRAM on a handheld... That is a waste of power and resources.
No it doesn't. A GPU with less Teraflops can outperform a GPU with more Teraflops in gaming.
It is a hypothetical denominator, not a real world one.
What about INT4, INT8, INT16, FP8, FP16, FP64? Your "teraflops" doesn't tell us squat about those... A game may not use single precision floating point (FP32) at all, it may use FP16, sidestepping your Teraflop counts entirely... Which is a likely proposition in the handheld space due to performance/battery life reasons.
What about Pixel/Geometry/Texture fillrates? Again. Teraflops tells us nothing about those.
At the end of the day... If we take a Radeon 5870 at 2.72 Teraflops and compare it against the Radeon 7850 at 1.76 Teraflops... By your measure and assumption, the Radeon 5870 would win due to having almost an extra Teraflop of FP32? You would be wrong. They even have the same DRAM bandwidth of 153.6GB/s.
But the 7850 is indeed faster. Don't take my word for it: https://www.anandtech.com/bench/product/511?vs=549
And that is comparing GPU's from the same company... Things get even crazier if we start to compare AMD and nVidia.
Doesn't need to match it in raw power. Again... TFLOPS doesn't tell the entire story. Efficiency is far more important than brute force. nVidia tends to engineer it's GPU's to do as little work as possible ironically, hence their efficiency jump with Maxwell.
|
Saying TFLOPS don't mean anything is like saying clock frequency doesn't mean anything, or core count, or really any one variable of a computer. It's just a ridiculous assertion. It isn't everything, it isn't even most things, but it gives you an idea of the hardware's capabilities. This is why, generation after generation, FLOPS increase substantially. Because they are correlated with gaming capability, it's just not 1:1.
Also, we do know FP16 performance given the above information. We don't know TOPS, but we can assume it'll be less than 200 (probably 100) based on what we know about Orin. We can attempt to calculate the TMU's based on what we know about Orin though. Orin also has a maximum of 16 SM's. However, the GPU that is closest to the calculated TFLOPS that I provided has 14 SM's. Given that the RTX-20 line has 4 TMU's per SM, we are looking at a maximum of 64 Gtexels, with a likelihood of 56 Gtexels/s if we stick with the 1Ghz frequency. The thing is it wont be that frequency. We don't know about ROP units, but I don't think I've seen any GPU with higher ROP count than TMU count, so 56 Gpixels/s is really the maximum here. The Series S has a pixel rate of 50GPixels/s and a texture rate of 125 GTexels/s. Unless Nintendo has no customization of the card (e.g. doesn't reduce ROPs) and doesn't decrease the frequency (reduction will absolutely happen) then we know it isn't going to match the Series S GPU. It can get close on the pixel rate side, but it'll be behind on the texture rate.
But, again, a lot of this is guess work and the only substantive evidence we have is TFLOPS (which freebs2 brought to my attention may not actually be substantive). We have information on Orin that isn't about TFLOPS, which is what I'm using a lot of my guesswork about. So the above is what I'm basing my knowledge on with it not being equivalent to Series S in raw power. And, sure, raw power isn't everything, but there is nothing that we can factually discuss that isn't related to raw power in this time.