| bonzobanana said: We all know different GPU architectures have different design elements that are both better than the competition and worse. Lots of things about AMD and Intel graphics architectures have been both praised and criticised compared to Nvidia. Their efficiency varies for different tasks. Do you have an example of a two same matching generation graphic chipsets from different vendors that compete against each other where they have similar Gflops but wildly different frame rates as an average across all games? I should add without using AI upscaling etc unless that is your point that AI upscaling allows a weak GPU to punch above its weight which is what I've been saying for a long time. The industry uses Gflops as a general indication of performance only just to give a approximate indicator of performance. We shouldn't pretend a old Nvidia design from 2019/2020 with limited memory bandwidth, limited power to use is somehow this super efficient design that even with a very dated fabrication process can compete with more advanced GPUs. I've got a far more powerful and later architecture Nvidia GPU in my laptop capable of a lot more performance and a CPU with about 7x the processing power, more memory bandwidth and its no miracle GPU its competing in its class quite well, weak in some areas, better in others. Obviously its much better than Switch 2 but there is no fixed platform optimisations so its not 2-3x docked or 4-5x portable of that of the Switch 2 because some of that performance is lost in not being a fixed platform for its gpu. It's certainly far superior but just not what you would expect when you compare the Switch 2 spec directly with it. I'm very happy with it but I only get around 1-2 hours battery life out of it (nearer 1 hour to be honest maybe 1hr 20 minutes) and its unusable as a portable system with the Nvidia GPU. When I use the laptop for portable gaming I use the laptops own AMD igpu which is on a more advanced fabrication process 7Nm vs 8/10Nm of the RTX 2050 and of course less powerful at about 1.5 Teraflops to get 2-3 hours plus a lot less fan noise. The laptop can last about 10-12 hours maximum just browsing, office etc on economy settings with the AMD igpu. Obviously the laptop has a much larger 15.6" 144Hz screen which consumes a lot more power. We shouldn't confuse what is mainly fixed platform optimisations with some sort of pretend miracle architecture from a old Nvidia design from 2019 using a very old fabrication process. I have a old Athlon 5350 pc with a radeon graphics card. The CPU is overclocked to about 2.6-2.7Ghz so is a match for PS4 CPU performance (all AMD Jaguar cores), admittedly it doesn't quite have the same memory bandwidth but its a fairly close match to the original PS4 in spec, the GPU is a little more powerful just over 2 Teraflops. If I was to say it performed a half as well as a PS4 I would probably be lying. The PS4 is like a generation above thanks to fixed platform optimisations. From memory its a R9 270 which replaced a R7 250 I think when I got the RX 580 for a different pc. Anyway you get my point fixed platform optimisations is a huge upgrade in performance for consoles. So any comparison involving Switch 2 must assume huge fixed platform performance boosts even if a lot of stuff has been scaled back or removed to create that performance. If you are wondering why I kept a old Athlon 5350 pc for so long its in a compact PC case so fits under the TV without problems. Where as my other PC is in a midi size case.
|
We do have examples of different real-world performance/TFLOP from different architectures.
For example, RDNA2 GPUs systematically perform (in rasterization) about 1.3 times Ampere TFLOP-for-TFLOP. And that is because Nvidia has designed their cores in a way that they could utilize mixed INT32/FP32 or fully dedicate all of that core's throughput to FP32 (if there is no INT32 code-execution.) So even though when running a pure FP32 workload Nvidia's TFLOPs are accurate, this doesn't fully translate to proportional game performance as you would have expected because some cores (when gaming) need to be utilized in the mixed mode. And as sourced from Nvidia, about a third of GPU utilization in rasterized loads is integer arithmetic not floating point.
As a concrete example, an RX 6800 performs about as well as an RTX 3070ti. The RX 6800 is 16.17 FP32 TFLOPs and the RTX 3070ti is 21.75 FP32 TFLOPs.
TFLOPs only make sense when you 1. compare according to the same architecture (and even then other aspects of the GPU can affect performance) or 2. are talking about a pure FP workload.
The rest of your post isn't relevant to the specific discussion of TFLOPs and whether we should use them as a general performance indicator because real-world gaming performance is not measured in TFLOPs. TFLOPs are just how much pure FP (32 in this case) arithmetic can be done at max GPU clock rates. it is a function of clock rate, and core count with various constant factors (which vary from architecture to architecture.)
If we were talking about pure FP workloads then we could use it as a general performance indicator, but gaming is not that. It is a mixed workload.
As a side-note, your RTX 2050 probably would outperform (maybe only slightly) your iGPU and still have good battery life if you capped its clock-rate with the iGPU disabled (which you probably can't do without a MUX switch.)
Last edited by sc94597 - on 24 October 2025






