| bonzobanana said: Even if those figures are actually correct they are only theoretical peak figures especially for portable mode. There is no way the Switch 2 can maintain 1.72 Teraflops with only 4-6W on a fabrication process that is mainly 10Nm. The Switch 1 in theory had something like 170-200 maximum Gflops depending on the source you read but in reality for games that pushed the Switch it was around 90-140 Gflops according to modders/hackers and for games that didn't need such resources it was considerably less. There are thermal restraints which are a factor too. The reality of the Switch 2 will surely be well below 1 Teraflop for portable performance. This is something I'm sure that will become clear later. 20Wh battery only gives 10W per hour and the screen is brighter with a high refresh rate, its going to be using up to half that wattage. The Switch 2 only has 4-6W to use per hour absolute maximum but for most games will be considerably less in portable mode. They are claiming up to 6.5 hours runtime on Switch 2 which I guess is with the screen brightness very low and no wifi and that gives the system 3W per hour of which probably a minimum of 2W going to the screen maybe. What sort of gflops figure will that be? Lets not forget the Switch 2 is a lot more powerful in CPU resources so that has to be factored in they need more power. We don't even want 1.72 Teraflops surely as battery runtime would be terrible and it wouldn't be a great portable system. We want Switch 2 to have lower gflops performance surely for portable mode. |
1. The figures are actually correct. There is no "if" here. An SDK with documentation meant for developers to have an idea of how powerful the target system will be isn't going to have the wrong max clocks in it.
2. TFLOPs always are "theoretical peak figures." In reality a gaming workload is a mixed-workload that isn't just floating point (or even just FP32) calculations, and there can be bottlenecks outside of the GPU cores. But if we were to run some sort of non-bottlenecked floating-point exclusive workload (like float-by-float matrix multiplication with FP32 precision) on the Switch 2's GPU it would achieve these figures (or somewhere close to it.)
3. Yes, if the game doesn't require the highest clock rates, and the developer chooses to down-clock to save battery life, it won't be running at the maximum performance. That is trivially true though. And some games will utilize the full clock rate, running at the maximum performance.
4. Your assertion "below 1 Teraflop for portable performance" is totally meaningless and has no basis in anything. Again, Teraflops aren't a measurement of gaming performance, they're a measurement of floating-point performance (and specifically FP32 in this case), which only amounts to about 2/3rds of the compute that a gaming workload involves. But again, if we were to put a non-bottlenecked floating-point exclusive workload on the handheld, and maxed out its clock-rate to 561 Mhz, it would achieve ~ 1.72 TFlops.
And we already explained pages ago why your power-argument which is based on the fact it is using an 8N/10N node doesn't incorporate the whole picture.
I am guessing you're going to be in denial on this until we actually have software measuring 561Mhz clock rates on jailbroken Switch 2's. And when that happens, it is a measurable fact that the Switch 2's GPU is capable (with the right workload) of 1.72 TFLOPs in handheld mode because floating-point performance is a function of clock-rate, core-count with a constant factor for each given micro-architecture and target precision.
For Ampere,
Max FP32 throughput = 2 *(Core Frequency)*Core Count/1000.
2*(561Mhz per core)*1536 cores/1000 = 1723.39 GFLOPs or 1.72339 TFLOPs.
And yes, there probably will be lower-clock handheld modes for battery life in less demanding games, but that doesn't mean no game will be hitting the 561Mhz peak.
Last edited by sc94597 - on 24 May 2025






