By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Nintendo Discussion - Switch 2 motherboard maybe leaked

Otter said:
Kyuu said:

Seems like it'll be a lot weaker than I hoped (PS4 Pro) even though it's launching a year+ later than I expected. Perhaps modern features and optimization will push it close enough. Hopefully the base console won't cost more than $400.

I always find PS4 Pro a weird reference point for power since its more or less tied to PS4 builds but with checkerboarded 4k. No games are meaningfully built around its specs.

I think comparing it to base PS4 makes a lot more sense especially as S2 internal resolutions will always be 1080p or less, even when docked. It sounds like its effectively doubling PS4's graphical capabilities which bodes incredibly well for first party games. Thinking of games like Ghost of Tsushimaa, The Last of Us pt 2 and God of War Ragnarok... Nintendo's gonna have hardware capable of better graphics than those. That's crazy exciting to think about.

As far as third parties I'm not so worried. Most titles are GPU bound, they'll be a bit soft and lacking some graphical features & texture detail on Switch 2 but will be comparable at a glace at 30fps. Developers shouldn't have much headache but day one releases are unlikely to be common. I think we'll often see Switch 2 versions 4-8months later and aiming for 1080p output docked, upscaled from sub 720p resolutions.

I'll be playing very few third party games on it but excited that some of my friends that only game on Nintendo will finally be able to tuck into modern JRPGs and AAA games

I hoped it could match PS4 Pro on rasterization and approach Series S on CPU-light apps via more advanced features and better efficiency.

I do expect modern games to target lower average resolutions on Switch 2 than old games on PS4 Pro regardless of specs, precisely because the typical AAA graphics fidelity will be significantly higher. Resolution and framerate are decided by the workload, and the workload of a typical AAA Switch 2 game is probably going to be notably higher than a typical PS4 game.



Around the Network
Otter said:

I always find PS4 Pro a weird reference point for power since its more or less tied to PS4 builds but with checkerboarded 4k. No games are meaningfully built around its specs.

I think comparing it to base PS4 makes a lot more sense especially as S2 internal resolutions will always be 1080p or less, even when docked. It sounds like its effectively doubling PS4's graphical capabilities which bodes incredibly well for first party games. Thinking of games like Ghost of Tsushimaa, The Last of Us pt 2 and God of War Ragnarok... Nintendo's gonna have hardware capable of better graphics than those. That's crazy exciting to think about.

As far as third parties I'm not so worried. Most titles are GPU bound, they'll be a bit soft and lacking some graphical features & texture detail on Switch 2 but will be comparable at a glace at 30fps. Developers shouldn't have much headache but day one releases are unlikely to be common. I think we'll often see Switch 2 versions 4-8months later and aiming for 1080p output docked, upscaled from sub 720p resolutions.

I'll be playing very few third party games on it but excited that some of my friends that only game on Nintendo will finally be able to tuck into modern JRPGs and AAA games

Actually, the PS4 Pro is a great comparison if you think about how games will be made for the Switch 2's performance in handheld mode, much like PS4 games are made based on the performance of the base PS4 first. 



Doctor_MG said:

Actually, the PS4 Pro is a great comparison if you think about how games will be made for the Switch 2's performance in handheld mode, much like PS4 games are made based on the performance of the base PS4 first. 

But again PS4 Pro was often internally 1440p which is significantly more expensive than what S2 will be handling. I just don't think that comparison gives a reflection of how games will look and perform 



Norion said:
sc94597 said:

I am not as pessimistic as some are about it. Switch 2's CPU, even at only 1Ghz should be comparable to mobile i5's from a few generations ago that plenty of people are able to play games on at console-level framerates. For example, I have a Thinkpad with an i5-10310U that should be roughly comparable performance-wise to the Switch 2. With an eGPU (which has its own performance penalties associated with it) it's able to play any modern game at >=30fps. 

As a rough test, I am currently downloading Microsoft Flight Simulator 2024 now on an old Dell Inspiron with an i5 7300hq (4-core, 4-thread, low performance CPU) and GTX 1060 Max-Q to see how it performs. Guessing 1080p (upscaled) 30fps will be doable. The Switch 2's CPU should be slightly better than this old i5 in multi-core and similar in single-core. The GPU should be similar (maybe slightly weaker), in pure-rasterization. 

Can that handle Dragon's Dogma 2 alright? Cause I know that pushes the CPU really hard in certain areas with something like a Ryzen 3600 not performing that well.

Just tested it out. 30FPS lock 1080p, low, FSR Balanced, with V-sync works fine at both 3.1Ghz and 2.5Ghz. At 2.5Ghz there are more drops to about 28fps and the GPU is at about 85% utilization overall. At 3.1Ghz it is far more stable (constant 30fps) and the GPU hovers between 95-100% utilization. 

The game seems to utilize multiple cores well (CPU is at 100% utilization on all cores) so I think a Switch 2 version locked at 30FPS should be doable. 

Where I see Switch 2 struggling is in any game that needs a few powerful cores and doesn't scale beyond those few. 



I am skeptical of claims that put Switch 2 performance above the Steam Deck in handheld mode. The Switch 2 SOC is slightly larger with ca. 210mm² compared to Steam Deck 163mm² at the original TSMC at 7N process. With a Samsung 8N node that would mean fewer transistors for the Switch 2. Samsung 5N should give Switch 2 more transistors. Clock speed on Switch 2 is much lower to accommodate 10W power and a longer battery life.

Switch 2 die size: 210mm²
Steam Deck die size: 163mm²

Switch 2 GPU Hz handheld: 561MHz
Steam Deck GPU Hz: 1600Mhz

Switch 2 TFlops: ?
Steam Deck Tflops: 1.6 TFlops

The [original Steam Deck] Van Gogh graphics processor is an average sized chip with a die area of 163 mm² and 2,400 million transistors.

https://www.techpowerup.com/gpu-specs/steam-deck-gpu.c3897

Last edited by numberwang - on 15 January 2025

Around the Network
sc94597 said:
Norion said:

Can that handle Dragon's Dogma 2 alright? Cause I know that pushes the CPU really hard in certain areas with something like a Ryzen 3600 not performing that well.

Just tested it out. 30FPS lock 1080p, low, FSR Balanced, with V-sync works fine at both 3.1Ghz and 2.5Ghz. At 2.5Ghz there are more drops to about 28fps and the GPU is at about 85% utilization overall. At 3.1Ghz it is far more stable (constant 30fps) and the GPU hovers between 95-100% utilization. 

The game seems to utilize multiple cores well (CPU is at 100% utilization on all cores) so I think a Switch 2 version locked at 30FPS should be doable. 

Where I see Switch 2 struggling is in any game that needs a few powerful cores and doesn't scale beyond those few. 

If you got that in the main city area then it seems that it could be a big problem with certain games but overall it should be ok and thus it should still get significantly more big games ported to it than the Switch if these leaks are accurate. Thanks for the help.



numberwang said:

I am skeptical of claims that put Switch 2 performance above the Steam Deck in handheld mode. The Switch 2 SOC is slightly larger with ca. 210mm² compared to Steam Deck 163mm² at the original TSMC at 7N process. With a Samsung 8N node that would mean fewer transistors for the Switch 2. Samsung 5N should give Switch 2 more transistors. Clock speed on Switch 2 is much lower to accommodate 10W power and a longer battery life.

Switch 2 die size: 210mm²

Steam Deck die size: 163mm²

Switch 2 GPU Hz handheld: 561MHz

Steam Deck GPU Hz: 1600Mhz

Switch 2 TFlops: ?

Steam Deck Tflops: 1.6 TFlops

The [original Steam Deck] Van Gogh graphics processor is an average sized chip with a die area of 163 mm² and 2,400 million transistors.

https://www.techpowerup.com/gpu-specs/steam-deck-gpu.c3897

TFLOPs are a function of clock-frequency and core-count. Both of those are knowns now. The node isn't important anymore. So yes, handheld Switch 2 is 1.72 TFLOPs. For raw-rasterization, that's not enough to say it is better than the Steam Deck 2 though, because 1 Ampere TFLOP ~ .7 - .75 RDNA2 TFLOPs when it comes to inferring rasterization performance. On paper, in a pure rasterized workload a max-TDP Steam Deck would outperform a Switch 2 handheld, all else kept equal. But with DLSS and in mixed ray-tracing/rasterized workloads (which are increasingly more common) the Switch 2 handheld should make up the gap. 

Again, why can the Switch 2 pull this off at a lower wattage and frequencies than the Steam Deck? Because the Switch 2 has three times the shading units/cores (1536 for Switch 2 vs. 512 for Steam Deck.)

The Steam Deck starts to collapse in terms of power-efficiency at about 1200Mhz or more, and the voltage has to rapidly increase to increase frequency beyond this, and therefore the power-consumption increase quadratically with voltage. So much so that in order to get that last 400 Mhz the power-consumption has to double on the Steam Deck. You can get 75% of the Steam Deck's performance when running at half its max TDP. 

Edit: Also that is without considering that most Steam Deck games run on a compatibility layer with performance loss + x86 (even AMD x86) is less efficient than ARM at sub-15W TDPs unless you use actual x86 efficiency cores (that cut out some of the instruction set), which nobody does because of compatibility issues. 

Last edited by sc94597 - on 15 January 2025

sc94597 said:
numberwang said:

I am skeptical of claims that put Switch 2 performance above the Steam Deck in handheld mode. The Switch 2 SOC is slightly larger with ca. 210mm² compared to Steam Deck 163mm² at the original TSMC at 7N process. With a Samsung 8N node that would mean fewer transistors for the Switch 2. Samsung 5N should give Switch 2 more transistors. Clock speed on Switch 2 is much lower to accommodate 10W power and a longer battery life.

Switch 2 die size: 210mm²

Steam Deck die size: 163mm²

Switch 2 GPU Hz handheld: 561MHz

Steam Deck GPU Hz: 1600Mhz

Switch 2 TFlops: ?

Steam Deck Tflops: 1.6 TFlops

The [original Steam Deck] Van Gogh graphics processor is an average sized chip with a die area of 163 mm² and 2,400 million transistors.

https://www.techpowerup.com/gpu-specs/steam-deck-gpu.c3897

TFLOPs are a function of clock-frequency and core-count. Both of those are knowns now. The node isn't important anymore. So yes, handheld Switch 2 is 1.72 TFLOPs. For raw-rasterization, that's not enough to say it is better than the Steam Deck 2 though, because 1 Ampere TFLOP ~ .7 - .75 RDNA2 TFLOPs when it comes to inferring rasterization performance. On paper, in a pure rasterized workload a max-TDP Steam Deck would outperform a Switch 2 handheld, all else kept equal. But with DLSS and in mixed ray-tracing/rasterized workloads (which are increasingly more common) the Switch 2 handheld should make up the gap. 

Again, why can the Switch 2 pull this off at a lower wattage and frequencies than the Steam Deck? Because the Switch 2 has three times the shading units/cores (1536 for Switch 2 vs. 512 for Steam Deck.)

The Steam Deck starts to collapse in terms of power-efficiency at about 1200Mhz or more, and the voltage has to rapidly increase to increase frequency beyond this, and therefore the power-consumption increase quadratically with voltage. So much so that in order to get that last 400 Mhz the power-consumption has to double on the Steam Deck. You can get 75% of the Steam Deck's performance when running at half its max TDP. 

Edit: Also that is without considering that most Steam Deck games run on a compatibility layer with performance loss + x86 (even AMD x86) is less efficient than ARM at sub-15W TDPs unless you use actual x86 efficiency cores (that cut out some of the instruction set), which nobody does because of compatibility issues. 

We don't know the shader count of the Switch 2 or really anything about its internal soc structure. The leaked image does not confirm anything that was speculated before with that T239. Switch 2 having 3x the "shading units/cores" to a Steam Deck is implausible on a comparable die size / transistor count without sacrificing something else on the die.

Look at it this way: Switch 1 had a die 118 mm² (TSMC 20nm) with 307MHz in handheld mode with 0.157 TFlops. A 5x increase to 0.8 Tflops on Switch 2 handheld through a bigger die, a better fabrication node and higher clocks is reasonable.

https://www.techpowerup.com/gpu-specs/switch-gpu-20nm.c3104



numberwang said:

We don't know the shader count of the Switch 2 or really anything about its internal soc structure. The leaked image does not confirm anything that was speculated before with that T239. Switch 2 having 3x the "shading units/cores" to a Steam Deck is implausible on a comparable die size / transistor count without sacrificing something else on the die.

Look at it this way: Switch 1 had a die 118 mm² (TSMC 20nm) with 307MHz in handheld mode with 0.157 TFlops. A 5x increase to 0.8 Tflops on Switch 2 handheld through a bigger die, a better fabrication node and higher clocks is reasonable.

https://www.techpowerup.com/gpu-specs/switch-gpu-20nm.c3104

Yes we do know that the T239 is a 12SM Ampere chip with 1536 cores. This is one of the things we know more than anything else, and have known for years now. It's not speculation. It is hard data from the Nvidia hack. It has been corroborated dozens of times over the last three years. 

For example, Eurogamer/Digital Foundry wrote an article about it in 2023. 

https://www.eurogamer.net/digitalfoundry-2023-inside-nvidias-latest-hardware-for-nintendo-what-is-the-t239-processor

The hack also suggests that T239 has 1536 CUDA cores, 75 percent of the cores of the much larger T234.

It is much more of a known than the manufacturing process. 

Are you going to be in denial about this until the chip releases in March or April and it is a side-comment in some breakdown of the die because everyone else has already accepted this fact for years now, and it isn't really interesting anymore? 

Also, the die-size range we are observing currently was predicted years ago too. 

https://famiboards.com/threads/future-nintendo-hardware-technology-speculation-discussion-st-new-staff-post-please-read.55/post-900926

Best guess for N4: 91mm².
Best guess for 7N: 136.2mm²
Best guess for Samsung 8nm: 201.4mm²


sc94597 said:

And again, the Switch 2 would be able to do this at lower power-levels because it takes a wide (many core) and slow (low clock-rate) architecture, which is more power-efficient than the Steam Deck's few-core, high clock-rate repurposed APU.

That is blatantly false.
See: Jump from Maxwell to Pascal.

Every chip has an "efficiency curve" which is a function of clockrate x voltage x transistor count... Things like electrical leakage and electromigration also impacts the efficiency curve.

When nVidia took the Geforce GTX 980, they increased the number of functional units by about 25% for the Geforce GTX 1080, but performance was improved by upwards of 70%.

How did they do it? Clockspeeds.
But how did they achieve the clockspeeds? Finfet reduced the amount of leakage, but nVidia also implemented extra dark silicon from "noisy" and energy hungry parts of the chip, which reduced the crosstalk... Which meant they were able to drive up clockspeeds.

The result is, that despite using relatively the same amount of energy as the Geforce GTX 980, they were able to increase performance by a substantial amount.


Every fabrication node, every single chip architecture... All have different clockrate/voltage efficiency curves and those chips get "binned" for that.

For some parts it would cost the same amount of energy if you ran a chip at 500mhz or 1000mhz because the quality of the silicon is really good.

Sometimes though... Companies like AMD, nVidia and Intel will take a part, throw efficiency out the window and drive clockspeeds as hard as they can go... We saw this with AMD's Vega GPU architecture (FuryX), which was actually super super efficient despite being a massive chip at low clockspeeds, but AMD decided to try and compete for the high-end so drove clockspeeds and voltages as high as possible.
If you backed the clockspeeds off, reduced the voltages, you can claw back a significant amount of power savings with minimal reduction in performance with the FuryX.

sc94597 said:

Handheld: CPU 1100.8 MHz, GPU 561 MHz, EMC 2133 MHz

Docked: CPU 998.4 MHz, GPU 1007.25 MHz, EMC 3200 MHz

More specific clock-rates. If the memory clocks are per module, then we're looking at 100 GBps in docked mode (rather than 120GBps max for LPDDR5X.) Not horrible given the GPU's performance. Should expect 25 GBps per TFLOP on Ampere. After considering the CPU's share of the bandwidth, 100 GBps should be sufficient. 

It's irrelevant if it's per module or all modules.

Your use of teraflops in this way is also bizarre.

100GB/s for 3200mhz (6400Mhz effective)
68GB/s for 2133Mhz. (4266Mhz effective)

It's a 720P/1080P device. - 1080P may be a little compromised as ideally you want around 150GB/s - 200GB/s of bandwidth for the necessary fillrates to drive that real estate.

However this doesn't account for real-world bandwidth improvements through better culling, better compression and more.


sc94597 said:

It's hard to say, given how old GCN 2.0 is and there rarely are direct comparisons between these architectures (that control for driver optimizations) to give us a good idea how they compare, but Ampere TFLOPs seem to correspond to 1.1-1.3 GCN 2.0 TFLOPs when estimating rasterization from them (not including ray-tracing, neural-rendering, etc, of course.) PS4 Pro is capable of about 4.2 TFLOPs. That's about 3.23 - 3.8 TFLOPs (adjusted) when comparing with the Switch 2, adjusting for the TFLOP per unit of rasterization performance. That puts Switch 2 around 80-96% of the raw theoretical performance of the PS4 Pro before any bottlenecks, depending on which ratio you use. 

With DLSS, it shouldn't be too hard for Switch 2 to match or even exceed PS4 Pro level graphics when docked, especially given that it has a better CPU (even with the heavy under-clock) and more available memory. 

This is a rough comparison, but the thing to take away that in terms of raw-performance they're roughly in the same class.

And of course when it comes to modern features (like ray-tracing and neural rendering) the Switch 2 will be able to do things the PS4 Pro couldn't. 

Teraflops are identical irrespective of architecture, they are the same single precision numbers.
It's a theoretical ceiling, not a real-world one.

More goes into rendering a game that it's simply not about the Teraflops... It actually never has been.

Also keep in mind that the Playstation 4 had games that used Ray Tracing, software based global illumination with light bounce, it's definitely more limited and primitive, but it did happen on the system.

sc94597 said:

To put things in perspective, a Cortex A78 cluster (4-cores) gets a Geekbench 6 score of about 1121 single, 3016 multi-core at 2GHz.

An FX-8120 (a desktop CPU of an architecture similar to the 8th Gen console's) gets about 413 single, 1800 multi-core at 3.1 Ghz. 

The 8th Generation consoles range from 1.6 Ghz to 2.3 Ghz, so they're running at much lower clocks than the FX-8120. Jaguar had higher IPC than Bulldozer, but only by about 20-30%. Not enough to make up the difference. 

We're looking at an IPC for the Switch 2's CPU nearly double that of Jaguar CPUs. 

The A78C also has an advantage over the base A78, in that all cores are on a single cluster and are homogenous. 

Even at only 1Ghz the Switch 2's CPU should outclass the PS4/Pro/XBO/XBO:S/XBO:X pretty easily. The IPC difference is just too large between modern-ish ARM and Jaguar. 

There is also the matter that game-engines are just much more efficient with multi-threading loads now than they were during the 8th generation. 

It's not always as simple as that.
Microsoft for example implemented extra silicon on the Xbox One X which offloaded CPU tasks onto fixed function hardware like API draw calls, which greatly improved the Jaguar CPU efficiency as it could be used to focus on other things.

sc94597 said:

I am not as pessimistic as some are about it. Switch 2's CPU, even at only 1Ghz should be comparable to mobile i5's from a few generations ago that plenty of people are able to play games on at console-level framerates. For example, I have a Thinkpad with an i5-10310U that should be roughly comparable performance-wise to the Switch 2. With an eGPU (which has its own performance penalties associated with it) it's able to play any modern game at >=30fps. 

As a rough test, I am currently downloading Microsoft Flight Simulator 2024 now on an old Dell Inspiron with an i5 7300hq (4-core, 4-thread, low performance CPU) and GTX 1060 Max-Q to see how it performs. Guessing 1080p (upscaled) 30fps will be doable. The Switch 2's CPU should be slightly better than this old i5 in multi-core and similar in single-core. The GPU should be similar (maybe slightly weaker), in pure-rasterization. 

Developers will work with whatever they get in the end.

numberwang said:

I am skeptical of claims that put Switch 2 performance above the Steam Deck in handheld mode. The Switch 2 SOC is slightly larger with ca. 210mm² compared to Steam Deck 163mm² at the original TSMC at 7N process. With a Samsung 8N node that would mean fewer transistors for the Switch 2. Samsung 5N should give Switch 2 more transistors. Clock speed on Switch 2 is much lower to accommodate 10W power and a longer battery life.

Switch 2 die size: 210mm²
Steam Deck die size: 163mm²

Switch 2 GPU Hz handheld: 561MHz
Steam Deck GPU Hz: 1600Mhz

Switch 2 TFlops: ?
Steam Deck Tflops: 1.6 TFlops

The [original Steam Deck] Van Gogh graphics processor is an average sized chip with a die area of 163 mm² and 2,400 million transistors.

https://www.techpowerup.com/gpu-specs/steam-deck-gpu.c3897

There are going to be aspects where the Switch 2's GPU will showcase significant advantages over the Steam Deck's GPU and vice versa as it's ultimately a battle between AMD and nVidia and both companies have different strengths and weaknesses with their GPU architectures.

sc94597 said:

TFLOPs are a function of clock-frequency and core-count. Both of those are knowns now.

There is more to it than that, you also need to include the number of instructions and the precision.

For example...
A GPU with 256 cores operating at 1,000Mhz with 1 instruction per clock with 32bit precision is 512Gflops.
A GPU with 256 cores operating at 1,000Mhz with 2 instructions per clock with 32bit precision is 1024Gflops.
A GPU with 256 cores operating at 1,000mhz with 2 instructions per clock but operating at 16bit precision double packed is 2048Gflops.

Same number of cores, same clockspeed... But there is a 4x difference.

Developers optimizing for mobile hardware tend to use 16bit precision whenever possible due to the inherent power saving and speed advantages.

sc94597 said:

The node isn't important anymore.

The node is important, it dictates the size, complexity and energy characteristics of the SoC.

sc94597 said:

For raw-rasterization, that's not enough to say it is better than the Steam Deck 2 though, because 1 Ampere TFLOP ~ .7 - .75 RDNA2 TFLOPs when it comes to inferring rasterization performance. On paper, in a pure rasterized workload a max-TDP Steam Deck would outperform a Switch 2 handheld, all else kept equal. But with DLSS and in mixed ray-tracing/rasterized workloads (which are increasingly more common) the Switch 2 handheld should make up the gap. 

The teraflops are the same regardless if it's Graphics Core Next or RDNA or Ampere.

The differences in the real world tend to come from things like caches, precision (Integer 8/16, Floating Point 16/32/64), geometry throughput, pixel fillrate, texture fillrate, texel fillrate, compression, culling and more rather than the teraflops alone which is generally your single precision throughput that goes through your shader cores... Which ignores 99% the rest of the chips capabilities.

DLSS is often being used as a "crux" to get games performing adequately rather than as a tool to "enhance" games... Just like FSR is used as a "crux" to get games looking and performing decent on Xbox Series X and Playstation 5.

sc94597 said:

Edit: Also that is without considering that most Steam Deck games run on a compatibility layer with performance loss + x86 (even AMD x86) is less efficient than ARM at sub-15W TDPs unless you use actual x86 efficiency cores (that cut out some of the instruction set), which nobody does because of compatibility issues. 

Modern x86 processors all reduce CISC operations into smaller micro-operations which is very RISC-like anyway.



--::{PC Gaming Master Race}::--