How Will be Switch 2 Performance Wise?

Pemalite said:

And yet... It still beats it.
Either way, the whole 3050 4GB vs 6GB argument is irrelevant.

Because like I alluded to before, video game developers build games within the confines of the hardware walls, not out of it.

The 4GB 2050 is also not a 3050 4GB or 6GB.
It's actually worst with a fraction of the bandwidth, which means it is even more useless at managing large datasets.

112GB/s vs 176GB/s/192GB/s is a big difference... Bandwidth is what is holding back the 2050 the most, not memory capacity.

It seems you've lost the context of the discussion here if you think it is irrelevant.

I argued that 6GB being available for graphics could alleviate VRAM capacity bottlenecks.
You argued that VRAM capacity bottlenecks are far less of an issue compared to VRAM bandwidth bottlenecks as the Switch 2 isn't targeting 1440p resolutions.
I showed a counterexample where two nearly identical low-end Ampere GPU's have the same exact specs except one has more VRAM capacity and the other more VRAM bandwidth. Their target resolution is native 1080p. The chip with more capacity not only was more stable even in those instances where it "loses" by having a slightly lower average framerate, but also didn't experience hard bottlenecks that nearly halved its framerate in the large and growing number of games that struggle at 1080p on 4GB cards.
The analogous situation where memory bandwidth is an huge issue doesn't show up. In other-words, the bottlenecks for the RTX 3050 6GB are less severe than the bottlenecks for the RTX 3050ti 4GB when targeting 1080p. Which is why most people recommend the prior over the latter.

Not sure where you're getting 176GB/s for the 3050 6GB (or are you referring to 3050 4GB?)

The 3050 6GB's effective double-rate clock speed is 14Gbps (in Afterburner we have 7001 Mhz * 2 for DDR)

14Gbps/pin * 96 pins/8bits/byte = 168 GBps.

Anyway,

Let's test your hypothesis (against the empirical evidence) on the 2050 vs. 3050.

Jarrod's Tech ran benchmarks comparing the RTX 2050 and RTX 3050 4GB with the same CPU.

Here is the average difference @1080p.

Yeah... it's not bandwidth, at least not at the lower TGPs.

Pemalite said:

sc94597 said:

Yes, it is 95W vs. 75W, BUT the GPU clocks are comparable and the 3050 is running at roughly 72W. 1965 MHz for the 3050ti and 1942 for the 3050 6GB. And the difference is +74%. That's not just because of a 20 watts difference, especially when that difference isn't affecting max clock rates.

TDP has a massive influence over mobile hardware... To the point where a lower-end part with a higher-TDP will outperform a higher-end part.

I.E. RTX 3070 outperforming the RTX 3080. - Despite the 3080 having twice the VRAM.

You know this, but it's the ability to achieve higher boost clocks that allows for the performance gains as you increase TGP. Having a certain voltage is necessary to achieve certain boost clocks; power usage is proportional to voltage squared (and linearly proportional to frequency) so you have to increase the power profile in order to achieve certain stable frequencies. But in the comparison I brought up, the ostensible "95W" RTX 3050 6GB was pulling about 70W most of the time, probably because of dynamic boosting apportioning power to the CPU, and the GPU core clock rates were the same between the two GPU's, with actually the 3050ti having a very slightly higher core clock.

If one pegs a GPU to 1950 Mhz, and then undervolts it so that they can reduce the power consumption by 20W, it's not going to lose performance as long as it is stable at that voltage.

Most voltage-frequency curves look like this, flattening around some horizontal asymptote. It seems that was what happened here.

Anyway, the 70% performance difference isn't because of the TDP label here. It is the fact that Forza bottlenecks as the VRAM fills up. All else that matters (clock rate, core count, etc.) is kept equal except that.

Keep in mind that Breath of the Wild also runs on a:

* Triple core CPU @ about 1.25Ghz
* 1GB Ram.
* Radeon 5550 class GPU.

It doesn't have hardware demands that are regarded as "intensive". 4k or not.

Yes and what hardware did Mirror's Edge run on originally?

Nintendo OS Ram use has generally increased in memory footprint every console generation.

Not only that, but one of the Switch's biggest issues is the extremely slow and laggy eSHOP performance, more memory dedicated to that task would clean it up a ton.
..And if they implement features like you alluded to, such as voice chat natively on the console itself, that would also require more RAM.

The Wii U had 1GB dedicated to the OS, like the Switch. The OS of the Wii and Wii U were significantly different enough in feature-set that it makes sense that there would be an increase. Likewise with the Wii and GameCube.

I don't see Nintendo adopting more features. In fact, the Switch was a downgrade in terms of non-gaming APPs compared to the Wii U (i.e no consumer-facing browser, no media apps, etc.)

Very much unlikely to be on a 5/4nm TSMC node as it's expensive.

Thratkor explains here why 5/4nm is very much likely, from a cost-minimization perspective. Basically going with 12 SMs doesn't make sense on 8N unless the power-profile is much, much higher than we would expect for a handheld. Going with 6SM at higher minimum clock speeds would be cheaper and give better performance, but we know from the leak there are 12 SMs on the T239.

"The short answer is that a 12 SM GPU is far too large for Samsung 8nm, and likely too large for any intermediate process like TSMC's 6nm or Samsung's 5nm/4nm processes. There's a popular conception that Nintendo will go with a "cheap" process like 8nm and clock down to oblivion in portable mode, but that ignores both the economic and physical realities of microprocessor design.

I recommend reading the whole post, but this snippet addresses the expense question specifically.

"But what about cost, isn't 4nm really expensive?

Actually, no. TSMC's 4N wafers are expensive, but they're also much higher density, which means you fit many more chips on a wafer. This SemiAnalysis article from September claimed that Nvidia pays 2.2x as much for a TSMC 4N wafer as they do for a Samsung 8nm wafer. However, Nvidia is achieving 2.7x higher transistor density on 4N, which means that a chip with the same transistor count would actually be cheaper if manufactured on 4N than 8nm (even more so when you factor yields into account)."

Radeon RX 570 is running Matrix Awakens demo.

It is a 4GB card.

I mean if you go into the config file and set every memory-intensive variable to 0, then yes it will work!

This is what they did to get it working.

Meanwhile the 3050ti doesn't do much better than Digital Foundry's attempt with the 2050 when running the actual demo, despite having nearly double the memory bandwidth.

"Video memory has been exhausted (2220.551 MB over budget) Expect extremely poor performance"

The Switch 2 version ostensibly is running with ray-tracing implemented.

Last edited by sc94597 - on 17 November 2023

Existing User Log In

New User Registration

Nintendo - How Will be Switch 2 Performance Wise? - View Post

Recent Badges: