VGC: Switch 2 Was Shown At Gamescom Running Matrix Awakens UE5 Demo

zeldaring

Banned

9,021

2078 posts since 21/05/23

Recent Badges:

Trust Me, It'll Have Legs 100 replies made to user's most popular thread.
Happy Birthday Logged in on your birthday.
Escape Artist Managed to avoid being banned for 1 month.
Breaking Out Managed to avoid being banned for 1 year.
So You Came Back For More, Huh? Logged in a second time.
Leaving Limbo 100 posts on the gamrConnect forums.

zeldaring on 15 September 2023

Oneeee-Chan!!! said:

zeldaring said:

It's 8nm The leaker go his info from a credible source.

You mean Kopite?

He's one of the most credible leakers for NVDA and well known and he's saying 8nm with out hesitation. He clearly knows what he's talking since he leaks many nividia GPU's.

Last edited by zeldaring - on 15 September 2023

sc94597

Currently Online

102,808

19196 posts since 23/01/08

Recent Badges:

Happy Birthday Logged in on your birthday.
Quite a Collection of Parts Add a total of 250 games to your collection.
Haunted Logged in on Halloween.
8 Years Has been a VGChartz member for over 8 years.
9 Years Has been a VGChartz member for over 9 years.
Some Here, Some There Bank a Total of 5,000 VG$.

sc94597 on 15 September 2023

This post explains why 8nm doesn't make sense from a "Nintendo goes cheap and low power" perspective.

Either our assumptions about power consumption are wrong or the process node (8nm) is wrong. Which one seems more likely?

https://famiboards.com/threads/future-nintendo-hardware-technology-speculation-discussion-st-read-the-staff-posts-before-commenting.55/post-683773

"However, while it's reasonable to design a chip with intent to clock it at the peak efficiency clock, or to clock it above the peak efficiency clock, what you're not going to see is a chip that's intentionally designed to run at a fixed clock speed that's below the peak efficiency clock. The reason for this is pretty straight-forward; if you have a design with a large number of SMs that's intended to run at a clock below the peak efficiency clock, you could just remove some SMs and increase the clock speed and you would get both better performance within your power budget and it would cost less."

"The above section wasn't theoretical. Nvidia and Nintendo did sit in a room (or have a series of calls) to design a chip for a new Nintendo console, and what they came out with is T239. We know that the result of those discussions was to use a 12 SM Ampere GPU. We also know the power curve, and peak efficiency clock for a very similar Ampere GPU on 8nm.

The GPU in the TX1 used in the original Switch units consumed around 3W in portable mode, as far as I can tell. In later models with the die-shrunk Mariko chip, it would have been lower still. Therefore, I would expect 3W to be a reasonable upper limit to the power budget Nintendo would allocate for the GPU in portable mode when designing the T239.

With a 3W power budget and a peak efficiency clock of 470MHz, then the (again, not theoretical) numbers above tell us the best possible performance would be achieved by a 6 SM GPU operating at 470MHz, and that you'd be able to get 90% of that performance with a 4 SM GPU operating at 640MHz. Note that neither of these say 12 SMs. A 12 SM GPU on Samsung 8nm would be an awful design for a 3W power budget. It would be twice the size and cost of a 6 SM GPU while offering much less performance, if it's even possible to run within 3W at any clock.

There's no world where Nintendo and Nvidia went into that room with an 8nm SoC in mind and a 3W power budget for the GPU in handheld mode, and came out with a 12 SM GPU. That means either the manufacturing process, or the power consumption must be wrong (or both). I'm basing my power consumption estimates on the assumption that this is a device around the same size as the Switch and with battery life that falls somewhere between TX1 and Mariko units. This seems to be the same assumption almost everyone here is making, and while it could be wrong, I think them sticking with the Switch form-factor and battery life is a pretty safe bet, which leaves the manufacturing process."

"So, what manufacturing process can give a 2.5x improvement in efficiency over Samsung 8nm? The only reasonable answer I can think of is TSMC's 5nm/4nm processes, including 4N, which just happens to be the process Nvidia is using for every other product (outside of acquired Mellanox products) from this point onwards. In Nvidia's Ada white paper (an architecture very similar to Ampere), they claim a 2x improvement in performance per Watt, which appears to come almost exclusively from the move to TSMC's 4N process, plus some memory changes."

With 4N just about stretching to the 2.5x improvement in efficiency required for a 12 SM GPU to make sense, I don't think the chances for any other process are good. We don't have direct examples for other processes like we have for Ada, but from everything we know, TSMC's 5nm class processes are significantly more efficient than either their 6nm or Samsung's 5nm/4nm processes. If it's a squeeze for 12 SMs to work on 4N, then I can't see how it would make sense on anything less efficient than 4N.

Last edited by sc94597 - on 15 September 2023

Chrkeller

Banned

26,258

4992 posts since 17/04/14

Recent Badges:

7 Years Has been a VGChartz member for over 7 years.
A Free Man Managed to avoid being banned for 2 years.
Happy Birthday Logged in on your birthday.
One Piece at a Time Add your first game to your collection.
Watch Your Back! Received 10,000 profile views.
2 Years Has been a VGChartz member for over 2 years.

Chrkeller on 15 September 2023

sc94597 said:

This post explains why 8nm doesn't make sense from a "Nintendo goes cheap and low power" perspective.

Either our assumptions about power consumption are wrong or the process node (8nm) is wrong. Which one seems more likely?

https://famiboards.com/threads/future-nintendo-hardware-technology-speculation-discussion-st-read-the-staff-posts-before-commenting.55/post-683773

"However, while it's reasonable to design a chip with intent to clock it at the peak efficiency clock, or to clock it above the peak efficiency clock, what you're not going to see is a chip that's intentionally designed to run at a fixed clock speed that's below the peak efficiency clock. The reason for this is pretty straight-forward; if you have a design with a large number of SMs that's intended to run at a clock below the peak efficiency clock, you could just remove some SMs and increase the clock speed and you would get both better performance within your power budget and it would cost less."

"The above section wasn't theoretical. Nvidia and Nintendo did sit in a room (or have a series of calls) to design a chip for a new Nintendo console, and what they came out with is T239. We know that the result of those discussions was to use a 12 SM Ampere GPU. We also know the power curve, and peak efficiency clock for a very similar Ampere GPU on 8nm.

The GPU in the TX1 used in the original Switch units consumed around 3W in portable mode, as far as I can tell. In later models with the die-shrunk Mariko chip, it would have been lower still. Therefore, I would expect 3W to be a reasonable upper limit to the power budget Nintendo would allocate for the GPU in portable mode when designing the T239.

With a 3W power budget and a peak efficiency clock of 470MHz, then the (again, not theoretical) numbers above tell us the best possible performance would be achieved by a 6 SM GPU operating at 470MHz, and that you'd be able to get 90% of that performance with a 4 SM GPU operating at 640MHz. Note that neither of these say 12 SMs. A 12 SM GPU on Samsung 8nm would be an awful design for a 3W power budget. It would be twice the size and cost of a 6 SM GPU while offering much less performance, if it's even possible to run within 3W at any clock.

There's no world where Nintendo and Nvidia went into that room with an 8nm SoC in mind and a 3W power budget for the GPU in handheld mode, and came out with a 12 SM GPU. That means either the manufacturing process, or the power consumption must be wrong (or both). I'm basing my power consumption estimates on the assumption that this is a device around the same size as the Switch and with battery life that falls somewhere between TX1 and Mariko units. This seems to be the same assumption almost everyone here is making, and while it could be wrong, I think them sticking with the Switch form-factor and battery life is a pretty safe bet, which leaves the manufacturing process."

"So, what manufacturing process can give a 2.5x improvement in efficiency over Samsung 8nm? The only reasonable answer I can think of is TSMC's 5nm/4nm processes, including 4N, which just happens to be the process Nvidia is using for every other product (outside of acquired Mellanox products) from this point onwards. In Nvidia's Ada white paper (an architecture very similar to Ampere), they claim a 2x improvement in performance per Watt, which appears to come almost exclusively from the move to TSMC's 4N process, plus some memory changes."

We are speculating on rumors is the issue.

sc94597

Currently Online

102,808

19196 posts since 23/01/08

Recent Badges:

A Badge Within A Badge Earned 20 badges.
14 Years Has been a VGChartz member for over 14 years.
Freezing Logged in on Christmas day.
Genocide 5,000 posts on the gamrConnect forums.
17 Years Has been a VGChartz member for over 17 years.
6 Years Has been a VGChartz member for over 6 years.

sc94597 on 15 September 2023

Chrkeller said:

."

We are speculating on rumors is the issue.

There is some hard evidence though. The Nvidia leak (and the 12SM GPU - T239 it proved existed) aren't a rumor.

Nintendo and Nvidia could have changed their mind since then, but at one point they intended to go with a 12SM GPU.

Basically if Switch 2 = T239, then either process node =/= 8nm or our power consumption assumptions are wrong.

Chrkeller

Banned

26,258

4992 posts since 17/04/14

Recent Badges:

Leaving Limbo 100 posts on the gamrConnect forums.
Haunted Logged in on Halloween.
Watch Your Back! Received 10,000 profile views.
7 Years Has been a VGChartz member for over 7 years.
It's a Start Bank a Total of 2,000 VG$.
Congratulations on Pressing Start! Score your first game in your collection.

Chrkeller on 15 September 2023

sc94597 said:

Chrkeller said:

We are speculating on rumors is the issue.

There is some hard evidence though. The Nvidia leak (and the 12SM GPU - T239 it proved existed) aren't a rumor.

Nintendo and Nvidia could have changed their mind since then, but at one point they intended to go with a 12SM GPU.

Basically if Switch 2 = T239, then either process node =/= 8nm or our power consumption assumptions are wrong.

Leaks are assumptions. It isn't hard evidence.

Wind Waker on the switch was leaked from a 'reliable' source.....

What Nintendo will or will not use is speculation. Especially since underclocking is a real possibility like with the current model.

sc94597

Currently Online

102,808

19196 posts since 23/01/08

Recent Badges:

3 Years Has been a VGChartz member for over 3 years.
6 Years Has been a VGChartz member for over 6 years.
Mighty Heart Logged in on Valentine's Day.
It's a Start Bank a Total of 2,000 VG$.
Pata 100 wall post comments made on gamrConnect.
Ride Into the Sunset Managed to avoid being banned for 3 months.

sc94597 on 15 September 2023

Chrkeller said:

sc94597 said:

There is some hard evidence though. The Nvidia leak (and the 12SM GPU - T239 it proved existed) aren't a rumor.

Nintendo and Nvidia could have changed their mind since then, but at one point they intended to go with a 12SM GPU.

Basically if Switch 2 = T239, then either process node =/= 8nm or our power consumption assumptions are wrong.

Leaks are assumptions. It isn't hard evidence.

Wind Waker on the switch was leaked from a 'reliable' source.....

What Nintendo will or will not use is speculation. Especially since underclocking is a real possibility like with the current model.

Sure, if we throw out the assumption that the Switch 2 is using T239, then anything is possible. I was addressing the point that Switch 2 could be T239 AND 8nm.

The post I shared explains why underclocking might not even be an option. There is a point where you can't reduce power consumption anymore by reducing clocks.

Because power consumption is mostly related to voltage, not clock speed, when you reduce clocks but keep the voltage the same, you don't really save much power. A large part of the power consumption called "static power" stays exactly the same, while the other part, "dynamic power", does fall off a bit. What you end up with is much less performance, but only slightly less power consumption. That is, power efficiency gets worse.

So that kink in the efficiency graph, between 420MHz and 522MHz, is the point at which you can't reduce the voltage any more. Any clocks below that point will all operate at the same voltage, and without being able to reduce the voltage, power efficiency gets worse instead of better below that point. The clock speed at that point can be called the "peak efficiency clock", as it offers higher power efficiency than any other clock speed.

Last edited by sc94597 - on 15 September 2023

Chrkeller

Banned

26,258

4992 posts since 17/04/14

Recent Badges:

A Civilized Man Managed to avoid being banned for 5 years.
Open For Business Earned 10 badges.
Mirror Image Awarded for uploading an avatar.
Littlest Genocide 1,000 posts on the gamrConnect forums.
Making Friends 10 friends on gamrConnect.
Mighty Heart Logged in on Valentine's Day.

Chrkeller on 15 September 2023

sc94597 said:

Chrkeller said:

Leaks are assumptions. It isn't hard evidence.

Wind Waker on the switch was leaked from a 'reliable' source.....

What Nintendo will or will not use is speculation. Especially since underclocking is a real possibility like with the current model.

Sure, if we throw out the assumption that the Switch 2 is using T239, then anything is possible. I was addressing the point that Switch 2 could be T239 AND 8nm.

The post I shared explains why underclocking might not even be an option. There is a point where you can't reduce power consumption anymore by reducing clocks.

Because power consumption is mostly related to voltage, not clock speed, when you reduce clocks but keep the voltage the same, you don't really save much power. A large part of the power consumption called "static power" stays exactly the same, while the other part, "dynamic power", does fall off a bit. What you end up with is much less performance, but only slightly less power consumption. That is, power efficiency gets worse.

So that kink in the efficiency graph, between 420MHz and 522MHz, is the point at which you can't reduce the voltage any more. Any clocks below that point will all operate at the same voltage, and without being able to reduce the voltage, power efficiency gets worse instead of better below that point. The clock speed at that point can be called the "peak efficiency clock", as it offers higher power efficiency than any other clock speed.

I'll admit most of that is over my head. I'll take your word for it.

sc94597

Currently Online

102,808

19196 posts since 23/01/08

Recent Badges:

50 in One Add a total of 50 games to your collection.
7 Years Has been a VGChartz member for over 7 years.
Killer Scorpion Earned 60 badges.
God Of VGC 10,000 posts on the gamrConnect forums.
Mirror Image Awarded for uploading an avatar.
Brotherhood 100 friends on gamrConnect.

sc94597 on 15 September 2023

Basically there are a few principles to consider when doing this analysis:

The larger the chip => the more expensive it is, all else equal. This is because you cut fewer chips per wafer AND because there is a greater likelihood of defects, resulting in fewer useable chips.

The "smaller" the process node => the more costly the wafer, but not necessarily the chips produced by it.

This is because:

The "smaller" the process node => the denser the transistor complexity of the wafer => more chips that can be cut from that wafer.

Voltage (and therefore power) is loosely proportional to clock speed until you bring it so low that you approach a sort of "minimum voltage" (before which you need to shut down cores.)

GPU's can be utilized well in parallel workloads, so having more cores can in most cases easily make up for a low voltage, but having more cores increases die size and therefore cost (for reasons mentioned earlier.)

There is an optimal voltage/core count for a given power profile, on a give process node.

Because core clock can vary, but core count is set, it is important to get core count correct earlier.

12SM's is nowhere near the optimal core count for an 8N Samsung chip at 3W. It might be doable on a 4N TSMC chip.

Last edited by sc94597 - on 15 September 2023

Oneeee-Chan!!!

Currently Offline

26,633

4628 posts since 11/07/16

Recent Badges:

It's a Start Bank a Total of 2,000 VG$.
Happy Birthday Logged in on your birthday.
So You Came Back For More, Huh? Logged in a second time.
Making Friends 10 friends on gamrConnect.
One Small 'Splosion Author of 100 forum threads.
Leaving Limbo 100 posts on the gamrConnect forums.

Currently Playing:

Animal Crossing: New Horizons (NS)
Shadow of the Tomb Raider (PS4)
The Legend of Zelda: Breath of the Wild (NS)
Doom (2016) (PC)

Oneeee-Chan!!! on 15 September 2023

I replied to the wrong person.

Last edited by Oneeee-Chan!!! - on 15 September 2023

zeldaring

Banned

9,021

2078 posts since 21/05/23

Recent Badges:

So You Came Back For More, Huh? Logged in a second time.
Breaking Out Managed to avoid being banned for 1 year.
1st Birthday Has been a VGChartz member for over 1 year.
Ride Into the Sunset Managed to avoid being banned for 3 months.
Happy Birthday Logged in on your birthday.
Leaving Limbo 100 posts on the gamrConnect forums.

zeldaring on 15 September 2023

sc94597 said:

Basically there are a few principles to consider when doing this analysis:

The larger the chip => the more expensive it is, all else equal. This is because you cut fewer chips per wafer AND because there is a greater likelihood of defects, resulting in fewer useable chips.

The "smaller" the process node => the more costly the wafer, but not necessarily the chips produced by it.

This is because:

The "smaller" the process node => the denser the transistor complexity of the wafer => more chips that can be cut from that wafer.

Voltage (and therefore power) is loosely proportional to clock speed until you bring it so low that you approach a sort of "minimum voltage" (before which you need to shut down cores.)

GPU's can be utilized well in parallel workloads, so having more cores can in most cases easily make up for a low voltage, but having more cores increases die size and therefore cost (for reasons mentioned earlier.)

There is an optimal voltage/core count for a given power profile, on a give process node.

Because voltage can vary, but core count is set, it is important to get core count correct earlier.

12SM's is nowhere near the optimal core count for an 8N Samsung chip at 3W. It might be doable on a 4N TSMC chip.

The guy posting the info is not some idiot he understands all this. He even caught someone postings a fake NVDA card video and called them out before it was released his info says 8NM and people have replied with your same theory yet he didn't change his info, meaning he probably has a very reliable source but course its all a rumor and we will have to wait i would wager on him being right though.

Last edited by zeldaring - on 15 September 2023

Existing User Log In

New User Registration

Forums - Nintendo Discussion - VGC: Switch 2 Was Shown At Gamescom Running Matrix Awakens UE5 Demo

Recent Badges:

Recent Badges:

Recent Badges:

Recent Badges:

Recent Badges:

Recent Badges:

Recent Badges:

Recent Badges:

Recent Badges:

Currently Playing:

Recent Badges: