Clarifying the 1.5TFLOPS of the SWITCH for those who just see the numbers.

Lafiel

Currently Offline

29,595

9331 posts since 01/02/07

Recent Badges:

Watch Your Back! Received 10,000 profile views.
Vice Free Managed to avoid being banned for 6 months.
Making Friends 10 friends on gamrConnect.
Hit And Run 15 comments posted on VGChartz news articles.
14 Years Has been a VGChartz member for over 14 years.
Open For Business Earned 10 badges.

Lafiel on 02 November 2016

vivster said:

Now the Switch's mystical number of 1.5TFLOPS refers to FP16. The FP32 performance is logically at 750GFLOPS. For comparison the PS4 sports 1.84TFLOPS in FP32. Its FP16 performance is naturally double as fast.

the PS4 still used GCN 2.0, which afaik didn't have native FP16/INT16 support meaning half-presicion tasks won't run at twice at a time on that architecture

GCN 4.0 (Polaris) and Pascall on the other hands added native support for half-precision tasks, so that two can be done at a time

walsufnir

Currently Offline

45,889

10494 posts since 22/12/11

Recent Badges:

Trust Me, It'll Have Legs 100 replies made to user's most popular thread.
It's a Start Bank a Total of 2,000 VG$.
Site Veteran Has been a VGChartz member for over 5 years.
Making Friends 10 friends on gamrConnect.
'Ello Princess! Awarded for signing up.
8 Years Has been a VGChartz member for over 8 years.

Currently Playing:

The Elder Scrolls V: Skyrim (X360)
Daytona USA (XBL)
Castlevania: Symphony of the Night (XBL)

walsufnir on 02 November 2016

Lafiel said:

vivster said:

Now the Switch's mystical number of 1.5TFLOPS refers to FP16. The FP32 performance is logically at 750GFLOPS. For comparison the PS4 sports 1.84TFLOPS in FP32. Its FP16 performance is naturally double as fast.

the PS4 still used GCN 2.0, which afaik didn't have native FP16/INT16 support meaning half-presicion tasks won't run at twice at a time on that architecture

GCN 4.0 (Polaris) and Pascall on the other hands added native support for half-precision tasks, so that two can be done at a time

I thought he was joking in that case but who knows...

Volterra_90

Currently Offline

17,003

5291 posts since 14/03/15

Recent Badges:

4 Years Has been a VGChartz member for over 4 years.
'Ello Princess! Awarded for signing up.
1st Birthday Has been a VGChartz member for over 1 year.
Harvest Time Logged in at the start of Spring.
Breaking Out Managed to avoid being banned for 1 year.
Vice Free Managed to avoid being banned for 6 months.

Currently Playing:

Pandora's Tower (Wii)
Dark Souls (PS3)
Xenoblade Chronicles (Wii)

Volterra_90 on 02 November 2016

vivster said:

Volterra_90 said:
I don't really know anything about hardware specs tbh. I know about good games, and that's what I really want... Switch will succeed with a good catalogue. Imo it has to be also cheap to suceed, so I really don't mind if the console is underpowered in the end.

I think so too. 750GFLOPS for the Switch is plenty for the kinds of games it will run. It doesn't seem to strive for parity with the bigger twins.

I just can't stand when people claim it's close to their performance.

As I said, I don't know too much about hardware specs, but I believe that a portable, close to PS4/One power, would be extremely expensive and really power consuming. Correct me if I'm wrong xD. And it would be pretty much absurd competing with PS4/One, it's a lost battle imo. They're better doing their own thing. And that is a cheap, Nintendo-based console.

vivster

Currently Offline

110,698

30087 posts since 01/12/13

Recent Badges:

2 Years Has been a VGChartz member for over 2 years.
Freezing Logged in on Christmas day.
Site Veteran Has been a VGChartz member for over 5 years.
Open For Business Earned 10 badges.
Watch Your Back! Received 10,000 profile views.
10 Years Has been a VGChartz member for over 10 years.

vivster on 02 November 2016

Lafiel said:

vivster said:

Now the Switch's mystical number of 1.5TFLOPS refers to FP16. The FP32 performance is logically at 750GFLOPS. For comparison the PS4 sports 1.84TFLOPS in FP32. Its FP16 performance is naturally double as fast.

the PS4 still used GCN 2.0, which afaik didn't have native FP16/INT16 support meaning half-presicion tasks won't run at twice at a time on that architecture

GCN 4.0 (Polaris) and Pascall on the other hands added native support for half-precision tasks, so that two can be done at a time

Didn't actually know that. Stupid me for overestimating AMD again. Removed it from the OP.

If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

Jizz_Beard_thePirate

Currently Offline

107,686

28396 posts since 07/08/13

Recent Badges:

Open For Business Earned 10 badges.
9 Years Has been a VGChartz member for over 9 years.
Pon Received 100 wall post comments on gamrConnect.
Happy Birthday Logged in on your birthday.
One Of A Kind 1,000 replies made to user's most popular thread.
Trust Me, It'll Have Legs 100 replies made to user's most popular thread.

Jizz_Beard_thePirate on 02 November 2016

The thing that is really starting to urk me is that people are trying to spin it in a way to show that their console is more powerful than it really is. But it is soooo stupid cause FP16 can apply to all current gen hardware. So switch is 1.5TF, Pro is 8.4TF and Scorpio is 12 TF and etc when you get PC into mind. So it ends up being the same in relative performance, just with bigger numbers.

Sighh

Granted its not exactly like that but come on

PC Specs: CPU: 7800X3D || GPU: Strix 4090 || RAM: 32GB DDR5 6000 || Main SSD: WD 2TB SN850

BlkPaladin

Currently Offline

7,178

1857 posts since 05/01/07

Recent Badges:

Making Friends 10 friends on gamrConnect.
So You Came Back For More, Huh? Logged in a second time.
6 Years Has been a VGChartz member for over 6 years.
A Civilized Man Managed to avoid being banned for 5 years.
2 Years Has been a VGChartz member for over 2 years.
Spreading the Disease Score a total of 50 games in your collection.

Currently Playing:

3D Classics: Kid Icarus (3DS)
Mass Effect 3 (PC)
Kid Icarus: Uprising (3DS)
Resident Evil: Revelations (3DS)

BlkPaladin on 02 November 2016

vivster said:

BlkPaladin said:

Just a correction you are not entirely correct about this. Percision is not how accuratly the chip does the calculation but how big the instruction is, that is when you are dealing with programming. FP16, floating point 16, are floating point instructions that are only 16-bits big maximum. And FP32 are 32 bit max instructions. For some procedures you only need FP16 and if you optimize correctly you can run two instruction concurently depending on the chip, from how nVidia is advertising this chip it seems to be the case. So in some cases and if you don't need 32-bit instructions you can process instructions faster this way.

This is one of the reasons why you don't want to look at just FLOPS, or anything else as is. I seen a person saying that the Xbox One runs at twice the speed if you do it in FP16 which may not be the case since the chip might not be able to do two FP16 instructions at the same time, though it may allow them.

Well, the length of a floating point number IS its precision. Like 3.14159265359 is more precise than 3.14. It's used like that in physics where precision is important and as such you will use the most precise number possible. You can use smaller numbers but the end product while correct will not be as precise.

Precision is just a fancy word for longer numbers.

I covered that in my answer, which I was editing at the time to add more infomation. But the way you orginally worded it make is sound like the calcuations may not come out correctly, and you don't alway want to be percise because in a lot of calcuations needless percision can throw off you results.

In programming which is what chips deal with you may not need to run instruction in FP32, and do it in FP16 instead which speeds up calcuations especially when the chips allow two FP16 instructions to run concurrently, if I remember correctly it is how many registers there are to run an instruction. Some chips use two 16-bit registers to run a 32-bit instruction and can change to doing two 16-bit instructions on the fly when there is optimization for it on the machine level. This allows for some secections of code to run faster. And on the other hand some registers are 32-bit registers only so even if you are putting 16-bit instructions through them they can only do one instruction at a time.

COKTOE

Currently Offline

41,091

8292 posts since 08/05/08

Recent Badges:

Shipped Or Sold? 15 comments posted on VGChartz sales articles.
Vice Free Managed to avoid being banned for 6 months.
Right Tool for the Right Tool Score a total of 25 games in your collection.
A Free Man Managed to avoid being banned for 2 years.
1st Birthday Has been a VGChartz member for over 1 year.
Killer Scorpion Earned 60 badges.

Currently Playing:

Hitman (PS4)
Hitman 2 (PS4)

COKTOE on 02 November 2016

I knew exactly none of this. Nice post. Good explanation. Thread title seems familiar......

- "If you have the heart of a true winner, you can always get more pissed off than some other asshole."

Lafiel

Currently Offline

29,595

9331 posts since 01/02/07

Recent Badges:

Escape Artist Managed to avoid being banned for 1 month.
Ride Into the Sunset Managed to avoid being banned for 3 months.
So You Came Back For More, Huh? Logged in a second time.
Vice Free Managed to avoid being banned for 6 months.
Mirror Image Awarded for uploading an avatar.
Open For Business Earned 10 badges.

Lafiel on 02 November 2016

vivster said:

Lafiel said:

the PS4 still used GCN 2.0, which afaik didn't have native FP16/INT16 support meaning half-presicion tasks won't run at twice at a time on that architecture

GCN 4.0 (Polaris) and Pascall on the other hands added native support for half-precision tasks, so that two can be done at a time

Didn't actually know that. Stupid me for overestimating AMD again. Removed it from the OP.

the websites I found mentioning that aspect of Polaris seemed to say Nvidia only just added that (to their consumer cards) with Pascal aswell - I imagine professional cards already offered native support for a longer time

vivster

Currently Offline

110,698

30087 posts since 01/12/13

Recent Badges:

Vice Free Managed to avoid being banned for 6 months.
The High Flyer Earned 40 badges.
Making Progress Earned 50,000 gamrPoints
9 Years Has been a VGChartz member for over 9 years.
Site Veteran Has been a VGChartz member for over 5 years.
Open For Business Earned 10 badges.

vivster on 02 November 2016

BlkPaladin said:

vivster said:

Well, the length of a floating point number IS its precision. Like 3.14159265359 is more precise than 3.14. It's used like that in physics where precision is important and as such you will use the most precise number possible. You can use smaller numbers but the end product while correct will not be as precise.

Precision is just a fancy word for longer numbers.

I covered that in my answer, which I was editing at the time to add more infomation. But the way you orginally worded it make is sound like the calcuations may not come out correctly, and you don't alway want to be percise because in a lot of calcuations needless percision can throw off you results.

In programming which is what chips deal with you may not need to run instruction in FP32, and do it in FP16 instead which speeds up calcuations especially when the chips allow two FP16 instructions to run concurrently, if I remember correctly it is how many registers there are to run an instruction. Some chips use two 16-bit registers to run a 32-bit instruction and can change to doing two 16-bit instructions on the fly when there is optimization for it on the machine level. This allows for some secections of code to run faster. And on the other hand some registers are 32-bit registers only so even if you are putting 16-bit instructions through them they can only do one instruction at a time.

Absolutely correct. Though that's already too technical I think. I went at it from a calculation and math perspective rather than from programming. And the smaller numbers might as well be imprecise, which wouldn't matter since higher precision isn't needed.

If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

BlkPaladin

Currently Offline

7,178

1857 posts since 05/01/07

Recent Badges:

4 Years Has been a VGChartz member for over 4 years.
Ride Into the Sunset Managed to avoid being banned for 3 months.
13 Years Has been a VGChartz member for over 13 years.
A Free Man Managed to avoid being banned for 2 years.
Everything's Falling Into Place Add a total of 100 games to your collection.
1st Birthday Has been a VGChartz member for over 1 year.

Currently Playing:

3D Classics: Kid Icarus (3DS)
Mass Effect 3 (PC)
Kid Icarus: Uprising (3DS)
Resident Evil: Revelations (3DS)

BlkPaladin on 02 November 2016

Captain_Yuri said:
The thing that is really starting to urk me is that people are trying to spin it in a way to show that their console is more powerful than it really is. But it is soooo stupid cause FP16 can apply to all current gen hardware. So switch is 1.5TF, x1 is 2.6TF, ps4 is 3.6TF, Pro is 8.4TF and Scorpio is 12 TF and etc when you get PC into mind. So it ends up being the same in relative performance, just with bigger numbers.
Sighh

Actually not if you look in the post above you and my last post, it depends on the chip. You just cannot magically make a chip half percise to run faster. Depending on how they are made it may double the perfermance of FP16 instruction or it may run at the same speed. I use registers in my answer because that is how deep my knowledge goes about these things go, I'm sure there are other ways to speed of FP16 and FP32 instructions other ways. But a register for all intents and purposes of this explination can only run one instruction at a time. And depending on how the chip is made to run the FP32 instructions can influence if the chip experences a "speed boost" running thing at half-percision. For example some 32-bit instruction are run on two 16-bit registers. So if it is optimized to do so, if you put 16-bit instructions into this register you can put another instruction at the same time in the other register and thus "twice" the speed in this case. But there are 32-bit registers that will only do one instruction at a time no matter how small the instruction is. So just looking at terms of FLOPS and Full percision/Half percision doesn't tell the entire story.

FLOPS, like Hertz before it, is just an advertising go-to word that really has limited real world inpact.

Existing User Log In

New User Registration

Forums - Nintendo Discussion - Clarifying the 1.5TFLOPS of the SWITCH for those who just see the numbers.

Recent Badges:

Recent Badges:

Currently Playing:

Recent Badges:

Currently Playing:

Recent Badges:

Recent Badges:

Recent Badges:

Currently Playing:

Recent Badges:

Currently Playing:

Recent Badges:

Recent Badges:

Recent Badges:

Currently Playing: