By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - Could the Xbox Series S be MS entry into a Handheld Device

EpicRandy said:

I get what you say and kind of good advice in general but it's a little more complicated than this. Tflops measure the limits in compute power a chip can have at corresponding clocks. It is actually very precise to determine the capacity of a chip in a nutshell. However, this is only 1 part of the story the others can all be summed up by what % you can actually use in any given scenario or in other words, how starved the chip actually is. It can be starved with an insufficient memory pool, insufficient memory bandwidth, and insufficient power delivery. 

No. No. No.

Tflops is *not* actually a measurement of anything.

It's a theoretical number based on a number of hardware attributes and not a measurement of capability. - It is a number that is impossible to achieve in the real world.

It is extremely imprecise, not precise.

For example...
A Radeon 5870 is 2.72 Teraflops GPU with 2GB @153GB/s of bandwidth.
A Radeon 7850 is 1.76 Teraflops GPU with 2GB Ram @153GB/s of bandwidth.

So the only real difference is almost 1 Teraflops of compute, right? It's accurate according to you right? So the Radeon 5870 should win right?

Then if it's such an accurate measure of compute, why is the 7850 faster in everything, including compute where in some single precision floating point tasks, the 7850 is sometimes more than twice as fast?
(But don't take my word for it)
https://www.anandtech.com/bench/product/1062?vs=1076


Again. Teraflops is absolute bullshit. It literally represents nothing.

EpicRandy said:

Another aspect to consider is the amount and variety of hardware acceleration you have that may result in software bypassing the utilization of CU in some scenarios where it would need to use them on a chip without such acceleration.

in the example you gave the 2700U is actually very starved by the lack of memory pool and bandwidth, The 4500U feature 50% more L2 cache and 2x L3 cache and supports higher memory frequency. The Cpu side is also more power-hungry with the 2700U than the 4500u leaving more leeway for the GPU on the 4500u to use the same 25W TDP.

You are just confirming my point, that the number of CU's is not the be-all, end-all.

EpicRandy said:

For the 5500u vs 4700u, the sole difference is that the Cpu side is less power hungry with the 5500U allowing for higher clocks on the GPU, but make no mistake if you were to isolate the GPU power consumption and compare them both the 4700U would actually be more efficient per watt. Even the more recent rdna2 is most efficient at around 1300 to 1400 MHZ according to this source. The Vega architecture however had a much lower most efficiency clock speed, I could not find a source for this but I remember at the time of the Vega announcement that AMD was using clocks of ~850MHZ in their presentation to portray efficiency increase compared with the older architecture. This was prior to the reveal of the Vega 56 and 64 however so it is possible that it was tested on engineering samples. This may have shifted with node shrinkage also, but could not find anything on this, still really doubt 1800mhz would be more efficient per watt than 1500mhz with the Vega architecture.

Nah. Isolating GPU power consumption doesn't result in higher GPU power consumption.

Remember binning is actually a thing and as a process matures you can obtain higher clockspeeds without a corresponding increase to power consumption, sometimes... You can achieve higher clocks -and- lower power consumption as processes mature.

And you are right, Vega was extremely efficient at lower clocks. - AMD used a lot less dark silicon to insulate parts of the chip to reduce power leakage to obtain higher clockrates.



--::{PC Gaming Master Race}::--

Around the Network

When I play a demanding game on the Steam Deck, the battery doesn't even last 2 hours. That's fine for me and I can tweak a whole bunch of things to squeeze more time out of it. But I think that isn't an option for an Xbox handheld. The Series S ist still quite a lot more powerful than the Steam Deck, so battery life would be in regions not compatible with mass market. When we get to the point where such a device would make sense, the next generation should be right around the corner making the whole thing quite useless.

In other words: no.



Official member of VGC's Nintendo family, approved by the one and only RolStoppable. I feel honored.

Pemalite said:
EpicRandy said:

I get what you say and kind of good advice in general but it's a little more complicated than this. Tflops measure the limits in compute power a chip can have at corresponding clocks. It is actually very precise to determine the capacity of a chip in a nutshell. However, this is only 1 part of the story the others can all be summed up by what % you can actually use in any given scenario or in other words, how starved the chip actually is. It can be starved with an insufficient memory pool, insufficient memory bandwidth, and insufficient power delivery. 


Again. Teraflops is absolute bullshit. It literally represents nothing.

It represents clock multiplied by core count as you know well. If it truly was completely useless as you would have people believe then it wouldn't exist in the first place. I also wouldn't be able to say with 100% confidence that lowering the clock of my GPU till it is 4TF will make it perform worse than running it at it's stock 5.3TF.

No TF alone is not accurate in representing the difference in GPU performance when there are other differences such as memory bandwidth or a completely different architecture such as your example of the 5870 vs the 7850. But I've seen you complain about others using TF a few times and it's not going to change anyone's mind especially when you use examples of a completely different architecture or ddr3 vs gddr5 to prove your point when the person using TF's is comparing RDNA2 to RDNA2.

People will keep using TF's and that will never change. There's no point you getting annoyed about it.



Zippy6 said:
Pemalite said:


Again. Teraflops is absolute bullshit. It literally represents nothing.

It represents clock multiplied by core count as you know well.

It is actually clock multiplied by cores multiplied by floating point operations per pass.

It's theoretical. - If you are going to try and "educate me" then at-least try and be factual.

Zippy6 said:

If it truly was completely useless as you would have people believe then it wouldn't exist in the first place. I also wouldn't be able to say with 100% confidence that lowering the clock of my GPU till it is 4TF will make it perform worse than running it at it's stock 5.3TF.

It exists because it does have a purpose. Just not what you think it is.

Again, I already provided substantial evidence of a GPU with the exact same amount of DRAM, the exact same amount of bandwidth, but with almost 1 teraflop less... End up being faster. Even in compute.

How can you justify Teraflops being legitimate after showing you that? You can't use it to accurately compare it with anything, even hardware from the same manufacturer.

Zippy6 said:

No TF alone is not accurate in representing the difference in GPU performance when there are other differences such as memory bandwidth or a completely different architecture such as your example of the 5870 vs the 7850. But I've seen you complain about others using TF a few times and it's not going to change anyone's mind especially when you use examples of a completely different architecture or ddr3 vs gddr5 to prove your point when the person using TF's is comparing RDNA2 to RDNA2.

People will keep using TF's and that will never change. There's no point you getting annoyed about it.

So basically you are shifting the goal post to include other aspects of a GPU as determiners of performance? Then why use Teraflops?

If Teraflops/Gigaflops was useful, it wouldn't matter what the architecture was... 1 Teraflop of compute should be equivalent of 1 Teraflop of compute regardless of the environment, the architecture, memory bandwidth etc' should be redundant. - A Teraflop is a Teraflop, it's not any different whether it's on a CPU or GPU, it is still the *exact* same single precision floating point operation. It doesn't change.

If it's a case of Teraflops relative to the amount of DRAM and Bandwidth you have, then nVidia and AMD release GPU's with all sorts of different ratios... So we can't even use it in the same product lineup/GPU architecture, because how do you normalize/equalize that crap?

For example... Take the Geforce 1030 GDDR5 vs DDR4... Same 1 Teraflop GPU. - Yet the DDR4 variant will perform at around half the speed.
Again... Signifying how useless Teraflops is even with GPU's of the exact same type.
https://www.techspot.com/review/1658-geforce-gt-1030-abomination/

The fact is, we cannot take any piece of hardware with the *slightest* difference and compare it based on Teraflops.

Ergo. It is a useless and bullshit metric.

Zippy6 said:

People will keep using TF's and that will never change. There's no point you getting annoyed about it.

People kept using "bits" as a way to gauge a consoles capabilities. How did that turn out?

Over time, taking any single number in these gaming devices and running with it as the be-all-end-all, will just result it in being a joke... And the individual being flatout wrong.



--::{PC Gaming Master Race}::--

Machiavellian said:
VAMatt said:

It seems pretty clear to me that Microsoft's long-term strategy is to get completely out of the gaming hardware business. So, I can't see them trying to get into a new segment of that business.

I disagree, MS isn't looking to get out of creating their own hardware instead they are looking to get out of depending on hardware sells as the sole means of revenue in games.  In other words, MS doesn't want to depend on the sales of Xbox hardware for revenue when there are billion of other devices out their that can play games.  Why limit your market just to a console when you can at least attempt to allow users to play on whatever they have on hand.  It also leverages the strength of MS as a company including their cloud stack.

There is zero chance that MS wants to remain in a money losing business like video game hardware.  The only reason they are in it now is to sell software.  But, they are now actively and openly trying to make it so that hardware is irrelevant.  

There are no companies on planet earth that want to make consoles. It's just that a few companies decided that was the best path to generating lots of software revenue.  All of those companies would prefer to not have to deal with rhe hardware anymore.  MS seems to be the one with the most ambitious plan to get out of it. 



Around the Network
Pemalite said
Zippy6 said:

No TF alone is not accurate in representing the difference in GPU performance when there are other differences such as memory bandwidth or a completely different architecture such as your example of the 5870 vs the 7850. But I've seen you complain about others using TF a few times and it's not going to change anyone's mind especially when you use examples of a completely different architecture or ddr3 vs gddr5 to prove your point when the person using TF's is comparing RDNA2 to RDNA2.

People will keep using TF's and that will never change. There's no point you getting annoyed about it.

So basically you are shifting the goal post to include other aspects of a GPU as determiners of performance? Then why use Teraflops?

If Teraflops/Gigaflops was useful, it wouldn't matter what the architecture was... 1 Teraflop of compute should be equivalent of 1 Teraflop of compute regardless of the environment, the architecture, memory bandwidth etc' should be redundant. - A Teraflop is a Teraflop, it's not any different whether it's on a CPU or GPU, it is still the *exact* same single precision floating point operation. It doesn't change.

If it's a case of Teraflops relative to the amount of DRAM and Bandwidth you have, then nVidia and AMD release GPU's with all sorts of different ratios... So we can't even use it in the same product lineup/GPU architecture, because how do you normalize/equalize that crap?

For example... Take the Geforce 1030 GDDR5 vs DDR4... Same 1 Teraflop GPU. - Yet the DDR4 variant will perform at around half the speed.
Again... Signifying how useless Teraflops is even with GPU's of the exact same type.
https://www.techspot.com/review/1658-geforce-gt-1030-abomination/

The fact is, we cannot take any piece of hardware with the *slightest* difference and compare it based on Teraflops.

Ergo. It is a useless and bullshit metric.

It seems like a bad metric to use but isn't there still some use to it making it not outright useless? Like comparing the teraflops of the Series X and Series S easily conveys the really big power difference between the two which is useful.



Norion said:

It seems like a bad metric to use but isn't there still some use to it making it not outright useless? Like comparing the teraflops of the Series X and Series S easily conveys the really big power difference between the two which is useful.

Except we have established that, whenever there is any difference in hardware, Teraflops isn't comparable... Because the ratio of compute to say... Memory capacity/bandwidth/caches and more is different.

There is literally no scenario where you can compare a 1 Teraflop part to a 1 Teraflop part, unless the entire system is *exactly* the same, right down to the transistor... And that never happens.

For example, the Series X has 12.1 Teraflops @560GB/s of bandwidth.

So to scale that downwards linearly... The Series S is 4 Teraflops, so it's bandwidth would need to be about 186GB/s I.E. Exactly 1/3rd, but it's 225GB/s. - So it's "FPS per flop" will be different to the Series X.

And this is why Teraflops is bullshit and a useless metric.



--::{PC Gaming Master Race}::--

Pemalite said:
Norion said:

It seems like a bad metric to use but isn't there still some use to it making it not outright useless? Like comparing the teraflops of the Series X and Series S easily conveys the really big power difference between the two which is useful.

Except we have established that, whenever there is any difference in hardware, Teraflops isn't comparable... Because the ratio of compute to say... Memory capacity/bandwidth/caches and more is different.

There is literally no scenario where you can compare a 1 Teraflop part to a 1 Teraflop part, unless the entire system is *exactly* the same, right down to the transistor... And that never happens.

For example, the Series X has 12.1 Teraflops @560GB/s of bandwidth.

So to scale that downwards linearly... The Series S is 4 Teraflops, so it's bandwidth would need to be about 186GB/s I.E. Exactly 1/3rd, but it's 225GB/s. - So it's "FPS per flop" will be different to the Series X.

And this is why Teraflops is bullshit and a useless metric.

Easily conveying power difference to people is useful which teraflops can do in the right circumstances since while there is inaccuracy it does still give a general idea of the power gap between the two. The inaccuracy does still make it a bad metric though so what metric would you suggest be used instead?



VAMatt said:
Machiavellian said:

I disagree, MS isn't looking to get out of creating their own hardware instead they are looking to get out of depending on hardware sells as the sole means of revenue in games.  In other words, MS doesn't want to depend on the sales of Xbox hardware for revenue when there are billion of other devices out their that can play games.  Why limit your market just to a console when you can at least attempt to allow users to play on whatever they have on hand.  It also leverages the strength of MS as a company including their cloud stack.

There is zero chance that MS wants to remain in a money losing business like video game hardware.  The only reason they are in it now is to sell software.  But, they are now actively and openly trying to make it so that hardware is irrelevant.  

There are no companies on planet earth that want to make consoles. It's just that a few companies decided that was the best path to generating lots of software revenue.  All of those companies would prefer to not have to deal with rhe hardware anymore.  MS seems to be the one with the most ambitious plan to get out of it. 

What make you believe that consoles is a money losing position.  That is not how the business works.  The reason you can sell hardware at a lost is because people can only purchase games and play on your system because MS gets their cut no matter who is selling the game on their platform.  In order to maximize your own revenue stream, you need to do both.  You need your own mobile hardware that is not tied to Android or IOS, that you can sell your subs to and also maximize your profits from your own store front.  MS will continue to make their own console just like they continue to make their own Hardware because it keeps more customers in their eco system dependent on their hardware and software.  They also want anyone who want to play their games but are still locked to other environments like IOS, Android, Nintendo and Sony systems. Its about maximizing all your revenue streams, not just limiting to just one.

This is why MS has added their cloud initiative to GP, because it gives them flexibility to maximaze GP no matter the hardware someone is using but like Nintendo, they need that sweet spot on the mobile front that gives a bigger screen then your largest phone and allow you to take your games easily anywhere you go.  A switch like Series S would do exactly that and cover all bases.



Pemalite said:
EpicRandy said:

I get what you say and kind of good advice in general but it's a little more complicated than this. Tflops measure the limits in compute power a chip can have at corresponding clocks. It is actually very precise to determine the capacity of a chip in a nutshell. However, this is only 1 part of the story the others can all be summed up by what % you can actually use in any given scenario or in other words, how starved the chip actually is. It can be starved with an insufficient memory pool, insufficient memory bandwidth, and insufficient power delivery. 

No. No. No.

Tflops is *not* actually a measurement of anything.

It's a theoretical number based on a number of hardware attributes and not a measurement of capability. - It is a number that is impossible to achieve in the real world.

It is extremely imprecise, not precise.

For example...
A Radeon 5870 is 2.72 Teraflops GPU with 2GB @153GB/s of bandwidth.
A Radeon 7850 is 1.76 Teraflops GPU with 2GB Ram @153GB/s of bandwidth.

So the only real difference is almost 1 Teraflops of compute, right? It's accurate according to you right? So the Radeon 5870 should win right?

Then if it's such an accurate measure of compute, why is the 7850 faster in everything, including compute where in some single precision floating point tasks, the 7850 is sometimes more than twice as fast?
(But don't take my word for it)
https://www.anandtech.com/bench/product/1062?vs=1076


Again. Teraflops is absolute bullshit. It literally represents nothing.

EpicRandy said:

Another aspect to consider is the amount and variety of hardware acceleration you have that may result in software bypassing the utilization of CU in some scenarios where it would need to use them on a chip without such acceleration.

in the example you gave the 2700U is actually very starved by the lack of memory pool and bandwidth, The 4500U feature 50% more L2 cache and 2x L3 cache and supports higher memory frequency. The Cpu side is also more power-hungry with the 2700U than the 4500u leaving more leeway for the GPU on the 4500u to use the same 25W TDP.

You are just confirming my point, that the number of CU's is not the be-all, end-all.

EpicRandy said:

For the 5500u vs 4700u, the sole difference is that the Cpu side is less power hungry with the 5500U allowing for higher clocks on the GPU, but make no mistake if you were to isolate the GPU power consumption and compare them both the 4700U would actually be more efficient per watt. Even the more recent rdna2 is most efficient at around 1300 to 1400 MHZ according to this source. The Vega architecture however had a much lower most efficiency clock speed, I could not find a source for this but I remember at the time of the Vega announcement that AMD was using clocks of ~850MHZ in their presentation to portray efficiency increase compared with the older architecture. This was prior to the reveal of the Vega 56 and 64 however so it is possible that it was tested on engineering samples. This may have shifted with node shrinkage also, but could not find anything on this, still really doubt 1800mhz would be more efficient per watt than 1500mhz with the Vega architecture.

Nah. Isolating GPU power consumption doesn't result in higher GPU power consumption.

Remember binning is actually a thing and as a process matures you can obtain higher clockspeeds without a corresponding increase to power consumption, sometimes... You can achieve higher clocks -and- lower power consumption as processes mature.

And you are right, Vega was extremely efficient at lower clocks. - AMD used a lot less dark silicon to insulate parts of the chip to reduce power leakage to obtain higher clockrates.

Ok to clarify things a bit, your position is :

Teraflops is absolute bullshit. It literally represents nothing.

My position:

TFLOPS should be contextualized before use, as it's only as good as you can realistically task it.

Tflops is *not* actually a measurement of anything.

yes it is, it's not some kind of metric obtained with a dice roll when a new GPU enters the markets, each shader core (AMD) and CUDA core (Nvidia), which are responsible for the general floating operation on their respective GPU, can do up to 2 FLOPs per clock for 32-bit floating-point operations. That's the physical limits of those cores. Saying TFLOPS is not worth anything is akin to saying HP means nothing to cars, and wattage means nothing to electric motors. Every single metric of about everything simple or complex is only as good as you can contextualize it.

It's a theoretical number based on a number of hardware attributes and not a measurement of capability. - It is a number that is impossible to achieve in the real world.

Yes like I explained earlier it is theoretical cause you could never 100% task those cores. So it can be viewed both as a theoretical limit or a physical barrier you could never exceed or even attain at the reference clock, but it still is very much a measurement of capability.


A Radeon 5870 is 2.72 Teraflops GPU with 2GB @153GB/s of bandwidth.
A Radeon 7850 is 1.76 Teraflops GPU with 2GB Ram @153GB/s of bandwidth.

So the only real difference is almost 1 Teraflops of compute, right? It's accurate according to you right? So the Radeon 5870 should win right?

Then if it's such an accurate measure of compute, why is the 7850 faster in everything, including compute where in some single precision floating point tasks, the 7850 is sometimes more than twice as fast?
(But don't take my word for it)
https://www.anandtech.com/bench/product/1062?vs=1076

There's is another very significant difference between the 2. When I made my list of what could starve GPU cores 'insufficient memory pool, insufficient memory bandwidth, and insufficient power delivery' I did not mean it to be exhaustive. Here it's the flaws of Terascale architecture that starve the cores of the high-end 5870. No matter the gen, architecture, or revision the high-ends/enthusiasts segments are meant to push limits by sacrificing efficiency to get the last drops of performance, so should always be viewed with high diminishing returns in mind. The 7850 is a mid-range GPU using the better GCN architecture which resulted in significantly less starvation of its core.

You are just confirming my point, that the number of CU's is not the be-all, end-all.

it's confirming my point too cause I never claimed the opposite either and using TFLOPS in certain scenarios does not mean I view it as a be-all, end-all either. In fact, all my statement points to carefulness when using Tflops, and using CU's would only be worse, so I don't know why you try to claim the opposite as my position. 

Nah. Isolating GPU power consumption doesn't result in higher GPU power consumption.

Remember binning is actually a thing and as a process matures you can obtain higher clockspeeds without a corresponding increase to power consumption, sometimes... You can achieve higher clocks -and- lower power consumption as processes mature.

I think you misunderstood what I was trying to say. The default TDP is for the whole APU so the 5500u having a less power-hungry CPU means the GPU has more available power hence the 200mhz higher clock frequency. The rdna2 (just for comparison as I did not find an equivalent chart for vega) architecture shows a 25% increase in W for 1800mhz compared to 1600mhz which is a 12.5% increase in performance. No doubt the Vega architecture is not as good as that either so the ratio may even be worse. Binning is a thing but the best bins would be reserved for the higher tier with better profit margins like 4800U and 4980U and have only a marginal impact nothing of the sort to bridge a 12.5% performance/watt gap.

And binning does not always end up being used for power saving, look at the RX 400s vs RX 500s, the difference was only better bins but they used it to get higher clocks (about 6%) since they pushed the architecture to the max it's actually resulted in worse performance/watt using 23% more W for the 580 vs 480.

Anyway, this whole conversation is weird cause all your points do not disprove the initial context in which I used the TFLOPS figure. I said that AMD/MS needs to attain the 4TFLOPS envelop of the RDNA2 architecture with mobile-like TDP which I pointed out to another RDNA2 (which I mistakenly write rdna3 in my previous post, sorry if it's the source of the debate) chip with close-matched TFLOPS, I did not claim their performance was equivalent or comparable based on this, it was supposed to mean that AMD already successfully reduce the TDP envelope of the RDNA2 architecture with the change from 7nm to 6nm. If I contextualized things more, the 680M is a max 50w TDP but boasts 80%+ of its performance at 25W, see benchmark here. This is promising when you consider the semi-custom design of the Series S is even more efficient by using more cores at lower clock speeds. So it only adds to the plausibility of the video in the OP. RDNA2 at 4nm or even 3nm should be more than enough to push a 4TFLOPS rdna2 package under 25w and even have a shot at a 15w APU target.

TFLOPS is also very useful in this context because consoles must keep this metric when doing a die shrink. Look at the PS5, it now uses the 6nm Oberon plus process and shaved off 20w to 30w but kept the same TFLOPS target (the same clock speed and the same number of shader cores), same memory, same bandwidth, same everything but shrunk down. They have to do it this way to keep changes invisible to developers, and that's basically what I anticipate Xbox to do with the series consoles whether or not they want a revised S in a switch-like format. If MS were to use a different architecture like RDNA 3 or 4 it is unlikely they would use a different TFLOPS target either if they want to keep things invisible to devs (it may not even be possible here but the more you keep the same the easier it should be for a dev to create a new build from the S version)