By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - Could the Xbox Series S be MS entry into a Handheld Device

DonFerrari said:
EpicRandy said:

They are already in competition with Nintendo. Adding a switch like experience to their offering would not change anything in that regards. The true question is would that add anything significant to their offering? For me the answer is yes as it would also synergize well with their service offering.

Actually nope, neither MS (which says Sony and Nintendo aren't their real competitors since this gen) nor Nintendo (who says they aren't competing with Sony and MS since Wii) recognizes one another as competitor, and MS have never competed in portable console it isn't really just "owww make the same system in your table be held in your hand".

Machiavellian said:

Its not about competing with Switch, its about offering a mobile device to fill a gap in their product lineup.  If you are going to get a Series S for whatever reason having it also as a mobile device probably could seal the deal better and improve sales of the device.

Considering their games are already available portably on SteamDeck and others and that they aren't even caring that much to improve sales on console hardware I really don't see much reason to try portable as a new stream venue.

Windows runs like ass on Steam deck.  Windows in its current iteration is a super pain in the butt to navigate using controller controls. Also only GP runs on Steam Deck but not MS store front which means Valve gets the lions share of sales for the SteamDeck not MS.  MS would want a device where you are locked into their eco system, using their hardware, getting games from their store front.  As stated a low tier console option which MS already has created in a more flexible mobile build fills in both gaps in their hardware strategy.  Since MS is already selling the series S, making it like a switch device would make it a better option for anyone looking to purchase an S.

Also who says they are not caring about improvig sales on console hardware.  I would just take that as a opinion based on speculation.  



Around the Network
DonFerrari said:
EpicRandy said:

They are already in competition with Nintendo. Adding a switch like experience to their offering would not change anything in that regards. The true question is would that add anything significant to their offering? For me the answer is yes as it would also synergize well with their service offering.

Actually nope, neither MS (which says Sony and Nintendo aren't their real competitors since this gen) nor Nintendo (who says they aren't competing with Sony and MS since Wii) recognizes one another as competitor, and MS have never competed in portable console it isn't really just "owww make the same system in your table be held in your hand".

To be fair MS did not say they did not consider Nintendo or Playstation competitors, they did say in 2020 they didn't view them as their main/biggest ones which they viewed Google and Amazon as.

Yet Google folded their initiative, Amazon is yet to grasp any relevancy, and MS's own Xcloud has yet to come out of the beta status reserved for GP ultimate subs.

If they were asked the same question now, I'm really not sure they would give the same answer but that depends on how their views of cloud gaming have evolved.

For Nintendo, I did a quick search and found this article but the title is quite misleading, while Reggie did mention Nintendo doesn't necessarily see Microsoft and Sony as "competition." He then precised that he was referring to consoles' power and third-party support which their view diverge from both the other. But he also recognized them both as their direct competitors "My competitive set is much bigger than my direct competitors in Sony and Microsoft" when battling for "consumers' attention with their "entertainment time".

Anyway, those are all PR statements, companies do not get to pick their own competitors. If you ask yourself, "is it possible that someone looking for a console can hesitate between Nintendo, Sony, and Microsoft offerings?" and answer yes then they are competitors at least to varying degrees.



Machiavellian said:
DonFerrari said:

Actually nope, neither MS (which says Sony and Nintendo aren't their real competitors since this gen) nor Nintendo (who says they aren't competing with Sony and MS since Wii) recognizes one another as competitor, and MS have never competed in portable console it isn't really just "owww make the same system in your table be held in your hand".

Machiavellian said:

Its not about competing with Switch, its about offering a mobile device to fill a gap in their product lineup.  If you are going to get a Series S for whatever reason having it also as a mobile device probably could seal the deal better and improve sales of the device.

Considering their games are already available portably on SteamDeck and others and that they aren't even caring that much to improve sales on console hardware I really don't see much reason to try portable as a new stream venue.

Windows runs like ass on Steam deck.  Windows in its current iteration is a super pain in the butt to navigate using controller controls. Also only GP runs on Steam Deck but not MS store front which means Valve gets the lions share of sales for the SteamDeck not MS.  MS would want a device where you are locked into their eco system, using their hardware, getting games from their store front.  As stated a low tier console option which MS already has created in a more flexible mobile build fills in both gaps in their hardware strategy.  Since MS is already selling the series S, making it like a switch device would make it a better option for anyone looking to purchase an S.

Also who says they are not caring about improvig sales on console hardware.  I would just take that as a opinion based on speculation.  

And here was myself believing in play anywhere however you want... doesn't like that expansive now.



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."

EpicRandy said:
DonFerrari said:

Actually nope, neither MS (which says Sony and Nintendo aren't their real competitors since this gen) nor Nintendo (who says they aren't competing with Sony and MS since Wii) recognizes one another as competitor, and MS have never competed in portable console it isn't really just "owww make the same system in your table be held in your hand".

To be fair MS did not say they did not consider Nintendo or Playstation competitors, they did say in 2020 they didn't view them as their main/biggest ones which they viewed Google and Amazon as.

Yet Google folded their initiative, Amazon is yet to grasp any relevancy, and MS's own Xcloud has yet to come out of the beta status reserved for GP ultimate subs.

If they were asked the same question now, I'm really not sure they would give the same answer but that depends on how their views of cloud gaming have evolved.

For Nintendo, I did a quick search and found this article but the title is quite misleading, while Reggie did mention Nintendo doesn't necessarily see Microsoft and Sony as "competition." He then precised that he was referring to consoles' power and third-party support which their view diverge from both the other. But he also recognized them both as their direct competitors "My competitive set is much bigger than my direct competitors in Sony and Microsoft" when battling for "consumers' attention with their "entertainment time".

Anyway, those are all PR statements, companies do not get to pick their own competitors. If you ask yourself, "is it possible that someone looking for a console can hesitate between Nintendo, Sony, and Microsoft offerings?" and answer yes then they are competitors at least to varying degrees.

I do agree with you that it is all PR. And also agree with you that you don't chose your competitors. But when sales of PS and Xbox hardly affect Nintendo sales and vice-versa and this have been true ever since Wii then you can infer that they indeed don't directly compete but work more as possible indirect or in Kotler way would be more like a substitute product (as in if the product you really want drops the ball you may consider the other, but on a normal situation you don't really want to get it, meaning someone that really wants Switch for the Nintendo titles hardly would consider buying PS or Xbox, sure a lot of us buy both, if products are direct competitors very few people would have reason to buy both).



duduspace11 "Well, since we are estimating costs, Pokemon Red/Blue did cost Nintendo about $50m to make back in 1996"

http://gamrconnect.vgchartz.com/post.php?id=8808363

Mr Puggsly: "Hehe, I said good profit. You said big profit. Frankly, not losing money is what I meant by good. Don't get hung up on semantics"

http://gamrconnect.vgchartz.com/post.php?id=9008994

Azzanation: "PS5 wouldn't sold out at launch without scalpers."

EpicRandy said:
Pemalite said:

Framerates tends to be a good one.

Well yeah, I agree, but there are still many flaws with this.

  1. FPS can only measure already available cards and serve no purpose in evaluating future ones.
  2. It still needs contextualization, what game was running, what was the target resolutions, what was all the post-processing effects, what API was used, and what engine was used.
  3. 2 GPUs may perform almost exactly the same on 1 title yet vastly differently on another.
  4. It is very susceptible to manipulation
    1. Nvidia, through their partner programs with devs, make certain they use sometimes unnecessary features or level of utilization of a feature to impact performance on AMD GPU e.g. the wither 3 use of x64 tessellation with Hairworks design to cripple performance on AMD GPU with no image fidelity gain (after x16).
    2. anandtech: "Let's start with the obvious. NVIDIA is more aggressive than AMD with trying to get review sites to use certain games and even make certain GPU comparisons."
    3. Back not too long ago Nvidia drivers revision had a tendency to decrease GPU performance while AMD revisions increased their performances over the lifetime of the GPUs. When new gens were announced Nvidia compared the latest drivers' performances of their old gen to the new gen showing skewed comparisons. 
    4. AMD was caught using blatantly wrong number when displaying FPS improvement gen over gen with the RX 7900 XTX
  5. When gaming, many kinds of workloads are computed all at once, some that have better utilization of the GPU than others, but the shown performance of the GPU in fps will only rise to that of the least performing one.
  6. For certain workloads, average FPS would literally be a trash figure while Tflops will depict things more accurately.

It's literally why benchmarks actually exist.

Tflops can never depict anything accurately.
I have already showcased how different GPU's perform lower even with more tflops.
I have already showcased how identical GPU's with the same Tflop can perform at half the speed.

So to assert that it can "depict things accurately" when you can double your performance with a part that has identical Tflops is disingenuous.

Not only that Teraflops only represent single precision floating point...

So for anything involving Quarter Precision, Half Precision, Double Precision, Integers, Geometry throughput, Texel/Pixel/Texture fillrate... Teraflops has no bearing. That's like a massive chunk of the GPU and stuff you know.

EpicRandy said:
Pemalite said:

No. It's actually not.

It's a bunch of numbers multiplied together. - It's theoretical, not real world.

Again, no GPU or CPU will ever achieve their "hypothetical teraflops" in the real world.

Those multiplication represent literally how to stream processor works, they can perform 2 flops per cycle by design.
That much is not theoretical, processors require a high and low signal aka clock cycle to operate and stream processors are mostly designed to run 2 FP32 instructions per clock. That's not a theory that's how they work. Some workloads such as scientific simulations, machine learning, and data analytics have better utilization and are sometimes close to 100%.

Again. It's theoretical, not real world.

Otherwise if we took the Geforce 1030 DDR4 and GDDR5 variants, we wouldn't have one that is almost twice as fast at the same Teraflops.

No, they are not 2 flops per cycle.
They can do 2 operations per cycle, very different. - Not all operations are the same as some can be packed together to form one operation... Or one operation may be split into many.

EpicRandy said:
Pemalite said:

But by that same vein, I could grab a 1 teraflop "rated" GPU and assert it's theoretically capable of "1.2 Teraflops" based on any number of factors.

It's meaningless, because it's unachievable in any real world scenario.

No, I literally wrote, "physical barrier you could never exceed or even attain". So unless you are able to overclock it by %20 that's not possible.

Some real-world scenarios get really close to 100%, just not gaming in general but it's already better with consoles due to static hardware and specific optimization.

Except you can exceed the theoretical Teraflop number by combining operations if you make your ALU's fat enough.

EpicRandy said:
Pemalite said:

What is starving the 5870 to have less real-world teraflops than the 7850?

Explain it. I'll wait.

Sure here's a very good read on the utilization of the TeraScale architecture: 

Utilization remains a big concern though, for both the SPUs and the SPs within them: not only must the compiler do its best to identify 5 independent datapoints for each VLIW thread, but so must 64 VLIW threads be packed together within each wavefront. Further, the 64 items in a wavefront should all execute against the same instruction; imagine a scenario wherein one thread executes against an entirely different instruction from the other 63! Opportunities for additional clock cycles & poor utilization thus abound and the compiler must do it’s best to schedule around them.

With 5 SPs in each SPU, attaining 100% utilization necessitates five datapoints per VLIW thread. That’s the best case; in the worst case an entire thread is comprised of just a single datapoint resulting in an abysmal 20% utilization as 4 SPs simply engage in idle chit-chat. Extremities aside, AMD noted an average utilization of 68% or 3.4 SPs per clock cycle. A diagram from AnandTech’s GCN preview article depicts this scenario, and it’s a good time to borrow it here:

The HD 6900 series would serve as the last of the flagship TeraScale GPUs, even as TeraScale based cards continued to release until October of 2013. As compute applications began to take center-stage for GPU acceleration, games too evolved. The next generation of graphics API’s such as DirectX 10 brought along complex shaders that made the VLIW-centric design of TeraScale ever more inefficient and impractically difficult to schedule for. The Radeon HD 7000 series would accordingly usher in the GCN architecture, TeraScale’s inevitable successor that would abandon VLIW and ILP entirely and in doing so cement AMD’s focus on GPU compute going forward.ly because their configurations are designed to cater to the worst binning offender of a particular SKU with some headroom to spare.

Did you really just link to a blog? Either way, I have a very low-level understanding of Terascale and it's derivatives.

And ironically, AMD has introduced VLIW-like ideas into RDNA3 by introducing dual-issue ALU's... It's a cheap way to increase throughput.

Which is partially why going from the Radeon RX 6950 @ 19.3 Teraflops to the RX 7900 XTX @46 Teraflops hasn't resulted in more than double the performance, because Teraflops is bullshit.

EpicRandy said:

    Pemalite said:

    My argument is that TFLOPS is bullshit in using it to determine the capability of a GPU.

    Nor are higher clocks always a detriment to power consumption... It's a balancing act as all CPU's and GPU's have an efficiency curve.

    For example... You can buy a CPU, unvolt it... Then overclock it... And result in a CPU that uses less power, but offers higher performance due to it's higher clockrate.

    Yes, I know all that but the efficiency curve gets exponentially worse with clocks past a certain speed and GPUs default clocks are always already past that point, binning only has marginal impacts here. You're able to under-volt some GPUs only because their configurations are designed to cater to the worst binning offender of a particular SKU with some headroom to spare.

    Binning has massive impacts, depending on the process.

    Some chips cannot handle higher currents... Otherwise they suffer from an issue known as "electromigration" which will destroy the silicon.

    Hence why Polaris through binning went from the RX 480 to RX 580, yes power draw increased with that jump, but only because they could get away with it.

    Consequently AMD and Intel use binning on all their CPU's... For example the Ryzen 5500 and 5600X are fundamentally the same chip, but through binning hit different performance/power targets. - And parts of the damaged L3 got lazered off.

    Heck we could even go back to Phenom where AMD would use the same Chip design for it's entire lineup from Dual-cores right up to Quad-Cores, some of the chips with a damaged core would have that core disabled, thankfully you could re-enable them by setting Automatic Core Calibration to Auto and re-enable it, sometimes you need to pump more volts or lower clocks.
    That's binning.

    DonFerrari said:

    I would say your analogy with cars already get the idea across.

    Of course one car having 1000hp and another car having 900hp doesn't mean much when there are several other design elements that will impact the performance of that car from simple 0-100km/h (0-60mph) to time to do a lap (which them can even be affected by the driver itself on exactly same car and conditions either putting 2 drivers to do lap or even same driver doing multiple laps they will be different time).

    So would that mean measuring HP as totally useless? Absolutely not =p

    Nah. It's not like horsepower at all.

    It's more like CC's on a engine... Aka. The air displacement.

    You can get lower CC engines, outperform higher CC engines based on a number of design factors.



    --::{PC Gaming Master Race}::--

    Around the Network
    DonFerrari said:
    EpicRandy said:

    To be fair MS did not say they did not consider Nintendo or Playstation competitors, they did say in 2020 they didn't view them as their main/biggest ones which they viewed Google and Amazon as.

    Yet Google folded their initiative, Amazon is yet to grasp any relevancy, and MS's own Xcloud has yet to come out of the beta status reserved for GP ultimate subs.

    If they were asked the same question now, I'm really not sure they would give the same answer but that depends on how their views of cloud gaming have evolved.

    For Nintendo, I did a quick search and found this article but the title is quite misleading, while Reggie did mention Nintendo doesn't necessarily see Microsoft and Sony as "competition." He then precised that he was referring to consoles' power and third-party support which their view diverge from both the other. But he also recognized them both as their direct competitors "My competitive set is much bigger than my direct competitors in Sony and Microsoft" when battling for "consumers' attention with their "entertainment time".

    Anyway, those are all PR statements, companies do not get to pick their own competitors. If you ask yourself, "is it possible that someone looking for a console can hesitate between Nintendo, Sony, and Microsoft offerings?" and answer yes then they are competitors at least to varying degrees.

    I do agree with you that it is all PR. And also agree with you that you don't chose your competitors. But when sales of PS and Xbox hardly affect Nintendo sales and vice-versa and this have been true ever since Wii then you can infer that they indeed don't directly compete but work more as possible indirect or in Kotler way would be more like a substitute product (as in if the product you really want drops the ball you may consider the other, but on a normal situation you don't really want to get it, meaning someone that really wants Switch for the Nintendo titles hardly would consider buying PS or Xbox, sure a lot of us buy both, if products are direct competitors very few people would have reason to buy both).

    Yeah, that's why I said to varying degrees. It's undeniable that MS and Sony overlap more between them than Nintendo. 

    The fact that many of us buy both Nintendo products and either Xbox or PS can also be viewed as competition though like Reggie said they will compete on entertainment time of their users. Also, make them directly compete for the budget of said users. Not every gamer/family has budget for 2 systems even if they have the will to do so. 

    Back a few years ago I capped my own gaming budget not to spend like crazy like I used to. As a result, I buy way fewer Nintendo products due to the fact they hardly ever move from release price and they take a toll on my budget. Instead, I game off of Steam sales and GP titles mostly. Last year I only bought the latest Metroid, this year it will be Tears of the Kingdom and probably none others.



    The current Xbox Series may not directly compete with Switch/Switch 2, but a hybrid model like this video suggests absolutely would, and that's a competition MS are going to struggle with since the much stronger Playstation brand couldn't manage to match Nintendo in the portable arena.



    Regardless if it failed or succeeded, I would love to see an attempt at an official handheld, fully portable device. I don't want some lame ass shit where I have to be connected to my home console.



    ...to avoid getting banned for inactivity, I may have to resort to comments that are of a lower overall quality and or beneath my moral standards.

    Pemalite said:
    EpicRandy said:

    Well yeah, I agree, but there are still many flaws with this.

    1. FPS can only measure already available cards and serve no purpose in evaluating future ones.
    2. It still needs contextualization, what game was running, what was the target resolutions, what was all the post-processing effects, what API was used, and what engine was used.
    3. 2 GPUs may perform almost exactly the same on 1 title yet vastly differently on another.
    4. It is very susceptible to manipulation
      1. Nvidia, through their partner programs with devs, make certain they use sometimes unnecessary features or level of utilization of a feature to impact performance on AMD GPU e.g. the wither 3 use of x64 tessellation with Hairworks design to cripple performance on AMD GPU with no image fidelity gain (after x16).
      2. anandtech: "Let's start with the obvious. NVIDIA is more aggressive than AMD with trying to get review sites to use certain games and even make certain GPU comparisons."
      3. Back not too long ago Nvidia drivers revision had a tendency to decrease GPU performance while AMD revisions increased their performances over the lifetime of the GPUs. When new gens were announced Nvidia compared the latest drivers' performances of their old gen to the new gen showing skewed comparisons. 
      4. AMD was caught using blatantly wrong number when displaying FPS improvement gen over gen with the RX 7900 XTX
    5. When gaming, many kinds of workloads are computed all at once, some that have better utilization of the GPU than others, but the shown performance of the GPU in fps will only rise to that of the least performing one.
    6. For certain workloads, average FPS would literally be a trash figure while Tflops will depict things more accurately.

    It's literally why benchmarks actually exist.

    Tflops can never depict anything accurately.
    I have already showcased how different GPU's perform lower even with more tflops.
    I have already showcased how identical GPU's with the same Tflop can perform at half the speed.

    So to assert that it can "depict things accurately" when you can double your performance with a part that has identical Tflops is disingenuous.

    Not only that Teraflops only represent single precision floating point...

    So for anything involving Quarter Precision, Half Precision, Double Precision, Integers, Geometry throughput, Texel/Pixel/Texture fillrate... Teraflops has no bearing. That's like a massive chunk of the GPU and stuff you know.

    Benchmarks have a high tendency of portraying the performance of a GPU in the context of their own test and lose accuracy when trying to portray anything else either.

    Tflops are not meant to be used to evaluate the FPS performance of a GPU, so it's disingenuous to use the figure in this context solely and say it's a bullshit figure. It always depicts accurately the performance capacity of the stream processors themselves but not the whole GPU. for this, you have to take everything into account.

    Benchmarks cannot be used when designing new GPUs/architecture, they have to rely on metrics and sets targets from every one of these, and tFlops is 1, time spy score isn't.

    Tflops will depict things accurately as long as you run workloads that have no, or are designed to avoid bottlenecks when possible. That's why supercomputers use this figure predominantly.

    Pemalite said:
    EpicRandy said:

    Those multiplication represent literally how to stream processor works, they can perform 2 flops per cycle by design.
    That much is not theoretical, processors require a high and low signal aka clock cycle to operate and stream processors are mostly designed to run 2 FP32 instructions per clock. That's not a theory that's how they work. Some workloads such as scientific simulations, machine learning, and data analytics have better utilization and are sometimes close to 100%.

    Again. It's theoretical, not real world.

    Otherwise if we took the Geforce 1030 DDR4 and GDDR5 variants, we wouldn't have one that is almost twice as fast at the same Teraflops.

    No, they are not 2 flops per cycle.
    They can do 2 operations per cycle, very different. - Not all operations are the same as some can be packed together to form one operation... Or one operation may be split into many.

    Yes, you would if the ddr4 starved the 1030 while the GDDR5 allowed for more consistent utilization of the stream processor.

    they are generally 2 flops per cycle. They can be used for other operations and the performance of those will also be listed alongside the tFlops figure. combined operations are also listed with different tFlops figures like fp16 or fp64, Other optimizations can be pre-done through the compiler when the software is built and so the GPU would be agnostic of these.

    A GPU designed with a stream processor with 2 flops/ cycle is not 2 operations, it is 2 operations using floats, when they process something else like double they will use many cycles to process the task. Some stream processors are designed so that they can use both 32-bit to process a single double (fp64), those will be listed with half performance on double. others are not and can take up to 16 cycles to do the same operations. that's dependent on the architecture. Some stream processors are limited to multiplication for 1 of its 32-bit operations and addition/subtraction for the other.

    Pemalite said:
    EpicRandy said:

    No, I literally wrote, "physical barrier you could never exceed or even attain". So unless you are able to overclock it by %20 that's not possible.

    Some real-world scenarios get really close to 100%, just not gaming in general but it's already better with consoles due to static hardware and specific optimization.

    Except you can exceed the theoretical Teraflop number by combining operations if you make your ALU's fat enough.

    No, you cannot, if you design your stream processor with different ALUs that can do 4flops/cycle like rdna3 or even 8 like some have done in the past, this will already be taken into consideration with the tFlops figure. like the 7900 xtx, its tFlops is Shader Core * clocks * 4 instead of 2. So you won't be able to exceed this value. You could offload some computing with other hardware accelerated parts but the tFlops figure is not meant to measure those, only the stream processors.

    Pemalite said:
    EpicRandy said:

    Sure here's a very good read on the utilization of the TeraScale architecture: 

    Utilization remains a big concern though, for both the SPUs and the SPs within them: not only must the compiler do its best to identify 5 independent datapoints for each VLIW thread, but so must 64 VLIW threads be packed together within each wavefront. Further, the 64 items in a wavefront should all execute against the same instruction; imagine a scenario wherein one thread executes against an entirely different instruction from the other 63! Opportunities for additional clock cycles & poor utilization thus abound and the compiler must do it’s best to schedule around them.

    With 5 SPs in each SPU, attaining 100% utilization necessitates five datapoints per VLIW thread. That’s the best case; in the worst case an entire thread is comprised of just a single datapoint resulting in an abysmal 20% utilization as 4 SPs simply engage in idle chit-chat. Extremities aside, AMD noted an average utilization of 68% or 3.4 SPs per clock cycle. A diagram from AnandTech’s GCN preview article depicts this scenario, and it’s a good time to borrow it here:

    The HD 6900 series would serve as the last of the flagship TeraScale GPUs, even as TeraScale based cards continued to release until October of 2013. As compute applications began to take center-stage for GPU acceleration, games too evolved. The next generation of graphics API’s such as DirectX 10 brought along complex shaders that made the VLIW-centric design of TeraScale ever more inefficient and impractically difficult to schedule for. The Radeon HD 7000 series would accordingly usher in the GCN architecture, TeraScale’s inevitable successor that would abandon VLIW and ILP entirely and in doing so cement AMD’s focus on GPU compute going forward.ly because their configurations are designed to cater to the worst binning offender of a particular SKU with some headroom to spare.

    Did you really just link to a blog? Either way, I have a very low-level understanding of Terascale and it's derivatives.

    And ironically, AMD has introduced VLIW-like ideas into RDNA3 by introducing dual-issue ALU's... It's a cheap way to increase throughput.

    Which is partially why going from the Radeon RX 6950 @ 19.3 Teraflops to the RX 7900 XTX @46 Teraflops hasn't resulted in more than double the performance, because Teraflops is bullshit.

    That's not because tFlops is bullshit, that's because the 7900 XTX utilization of its stream processor is bullshit with video games' typical workloads. Again tFlops are not meant to measure the performance of a whole GPU only the stream processor's max throughput.

    Pemalite said:
    EpicRandy said:

    Yes, I know all that but the efficiency curve gets exponentially worse with clocks past a certain speed and GPUs default clocks are always already past that point, binning only has marginal impacts here. You're able to under-volt some GPUs only because their configurations are designed to cater to the worst binning offender of a particular SKU with some headroom to spare.

    Binning has massive impacts, depending on the process.

    Some chips cannot handle higher currents... Otherwise they suffer from an issue known as "electromigration" which will destroy the silicon.

    Hence why Polaris through binning went from the RX 480 to RX 580, yes power draw increased with that jump, but only because they could get away with it.

    Consequently AMD and Intel use binning on all their CPU's... For example the Ryzen 5500 and 5600X are fundamentally the same chip, but through binning hit different performance/power targets. - And parts of the damaged L3 got lazered off.

    Heck we could even go back to Phenom where AMD would use the same Chip design for it's entire lineup from Dual-cores right up to Quad-Cores, some of the chips with a damaged core would have that core disabled, thankfully you could re-enable them by setting Automatic Core Calibration to Auto and re-enable it, sometimes you need to pump more volts or lower clocks.
    That's binning.

    I know all that, but it does not really address the point. As binning is already done by the manufacturer and has been sorted and used in different SKUs that fit their respective tolerance the leeway you end up having as a customer to do undervolting and overclocking to get to the same TDP is marginal. And also the efficiency curve does exponentially increase the TDP with clocks passing a certain clock speed it does not really matter if you can get 100-200mhz of the most efficient target because you got lucky with the binning lottery. Skus that target consoles can't rely on the top few % of dies or else yield would be terrible.

    Pemalite said:
    DonFerrari said:

    I would say your analogy with cars already get the idea across.

    Of course one car having 1000hp and another car having 900hp doesn't mean much when there are several other design elements that will impact the performance of that car from simple 0-100km/h (0-60mph) to time to do a lap (which them can even be affected by the driver itself on exactly same car and conditions either putting 2 drivers to do lap or even same driver doing multiple laps they will be different time).

    So would that mean measuring HP as totally useless? Absolutely not =p

    Nah. It's not like horsepower at all.

    It's more like CC's on a engine... Aka. The air displacement.

    You can get lower CC engines, outperform higher CC engines based on a number of design factors.

    750 hp car beat 1500hp one.



    EpicRandy said:

    Benchmarks have a high tendency of portraying the performance of a GPU in the context of their own test and lose accuracy when trying to portray anything else either.

    Correct.

    EpicRandy said:

    Tflops are not meant to be used to evaluate the FPS performance of a GPU, so it's disingenuous to use the figure in this context solely and say it's a bullshit figure. It always depicts accurately the performance capacity of the stream processors themselves but not the whole GPU. for this, you have to take everything into account.

    No. It doesn't accurately depict the performance capacity of the stream processors.
    Teraflops is single precision floating point.

    I have already provided the evidence on this, that the advertised Teraflops doesn't correspond with real-world floating point performance in tasks.
    See here with the Geforce 2060 @5.2 Teraflops doubling the floating point performance of the Radeon RX 580 @5.8 Teraflops.
    https://www.anandtech.com/bench/GPU19/2703

    It doesn't tell us the capabilities of the Stream processors INT16 performance.
    It doesn't tell us the capabilities of the Stream processors INT8 performance.
    It doesn't tell us the capabilities of the Stream processors INT4 performance.
    It doesn't tell us the capabilities of the Stream processors FP8 performance.
    It doesn't tell us the capabilities of the Stream processors FP16 performance.
    It doesn't tell us the capabilities of the Stream processors FP64 performance.

    You do realise the Stream processors do more than just single precision FP32, right? right?
    Things like rapid packed math is a thing as well.
    https://www.anandtech.com/show/11717/the-amd-radeon-rx-vega-64-and-56-review/4

    You need to stop arguing against the evidence.

    EpicRandy said:

    Benchmarks cannot be used when designing new GPUs/architecture, they have to rely on metrics and sets targets from every one of these, and tFlops is 1, time spy score isn't.

    That isn't how CPU's and GPU's are designed.

    They design them in such a way to have "projected" performance for different benchmarks.

    AMD, nVidia and Intel will also take past historical performance uplift trends in current benchmarks to project their future performance of new hardware to see how they will compete.

    EpicRandy said:

    Benchmarks cannot be used when designing new GPUs/architecture, they have to rely on metrics and sets targets from every one of these, and tFlops is 1, time spy score isn't.

    Tflops will depict things accurately as long as you run workloads that have no, or are designed to avoid bottlenecks when possible. That's why supercomputers use this figure predominantly.

    750 hp car beat 1500hp one.

    No, supercomputers use the figure as an advertisement tool.

    Teraflops. Aka. Single Precision Floating Point. Aka. FP32 would not be used... At all in a super computer that is only doing INT4 or INT8 A.I inference calculations... And this is actually a growing and common thing, where a super computer doesn't need any FP32 capability, making Teraflops a useless metric.

    https://developer.nvidia.com/blog/int4-for-ai-inference/

    EpicRandy said:

    Tflops will depict things accurately as long as you run workloads that have no, or are designed to avoid bottlenecks when possible. That's why supercomputers use this figure predominantly.

    Yes, you would if the ddr4 starved the 1030 while the GDDR5 allowed for more consistent utilization of the stream processor.

    I think you just admitted that Teraflops alone is bullshit, because you are starting to recognize other aspects.

    Took awhile, but we are getting there.

    EpicRandy said:

    they are generally 2 flops per cycle. They can be used for other operations and the performance of those will also be listed alongside the tFlops figure. combined operations are also listed with different tFlops figures like fp16 or fp64, Other optimizations can be pre-done through the compiler when the software is built and so the GPU would be agnostic of these.

    Except the advertised Teraflops doesn't account for FP16 and FP64. - When Teraflops is used by itself it's FP32/Single Precision.

    EpicRandy said:

    A GPU designed with a stream processor with 2 flops/ cycle is not 2 operations, it is 2 operations using floats, when they process something else like double they will use many cycles to process the task. Some stream processors are designed so that they can use both 32-bit to process a single double (fp64), those will be listed with half performance on double. others are not and can take up to 16 cycles to do the same operations. that's dependent on the architecture. Some stream processors are limited to multiplication for 1 of its 32-bit operations and addition/subtraction for the other.

    Floats are an operation.

    Here is the thing, packing math together -only- works if the operation is identical, thus Half-Precision and Double-Precision is -never- going to be a linear increase/decrease in the real world due to those inherent inefficiencies.

    Again. Teraflops doesn't account for any of that, hence why it's bullshit.

    EpicRandy said:

    No, you cannot, if you design your stream processor with different ALUs that can do 4flops/cycle like rdna3 or even 8 like some have done in the past, this will already be taken into consideration with the tFlops figure. like the 7900 xtx, its tFlops is Shader Core * clocks * 4 instead of 2. So you won't be able to exceed this value. You could offload some computing with other hardware accelerated parts but the tFlops figure is not meant to measure those, only the stream processors.

    A little bit more complex than that I am afraid.

    The 7900XTX having dual-issue ALU's, each can do 2 operations, means it is Shader Core*2 (Just like with VLIW) * Clock * 2 Operations per cycle.

    Those ALU's can do Integer operations as well, Teraflops doesn't represent any of that, Teraflops tells us *nothing* except for a single type of operation a GPU does... And only theoretically.

    EpicRandy said:

    That's not because tFlops is bullshit, that's because the 7900 XTX utilization of its stream processor is bullshit with video games' typical workloads. Again tFlops are not meant to measure the performance of a whole GPU only the stream processor's max throughput.

    No. It pretty much tells us Teraflops is bullshit.

    EpicRandy said:

    I know all that, but it does not really address the point. As binning is already done by the manufacturer and has been sorted and used in different SKUs that fit their respective tolerance the leeway you end up having as a customer to do undervolting and overclocking to get to the same TDP is marginal. And also the efficiency curve does exponentially increase the TDP with clocks passing a certain clock speed it does not really matter if you can get 100-200mhz of the most efficient target because you got lucky with the binning lottery. Skus that target consoles can't rely on the top few % of dies or else yield would be terrible.

    This is core clockrate and power scaling on the Radeon 6700XT.

    Pretty much explains that increasing core clocks has an efficiency curve.



    Now if you increase clock, but decrease voltage by 500mV you will have a net-gain in terms of power consumption, or stay the same.



    --::{PC Gaming Master Race}::--