Shin'en: If you can't make great looking games on Wii U, it's not the hardware's fault

Pemalite said:

fatslob-:O said:

Pemalite said:

fatslob-:O said:

Just so you know IPC isn't the whole story of performance. Oh and you appear to be right about jaguar (Got to check things out more often on my part.). I'm pretty sure floating point performance is a standard there are others like integer performance but that is not as important as the later.

IPC does matter, that's Instructions Per Clock, it's certainly more important than flops and is why AMD hasn't *really* been able to compete with Intel in the high-end, despite Intel having a core disadvantage.
The problem with multi-threading a game is that you have a primary thread, all your secondary threads have dependancies in the first, that's why generally you will always have a single thread that is more demanding than any others on a multi-core system when running a game, that's why a high IPC (A way to fixing it isn`t exactly a higher IPC but a higher clocks will do the trick because it will resolve dependencies alot better than a higher IPC.) at any given clock is important.

Floating Point as a gauge on performance really is pointless for games, note I said games.
If you were building an application that only used floating point, then it's an important measurement.

Game engines use allot of different types of math to achieve a certain result, the problem is you nor I will ever know what type of math is being used at any one instance in a games scene. (You do realize why we went away from integer math in the first place, right ? Here I'll give you the answer on my behalf and that's because using integers to calculate transformations and lightings was too slow initially and it became especially prevalent in the 3D graphics era. Alot of processors those days didn't feature an fpu and instead to render in 3D precisely they had to use software enabled floating point math.)
Take the Cell for example, it's a single-precision iterative refinement floating point monster, my CPU isn't even as fast in real-world tests when it comes to that, however the Cell's integer and double precisions floating point performance is orders of magnitude slower than my 3930K or even Jaguar, yet, overall in a real-world gaming situation, both CPU's would pummel the Cell.

@Bold

There are other factors to performance such as clock speeds too. If both processors can output the same amount of instructions at the same amount of time then I would prefer if the procressor has the higher clock because not only does it match the processor with a higher IPC but it has a higher clock`s to overcome sequential workloads too. Remeber how a single core clocked at 3 Ghz will beat a dual core that does 1.5 Ghz. Well the same situation applies here because there are certain programs that won`t be able to leverage a higher IPC but instead a higher clock provided that the each line of code is dependent on the last line of code. (I did not say that IPC did not matter but you have to account orther things too about the processor.)

Yet these games use alot of floating point math from what I am seeing. (Even if games were to use integer it`d be miniscule alot of times and you know this.)

The cell had alot more in common with what you'd call a vector processor. The actual processor was the PPE as for the SPE's they were a bunch of SIMD units, nothing more plus their instruction sets were pretty narrow.

Nope, integer math is still used heavily. (Give me an example that has a significant affect on rendering.)
The industry moved transform and lighting from being done on the CPU to the GPU because, well. It's faster (I was a PC gamer even when they first made that transition with the Geforce 256 I AM GETTING OLD!), it's a highly parallel task, initially it was done via a fixed function hardware block untill later it was performed on the pixel shader pipelines (It's one of the reasons why the Geforce FX faltered against the Radeon 9700 series, due to wasted transisters on the TnL hardware where-as AMD had it all done in the pixel pipelines.)
In-fact if you look at the evolution of the PC's hardware, over time the CPU has been tasked with doing less and less processing as everything is offloaded to the graphics hardware due to it being able to take advantage of highly parallel tasks. (Before graphics cards existed there were fpu's. just so you know.)

Back then, CPU's did have a floating point unit, however processors like AMD K6 ande Cyrix M3 had a very poor floating point unit when compared to the Pentium 1, 2 and 3 and the Celeron counterparts, it wasn't untill AMD introduced the K7 which was a massive departure from the K6 series that they actually caught up.
Heck even AMD's FX has it's floating point unit shared between 2 processing cores.
AMD's eventual goal for it's Fusion initiative is to remove the floating point unit entirely from it's CPU and instead move it onto the graphics processor which is more suited to that type of processing. #1

As for clockspeed, I agree it is important, to an extent and only if all things are equal architecturally, I would for instance take a single core Haswell processor over a Single core Netburst processor even if the Netburst processor has twice the frequency, why? Because Haswell has a much higher IPC, so it can do more work than a Netburst processor even at a low clockspeed, but this is all old news, the Gigahert race between AMD and Intel eventually showed it's not a primary measurement of performance with two different architectures.

The problem with the consoles is that they all have a low clockspeed and a low-IPC CPU, so all-round, they're crap. :P
However, where this generation is different compared to prior ones is that the GPU can offload some of the processing anyway. #2

@Bold #1 That would be extremely naive of amd seeing as how GPU's aren't very flexible in terms of instruction sets and there are floating point workloads that benefit from a faster fpu.

But then wouldn't haswell suffer from sequential tasks such as executing lines of codes that are dependent on the last result of the executed line of code. Oh and your comparison of netburst to to haswell isn't fair seeing as how it has a much higher IPC to compensate for it's low clock but then you have to factor in the diminishing returns of a higher IPC because once reaches a fraction embarrassingly parallel level not alot of programs will be able to leverage the higher IPC like I said so instead raising the IPC it would be a smarter idea to raise clocks because you will always see a boost on performance on any type of workloads. So like I said in my last scenario I would prefer the core that can do 1 IPC at 2 Ghz than a core that does 2 IPC at 1Ghz. (Higher IPC != Better cores)

@Bold #2 What did I just say about being able to offload processing to GPU's !? There's only a handful of tasks that can be done to offload to GPU's and even then they would most certainly have to be insanely parrallel otherwise the performance increase wouldn't be as big because of the latency involved in trying to send the workload to the GPU would diminish the performance gains because the GPU would mostly stall waiting for the data to come over and then they would have to transfer it back to the main memory but like you said these latencies don't matter when clearly this programmer is struggling to get better performance on his gpu compared to his cpu but it's fixed cause he had to input larger matrices. http://stackoverflow.com/questions/11835956/c-amp-with-fast-gpus-slower-than-cpu (Note: You do not say that copy times don't matter ever, each milisecond counts!)

Existing User Log In

New User Registration

Nintendo Discussion - Shin'en: If you can't make great looking games on Wii U, it's not the hardware's fault - View Post

Recent Badges:

Currently Playing: