Marty8370 said:
goddog said: @nnn2004 for your nvida/ati comparison, that depends heavily on software implimentation, on the mac side of things the gt 8800 is slower than the ati 2600. the ati 3870 destroys the 8800, being almost 50% faster at many tasks barefeats.com provides the testing.
@topic basicaly for the cell to be more powerful than a tri-core powerpc processor, you need someone to program very well for each of the spes if that is not done, the tri-core will win out due to all three cores being able to due to same quality work on the same data, the spes can not do that, they need direction from the main core. the spe shine when someone takes the time to write out code that optimizes for it, the tri-core should not need the same level of optimization, really if you develop for ps3 first you would have to de-optimize it, the best example i have of this is on mac when the shift for powerpc to intel took place, photoshop cs1 was optimized for alvatec a speed boost powerpc tech, when moved to intel it was deactvated, and a special patch was issued to try and regain some speed... though it was still much slower on new intel chips than on three year older powerpc chips.
alvatec is a very similer idea to spe. though at most you saw two alvatec threads on any powerpc chip. if it is taken advatage of it can be very useful and give all kinds of really cool gains, but it is hard to program for and has only specific uses (or per spe in the case of cell). if not taken advantage of though the chip has no special qualites and depends on the power of the main core |
@Fishie - I back my posts up with facts not made up shit.
Synergistic Processor Elements (SPEs)
Each Cell contains 8 SPE's(7 SPE'sfor PS3)
An SPE is a self contained vector processor which acts as an independent processor. They each contain 128 x 128 bit registers, there are also 4 (single precision) floating point units capable of 32 GigaFLOPS* and 4 Integer units capable of 32 GOPS (Billions of integer Operations per Second) at 4GHz. The SPEs also include a small 256 Kilobyte local store instead of a cache. According to IBM a single SPE (which is just 15 square millimetres and consumes less than 5 Watts at 4GHz) can perform as well as a top end (single core) desktop CPU given the right task.
*This is counting Multiply-Adds which count as 2 instructions, hence 4GHz x 4 x 2 = 32 GFLOPS(PS3 Cell runs at 3.2GHz)
32 X 8 SPE's = 256 GFLOPS 32 X 7 SPE's = 224 GFLOPS(PS3 Cell)
Like the PPE the SPEs are in-order processors and have no Out-Of-Order capabilities. This means that as with the PPE the compiler is very important. The SPEs do however have 128 registers and this gives plenty of room for the compiler to unroll loops and use other techniques which largely negate the need for OOO hardware.
http://www.blachford.info/computer/Cell/Cell1_v2.html
|
Which is of course total shit because it was already explained that only 6 can be accessed by programers.
The 32 figure is for a CELL running at 4GHz, the one in the PS3 runs at 3.2 so the same calculation shows us the following:
3.2GHZx4x2=25.6
25.6x6=153.6GFLOPS.