Scoobes said:
selnor said: Stop. Your so far off the mark. Ok. Those figures are from IBM yes. But not actual usage data. Efficiency is where the figures are at. And Cell in PS3 gets nowhere near it's claims.Here the following is from IBM's actual tests on their own website. Something Sony fans never ever use. And something they always push aside hoping noone will ever see it. Table 9. Comparison of Linpack performance between Cell BE and other processors
Linpack 1kx1k (DP) |
Peak GFLOPS |
Actual GFLOPS |
Efficiency |
SPU, 3.2GHz |
1.83 |
1.45 |
79.23% |
8 SPUs, 3.2GHz |
14.63 |
9.46 |
64.66% |
Pentium4, 3.2GHz |
6.4 |
3.1 |
48.44% |
Pentium4 + SSE3, 3.6GHz |
14.4 |
7.2 |
50.00% |
Itanium, 1.6GHz |
6.4 |
5.95 |
92.97% |
Now at this point I will point out to you, the SPU's also according to IBM CANNOT run an operating system. They are only designed for Floating work. IBM's own words. Now the Pentium + SSE3, 3.65GHz is a single core processor. If that was a quad core it would destroy 8 SPU's together even though the SPU's are designed for the work this test provides and the Pentium isn't. The Pentium is a PPE processor. Doing 7.2 Actual GFLOPS in a test which Cell was designed to be good at, and 8 SPU's actual is only 2 GFLOPS above a single core PPE back in 2006.Also bear in mind take off some for PS3 has a dormant SPU and 1 dedicated to OS and PS3 has only 6 SPU's available for making games.This is where the figures come from. When all is added up, under actual performance the Cell does 114.2 GFLOPS and Xenon 115.4. But remeber games that require more General purpose work, have considerably more power on the Xenon as the SPU's CANNOT do it at all. So PS3 has just 1 PPE to do those jobs.Table 7. Performance of parallelized double-precision Linpack on eight SPUs
matrix size |
# of Cycles |
# of Insts. |
CPI |
Dual Issue |
Channel Stalls |
Other Stalls |
Used Regs |
SPEsim GFLOPS |
Measured GFLOPS |
Model Accuracy |
Efficiency |
1Kx1K |
236.7M |
69.1M |
3.42 |
2.9% |
6.7% |
68.5% |
128 |
9.704 |
9.46 |
97.49% |
64.66% |
2Kx2K |
1.64G |
44.9M |
3.65 |
2.2% |
3.3% |
72.5% |
128 |
11.184 |
11.05 |
98.80% |
75.53% |
Table 4. Performance of parallelized Linpack on eight SPUs
Matrix size |
Cycles |
# of Insts. |
CPI |
Single Issue |
Dual Issue |
Channel Stalls |
Other Stalls |
# of Used Regs |
SPEsim |
Mea- sured |
Model accuracy |
Effi- ciency |
1024x1024 |
27.6M |
2.92M |
0.95 |
27.9% |
32.6% |
26.9% |
12.6% |
126 |
83.12 |
73.04 |
87.87% |
35.7% |
4096x4096 |
918.0M |
1.51G |
0.61 |
29.0% |
56.7% |
10.8% |
3.4% |
126 |
160 |
155.5 |
97.2% |
75.9% |
http://www.ibm.com/developerworks/power/library/pa-cellperf/Now IBM also state the following:
The PPE was designed specifically for the Cell processor but during development, Microsoft approached IBM wanting a high performance processor core for its Xbox 360. IBM complied and made the tri-core Xenon processor, based on a slightly modified version of the PPE.[33][34]
Anything the PS3 needs the PPE for, the 360 is more than 3 times as powerful at doing so. Because the SPU's cannot do it at all.
|
Do we know what the developers use PPE's for and how efficiently the 360's 3 cores are utilised? As far as I'm aware most of the power hungry stuff is shifted to the GPUs and in the PS3's case the SPEs. As far as I'm aware AI and physics are mainly covered on the CPUs but physics can also be done on the GPU and more efficiently, and I didn't think AI was particularly power hungry.
On PC I'm running on an AMD 3800 X2 (V.old) and games run fine whilst looking better than most console games as I'm running on an 8800GTS GPU.
|
Compared to a modern personal computer, the relatively high overall floating point performance of a Cell processor seemingly dwarfs the abilities of the SIMD unit in desktop CPUs like the Pentium 4 and the Athlon 64. However, comparing only floating point abilities of a system is a one-dimensional and application-specific metric. Unlike a Cell processor, such desktop CPUs are more suited to the general purpose software usually run on personal computers. In addition to executing multiple instructions per clock, processors from Intel and AMD feature branch predictors. The Cell is designed to compensate for this with compiler assistance, in which prepare-to-branch instructions are created. For double-precision floating point operations, as sometimes used in personal computers and often used in scientific computing, Cell performance drops by an order of magnitude, but still reaches 20.8 GFLOPS ( Theoretical 1.8 GFLOPS per SPE, 6.4 GFLOPS per PPE). The PowerXCell 8i variant, which was specifically designed for double-precision, reaches 102.4 GFLOPS in double-precision calculations.
But from the tests we know An individual SPU on it's own has an actual GFLOP of 1.4 not 1.8. and further when using more together the efficiency decreases. as 8 SPU's only mnage 9.46 GFLOPS. Whereas 1.4 x 8 should be giving 11.2 GFLOPS not 9.46.
So as you can see, Sony through around the theoretical numbers of PS3 Cell. Didnt mention to their customers that it was based on 8 SPU Cell when the PS3 would only have 7 available with a further locked for Linux help. And then further forgot to mention that in real world terms it gets nowhere near 218GFLOPS.