By using this site, you agree to our Privacy Policy and our Terms of Use. Close
NJ5 said:
Groucho said:

The SPUs can, and frankly they can run general purpose code, that doesn't involve accessing large tracts of memory (which is a big deal), just as fast, or in some cases faster, than the 2 hardware PPU threads can.

That's not true. There are some specific ways in which the SPUs are inherently slower than the PPU, for example double-precision arithmetic.

Each SPU is capable of executing two DP instructions every seven cycles. With Fused-Multiply-Add, an SPU can achieve a peak 1.83GFLOPS at 3.2GHz. With eight SPUs and fully pipelined DP floating-point support in the PPE's VMX, the Cell BE is capable of a peak 21.03GFLOPS DP floating-point

Do the math and you can see the PPU is 3.5 times faster than the SPU at that.

Source:

http://www-128.ibm.com/developerworks/power/library/pa-cellperf/

 

There may be a few operations where a 3.2 GHz SPU is slower than a 1.6 GHz PPU thread, but in terms of general-purpose programming, its negligable.  Even branches tend to be faster on the SPUs, and that's practically the definition of "general-purpose" programming expense.