kars said:
Not quite. The SPUs do not have a cache, they depend on their local memory that has to hold the programming code and the Data and only the Programmer is responsible for the management of this memory. The important thing of these units is that they can simultaniously send their old results and receive new data (via their own Memory Flow Controller) and calculate the current data. In theory all units could work continously but in such a situation 3 SPE (SPU+MFC) could block the bus (if they do not form a chain). Additionaly the PPE can execute two orders at the same time, the SPUs can only execute one order, but every SPU has an AltiVec 128 Engine but only one of the execution pipelines of the PPE has such a unit. There is one of the biggest differences to the Xenon which has two AltiVec 128 Units for each core (one per pipe). |
1) The local store is a cache, call it whatever you want, but that's what it is.
2) The SPUs are dual instruction issue, the odd pipeline can do LS loads and stores and the even pipeline can do floating point operations. A 128-bit load takes 6 cycles. A single precision floating point SIMD operation takes 6 cycles. You can do these both at the same time. This is why the chip is fast.
3) What's this "blocking the bus" stuff? Do you mean blocking on the bus?
4) The SPUs do not use AltiVec, they use a RISC ISA they imaginatively call SPU ISA. It's pretty close to the actual microarchitecture.







