NJ5 said:
How is that possible if the PPU has dynamic branch prediction and the SPU doesn't? The SPU doesn’t have dynamic branch prediction hardware. The rationale behind this is that that hardware would take increase both energy and chip area requirements.
Correct me if I'm wrong, but that's two false claims in a row from you...
|
Feel free to dig for some info on the details of the "branch predictor" on the PPU, and then re-evaluate your comment, in terms of icache misses, especially since the PPU threads are sharing a cache.
If you argued that SPU branches tend to be faster because they never end up in a cache miss, because they SPU code is much smaller, better written, etc. than PPU code tends to be, I would have to agree with you. That still says nothing about their ability to run general purpose code blazingly fast, especially when doing it in parallel.