By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Sony Discussion - Remember when Sony said the PS3 would render at 120 fps? Forget it, 240fps

If the SPEs are so so awesome at doing general purpose equations then why are they not being used in that fashion? Its pretty simple, after a while you stop blaming the developers are start blaming the tools and architecture.



Tease.

Around the Network

@Groucho: Why so aggressive? And why are you throwing in a third false claim for my amusement?

a single PPU thread (which is only 1.6 GHz, not 3.2 GHz)


Bzzt, wrong again. Simultaneous multi-threading doesn't mean thread gets works as if it had half the clockrate. The whole point of SMT is to get both threads as close as possible to using the whole CPU for themselves (with the obvious constraints).

I really suggest you read some more before you embarass yourself again.

PS: I'd also suggest that you be a bit clearer next time, you didn't make it clear at all that you were comparing 6 SPUs vs 1 PPU (an unfair comparison I may add).

 



My Mario Kart Wii friend code: 2707-1866-0957

NJ5 said:
MikeB said:
@ NJ5

PPU has special hardware to improve branch prediction that the SPU doesn't have


Branch hints and branch elimination are the roads to take with regard to the SPUs, the end result is code which if done well runs better on any kind of CPU.

True, but as you say that's something you can also do on the PPU, so Grouch still has to explain why his statement goes against what the Cell experts say.

 

To quote myself:

"Feel free to dig for some info on the details of the "branch predictor" on the PPU, and then re-evaluate your comment, in terms of icache misses, especially since the PPU threads are sharing a cache.

 

If you argued that SPU branches tend to be faster because they never end up in a cache miss, because they SPU code is much smaller, better written, etc. than PPU code tends to be, I would have to agree with you.  That still says nothing about their ability to run general purpose code blazingly fast, especially when doing it in parallel."

 

When it comes down to the fine details, you are absolutely correct, NJ5.  I was angered by your nitpicking, and was reminded that, in a recent optimization, I discovered that some code I had rewritten* (this is the important part) to run on the the SPUs was signifigantly faster than it had been on the PPU.  The branch hinting can be ported back to the PPU, and likely, excepting that my PPU compiler isn't as good as my SPU compiler, the code will be faster there as well, assuming I don't get a load of icache misses with my mispredicted branches.  You could say that the icache expense is really similar to the expense of uploading an entire job to a SPU in the first place, which is, of course, correct.  I've just offloaded the icache hits at the beginning of the job, rather than effectively loading code on-the-fly.  So really, in the end, NJ5 is right, and I am wrong.

Looks like my whole argument is debunked, and this thread can continue along its course of assuming that the SPUs are not independant processors, and totally incapable of running general purpose code decently fast, if at all -- especially since they can't do it in parallel.. you know.. independantly.

 



NJ5 said:

@Groucho: Why so aggressive? And why are you throwing in a third false claim for my amusement?

a single PPU thread (which is only 1.6 GHz, not 3.2 GHz)


Bzzt, wrong again. Simultaneous multi-threading doesn't mean thread gets works as if it had half the clockrate. The whole point of SMT is to get both threads as close as possible to using the whole CPU for themselves (with the obvious constraints).

I really suggest you read some more before you embarass yourself again.

PS: I'd also suggest that you be a bit clearer next time, you didn't make it clear at all that you were comparing 6 SPUs vs 1 PPU (an unfair comparison I may add).

 

Interesting, you're telling me the hardware threading on the PPU doesn't alternate threads each cycle?  That's something I certainly wasn't aware of.

 



Groucho said:

You should really avoid commenting on this, since you have, quite obviously from this comment, never worked on a PS3, or at least never with the SPUs.  I can say with authority that the italicized parts of the comment above is pure BS, and the sources you got it from don't know what the hell they are talking about.  The SPUs *are* general purpose cores, with a lot of extra logic devoted to vector mathematics.  They lack the supporting memory architecture to do things in the same manner as the PPU, but that doesn't stop them from being able to do it.  Being an independant core doesn't require that you be able to address main memory directly -- the only requirement is that the processor be able to run concurrently with other cores.  The SPUs can, and frankly they can run general purpose code, that doesn't involve accessing large tracts of memory (which is a big deal), just as fast, or in some cases faster, than the 2 hardware PPU threads can.

The bolded part of your comment is absolutely correct, however.  At least you got that right -- or your source did.  The only reason any newbie engineer wannabe could possibly claim that the SPUs are "not independant cores" is due to the fact that they can only address their 256K local memory (not a "cache" as some have called it), and must stream data from main memory in/out.  The truth is, however, they can do this independantly of the PPU.  Thus *drum roll* they are independant cores.  They are NOT coprocessors, as you seem to be implying, as that's the word used to describe a "dependant processor" which isn't actually capable of independant operation.  The SPUs are -- thus, they are independant cores.  

I know you aren't going to be able to say to someone who doubts this fact "Groucho said that's not true", but I can tell you that, if you understand that the SPUs are independant cores, you will be correct, and anyone who says otherwise is... well, ignorant, and full of ****.  Please have faith that some authorities do surf here, and please stop peddling this kind of hogwash.

 

 

1. I have never programmed the PS3... that is absolutely correct.  I am simply repeating information I have gathered from others who have programmed the PS3, so I'm not going to try to argue with someone of your truly dizzying intellect.

2. I never claimed that the SPUs were incapable of running generalized code... just that their primary function and design were for other tasks, and that this could make them less effective in ways than a general purpose core.  You admitted as much in your response.

3. Considering that I said the SPUs were like DSPs or bit-blitters on steriods, it's difficult to imagine you're arguing that they're instead cores with lots of vector processing.  You DO realize that DSPs and bit-blitters specialize in vector processing, right?

3. I never said the SPEs were co-processors.  And for your information, the Amiga "co-processors" (agnes, denise, buster, etc.) were capable of operating independently of the 68k CPU (in fact, Jay Miner architected the Amiga hardware so the co-processors and CPU operated on alternate bus cycles so they would have less contention when accessing memory).  And yes, I did programming on the Amiga.  And yes, I've programmed for 28 years--everything from satellite communications to kernel work to embedded systems--so I'm not a "newbie engineer wannabe".

4. Your post actually does point out an area where my post was weak... I could have described more clearly what I meant when I said programming the SPUs can be heavy lifting.  It's not because they can't run general purpose code, it's because they're designed primarily to handle vector processing, and that's what many of the devs are trying to use them for, i.e. - to do what the PPU isn't as good at.  Their not designed to be general purpose cores... their primary function is to ASSIST the core with what it doesn't do well.

I've had disagreements with MikeB on technical elements of the PS3, but I don't remember him making them into personal attacks as you have... "above is pure BS", "You should really avoid commenting on this", "At least you got that right -- or your source did", "any newbie engineer wannabe", "ignorant, and full of ****", "please stop peddling this kind of hogwash".  I don't know why you're venting your spleen on me, because I'm not the only person on VGCharts who believes IBM's technical literature when they refer to the Synergistic Processing Units as vector processors.

Had you taken a different tone in your response, I would have gratefully accepted what you said and shown an interest in learning more about how the PS3 and Cell really work (I've learned a few things from MikeB and others), but now I'm not really interested in listening to what you have to say on the subject.

 



Around the Network

This is the part that got to me:

crumas2: "Had you taken a different tone in your response, I would have gratefully accepted what you said and shown an interest in learning more about how the PS3 and Cell really work (I've learned a few things from MikeB and others), but now I'm not really interested in listening to what you have to say on the subject."

 

I've clearly let my emotions over some of the posts in this thread interfere with my goal of educating. I apologize to all who I have offended, and I'm going to just back out of this thread.

If you have learned anything from my posts, I'm glad. I'm sorry if I pressed any buttons.

In particular, I apologize to NJ5, and crumas2... sorry guys.



Groucho said:
NJ5 said:

@Groucho: Why so aggressive? And why are you throwing in a third false claim for my amusement?

a single PPU thread (which is only 1.6 GHz, not 3.2 GHz)


Bzzt, wrong again. Simultaneous multi-threading doesn't mean thread gets works as if it had half the clockrate. The whole point of SMT is to get both threads as close as possible to using the whole CPU for themselves (with the obvious constraints).

I really suggest you read some more before you embarass yourself again.

PS: I'd also suggest that you be a bit clearer next time, you didn't make it clear at all that you were comparing 6 SPUs vs 1 PPU (an unfair comparison I may add).

 

Interesting, you're telling me the hardware threading on the PPU doesn't alternate threads each cycle?  That's something I certainly wasn't aware of.

 

Yes, that's exactly what I'm telling you. SMT does fine-grained division of the CPU's resources, for example thread 1 uses the memory access hardware while thread 2 does a calculation.

http://en.wikipedia.org/wiki/Simultaneous_multithreading

However, I'm actually finding some conflicting sources on this. I'm not sure whether the PPE does interleaved or simultaneous multi-threading anymore, so you may be right on this one (in that case apologies in advance).

 



My Mario Kart Wii friend code: 2707-1866-0957

NJ5 said:
Groucho said:
NJ5 said:

@Groucho: Why so aggressive? And why are you throwing in a third false claim for my amusement?

a single PPU thread (which is only 1.6 GHz, not 3.2 GHz)


Bzzt, wrong again. Simultaneous multi-threading doesn't mean thread gets works as if it had half the clockrate. The whole point of SMT is to get both threads as close as possible to using the whole CPU for themselves (with the obvious constraints).

I really suggest you read some more before you embarass yourself again.

PS: I'd also suggest that you be a bit clearer next time, you didn't make it clear at all that you were comparing 6 SPUs vs 1 PPU (an unfair comparison I may add).

 

Interesting, you're telling me the hardware threading on the PPU doesn't alternate threads each cycle?  That's something I certainly wasn't aware of.

 

Yes, that's exactly what I'm telling you. SMT does fine-grained division of the CPU's resources, for example thread 1 uses the memory access hardware while thread 2 does a calculation.

http://en.wikipedia.org/wiki/Simultaneous_multithreading

However, I'm actually finding some conflicting sources on this. I'm not sure whether the PPE does interleaved or simultaneous multi-threading anymore, so you may be right on this one (in that case apologies in advance).

 

 

Its interleaved.  Man I *wish* it was simultaneous.



Squilliam said:
If the SPEs are so so awesome at doing general purpose equations then why are they not being used in that fashion? Its pretty simple, after a while you stop blaming the developers are start blaming the tools and architecture.

But you have. Killzone 2 is a giant heap of awesome.



"We'll toss the dice however they fall,
And snuggle the girls be they short or tall,
Then follow young Mat whenever he calls,
To dance with Jak o' the Shadows."

Check out MyAnimeList and my Game Collection. Owner of the 5 millionth post.

From what I understand the Human Eye doesn't notice a change from 25 frames and up......



4 ≈ One