By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - OMG!!!CELL can do realtime RAYTRACING with just software!!!!

BenKenobi88 said: Everyone knows he had it coming. On his CELL thread, he didn't apologize for swearing and bashing other users, even though I asked. He just kept going and going with his Cell rantings. I actually am somewhat interested in what the Cell can do...but nobody needs to spam so much and shut down what other people say.
The cell as a standalone processor isnt all that ungodly fast, but that's what's so nice about the cell. It is meant to cluster. I mean, compareing it to a single C2D it's fast, but the single Cell will be passed in a year or two by it. That is where the most imporant part of the CELL comes into play. 2 cells = 200% power. 4 cells = 400% power. (over a single cell) 16cells = 1600% power. C2D = 100 C24(4 cores) = about 180% C28(8 cores) = about 300% C216(16cores) = about 500% C232(32cores) = about 800% There is a MAJOR problem with that system tho. Memory speed. The 360 has a major memory bandwidth issue. The modern PC has been plagued by the memory issues. The C2D changed things for the better. But still once agian, it is not the way the cell works. The cell has 64mb memory on each SPE. (dedicated) meaning that one SPE can access that memory at 100% speed. or all 8 SPEs can access their memory at 100%. For encodeing audio/video, a modern C2D is limited by the memory. The Cell has more speed coming off each SPE's memory than a C2D has total. The CELL has about 8x the memory bandwidth of a C2D. To compare the CELL to a RISC, PowerPC core... is basically the same as comparing it to a C2D. As a C2D is actually faster, and if I'm not mistaken, slightly better than a PPC. (the powerPC clusters better tho) My point on this is. 1 c2d or 8 c2d. They have the same problem. Memory speed. 1 cell can outpower the 8 c2d in memory intensive applications. (becuase of the memory) the 1 cell can outpower the 3 PPC processors on the 360 with it's eyes closed on memory intensive applications. The thing about the cell's power is it's memory calls. 1 cell = 8 memory calls every clock-tick. 2 cell = 16memory calls every clock-tick. 4 cell = 32memory calls every clock-tick. 1 c2d = 2 memory calls every clock tick. 2 c2d = 4 memory calls every clock tick. 4 c2d = 8 memory calls every clock tick. Here is where the Cell wins... The memory on the C2D isnt fast enough to do more than 4 memory calls each clock-tick. Meaning the memory intensive application... will see 0 performance increase from 2 c2d to 4 c2d. The Cell memory bandwidth, increases along with the power. EDIT: so you can link 100 cell processors together, and still run at 98% performance off each chip The PS3 is useing the cell as a prototype I think. To bring on the PS4. The PS4 is gonna rock this world. (if sony dont pull a stupid one and leave the cell processor)



PSN ID: Kwaad


I fly this flag in victory!

Around the Network

Kwaad said: The C2D changed things for the better. But still once agian, it is not the way the cell works. The cell has 64mb memory on each SPE. (dedicated) meaning that one SPE can access that memory at 100% speed. or all 8 SPEs can access their memory at 100%.
And I still rais the issue: check your numbers. The amount of local memory that you grant a single SPE is more than the amount of memory available on the whole cell. A single SPE has 512 KB of memory and its basic limitation lies in the interaction of its units and in task switches, and if i am not totally mistaken this will become a major hazard in the cell of the PS-3. Either you don't really use them efficiently or they are in the process of flooding the internal bus and their common memory interface. The cell has certain advantages and certain disadvantages. Another thing I find funny is how much some people value special benchmarks. in reality ih the profesional world benchmarks are more or less seen as smoke screens. You really has to think about what you are measuuring and what kind of hazards you miss. The only realiable benchmarks are benchmarks with your real applications and real data.



not internal. Each SPE has 64mb dedicated to it. At least on the PS3. Oh, and sorry, I did my math wrong, it was 32mb per SPE. 8x32=256 What is good about this system is... that is 8x the speed of 1x256. The Cell is great for 3D video, and it is great for compressing/decompressing video/audio.



PSN ID: Kwaad


I fly this flag in victory!

Kwaad said: not internal. Each SPE has 64mb dedicated to it. At least on the PS3. Oh, and sorry, I did my math wrong, it was 32mb per SPE. 8x32=256
#1. The Cell in PS3 has an SPE disabled for redundancy, so it really has just 7 SPE's. Furthermore, one SPE is taking up by the OS so games only have 6 available. #2. 256mb of RAM, which is accesed by the CPU. Did you get that, it's used by the CPU and in no way is "dedicated" via hardware to the SPE.
What is good about this system is... that is 8x the speed of 1x256.
What? Do you understand how RAM works?
The Cell is great for 3D video
Don't you mean rendering 3d, or has the Playstation 3 invented 3d video?



Leo-j said: If a dvd for a pc game holds what? Crysis at 3000p or something, why in the world cant a blu-ray disc do the same?

ssj12 said: Player specific decoders are nothing more than specialized GPUs. Gran Turismo is the trust driving simulator of them all. 

"Why do they call it the xbox 360? Because when you see it, you'll turn 360 degrees and walk away" 

I mean 3dStuiomax, and that style of rending. Ya'know. Stuff like you see in the movies.



PSN ID: Kwaad


I fly this flag in victory!

Around the Network

Kwaad said: I mean 3dStuiomax, and that style of rending. Ya'know. Stuff like you see in the movies.
Well, the Cell was more or less designed to do such works as raycasting, but it will not be used in the PS-3 in this way. Furtheremore real test cases already had the tendency that the single programms of one node did not fit into its 512 KB of internal Ram. Only this memory is dedicated to the SPE. It has nothing to do with the RAM that is dedicated to the Cell itself. If the cell has to work with more than these 512KB of memory it has to wait till the missing code is loaded from the 256 MB of memory, which takes several cycles as on every other processor, too. In fact this was probaly a reason why they chode a raycasting demo. Many people don't realize the difference between raycasting and raytracing, but raycasting is much easier and faster than an raytracing algorithm, which would probably not fit into the internal 512 KB RAM and it would not work realtime.



oh god! this post is by far the funniest thing i have read in long time! Washimul seems like an idiot.. but oh well he was baned.. but yea.. i started reading was like eh? hasnt raytraceing and casting been around for awile.. i dont know much about hardware i just know the games that use it. gota know at least that to be able to get an good PC or Mac to run the games nicely erm hey whats Doom 3 use for rendering? for i only get like 15-25FPS o.O [depinding on action going on] though thats prob due to my Geforce 4mx : oh well meh = new comp befor may ^^ but yea... what would an good computer be for BF2 related games and HL2 but gfx must be nvidia! am fanboy of nvidia.. o.o though ATI is nice >.> i might join other side... i never stay fanboy on one thing for long... currently i am fanboy of wii and 360 and sony can go to hell atm only reson i would want ps3 is for MGS4 and FF and possibly Tales game.. but yea..... Washimul is funny even though he is stupid.. for just the little amout i know.. he seems stupider than i o.O and i thought that was hard to do!



kars said: Kwaad said: I mean 3dStuiomax, and that style of rending. Ya'know. Stuff like you see in the movies. Well, the Cell was more or less designed to do such works as raycasting, but it will not be used in the PS-3 in this way. Furtheremore real test cases already had the tendency that the single programms of one node did not fit into its 512 KB of internal Ram. Only this memory is dedicated to the SPE. It has nothing to do with the RAM that is dedicated to the Cell itself. If the cell has to work with more than these 512KB of memory it has to wait till the missing code is loaded from the 256 MB of memory, which takes several cycles as on every other processor, too. In fact this was probaly a reason why they chode a raycasting demo. Many people don't realize the difference between raycasting and raytracing, but raycasting is much easier and faster than an raytracing algorithm, which would probably not fit into the internal 512 KB RAM and it would not work realtime.
Very nice.... except the PPE alone can grab 4 instructions from the XDR and write 2 in a single cycle. The PPE has 512k cache while "EACH" individual SPE has 256k of ram. It is referred to as ram because it allow the programmers to control how information is stored and handle by the SPE's.



Games make me happy! PSN ID: Staticneuron Gamertag: Staticneuron Wii Code: Static Wii - 3055 0871 5802 1723

kars said: And I still rais the issue: check your numbers. The amount of local memory that you grant a single SPE is more than the amount of memory available on the whole cell. A single SPE has 512 KB of memory and its basic limitation lies in the interaction of its units and in task switches, and if i am not totally mistaken this will become a major hazard in the cell of the PS-3. Either you don't really use them efficiently or they are in the process of flooding the internal bus and their common memory interface. The cell has certain advantages and certain disadvantages. Another thing I find funny is how much some people value special benchmarks. in reality ih the profesional world benchmarks are more or less seen as smoke screens. You really has to think about what you are measuuring and what kind of hazards you miss. The only realiable benchmarks are benchmarks with your real applications and real data.
There are no bottlenecks when the PPE is communicating with the SPE's. There is a slight overhead when communicating on the EIB with the XDR and the RSX but not that much to "flood" the EIB (from my studies wouldn't be easy to do by accident). You can program bad for the PS3 but you can also do the same for any proc. It works differently, so whatever seems like a disadvantage just means you should look at a problem in a different way.



Games make me happy! PSN ID: Staticneuron Gamertag: Staticneuron Wii Code: Static Wii - 3055 0871 5802 1723

staticneuron said: Very nice.... except the PPE alone can grab 4 instructions from the XDR and write 2 in a single cycle. The PPE has 512k cache while "EACH" individual SPE has 256k of ram.
Ooops, yes, You are right, only the PPE has 512 KB of memory, the SPEs only have a local memory of 256 KB (and the need that the programmer has to take control of the memory management). I won't write anything about the PPE access of the XDR, due to the fact that I have doubts, that the SPEs will leave their memory access to the PPE, which will be bothered more than enough by the branch heavy parts of the Software (AI, Physics and so on).