Ascended_Saiyan3 said:
slowmo said:
Final-Fan said: Ascended_Saiyan3: 'It IS real world, just look at these theoretical numbers!'
This deeply amused me. |
Honestly I was trying to help him but I can't stop him shooting himself in the foot.
It would be wise to look into the deficiencies of a asynchronous design in relation to games before you signify it an advantage AS3. Also bandwidth between cores is completely irrelevant if the amount of data each core can hold is tiny and the delay in getting more data makes the speed boost pointless. The Cell is an excellent processor for certain things its just not as flexible as most developers would have liked hence the struggles in programming for it.
Just to settle this discussion once and for all, do you think a PS3 could run Crysis at even 720p on high and still look as good or better than the PC?
|
Inter-core, memory, and cache bandwidth is VERY relevant. And, Intel's Core i7 doesn't have it over the Cell. You are thinking in the terms of OLD programming knowledge. Small pieces of data being moved around and processed in parallel is the NEW way of coding for next-gen architectures.
Maybe this will help you "get it", if you understand this in the first place (pages 94 - 194). That's the Siggraph 2008 PDF for developers...by developers. Note how those next-gen techniques look like best practices for Cell programming (small chunks of data).
The Cell is not as general purpose to developers TODAY. When their code is changed to the best practices of NEXT-GEN coding (as many next-gen coding practices suggest), the Cell will be one of the MOST flexible general purpose processors on the market. The Cell is just ahead of it's time. And, whenever you have something ahead of it's time, you get people like yourself that have a hard time seeing what it is.
|
A quote for you to read and process:
The PS3 is so much more difficult to program than the 360. In a sense it is designed similar to multiprocessor systems used by specialized customers such as NASA Ames Research Center. The concept is based on the principle that there is a very large amount of repetive mathematical data that can be performed in a parallel or a segmented sequential fashion (ex. one core multiples two arrays of 10000 numbers and then passes the output array to another core which performs divides on individual elements in the array which will pass the array to another core which performs some other operation on the data, etc. After the first core finishes its operation, it will acquire more data and perform the same operation).
Like the 360, the application would initially be developed using the PPE core. Next you would think that the PS3 (just like the 360) would be able to segment the game control plus AI code into one core and the graphics rendering code into another core. However that is not possible. Since the total application code may be about 100 MB and the SPE only has 256KB of memory, only about 1/400 of the total code can fit in one SPE memory. Also since there isn't any branch prediction capabilities in an SPE, branching should be done as little as possible (although I believe that the complier can insert code to cause pre-fetches so there may not be a big issue with branching).
Therefore the developer has to find code that is less than 256KB (including needed data space) that will execute in parallel.
Even if code can be found that can be segmented, data between the PPE and the SPE has to be passed back and forth via DMA(Direct Memory Access) which is quite slow compared to that of a pointer to the data like the 360.
If we assume that enough segment code was found that could use all 6 SPE cores assigned to the game application, now the developer would try to balance the power among the cores. Like the 360, some or all of the cores may have a very low utilization. Adding more hardware threads are not possible since each core has only one hardware thread. Adding software threads probably will not work due to the memory constraint. So the only option is an overlay scheme where the PPE will transfer new code using DMA to the SPE when the last overlay finishes processing. This is very time consuming and code has to be found that does not overlap in the same time frame.
The full post can be read here:
http://on-mirrors-edge.com/forums/viewtopic.php?pid=38256
You're quoting theoretical maximum throughput figures like they can be constantly achieved. If you honestly believe that then you're a fool. The small chunks of data that you call next gen programming techniques are a limitation of the SPE's not a benefit to the system at all. The cell isn't ahead of its time at all, it's just not designed as a general purpose processor, I can't believe you're so blinded to see it.
Also if the best defence you can come up in debate is to call people sad and name call then I pity you. You still didn't answer my question if the PS3 could run Crysis on high at 720p.