sc94597 said:
All the guy you quoted essentially said is that the PS2 wouldn't be able to handle GTA V's traffic system. That is a pretty low bar though. The traffic system of GTA V is good enough for what it is intending to do (add a layer of complexity to a sandbox.) It's not spectacular when compared to real city simulation games though, but nobody expects it to be. Noticed how his quote doesn't really talk about the behavior of individual cars, and more speaks about the emergent behavior of having many of them. The actual decisions made by the individual cars with respect to the greater traffic hasn't changed much though, which is quite a different thing from games which focus very much on traffic simulations. I am a bit baffled at what he means by "but 3x ARM cores are likely a significant downgrade compared to 7x x64 cores for this kind of highly parallel city simulation workload" when as far as I recall the traffic system wasn't upgraded in the 8th generation remasters. If the 360's in-order Xenon can handle it the Switch's CPU certainly can, in handheld or docked mode. As for "lots of other simulated systems running concurrently" I don't disagree, GTA V does have many systems, but SO DOES BOTW. |
it's impossible to know how many things a game is doing under the hood. hard to compare games until they are both on the same systems, but just using common sense, making a living breathing realistic city with traffic is gonna be way more demanding then a open world cell shaded game in the forest. as for the developer he says there is way more traffic in the next gen versions, which would be too taxing for switch CPU. he also compares 360/ps3 cpu vs switch.
t would be highly dependent on the code they are running.
"Cell and Xenon are good in highly optimized SIMD code. Xenon = 3 cores at 3.2 GHz, four multiply-adds per cycle (76.8 GFLOP/s). That's significantly higher theoretical peak than the 4x ARM cores on Switch can achieve. But obviously it can never reach this peak. You can't assume that multiply-add is the most common instruction (see Broadwell vs Ryzen SIMD benchmarks for further proof). Also Xenon vector pipelines were very long, so you had to unroll huge loops to reach good perf with it. Branching and indexing based on vector math results was horrible (~40 cycle stall to move data between register files). ARM NEON is a much better instruction set and OoO and data prefetch helps even in SIMD code.
If you compare them in standard C/C++ game code, ARM and Jaguar both stomp over the old PPC cores. I remember that it was common consensus that the IPC in generic code was around 0.2. So both Jaguar and ARM should be 5x+ faster per clock than those PPC cores (IIRC Jaguar average IPC was around 1.0 in some real life code benchmark, this ARM core should be close). However you can also write low level optimized game code for PPC, so it all depends on how much resources you had to optimize and rewrite the code. Luckily those days are a thing of the past. I don't want to remember all those ugly hacks we had around the code base to make the code run "well enough". The most painful thing was that CPU didn't have a data prefetcher. So you had to know around 2000 cycles in advance which memory regions your future code is going to access, and prefetch that data to cache. If you didn't do this, you would get 600 cycle stalls on memory loads. Those PPC cores couldn't even prefetch linear arrays."







