Bofferbrauer2 said:
Pemalite said:
Gotta' disagree. Because... Bandwidth. 25.6GB/s isn't good for 1080P gaming... There would be a linear increase in performance with a doubling of bandwidth.
I did some testing on this back in the day and posted it on this very forum about how bandwidth scales with resolution. - Around 150-200GB/s is the ideal ballpark for 1080P... Otherwise you are fillrate limited.
So you are going from 25.6GB/s to 59.7GB/s... Which is a 133% increase in bandwidth... Maxwell to Pascal V4 Delta Colour compression gives Pascal a 20% best-case scenario bandwidth improvement on top of that. https://www.anandtech.com/show/10325/the-nvidia-geforce-gtx-1080-and-1070-founders-edition-review/8
So that potentially makes the difference of 25.6GB/s to 71.64GB/s or 179% advantage to Tegra Pascal.
If you can point me to a point in history where a GPU of the same relative architecture but with a 150%-200% bandwidth advantage getting less than a 50% performance improvement at decent resolutions... I'll eat my hat.
Because I can point to many cases where nVidia/AMD have taken a GPU, but hamstrung bandwidth by including DDR4/DDR3 over GDDR5 and performance took a massive dive. Like the Radeon 7750, Geforce 1030 etc'.
|
Here's the results I had in GoogleNet and AlexNet, two neural networks that make full use of the GPU. Both chips have the same settings:
| Benchmark |
Jetson TX1 |
TX2 Max-Q |
TX2 Max-P |
| GoogLeNet (batch=2) |
141 FPS |
137FPS |
176 FPS |
| GoogLeNet (batch=128) |
202 FPS |
195 FPS |
252 FPS |
| AlexNet (batch=2) |
163 FPS |
176 FPS |
222 FPS |
| AlexNet (batch=128) |
507 FPS |
465 FPS |
603 FPS |
As you can see, at no point do we get a 50% uplift at similar power profiles.
That being said, the TX2 can be fully unleashed, at which point the chip has an additional 10-20% uplift over Max-P - but then, it consumes also roughly 25% more power than the TX1. It's possible that you were referring to those results, as then we're getting around 50% uplift.
Of course, it's also dependent on the Benchmark. Jpeg compression for instance is almost 3 times as fast on the TX2. Here, the higher benchmark comes fully to play. But that simply ain't true for all the applications.
|
That's compute. Not gaming. It's certainly not making "full use" of the GPU, there are a ton of fixed function units going unused.
It's like doing mining on Vega 7... Which often has a higher hashrate than RDNA2 GPU's, despite RDNA2 GPU's being more performant in games.
https://overclock3d.net/news/gpu_displays/amd_s_rx_6700_xt_is_slower_at_mining_than_its_predecessor_-_here_s_why/1