haxxiy said:
From the performance figures they've given, Ampere has 98% more flops per watt than Turing but only 21% more performance, on average. That means one needs 1.61 Ampere flops to equal the peformance of 1 Turing flops, and 1.5 Ampere flops to equal 1 RDNA 1.0 flops. It seems clear to me each shader was effectively cut in half before some architectural improvements, or perhaps it was the increased number of FP32 engines themselves that increased the performance relative to Turing. With RDNA 2.0 apparently focusing on IPC, it would seem like Nvidia and AMD have more or less switched places concerning what their GPU design philosophy historically used to be. Ampere is very Terascale-like (lots of shaders, lower clocks and performance) while RDNA 2.0 is kind of Fermi-like (higher cloks and IPC but less shaders). An Ampere CUDA core also has some similarities with Bulldozer modules, in that a second (integer in the case of Bulldozer, floats in the case of Ampere) unit was added to each processing core to increase performance and also make into those PR slides with twice the number of cores. So, I don't think it's feasible to expect there's more performance left in future drivers (the same way that magical expectation wasn't feasible with Terascale or GCN). |
Thanks for the detailed analysis.
Unless you know something the rest of us don't, isn't it a bit early to compare Ampere to RDNA 2.0 given how little we know about the latter? Sure, we know some stuff from mostly the XSX deep dive, but it that enough to compare both architectures before AMD reveals the specs of Big Navi and the rest of the line?
Please excuse my bad English.
Currently gaming on a PC with an i5-4670k@stock (for now), 16Gb RAM 1600 MHz and a GTX 1070
Steam / Live / NNID : jonxiquet Add me if you want, but I'm a single player gamer.