By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - PC Discussion - Carzy Zarx’s PC Gaming Emporium - Catch Up on All the Latest PC Gaming Related News

Overall, I'm a bit underwhelmed.

31.34 SP and 15.67 DP GFLOPS/W for Pascal Tesla with HBM RAM don't sound that great when the 28nm FirePros are at ~20 SP and ~10 DP GFLOPS/W . Seems like all the machinery to do double precision is costing them a lot of transistors and power consumption. Of course, CUDA has been the preferred architecture for scientific workloads for a while, but that could shift a bit of marketshare, if AMD manages more than a 50% performance per watt increase with Polaris Server GPUs.






 

 

 

 

 

Around the Network

Self-Driving Cars will kill us.



The GP100 is looking sexy ...



CGI-Quality said:
Yeah, consumer Pascal will be @ Computex (May 31-June 1).

Indeed.

Nvidia did show Pascal, but only in their Tesla configurations. At least now we know what the new Titan could be like .

 

Btw, have you seen that Realities VR game video? I posted it in the news and thought you may like it given that you made that thread about UE4 and photogrammetry pics.



Please excuse my bad English.

Currently gaming on a PC with an i5-4670k@stock (for now), 16Gb RAM 1600 MHz and a GTX 1070

Steam / Live / NNID : jonxiquet    Add me if you want, but I'm a single player gamer.

Nvidia has given some details and a comparison chart of Pascal through their blog:

https://devblogs.nvidia.com/parallelforall/inside-pascal/

Like previous Tesla GPUs, GP100 is composed of an array of Graphics Processing Clusters (GPCs), Streaming Multiprocessors (SMs), and memory controllers. GP100 achieves its colossal throughput by providing six GPCs, up to 60 SMs, and eight 512-bit memory controllers (4096 bits total). The Pascal architecture’s computational prowess is more than just brute force: it increases performance not only by adding more SMs than previous GPUs, but by making each SM more efficient. Each SM has 64 CUDA cores and four texture units, for a total of 3840 CUDA cores and 240 texture units.

 

Tesla Products Tesla K40 Tesla M40 Tesla P100
GPU GK110 (Kepler) GM200 (Maxwell) GP100 (Pascal)
SMs 15 24 56
TPCs 15 24 28
FP32 CUDA Cores / SM 192 128 64
FP32 CUDA Cores / GPU 2880 3072 3584
FP64 CUDA Cores / SM 64 4 32
FP64 CUDA Cores / GPU 960 96 1792
Base Clock 745 MHz 948 MHz 1328 MHz
GPU Boost Clock 810/875 MHz  1114 MHz 1480 MHz
FP64 GFLOPs 1680 213 5304
Texture Units 240 192 224
Memory Interface 384-bit GDDR5 384-bit GDDR5 4096-bit HBM2
Memory Size Up to 12 GB Up to 24 GB 16 GB
L2 Cache Size 1536 KB 3072 KB 4096 KB
Register File Size / SM 256 KB 256 KB 256 KB
Register File Size / GPU 3840 KB 6144 KB 14336 KB
TDP 235 Watts 250 Watts 300 Watts
Transistors 7.1 billion 8 billion 15.3 billion
GPU Die Size 551 mm² 601 mm² 610 mm²
Manufacturing Process 28-nm 28-nm 16-nm

*I've put in bold the aspects that I think are more important for gaming.

Also, videocardz says that this Tesla P100 is not using the full chip.



Please excuse my bad English.

Currently gaming on a PC with an i5-4670k@stock (for now), 16Gb RAM 1600 MHz and a GTX 1070

Steam / Live / NNID : jonxiquet    Add me if you want, but I'm a single player gamer.

Around the Network

My excitement died down after reading the specs ...



fatslob-:O said:
My excitement died down after reading the specs ...

Don't be so negative.

It's true that Pascal focuses more on Nvidia's computing business than Maxwell, but I'm confident it will still bring some serious improvements over the current cards.

 

*Edit: I'm working on a better table to be able to compare Pascal/P100 with previous Nvidia cards, unless I find something on the web.



Please excuse my bad English.

Currently gaming on a PC with an i5-4670k@stock (for now), 16Gb RAM 1600 MHz and a GTX 1070

Steam / Live / NNID : jonxiquet    Add me if you want, but I'm a single player gamer.

JEMC said:

2-Fan projects

This reminds me of that 3D NES emulator that came up a few weeks before. Doesn't look that bad.

I wonder if Nintendo could implement these things in their VC system, and enhance the roms with different modes such as these voxels or implement vector graphics?



@Twitter | Switch | Steam

You say tomato, I say tomato 

"¡Viva la Ñ!"

TomaTito said:
JEMC said:

2-Fan projects

This reminds me of that 3D NES emulator that came up a few weeks before. Doesn't look that bad.

I wonder if Nintendo could implement these things in their VC system, and enhance the roms with different modes such as these voxels or implement vector graphics?

It would be cool if they did. But since it's Nintendo, the ywon't.



Please excuse my bad English.

Currently gaming on a PC with an i5-4670k@stock (for now), 16Gb RAM 1600 MHz and a GTX 1070

Steam / Live / NNID : jonxiquet    Add me if you want, but I'm a single player gamer.

Ok, so here's a kind of recap about Pascal and the Tesla P100.

This is Tesla P100:

And this is Pascal's P100 core diagram:

First things first: The chip has 8x512-bit memory controllers for a total of 4096-bit memory bus width. This is because the P100 uses 4xHBM2 memory stacks instead of GDDR5.

The GPU itself is compromised of 6 GPCs (Graphics Processor Cluster), with each cluster comprised of 10 SMs (Streaming Multiprocessor). This is a departure from Maxwell and its 4 SMs for each GPC, of which also had 6 GPUnits.

Now let's take a closer look at a SM unit

As you can see, each Pascal SM is comprised of 64 CUDA cores (or shaders) and 4 Texture Units, whereas Maxwell also had 4 Texture Units but the Shader Units count was twice as much with 128.

That's because Nvidia has focused a lot on the computing side of things, improving its single precision (FP32) performance but specially it's Double Precision (FP64) results, achieving a whooping 1:2 ratio between single and double precision compared to the paltry 1:32 ratio of Maxwell (that was beated by Kepler).

Finally, the chip features 14MB of shared register files and 4MB of L2 cache.

And here is a table comparing the Tesla P100 with the previous Tesla products and top-end cards:

Tesla P100 Tesla M40 GTX Titan X Tesla K40 GTX Titan Black
GPU GP100   GM200 GM200   GK110B GK110B
Architecture Pascal   Maxwell 2 Maxwell 2   Kepler

Kepler

GPC 6   6 6   5 5
SMs 56   24 24   15 15
CUDA Cores/SM 64   128 128   192 192
CUDA Cores 3584   3072 3072   2880 2880
Texture Units/SM 4   4 4   16 16
Texture Units 224   192 192   240 240
ROPs -   96 96   48 48
Core Clock 1328 MHz   948 MHz 1000MHz   745 MHz 889MHz
Boost Clock 1480 MHz    1114 MHz 1075MHz   810/875 MHz 980MHz
Memory Type HBM2   GDDR5 GDDR5   GDDR5 GDDR5
Memory Clock 1,4GHz   6GHz 7GHz   6GHz 7GHz
Memory Bus Width 4096-bit   384-bit 384-bit   384-bit 384-bit
Memory Bandwidth 720GB/sec   288GB/sec 336GB/sec   288GB/sec 336GB/sec
VRAM 16 GB   12 GB 12GB   6 GB 6GB
TDP 300 Watts   250 Watts 250W   235 Watts 250W
Transistor Count 15.3 Billions   8 Billions 8 Billions   7.1 Billions 7.1 Billions
Single Precision FP32 10.6 TFLOPS   6.8 TFLOPS 6.14 TFLOPS   4.29 TFLOPS 5.1 TFlops
FP64 1/2 FP32   1/32 FP32 1/32 FP32   1/3 FP32 1/3 FP32
Double Precision FP64 5.3 TFLOPS   213 GFLOPS -   1.43 TFLOPS -
Manufacturing Process TSMC 16nm   TSMC 28nm TSMC 28nm   TSMC 28nm TSMC 28nm
GPU Die Size 610 mm²   601 mm² 601 mm²   551 mm² 551 mm²

Of course, anyone expecting the upcoming GTX 1080/1070 (or whatever they are called) to be as big as P100 will be disappointed. Expect the 1080/1070 to be based on a GP 104 chip with the number of Graphics Processor Clusters reduced to 4, and the 1060/1050 to have 2 GPCs.



Please excuse my bad English.

Currently gaming on a PC with an i5-4670k@stock (for now), 16Gb RAM 1600 MHz and a GTX 1070

Steam / Live / NNID : jonxiquet    Add me if you want, but I'm a single player gamer.