By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Nintendo - Looking At Tegra Xavier -- The Next-Gen Switch Chip

Soundwave said:
Pemalite said:

Not to mention it's a theoretical number anyway.

The Tegra X2/Pascal should be roughly 50% faster than X1/Maxwell. Maybe a little more.
It could be doubled if Tegra X2/Pascal was built at 16/14nm Finfet and had the clockrates driven up more.

Xavier we don't have benchmarks for yet. Xavier has doubled the functional units over Tegra X2/Pascal.
But the real kicker is... What are the clocks?
If the clocks are lower than the Pascal based Tegra we might only see another 50% gain.

Or we could see a 150% gain. We don't know yet.

Nvidia is saying one Tegra Xavier chip at matches the Drive PX2, which is 2 Tegra X2s + 2 Pascal based GPUs (so 4 processors total) at 8 TFLOPS. 

Well, those Volta cores must be radically different then what GPUs have been packing for a long time - 512 cores with 2 operations per cycle for FP32 would need to run at insane clocks to achieve 8TFLOPS...as in 7500+ MHz insane.

For reference, made on similar, if not same 16nm proccess as advertised for Xavier, current nVidia tech needs 1920 cores running at 2000+ MHz for similar performance (overclocked GTX1070).



Around the Network
HoloDust said:
Soundwave said:

Nvidia is saying one Tegra Xavier chip at matches the Drive PX2, which is 2 Tegra X2s + 2 Pascal based GPUs (so 4 processors total) at 8 TFLOPS. 

Well, those Volta cores must be radically different then what GPUs have been packing for a long time - 512 cores with 2 operations per cycle for FP32 would need to run at insane clocks to achieve 8TFLOPS...as in 7500+ MHz insane.

For reference, made on similar, if not same 16nm proccess as advertised for Xavier, current nVidia tech needs 1920 cores running at 2000+ MHz for similar performance (overclocked GTX1070).

I think it's 4 TFLOP at FP32 (8 TF at FP16), I agree 8 TFLOPS would be insane.

Still 4 TFLOP performance from 20 watts would be very, very impressive. Nintendo could slash that even further in half and still get 2 TFLOP performance from a 10 watt chip. Current Switch runs at 15 watts docked. 



HoloDust said:
Soundwave said:

Nvidia is saying one Tegra Xavier chip at matches the Drive PX2, which is 2 Tegra X2s + 2 Pascal based GPUs (so 4 processors total) at 8 TFLOPS. 

Well, those Volta cores must be radically different then what GPUs have been packing for a long time - 512 cores with 2 operations per cycle for FP32 would need to run at insane clocks to achieve 8TFLOPS...as in 7500+ MHz insane.

For reference, made on similar, if not same 16nm proccess as advertised for Xavier, current nVidia tech needs 1920 cores running at 2000+ MHz for similar performance (overclocked GTX1070).

There are 4 chips remember. 2x Tegra SoC's each with 256 Cuda cores each and two pascal powered GPU's in an MXM form factor.

http://www.anandtech.com/show/9903/nvidia-announces-drive-px-2-pascal-power-for-selfdriving-cars

The image nVidia used had two Geforce 980 MXM cards.

The Geforce 980M has 1536 Cuda cores. So that would mean 3072 Cuda cores for the discreet GPU's. Then another 512 total with the two Tegra chips for a total of 3584 cuda cores.

Now 3584 Cuda cores * Instructions * Clock rate = flops.
3584 * 2 * 1125mhz = 8.064 Teraflops.

The overall package has a 250W TDP.

No way is a single Xavier chip matching that.




www.youtube.com/@Pemalite

Soundwave, you are the most tiring and constantly unsatisfied Nintendo fan on VGC. Could you please just buy a PS4 and leave Nintendo behind?



X2 is perfect for Switch and should be included asap. But X3 architecture may be too different to ensure full backwards compatibility, so the X2 is the only way to go.



Around the Network
okr said:
Soundwave, you are the most tiring and constantly unsatisfied Nintendo fan on VGC. Could you please just buy a PS4 and leave Nintendo behind?

I have a PS4 (Pro) actually. And Xbox One S. And Switch. And Wii U. And 3DS (four different versions of it actually, launch black, Fire Emblem Awakening blue, Zelda LBW Gold and Zelda MM n3DS). And Vita. 

I enjoy them all. 

I have fun speculating about the industry and have been right in the past, I was talking about Tegra X1 two years ago here too. 

If you don't want to talk about future tech, no one is forcing you to come into this thread. 



Soundwave said:
HoloDust said:

Well, those Volta cores must be radically different then what GPUs have been packing for a long time - 512 cores with 2 operations per cycle for FP32 would need to run at insane clocks to achieve 8TFLOPS...as in 7500+ MHz insane.

For reference, made on similar, if not same 16nm proccess as advertised for Xavier, current nVidia tech needs 1920 cores running at 2000+ MHz for similar performance (overclocked GTX1070).

I think it's 4 TFLOP at FP32 (8 TF at FP16), I agree 8 TFLOPS would be insane.

Still 4 TFLOP performance from 20 watts would be very, very impressive. Nintendo could slash that even further in half and still get 2 TFLOP performance from a 10 watt chip. Current Switch runs at 15 watts docked. 

Nah, PX2 is rated at 8TFLOPS FP32...so for 512 cores to pull of that they would need to run @7800MHz...even if Volta's cores can do 2 fused multiply-adds instead of 1 per cycle, that's still 3900MHz...stil insane.

But, what's more, let's say Volta's cores are indeed quite different then previous GPUs...curently you need 150+ W to achieve that sort of performance on 16nm - even with that 12nm TSMC is offering them for Volta there's no chance you can get 8TFLOPS out of 20W SoC, let alone on 16nm Xavier is supposed to be built on.

Pemalite said:
HoloDust said:

Well, those Volta cores must be radically different then what GPUs have been packing for a long time - 512 cores with 2 operations per cycle for FP32 would need to run at insane clocks to achieve 8TFLOPS...as in 7500+ MHz insane.

For reference, made on similar, if not same 16nm proccess as advertised for Xavier, current nVidia tech needs 1920 cores running at 2000+ MHz for similar performance (overclocked GTX1070).

There are 4 chips remember. 2x Tegra SoC's each with 256 Cuda cores each and two pascal powered GPU's in an MXM form factor.

http://www.anandtech.com/show/9903/nvidia-announces-drive-px-2-pascal-power-for-selfdriving-cars

The image nVidia used had two Geforce 980 MXM cards.

The Geforce 980M has 1536 Cuda cores. So that would mean 3072 Cuda cores for the discreet GPU's. Then another 512 total with the two Tegra chips for a total of 3584 cuda cores.

Now 3584 Cuda cores * Instructions * Clock rate = flops.
3584 * 2 * 1125mhz = 8.064 Teraflops.

The overall package has a 250W TDP.

No way is a single Xavier chip matching that.

Yeah, that's what I've been trying to say all along. 20W for 8TFLOPS, all from 512 cores on 16nm (even on 12nm)...yeah, sure.

Now, I think confusion comes from 20 DLTOPS, which are measured for 8-bit integer - cause that's what they said Xavier will match compared to PX2. For example, Tesla P4 is rated at 22DLTOPS, having 2560 cores that run @1063MHz boosted, which is quite slow for GP104 part, and achieves 5.5TFLOPS in 50-75W.

But I honestly don't see how even that can be reduced to 20W, even if it's 12nm TSMC (which I really doubt is true 12nm in the first place).

So while nVidia might pull off SoC that can indeed deliver 20DLTOPS at 20W, I really doubt its FLOPS rating would be anywhere near PX2.



Soundwave said:
HoloDust said:

I think it's 4 TFLOP at FP32 (8 TF at FP16), I agree 8 TFLOPS would be insane.

Still 4 TFLOP performance from 20 watts would be very, very impressive. Nintendo could slash that even further in half and still get 2 TFLOP performance from a 10 watt chip. Current Switch runs at 15 watts docked. 

Too impressive.

Theres no way that Nvidia can do that.

Switch uses like 15-20watts when its docked and its about 400 Gflops.

4 Tflops (at 20watts) would need a increase of 10 times the performance/watt.

Nvidia doesnt have such a chip.

Nvidia wont have such a chip.... for atleast another 10years or something like that.

 

Everytime Nvidia does some crazy sounding PR, someone on a forum always runs away with it, and blows it out of proportions.

It happends every time. Not sure why, when they so often fail to live up to their insane PR talk, esp with their Mobile chips.

.



HoloDust said:

So while nVidia might pull off SoC that can indeed deliver 20DLTOPS at 20W, I really doubt its FLOPS rating would be anywhere near PX2.

Deep Learning Operations (DLTOPS) is a theoretical number anyway. It's probably even more irrellevent to gaming performance than tflops.

JRPGfan said:

Everytime Nvidia does some crazy sounding PR, someone on a forum always runs away with it, and blows it out of proportions.

It happends every time. Not sure why, when they so often fail to live up to their insane PR talk, esp with their Mobile chips.

Happens with AMD as well. I mean... Everyone thought Ryzen was going to smash Intel out of the Park.
The same happened with Bulldozer, everyone thought AMD was "returning" with their new "FX" chips.

If anyone had paid attention to the engineering samples and listened to experts, they would have known better.




www.youtube.com/@Pemalite

Next gen Switch will most likely be on a process node smaller than 16/14nm, I expect a newer chip than Xavier in '21 & beyond.

Now the X2 chip, (power efficient evolution of X1) seems perfect for HW refreshes. Question is, will it come sooner than later (ie original DS phat, phased out after 1 yr) and become the standard switch tech for the rest of the gen.

And wave, stop bragging about predictions lol... it was easy to see the direction Nintendo was going when they showed the Wii U gamepad's off tv function

"se7en7thre3 on 01 August 2013
....I think Ninty will be first out of the gate in terms of ditching the traditional tethered home console. They started with Wii U not necessarily dependent on the TV, I believe next will be a truly mobile/flexible solution.

Maybe call it the "N7", something like this http://www.youtube.com/watch?v=k3KCTOkHVhA"

http://gamrconnect.vgchartz.com/post.php?id=5557723