By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - PC Discussion - A PS4 equivalent rig for $500.00?

No it is not possible.

 

Just buy a PS4 you know you want to.



Around the Network
errorpwns said:
Stefl1504 said:
If you can get your fingers on some used parts (the casing and stuff like that) you should be able to make a PC that comes close to the PS4 in power, but you will never get that much graphical prowess out of the hardware than a PS4 does.


False. You know absolutely jack about hardware then if you're wanting to make a rash claim like that. 

LET ME LIVE IN MY DELUSSION THAT MY 570€ PC FROM LAST YEAR IS ON PAR WITH CURRENT GEN, SO THAT I DON'T NEED TO BE SAD ABOUT FUKKEN PS4 BEING SOLD OUT EVERYWHERE!



Zekkyou said:

The modders for skyrim have done some pretty crazy shit.

 

 

Edit: I should note those screens are rather overkill lol. To make an open world game look that good you need an awful lot of different mods.


They really have made some insane things.
Been some insane graphics mods for Crysis and Grand Theft Auto IV too.

Here is what GTA IV could look like:

Crysis 1 realism:


Oblivion can get pretty impressive too, considering it was plagued with frame drops, low resolution blurry textures and pop-in on the consoles:



Can't wait for Star Citizen with it's open mods support.

existenz2 said:

No it is not possible.

If you spent any time to read some prior posts, you would already know you have been proven wrong.



--::{PC Gaming Master Race}::--

Pemalite said:
Zekkyou said:

The modders for skyrim have done some pretty crazy shit.

 

 

Edit: I should note those screens are rather overkill lol. To make an open world game look that good you need an awful lot of different mods.


They really have made some insane things.
Been some insane graphics mods for Crysis and Grand Theft Auto IV too.

Here is what GTA IV could look like:

Crysis 1 realism:


Oblivion can get pretty impressive too, considering it was plagued with frame drops, low resolution blurry textures and pop-in on the consoles:



Can't wait for Star Citizen with it's open mods support.


I wonder if next-gen games will ever look that good.



Pemalite said:

Your entire first part of the argument is that the CPU is powerfull because the GPU can assist in processing.
Well here is a news flash. - The PC can do it too, in-fact it's been doing it longer than the consoles, but that doesn't actually make the CPU powerfull, that just means the CPU is weak and is getting assistance from another processor.
In the PC space they do it to conserve energy.

And another fact is, Ram is a temporary fast form of storage, it DOES NOT do any form of processing, you could have 1024GB/s of memory bandwidth, but if you don't have the compute resources to make any use of it... Then it would be utterly and completely pointless.
Besides the PS4's memory bandwidth has to be split between the CPU, GPU and what-ever other processors the PS4 uses, that 176GB/s number? It's going to be much lower than that for actuall games.
You basically took the number Sony advertised and ran with it, claiming it as the holy grail, the reality is far different that that I'm afraid.
The same thing occured with the Playstation 3 and the pretty horrible performing Cell.

As for PCI-E's bandwidth, especially PCI-E 3.0 16x isn't a problem, not even really for compute, there is a reason why there is GDDR5 Memory next to the GPU so that the PCI-E interface isn't hammered hard and constantly, but that's simple logic.
Besides, the GDDR5 Ram in my PC is faster than the PS4's, the PC will also move to GDDR6 and/or maybe XDR2 Ram soon, possibly within the next GPU release cycle.
You also have 512bit memory configurations in the PC rather than the mid-range 256bit bus the PS4 employs.

As for the memory wall, that has to be the funniest thing I have read all day.
So what you're saying is that you would see no difference in performance moving from a Core i3 (Which is essentially how fast an 8 core Jaguar would be.) to a Core i7 6-core, 12 threaded processor if you used the same Ram.
I have the ability to disable my cores and hyper threads, wan't me to disable them more to represent a Dual-Core? I can assure you the performance gains are real and stupidly massive by having all functional units enabled.

I also have a Phenom 2 x6 1090T 6 core processor in another PC and the motherboard supports DDR2 and DDR3 Ram, the CPU also has a DDR2 and DDR3 memory controller, wan't to know what I discovered a few years back? There was ZERO and I mean ZERO differences in performance between DDR2 800mhz Ram and DDR3 1866mhz Ram in gaming on the PC.

CPU's have Caches to keep data that's required for processing near the compute engines, which is both stupidly fast and low latency.
The CPU also utilises various types of predictors so that it can predict the data it is going to require ahead of time, this prevents a resource stall where if the CPU doesn't have the data it needs in the Cache it has to travel all the way down to system memory and regardless of how fast or low latency the system memory is, it will NEVER make up for the bandwidth and latency differences between the L4/L3/L2 and L1 caches.
Simple fact of the matter is, Jaguar's predictors are relatively simple, it's going to be doing allot of slow trips down to the GDDR5 Ram, possibly wasting millions/billions of cycles.

This may also hurt your pride for the Playstation 4 a bit, but... Without the PC and the PC's technology that PC's gamers have essentially "Funded" the research and development for... You wouldn't have the Playstation 4 at all, not how it is today.
You would more than likely ended up with an ARM based solution that a mobile phone uses instead.


First, I did not said anywhere that RAM does any kind on calculation. Second, I don't seem to understand the PCI-E bandwidht problem and the massive performace hit that coping data from CPU to GPU creates. When you work with, let's say, CUDA, sometimes you forget about doing some work that would be 200x faster on the GPU just because the amount of time to pass and get back the date would take longer than doing it slowly on the CPU. The thing here is balanced performance, splitting worload between the two. An unified memory architecture will allow PS4 do physics calculations that won't be on PC versions simply because it can jut get the data and calculate on GPU without data passing costs. You have data, it's on CPUs memory space (DDR3), then you must pass it to the GPU memory space (GDDR5). Now try it using PCI-E and see if it looks fast for anything. I can already garantee you, if the data is small, it doesn't matter if the GPU does the operation 300x faster, you will have to use the CPU. The biggest prove of this problems is looking at what NVidia is doing with their GPU tech for clusters. Their biggest problems is that a node, when splitting work in a CUDA program had to: load data on CPU, send to GPU. Get it back on CPU and then send via ethernet/infiniband to all the other nodes that would do the same. Now, they have the hability to access the network interface directly form the GPU, so a GPU can actually send data to other without passing it to the CPU. 

Second, the "memory wall". If you don't believe, reasearch on Google or ask a HPC specialist. This is the single reason that supercomputing migrated form the traditional supercomputer PRAM model for the distributed model we see on clusters. And is one of the primary reasons for Cloud Computing. More machines, more memory bandwidth. On old supercomputers (current ones are actually clusters) you increaser the core count and tried your best to find a better memory technology, but that race was lost long ago. Cache is there to help, not to solve. It all depends if the ammount of data you need will fit on it or not. If it won't, you will have to refill it anyway. Having more cores only makes it worse. And cores only help if the code you are running is optimized to use them. A lot of games are 32-bit executables, just to show how they aren't that worried about optimization now (3GB RAM limit for the executable).

Now, on PS4, you know what you have under the hood. The developer knows exactly. You can even predict and interfere in the way things will be on cache to gain performance. You will have games using heavy parallel processing in 8-cores plus offloading some physics to GPU (SPH it's a good example). And believe me, heavy parallel processing is something they aren't doing on PC games yet. And the GPU part it's even worse, it's much more harder to write code that runs calculation well on any GPU than on a specific one, simply because the compilers suck (I mean CUDA tools and OpenCL tools). For the CPU part, you have amazing compilers (partially thanks to Intel works on the 70s and 80s with matematicians). And besides that, it is currently a mess. NVidia and AMD can't create a common GPU API/Framework for doing calculations and we have to watch OpenCL AMD GPUs and CUDA NVidia GPUs (they work with OpenCL too, but NVidia only creates tools for CUDA and they are far better. Current dev tools for AMD GPUs simply suck terribly bad compared to CUDA). And that will make one hell of a difference.



Around the Network
prayformojo said:

I've been a console gamer since the early 80's when I was just a little kid, so I don't know ANYTHING about PC gaming. I have a Steam account, with a few indie titles and free games, but that's all I can play because it's running on a store bought laptop that's a few years old.

I am really interested in going PC this generation, but only have about $500.00 to spend on a rig. I wanna know if it's possible to build something that can game at a stable 1080p/60fps...and that can run AAA titles at least CLOSE to PS4 quality. 

Does anybody think it can be done?


For US$ 500 you won't have a PS4 equivalent. Currently, no PC is equivalent to it because of its memory architecture and all the optimizations games will have specifically for it (games for PC are mostly bad optimizated). What I do now is having one console (PS3 and next PS4) and a gaming PC, since you can get a lot of multiplats there for good prices. Get some good parts and focus on a good motherboard that can endure some years (so you can upgrade other parts without worring about the MB).

I like MSI and Gigabyte MB, the Gigabyte 990FXA-UD5 is a good one (MSI has a model with the same chipset that I prefer). The CPU could be one of AMD newer ones, chose accordling to your budget. Put some 16GB of DDR3 RAM (Corsair), HD buy only from Western Digital (Black caviar, it's better than the Green and Blue models). The power source could be from Corsair too, the model will depend on the parts you choose. I mounted mine on a HAF-912 Plus from CoolerMaster, it is decently big and easy to mount. Certify to chose well the case and the MB, because changing one of them is a PITA since you have to remove everything and put it back again. The rest of the money, you put all on the GPU, I mean all, since that will be the part limiting your gaming performance. I can't recommend you AMD GPUs because I always buy NVidia (I don't like AMD video drivers, NVidia in general has better dev tools and a more well engineered video driver. Plus, I need CUDA). I have a GTX650ti and it runs everything pretty well, 1080p at 60fps, not at ultra, but always on high settings (Metro Last Light, AC4, Lost Planet 3, RE6, Hitman: Absolution, Sleeping Dogs, just to name a few). Once again, get a MSI graphics card, I highly recommend. And buy a X360 gamepad, it sucks to play some game genres on keyboard (fighting, platform, etc), it's cheap and works well (I use 2 wired X360 gamepads, because I didn't wanted the hassle of batteries or recharging battery packs and I sit at less than 2m from my PC when gaming. Besides that, the X360 wired controllers have a long, looooong cord).



I'm going to break all this down as it will be easier.

torok said:

Second, I don't seem to understand the PCI-E bandwidht problem and the massive performace hit that coping data from CPU to GPU creates. When you work with, let's say, CUDA, sometimes you forget about doing some work that would be 200x faster on the GPU just because the amount of time to pass and get back the date would take longer than doing it slowly on the CPU.

Glad you cleared up the point of you not understanding the PCI-E bandwidth problem.

Some datasets aren't terribly bandwidth or latency sensitive, some are, the ones that are reside next to the GPU's GDDR5 Ram, this would range from textures to geometry data for the tessellators to sheer number crunching compute.

torok said:

An unified memory architecture will allow PS4 do physics calculations that won't be on PC versions simply because it can jut get the data and calculate on GPU without data passing costs. You have data, it's on CPUs memory space (DDR3), then you must pass it to the GPU memory space (GDDR5). Now try it using PCI-E and see if it looks fast for anything. I can already garantee you, if the data is small, it doesn't matter if the GPU does the operation 300x faster, you will have to use the CPU.

Well, the proof is in the pudding in regards to Physics calculations.
If you take the Unreal 4 engine tech demo, the PC was able to have far more partical physics calculations than the Playstation 4, Physics isn't terribly bandwidth sensitive, it can be done completely on a PC's GPU and reside next to the GPU's GDDR5 memory.
Heck Ageia's first Physics card didn't even use PCI-E Express and only had 128Mb of GDDR3 Ram on a 128bit interface which is a testiment to how lean Physics really is on bandwidth requirements.
Asus eventually released a PCI-E variant but that was still only PCI-E 1x 3.0.

The limiting factor with Physics is and has been compute resources, it can be stupidly parallel hence why GPU's are well suited to that sort of processing.

torok said:

Second, the "memory wall". If you don't believe, reasearch on Google or ask a HPC specialist. This is the single reason that supercomputing migrated form the traditional supercomputer PRAM model for the distributed model we see on clusters. And is one of the primary reasons for Cloud Computing. More machines, more memory bandwidth. On old supercomputers (current ones are actually clusters) you increaser the core count and tried your best to find a better memory technology, but that race was lost long ago.

You're talking servers and super computers, that's your problem, consoles and the Desktop PC are far removed from that.
I still stand by that we are far from a memory wall in the PC space, I have yet to encounter such a thing and I have one of the fastest consumer CPU's that money can buy, I saw gains from having my 6 cores/12 threads @ 3.2ghz to 4.8ghz, I saw gains moving from AMD's 6 core Phenom 2 x6 to an AMD FX 8-core 8120 @ 4.8ghz all with the same DRAM speeds.
Granted my Core i7 has Quad-Channel DDR3 that will help somewhat in some scenarios, but even with half the bandwidth the differences were neglible, Intel's Predictors are fantastic.

Plus, next year with Haswell-E we will have DDR4.


torok said:

Cache is there to help, not to solve. It all depends if the ammount of data you need will fit on it or not. If it won't, you will have to refill it anyway. Having more cores only makes it worse. And cores only help if the code you are running is optimized to use them. A lot of games are 32-bit executables, just to show how they aren't that worried about optimization now (3GB RAM limit for the executable).


Cache helps to solve the bandwidth and latency deficit of having the CPU to grab data from system memory.
L4 cache like in Intels Iris Pro is allot more flexible in that regard as it has 128Mb of the stuff.

As for the 3Gb memory limit, that's partly true, most games are starting to support 64bit now anyway, heck FarCry did back when the Playstation 2 was in it's prime.

torok said:

Now, on PS4, you know what you have under the hood. The developer knows exactly. You can even predict and interfere in the way things will be on cache to gain performance. You will have games using heavy parallel processing in 8-cores plus offloading some physics to GPU (SPH it's a good example).



I agree with the first part.
But you are completely wrong on the second.

Here it is in bold so that it sinks in...
The Playstation 4 and Xbox One do NOT have 8 cores dedicated to gaming, they reserve a core or two for the OS and other tasks.

A PC can and does offload game rendering and Physics calculations to GPU's, no questions asked, at 4k or greater resolutions with all the settings on max, something the Playstation 4 could never ever hope to possibly achieve.
Here is why: The Playstation 4 doesn't have enough compute resources.

However with that in mind, I am willing to eat my hat if you can get a Playstation 4 to play Battlefield 4 in Eyefinity at 7680x1440 with everything on Ultra and achieve 60fps if you really think it's the be-all and end-all of platforms it certainly could do that right? (Currently it's 900P and High-settings. - GOOD LUCK!)

torok said:

 

 And the GPU part it's even worse, it's much more harder to write code that runs calculation well on any GPU than on a specific one, simply because the compilers suck (I mean CUDA tools and OpenCL tools). For the CPU part, you have amazing compilers (partially thanks to Intel works on the 70s and 80s with matematicians). And besides that, it is currently a mess. NVidia and AMD can't create a common GPU API/Framework for doing calculations and we have to watch OpenCL AMD GPUs and CUDA NVidia GPUs (they work with OpenCL too, but NVidia only creates tools for CUDA and they are far better. Current dev tools for AMD GPUs simply suck terribly bad compared to CUDA). And that will make one hell of a difference.

There are other alternatives other than Cuda (Only locked to nVidia) and OpenCL. This is the PC, you can make your own if you felt inclined.
Mantle is coming it will be a game changer.
Whatever nuances that multi-platform developers make for console is going to translate into real gains for AMD's Graphics Core Next GPU's on the PC.

torok said:
I like MSI and Gigabyte MB, the Gigabyte 990FXA-UD5 is a good one

Don't get the Gigabyte 990FX-UD5, it's a horrible motherboard with bad voltage regulation and vdroop. - Personal experiences with over a dozen boards.
The UD7 wasn't much better, the UD3 is downright crap because of the lack of mosfet and VRM cooling.

And AM3+ is a bad investment anyway, Socket FM2+ is the platform to go for as that's AMD's focus as Steamroller is cometh to that socket, whist AM3+ isn't getting any updates for at-least untill 2015.
In-fact I can't recommend Intel enough because AMD has woefull single threaded performance and are extremely power hungry.



--::{PC Gaming Master Race}::--

a 150$ CPU outperforms easily the PS4 CPU (i5 3xxx)
a 220$ is on par with PS4 GPU (660 Ti OC 3GB)
a 80 $ 8GB DDR3 + 3GB GDDR5 of the GPU outperforms easily 8GDDR GB RAM of PS4
a 120$ motherboard (i got Asus PZ whatever gen3)
the other stuff you can get dirty cheap or for free

thats less than 600$ and i assure you can play 1080p 50fps every game out there from now to a couple of years



Pemalite said:

Glad you cleared up the point of you not understanding the PCI-E bandwidth problem.


Some datasets aren't terribly bandwidth or latency sensitive, some are, the ones that are reside next to the GPU's GDDR5 Ram, this would range from textures to geometry data for the tessellators to sheer number crunching compute.

Well, the proof is in the pudding in regards to Physics calculations.
If you take the Unreal 4 engine tech demo, the PC was able to have far more partical physics calculations than the Playstation 4, Physics isn't terribly bandwidth sensitive, it can be done completely on a PC's GPU and reside next to the GPU's GDDR5 memory.
Heck Ageia's first Physics card didn't even use PCI-E Express and only had 128Mb of GDDR3 Ram on a 128bit interface which is a testiment to how lean Physics really is on bandwidth requirements.
Asus eventually released a PCI-E variant but that was still only PCI-E 1x 3.0.

The limiting factor with Physics is and has been compute resources, it can be stupidly parallel hence why GPU's are well suited to that sort of processing.

You're talking servers and super computers, that's your problem, consoles and the Desktop PC are far removed from that.
I still stand by that we are far from a memory wall in the PC space, I have yet to encounter such a thing and I have one of the fastest consumer CPU's that money can buy, I saw gains from having my 6 cores/12 threads @ 3.2ghz to 4.8ghz, I saw gains moving from AMD's 6 core Phenom 2 x6 to an AMD FX 8-core 8120 @ 4.8ghz all with the same DRAM speeds.
Granted my Core i7 has Quad-Channel DDR3 that will help somewhat in some scenarios, but even with half the bandwidth the differences were neglible, Intel's Predictors are fantastic.

Plus, next year with Haswell-E we will have DDR4.


Cache helps to solve the bandwidth and latency deficit of having the CPU to grab data from system memory.
L4 cache like in Intels Iris Pro is allot more flexible in that regard as it has 128Mb of the stuff.

As for the 3Gb memory limit, that's partly true, most games are starting to support 64bit now anyway, heck FarCry did back when the Playstation 2 was in it's prime.

I agree with the first part.
But you are completely wrong on the second.

Here it is in bold so that it sinks in...
The Playstation 4 and Xbox One do NOT have 8 cores dedicated to gaming, they reserve a core or two for the OS and other tasks.

A PC can and does offload game rendering and Physics calculations to GPU's, no questions asked, at 4k or greater resolutions with all the settings on max, something the Playstation 4 could never ever hope to possibly achieve.
Here is why: The Playstation 4 doesn't have enough compute resources.

However with that in mind, I am willing to eat my hat if you can get a Playstation 4 to play Battlefield 4 in Eyefinity at 7680x1440 with everything on Ultra and achieve 60fps if you really think it's the be-all and end-all of platforms it certainly could do that right? (Currently it's 900P and High-settings. - GOOD LUCK!)

wn if you felt inclined.
Mantle is coming it will be a game changer.
Whatever nuances that multi-platform developers make for console is going to translate into real gains for AMD's Graphics Core Next GPU's on the PC.



Man, that is a lot of text. Let's go point-by-point:

You are not understading the problems with GPU intensive processing and the problem in the bottleneck. Currently the single massive bottleneck is on passing data from CPU to GPU and all currently uses of GPU calculations are the ones that aren't affected by this problem. As you said "Some datasets aren't terribly bandwidth or latency sensitive, some are". Thats sums up all the point of discussion.

Don't assume to that all massive parallel operations are easy to run on a GPU. Conditional statements or recursive algorithm destroy GPU calculation performance and it's not easy to remove this problems. So we usually deal with more complex algorithms on GPU and still having to worry about distributing your data set is far from a nice experience. That even account for physics, SPH being a good example of problems with CPU-GPU data transfer (http://chihara.naist.jp/people/2003/takasi-a/research/short_paper.pdf, but newer resultas from NVidia are actually looking good now).

Don't believe in the memory wall problem if you prefer, even if it is basically accepted as a fact in all the parallel/masivelly parallel computing community. And that's what we have with 8 or 16 cores. These link: http://storagemojo.com/2008/12/08/many-cores-hit-the-memory-wall/ is pretty great and shows some nice points, even with cases of 16 core processors losing to 8 core ones in operations. There is a paper from John Von Neumann there pointing the problem, and that was in 1945. Is Von Neumann is wrong about it? Not much likely. You point for the use on traditional desktops, where normally the CPU isn't being heavily taxed. And when it is normally the answer is Turbo Boost and that disable cores to rise the clock of others, wich avoids the memory wall problem. I'm talking here about games using all of the cores to do intensive operations and that will hit the bottleneck faster than anything.

HPC is a good source of information about what will happen next. Simply because desktops from now use a similar approach to 80s supercomputers (many cores, PRAM). After that, HPC migrated from that to distributed systems. And that's what's next to regular computing. It's so similar that even GPUs aren't nothing new. NASA's vector computers used in space simulations (and others used in nuclear experiments simulation) are exactly what a GPU is and they existed in the 70s. GPU are just that tech reused for rasterization in the 80s that, by pure luck, were pretty good for graphics calculation because of its nature.

And of course a PS4 can't do "Battlefield 4 in Eyefinity at 7680x1440 with everything on Ultra and achieve 60fps" since it doesn't have the required raw power to rasterize all that pixels. More GPUs? Good luck splitting work between them without passing data. But PS4 will far exceed in physics calculation using both GPU and CPU to workload the task. And even in 1080p, it will look way better. And forgot BF4 now, since it's a unoptimized launch game and probably just a por of the PC version to grab money from people. 

And, last: "There are other alternatives other than Cuda (Only locked to nVidia) and OpenCL. This is the PC, you can make your own if you felt inclined. Mantle is coming it will be a game changer. Whatever nuances that multi-platform developers make for console is going to translate into real gains for AMD's Graphics Core Next GPU's on the PC.

No, there isn't any good and real alternative to CUDA. Even CUDA currently sucks. We don't need more alternatives, we need a unified one that runs well on all GPUs and has good developer tools. All the decent ones are NVidia only. AMD needs to up their game here. About Mantle, it's largely PR talk. Coming from AMD that has a terrible background in software tools it is even worse. All GPUs around here are totally different beasts, it's not easy to optimize code for them. Of course it will bring some improvement, but will be far from a game changer. If it was that easy, NVidia would already have it. In the research world, AMD basically never, I mean never, brings nothing new to the table. NVidia brought a lot of massive techs over the years, CUDA is currently the king in GPU computing and Optics is almost bringing real-time raytracing for us. That last one, is THE game changer for the next decade of graphics processing.



torok said:

You are not understading the problems with GPU intensive processing and the problem in the bottleneck. Currently the single massive bottleneck is on passing data from CPU to GPU and all currently uses of GPU calculations are the ones that aren't affected by this problem. As you said "Some datasets aren't terribly bandwidth or latency sensitive, some are". Thats sums up all the point of discussion.


You're missing the point completely. There is no "Bottleneck" it's a falacy dreamed up by console gamers who believe in the advertising and hype by their respective companies.
There is a reason why PC's use DDR3 Ram as system Ram and GPU's have GDDR5 Ram.

System Ram is typically 20% give or take lower latency (In overall clock cycles) than GDDR5 memory, this helps massively when there is a stall on the CPU.
GPU's however want bandwidth above all else, latency be damned, GDDR5 is perfect for this.

Grab some DDR3 1600mhz memory, that's 800mhz IO, which has a typical CAS latency of 8, that means it has a latency of 10ns.
Grab some DDR2 800mhz memory, that's 400mhz IO, which has a typical CAS latency of 4, this is also 10ns.

Now with GDDR5 the data rates are 4x faster than the IO clock instead of 2x, I.E. 5ghz GDDR5 is 1.25ghz x4 and would have a CAS Latency of 15.
15/(1.25 GHz) = 12 ns

So the latency of GDDR5 is 20% higher than DDR3, that's a big difference when the CPU doesn't have the data it requires in it's caches and the predicters weren't able to predict the data required for processing ahead of time, we are talking millions/billions of compute cycles here essentially going to waste.

torok said:

Don't assume to that all massive parallel operations are easy to run on a GPU. Conditional statements or recursive algorithm destroy GPU calculation performance and it's not easy to remove this problems. So we usually deal with more complex algorithms on GPU and still having to worry about distributing your data set is far from a nice experience. That even account for physics, SPH being a good example of problems with CPU-GPU data transfer (http://chihara.naist.jp/people/2003/takasi-a/research/short_paper.pdf, but newer resultas from NVidia are actually looking good now). 

That was done on a Geforce FX.
For one, those cards weren't PCI-E to begin with (There is a big difference between those interconnects and not just in relation to the total GB/s either!).
Secondly... They only used a single core processor and a crap one at that.
Even the Geforce 6800 and 7900 used a PCI-E to AGP bridge chip to enable PCI-E compatibility, with the downside of only maxing out at AGP 8x speeds. - Which is orders of magnitude slower than what we have today.
GPU's are also far more flexible today, The Geforce FX, 6 and 7 series weren't even designed for dedicated compute tasks in mind, I should know I helped write some shaders for Oblivion and Fallout in order to achieve better performance on the FX, Geforce 3 and 4 cards.


torok said:

Don't believe in the memory wall problem if you prefer, even if it is basically accepted as a fact in all the parallel/masivelly parallel computing community. And that's what we have with 8 or 16 cores. These link: http://storagemojo.com/2008/12/08/many-cores-hit-the-memory-wall/ is pretty great and shows some nice points, even with cases of 16 core processors losing to 8 core ones in operations. There is a paper from John Von Neumann there pointing the problem, and that was in 1945. Is Von Neumann is wrong about it? Not much likely. You point for the use on traditional desktops, where normally the CPU isn't being heavily taxed. And when it is normally the answer is Turbo Boost and that disable cores to rise the clock of others, wich avoids the memory wall problem. I'm talking here about games using all of the cores to do intensive operations and that will hit the bottleneck faster than anything. 


You keep parroting that, but that's not what I am seeing on my end on a consumer processor. (And I reiterate, it's essentially the fastest consumer-grade CPU money can buy.)
I can tax all my cores for something like Folding@Home or Bitcoin mining or Seti, I see significant gains when enabling more cores, there is no wall.
I also don't have turbo enabled, all 6 cores and 12 threads run at 4.8ghz as the nominal clock, with allowances for lower when not being utilised to conserve on power.


torok said:


And of course a PS4 can't do "Battlefield 4 in Eyefinity at 7680x1440 with everything on Ultra and achieve 60fps" since it doesn't have the required raw power to rasterize all that pixels. More GPUs? Good luck splitting work between them without passing data. But PS4 will far exceed in physics calculation using both GPU and CPU to workload the task. And even in 1080p, it will look way better. And forgot BF4 now, since it's a unoptimized launch game and probably just a por of the PC version to grab money from people.

Confirmed: PS4 is inferior to my PC.
One of my GPU's in my PC is faster than Two PS4 GPU's and a single 8-core Jaguar CPU and I have three of those GPU's.
I could dedicate two of those GPU's to Physics if a developer allowed me to do so, that's almost 9 Teraflops right there, the PS4 would literally scream "I'm a teapot!" in an attempt to process that much data.
And next month my rebuild will be done and I will have four Radeon R9 290X's under water, which would put me at about 12x the power of a PS4 with just my GPU's alone and twice the amount of GDDR5 (Which will also be faster) memory and 8x the system memory.

I also don't have to split work up.
AMD'sdrivers are actually incredibly good at handling that task all by itself even when I give it a generic compute job.


torok said:

 


No, there isn't any good and real alternative to CUDA. Even CUDA currently sucks. We don't need more alternatives, we need a unified one that runs well on all GPUs and has good developer tools. All the decent ones are NVidia only. AMD needs to up their game here. About Mantle, it's largely PR talk. Coming from AMD that has a terrible background in software tools it is even worse. All GPUs around here are totally different beasts, it's not easy to optimize code for them. Of course it will bring some improvement, but will be far from a game changer. If it was that easy, NVidia would already have it. In the research world, AMD basically never, I mean never, brings nothing new to the table. NVidia brought a lot of massive techs over the years, CUDA is currently the king in GPU computing and Optics is almost bringing real-time raytracing for us. That last one, is THE game changer for the next decade of graphics processing.

Of course a developer would not bother wasting their time optimising their game for every single GPU on the market, that would be asanine to even suggest such an endeaver, however...
The reality is nVidia and AMD have very similar feature sets when abstracted, they go about implementing specific features very differently but the end result is the same, then AMD and nVidia build an entire range of GPU's from that architectural feature set which is identical across the board minus varying amounts of memories and compute engines and other such things.

Then you have the API, there are several types of API's such as High-Level and Low-Level API's, it's the same for consoles too.
The Low level API's are closer to the metal and are incredibly efficient, however they are also harder to build a game for.
A high Level API is very easy to make a game for, but you sacrifice (obviously) speed.
Both interface with a driver and the driver interfaces with the hardware.

Historically the PC has only had High-Level API's since 3dfx's GLIDE API, consoles have a choice of both.
Battlefield 4 for instance uses a High-Level API on the Playstation 4, hence why it does not run at full 1080P with Ultra settings.

On the PC however, AMD has reintroduced the Low-level API in the forum of Mantle, which initially is only going to be for it's Graphics Core Next Architecture, of course it's open source so nVidia can adapt it's drivers to it too.

To put it in perspective though, nVidia and AMD's drivers have more lines of code than even the Windows Kernel, they're incredibly complex pieces of software in all respects, this is in order to squeeze out maximum performance and image quality whilst retaining complete backwards compatability with decades worth of software and games.

Mantle however will also reduce the Draw Cell overhead, AMD stated that even with an AMD FX underclocked to 2Ghz, that the Radeon R9 290X is still GPU bound.
Draw Cells account for a stupidly massive amount of a games current CPU usage.

Also Real-time Ray tracing isn't going to be here for a long long time, maybe in 3-4 console generations, heck movies like Lord of the Rings, Finding Nemo etc' uses a scanline renderer with photon mapping not ray tracing.

A mix of technologies is the best way to go about it.



--::{PC Gaming Master Race}::--