By using this site, you agree to our Privacy Policy and our Terms of Use. Close

Forums - Gaming Discussion - Shinen is using triple buffering for the gbuffer on Fast Racing Neo, bandwidth is not a problem

Pemalite said:


I personally prefer trapezoidal implemented anisotropic filtering.
Consoles typically use a variation of bilinear or trilinear filtering because it's cheap.

As for Texture resolution? More is always better, 16k is what the PC is heading towards with some texture mods looking bloody fantastic with it.
However, even if you only have moderate 4k resolution textures, if you have decent filtering they can look better than higher resolution textures.
Converesly, some games will "sprout" 16k textures but only use them sparingly like on the terrain, leaving other objects/surfaces at a lower resolution (Rage, anyone?).

However, keep in mind, I run my games higher than 1080P, I can see the flaws in games more readily than the mobile-phone, 2 decade old 1080P resolution, so better textures and better filtering is a must.

Hopefully, thanks to the Playstation 4 and mass production, the cost of high-density GDDR5 drops substantually so GPU's can blow out the VRAM counts and the industry can start to look towards 32k textures.

Anisotropic filtering is actually becoming more common in games ... 

A 16K texture is practically over 2GB! It's useless on everything but large surfaces. 

If your hoping for a large cost reduction then prepare to be disappointed since we have yet to figure out all of the large concerns with 13.5nm extreme ultraviolet lithography ... 



Around the Network
maxima64 said:
fatslob-:O said:
megafenix said:


jjaja, dont worry, i also have had same feelings as you and even today i still learn, but as i mentioned i am still speculating here so lets not dig u to much on the topic, but i will try to ask shinen thneselves just to see if i am lucky and get an answer, as long as i dont have it maybe i should avoid the topic for a while

If I were Curl, I'd be very offended by now ... 


i feel the same about you, the way you treat people is denigrating and even worse when they show you the proof under the nose and you still dont admit your mistakes

its ok, veryfing my answer again i think it can be misunderstood that way and for that I give my apologises to curl-6, not to mention that she and faslob were right about the triple framebuffers(i msunderstood shinen for triple gbuffers)so yea i can learn from my mistakes like many others can(well except some there that wont admit something like the defered rendering having better performance than forward though i demostrate it with good sources or that amd indeed had tiled based gpus)

 

to pemalite

yea, as i told you i already knew about the single pass multitexturing since the gamecube era, the gamecube was capable of 8 textures in a single pass, get it?

single pass and 8 multitextures(if we combine single pass and multi-texturing obviously we get single pass multitexturing, much like the 6 textures per pass example from ati smartshader); sorry for not bothering in putting the name of the technique, thought that was implicit and the ati smartshader was just an example of why single pass is bettter than multipass, whats important there is the concept of achieving a work by using the pipeline as less as possible, thats all what matters not the technology from years ago. Also as i mentioned eventhough defered rendering doesnt do things in a single pass is still far less than what forward rendering would require, forward requires a pass per light and defered need only two passes

here

https://hacks.mozilla.org/2014/01/webgl-deferred-shading/

"

Today, most WebGL engines use forward shading, where lighting is computed in the same pass that geometry is transformed. This makes it difficult to support a large number of dynamic lights and different light types.

Forward shading can use a pass per light. Rendering a scene looks like:

This requires a different shader for each material/light-type combination, which adds up. From a performance perspective, each mesh needs to be rendered (vertex transform, rasterization, material part of the fragment shader, etc.) once per light instead of just once. In addition, fragments that ultimately fail the depth test are still shaded, but with early-z and z-cull hardware optimizations and a front-to-back sorting or a z-prepass, this not as bad as the cost for adding lights.

To optimize performance, light sources that have a limited effect are often used. Unlike real-world lights, we allow the light from a point source to travel only a limited distance. However, even if a light’s volume of effect intersects a mesh, it may only affect a small part of the mesh, but the entire mesh is still rendered.

In practice, forward shaders usually try to do as much work as they can in a single pass leading to the need for a complex system of chaining lights together in a single shader.

Deferred Shading

Deferred shading takes a different approach than forward shading by dividing rendering into two passes: the g-buffer pass, which transforms geometry and writes positions, normals, and material properties to textures called the g-buffer, and the light accumulation pass, which performs lighting as a series of screen-space post-processing effects.

This decouples lighting from scene complexity (number of triangles) and only requires one shader per material and per light type. Since lighting takes place in screen-space, fragments failing the z-test are not shaded, essentially bringing the depth complexity down to one. There are also downsides such as its high memory bandwidth usage and making translucency and anti-aliasing difficult.

Until recently, WebGL had a roadblock for implementing deferred shading. In WebGL, a fragment shader could only write to a single texture/renderbuffer. With deferred shading, the g-buffer is usually composed of several textures, which meant that the scene needed to be rendered multiple times during the g-buffer pass

"

 

here is another easier example for forward rednering passes

http://docs.unity3d.com/Manual/RenderTech-ForwardRendering.html

"

Forward Rendering Path Details

This page describes details of Forward rendering path.

Forward Rendering path renders each object in one or more passes, depending on lights that affect the object. Lights themselves are also treated differently by Forward Rendering, depending on their settings and intensity.

Rendering of each object happens as follows:

  • Base Pass applies one per-pixel directional light and all per-vertex/SH lights.
  • Other per-pixel lights are rendered in additional passes, one pass for each light.
  • Base Pass

    Base pass renders object with one per-pixel directional light and all SH lights. This pass also adds any lightmaps, ambient and emissive lighting from the shader. Directional light rendered in this pass can have Shadows. Note that Lightmapped objects do not get illumination from SH lights.

    Additional Passes

    Additional passes are rendered for each additional per-pixel light that affect this object. Lights in these passes can’t have shadows (so in result, Forward Rendering supports one directional light with shadows).

"

 

thats all what i was trying to mean, with defered rendering you use less shader power and put less stress on the pipeline since its not being overused, the obvious disadventage is more bandwidth and other issues like with alpha blending



fatslob-:O said:
Pemalite said:


I personally prefer trapezoidal implemented anisotropic filtering.
Consoles typically use a variation of bilinear or trilinear filtering because it's cheap.

As for Texture resolution? More is always better, 16k is what the PC is heading towards with some texture mods looking bloody fantastic with it.
However, even if you only have moderate 4k resolution textures, if you have decent filtering they can look better than higher resolution textures.
Converesly, some games will "sprout" 16k textures but only use them sparingly like on the terrain, leaving other objects/surfaces at a lower resolution (Rage, anyone?).

However, keep in mind, I run my games higher than 1080P, I can see the flaws in games more readily than the mobile-phone, 2 decade old 1080P resolution, so better textures and better filtering is a must.

Hopefully, thanks to the Playstation 4 and mass production, the cost of high-density GDDR5 drops substantually so GPU's can blow out the VRAM counts and the industry can start to look towards 32k textures.

Anisotropic filtering is actually becoming more common in games ... 

A 16K texture is practically over 2GB! It's useless on everything but large surfaces. 

If your hoping for a large cost reduction then prepare to be disappointed since we have yet to figure out all of the large concerns with 13.5nm extreme ultraviolet lithography ... 

There is Anisotropic filtering... And then there is Anisotropic filtering.

2Gb? Raw maybe, but that's not going to be it's actual size when it comes to rendering time.

As for the GDDR5, economy of scale is what will bring the cost down, it happens with the volatile DRAM market constantly, when there is an abundance of DRAM chips (Costs of production be damned!) then prices drop, this is why DDR3 got to such crazy low prices.

megafenix said:

to palmatite

yea, as i told you i already knew about the single pass multitexturing since the gamecube era, the gamecube was capable of 8 textures in a single pass, get it?

single pass and 8 multitextures(if we combine single pass and multi-texturing obviously we get single pass multitexturing, much like the 6 textures per pass example from ati smartshader); sorry for not bothering in putting the name of the technique, thought that was implicit and the ati smartshader was just an example of why single pass is bettter than multipass, whats important there is the concept of achieving a work by using the pipeline as less as possible, thats all what matters not the technology from years ago. Also as i mentioned eventhough defered rendering doesnt do things in a single pass is still far less than what forward rendering would require, forward requires a pass per light and defered need only two passes


My name is "Pemalite" not "Palmatite".

And I honestly, can't be bothered replying to any great technical degree due to the circular mindset that you have employed in this argument.

However, I wall say this... Be mindfull of generalised blanketed statement of graphics techniques, relying on multiple passes in a game may be better than a single pass.



--::{PC Gaming Master Race}::--

Pemalite said:
fatslob-:O said:
Pemalite said:


I personally prefer trapezoidal implemented anisotropic filtering.
Consoles typically use a variation of bilinear or trilinear filtering because it's cheap.

As for Texture resolution? More is always better, 16k is what the PC is heading towards with some texture mods looking bloody fantastic with it.
However, even if you only have moderate 4k resolution textures, if you have decent filtering they can look better than higher resolution textures.
Converesly, some games will "sprout" 16k textures but only use them sparingly like on the terrain, leaving other objects/surfaces at a lower resolution (Rage, anyone?).

However, keep in mind, I run my games higher than 1080P, I can see the flaws in games more readily than the mobile-phone, 2 decade old 1080P resolution, so better textures and better filtering is a must.

Hopefully, thanks to the Playstation 4 and mass production, the cost of high-density GDDR5 drops substantually so GPU's can blow out the VRAM counts and the industry can start to look towards 32k textures.

Anisotropic filtering is actually becoming more common in games ... 

A 16K texture is practically over 2GB! It's useless on everything but large surfaces. 

If your hoping for a large cost reduction then prepare to be disappointed since we have yet to figure out all of the large concerns with 13.5nm extreme ultraviolet lithography ... 

There is Anisotropic filtering... And then there is Anisotropic filtering.

2Gb? Raw maybe, but that's not going to be it's actual size when it comes to rendering time.

As for the GDDR5, economy of scale is what will bring the cost down, it happens with the volatile DRAM market constantly, when there is an abundance of DRAM chips (Costs of production be damned!) then prices drop, this is why DDR3 got to such crazy low prices.

megafenix said:

to palmatite

yea, as i told you i already knew about the single pass multitexturing since the gamecube era, the gamecube was capable of 8 textures in a single pass, get it?

single pass and 8 multitextures(if we combine single pass and multi-texturing obviously we get single pass multitexturing, much like the 6 textures per pass example from ati smartshader); sorry for not bothering in putting the name of the technique, thought that was implicit and the ati smartshader was just an example of why single pass is bettter than multipass, whats important there is the concept of achieving a work by using the pipeline as less as possible, thats all what matters not the technology from years ago. Also as i mentioned eventhough defered rendering doesnt do things in a single pass is still far less than what forward rendering would require, forward requires a pass per light and defered need only two passes


My name is "Pemalite" not "Palmatite".

And I honestly, can't be bothered replying to any great technical degree due to the circular mindset that you have employed in this argument.

However, I wall say this... Be mindfull of generalised blanketed statement of graphics techniques, relying on multiple passes in a game may be better than a single pass.

 

with multipasses you can achieve more things sure, but i am not suggesting to always use single pass but to try to use as less passes as possible to strain the hardwrae as low as possible, thats the kind of approach defered rendering tries to achieve vs the forward rendering, defered rendering strains less the shaders and the pipeline and can render more lights with less resources(except for memory bandwidth needs) than forwrd rendering would need to render those same lights, sure is not perfect and trades off bandwidth so that you dont use to much shader power, but hey since wii u has plenty of memory bandwidth and is low on shader power this approach is ideal

thats why i brought the topic of single pass vs multipass,of course that its almost impossible to achieve something in single pass than in multipass, but thats not the point, the point is that is better to look for a solution that uses as less passes as possible to achieve a work that could have taken more passes with other approach



Pemalite said:

There is Anisotropic filtering... And then there is Anisotropic filtering.

2Gb? Raw maybe, but that's not going to be it's actual size when it comes to rendering time.

As for the GDDR5, economy of scale is what will bring the cost down, it happens with the volatile DRAM market constantly, when there is an abundance of DRAM chips (Costs of production be damned!) then prices drop, this is why DDR3 got to such crazy low prices.

I'm sure devs will be able to pull off 4x anistropic filtering in their game EASILY on the PS4 so long as they have it accounted for in mind. With 72 TMU's, that should come for free. 

Even with DXT5, the texture is still over 500MB!

Economies of scale is only meant to emphasize cost production advantages with higher volumes. Like it or not businesses will try to avoid selling at a loss, otherwise they won't be there for much longer. *cough* AMD *cough* 

The market will correct itself one way or another by closing down manufacturers cause what your describing simply isn't sustainable ... 



Around the Network
megafenix said:

with multipasses you can achieve more things sure, but i am not suggesting to always use single pass but to try to use as less passes as possible to strain the hardwrae as low as possible, thats the kind of approach defered rendering tries to achieve vs the forward rendering, defered rendering strains less the shaders and the pipeline and can render more lights with less resources(except for memory bandwidth needs) than forwrd rendering would need to render those same lights, sure is not perfect and trades off bandwidth so that you dont use to much shader power, but hey since wii u has plenty of memory bandwidth and is low on shader power this approach is ideal

thats why i brought the topic of single pass vs multipass,of course that its almost impossible to achieve something in single pass than in multipass, but thats not the point, the point is that is better to look for a solution that uses as less passes as possible to achieve a work that could have taken more passes with other approach


Actually that is not always true in fact using multiple passes will often be faster, and easier to implement. Complex shaders have a significant performance impact even on the latest GPUs. For many cases using multiple simple shader passes will end up being cheaper than trying to do everything in a single pass. Especially in modern game engines where you can mix a lot of different textures and effects on a single model, and lots of different models in the same scene. Often it will be cheaper to do each in a seperate pass so that you can use simple shaders that have less setup time and you don't waste performance calling assets and effects for areas that don't need them.

For example if you are rendering a human character with SSS on their skin, and completly different shading for their cloths. It will probably be faster to do a seperate pass for each. Rather than do it all in a single pass where you are calling a super complex shader that does both types of shading. The more things you try and do in a single pass the more redundant processing you will need to do. Especially if you have multiple types of meterials and effects in defferent parts of the scene, as that would require branching where the shader has to work out wich effects it has to do for each pixel as branches are very expensive on GPUs.

Another example would be transparent objects in a deferred renderer. Normally deferred rendering does not support transparent or translucent objects. The methods for rendering transparencies in a deferred renderer a super expensive and usually involve multiplying the size of the G Buffer. So the way most devs get around this is to just render transparent objects in a completly seperate forward shader pass.



@TheVoxelman on twitter

Check out my hype threads: Cyberpunk, and The Witcher 3!

zarx said:
megafenix said:

with multipasses you can achieve more things sure, but i am not suggesting to always use single pass but to try to use as less passes as possible to strain the hardwrae as low as possible, thats the kind of approach defered rendering tries to achieve vs the forward rendering, defered rendering strains less the shaders and the pipeline and can render more lights with less resources(except for memory bandwidth needs) than forwrd rendering would need to render those same lights, sure is not perfect and trades off bandwidth so that you dont use to much shader power, but hey since wii u has plenty of memory bandwidth and is low on shader power this approach is ideal

thats why i brought the topic of single pass vs multipass,of course that its almost impossible to achieve something in single pass than in multipass, but thats not the point, the point is that is better to look for a solution that uses as less passes as possible to achieve a work that could have taken more passes with other approach


Actually that is not always true in fact using multiple passes will often be faster, and easier to implement. Complex shaders have a significant performance impact even on the latest GPUs. For many cases using multiple simple shader passes will end up being cheaper than trying to do everything in a single pass. Especially in modern game engines where you can mix a lot of different textures and effects on a single model, and lots of different models in the same scene. Often it will be cheaper to do each in a seperate pass so that you can use simple shaders that have less setup time and you don't waste performance calling assets and effects for areas that don't need them.

For example if you are rendering a human character with SSS on their skin, and completly different shading for their cloths. It will probably be faster to do a seperate pass for each. Rather than do it all in a single pass where you are calling a super complex shader that does both types of shading. The more things you try and do in a single pass the more redundant processing you will need to do. Especially if you have multiple types of meterials and effects in defferent parts of the scene, as that would require branching where the shader has to work out wich effects it has to do for each pixel as branches are very expensive on GPUs.

Another example would be transparent objects in a deferred renderer. Normally deferred rendering does not support transparent or translucent objects. The methods for rendering transparencies in a deferred renderer a super expensive and usually involve multiplying the size of the G Buffer. So the way most devs get around this is to just render transparent objects in a completly seperate forward shader pass.


it depends on the applicatio, but in must cases is better to try achieve things with less passes so that you can save up shader power and use itfor other things, thats the whole point about about defered vs forward, forward stresses the gpu to much while the defered uses less shader and pipeline resources by tading off memory bandwidth

here

http://http.developer.nvidia.com/GPUGems/gpugems_ch28.html

"

 Optimizing Vertex Processing

  • Reduce the number of vertices processed. This is rarely the fundamental issue, but using a simple level-of-detail scheme, such as a set of static LODs, certainly helps reduce vertex-processing load.
  • Use vertex-processing LOD. Along with using LODs for the number of vertices processed, try LODing the vertex computations themselves. For example, it is likely unnecessary to do full four-bone skinning on distant characters, and you can probably get away with cheaper approximations for the lighting. If your material is multipassed, reducing the number of passes for lower LODs in the distance will also reduce vertex-processing cost.

Speeding Up Fragment Shading

If you're using long and complex fragment shaders, it is often likely that you're fragment-shading bound. If so, try these suggestions:

  • Render depth first. Rendering a depth-only (no-color) pass before rendering your primary shading passes can dramatically boost performance, especially in scenes with high depth complexity, by reducing the amount of fragment shading and frame-buffer memory access that needs to be performed. To get the full benefits of a depth-only pass, it's not sufficient to just disable color writes to the frame buffer; you should also disable all shading on fragments, even shading that affects depth as well as color (such as alpha test).
  • Consider using fragment shader level of detail. Although it offers less bang for the buck than vertex LOD (simply because objects in the distance naturally LOD themselves with respect to pixel processing, due to perspective), reducing the complexity of the shaders in the distance, and decreasing the number of passes over a surface, can lessen the fragment-processing workload.

"

 

Of course that defered rendering is not perfect and besides the problems with huge bandwidth requirements you also have issues like alpha blending and msaa, but there are good solutions for that, instead of msaa for example you can use temporal antialaising, which laos is cheaper than msaa and is better suited with defered rendering, FXAA or Fast Approximate Anti-Aliasing is also a good option

 

As for the transparency, well you can combine defered and forward rendering, forward rendering would be just used for the parts where you need transparency or can use some other solutions like the ones found here

http://www.csc.kth.se/utbildning/kth/kurser/DD143X/dkand12/Group1Marten/final/final_TransparencyWithDeferredShading.pdf

"

How can we render transparent object using deferred shading?

Within the frame of this project, several techniques for rendering transparent objects

were examined on their advantages and disadvantages. Below we suggest a

review of previous studies on deferred shading as well as some of the practical solutions

for application of this technique in transparency. We propose a prototype of own developed technique for rendering of transparent objects. The built of our

deferred shader is described along with actual integration, of the front renderer and

deferred shader techniques, is explained in particular. We discuss the test results of

our prototype in terms of performance and image quality and describe the model

we used for this testing.

 

Rendering transparent objects with deferred shading impose some problems as the

depth-buffer used for rendering during deferred shading only supports one fragment

at a time. In our work, we have chosen to use alpha-blending in a post pass using

front rendering. Despite the flaws of alpha-blending, it is still very straightforward

and is easy to implement into a deferred shader. In the next section we discuss

basic functions of our algorithm

 

Deferred shading with Front Rendering

The front render is used to process transparent objects and fits well into the deferred

shading pipeline. It renders all opaque objects first with the deferred shader

and then renders the transparent objects on top using the front renderer. This is

important as the depth buffer has to be filled with opaque objects first, to prevent

rendering of non-visible transparent objects. When the front renderer is performed,

the final picture can be rendered to the frame buffer for display.

The implementation has the following rendering stages:

1. Render all opaque geometry to the G-Buffer

2. Bind the G-Buffer as texture. For each light in the scene draw a full screen rectangle

and calculate lighting at each pixel using the data from the G-Buffer. Save

result in the P-Buffer.

3. Sort all transparent entities in back to front order.

4. Render all transparent geometry using the front renderer. Blend the result to the

P-Buffer using the depth buffer to filter out any non-visible transparent geometry.

5.-Copy P-Buffer to frame buffer.

"



This freaking thread...

And this is coming from a Shin'en fan who usually loves talking tech.



curl-6 said:

This freaking thread...

*snip*

And this is coming from a Shin'en fan who usually loves talking tech.

I blame OP ... 



fatslob-:O said:
curl-6 said:

This freaking thread...

*snip*

And this is coming from a Shin'en fan who usually loves talking tech.

I blame OP ...