By using this site, you agree to our Privacy Policy and our Terms of Use. Close
fatslob-:O said:

So it has begun ... 

Microsoft has started to roll out a sneak peak of Shader Model 6 so let's take a review some of the funtionality. WaveBallot returns a 64-bit mask which coincidentally matches the width of the SGPRs found in console GPUs. WavePrefixSum is just a total sum of the active lanes which is gained by using Waveballot but console GPUs have two instructions in 32-bit width which evaluates the total sum of both the lower significant bits along with the higher significant bits from the active lanes returned by the 64-bit mask. WaveReadFirstLane is just a special case of WaveReadLaneAt, these are useful for reducing VGPR pressure in console GPUs by moving the data into the SGPRs since it assumes that the value will be uniform thoughout execution ... 

But the most stand out stuff so far in shader model 6 is ordered append which is pretty orthogonal to the other wave intrinsics but consoles GPUs have a special hardware implementation of this function that resides in that tiny and extremely fast 64KB cache that's effectively visible by every vector unit which is not only useful for global synchronization yet it can be used in some kernels that require a counter that is accessed in the order of the created waves ... 

Sweet!

Also. I doubt anyone would care or know what the hell you are talking about on this forum besides a small select few. :P




www.youtube.com/@Pemalite