Pemalite said:
You should ask AMD that same very question. :P |
So I took a bit of snooping on the GCN whitepapers and the only thing that you would need to store is the patch data according to microsoft and that is relatively small. Another thing that the GCN architecture does to alleviate this storage bottleneck is that the patch data can spill to the L2 cache so I'm willing to bet that the storing of patch data isn't as much of a problem as AMD's own implimentation of tessellation. What AMD DOESN'T currently have is the hardware to maintain the exponential increase of triangles. Instead of just having 2 tessellators that are outside of the compute units or streaming multiprocessors maybe it would be a better idea to have tinier tessellation engines in those units. Even if each tessellator doesn't have 1 prim/clk and only does 0.25 prim/clk it would be alot easier to handle the explosion of triangles on 8 or so smaller tessellators rather than just 2 tessellators.
AMD needs to seriously figure out a truly parallel solution. Their pipeline is serialized from the looks of it and that's probably causing the bottleneck.
/Edit
Off-topic: Now I know why we don't use tile based renderers ... The system scales like a bitch once you have more than 512 primitives per each 16 x 32 tiles. Eventually the benefits of z buffering such as being able to do stencil shadowing starts to outweigh the drawbacks of a steep slope expense. Tiled based rendering wouldn't be feasible anymore on current gen games that are coming out and it would likely have a much more difficult relationship with tessellation than AMD now which is already bad.