Nvidia reveals DLSS 5 , essentially applies AI filter to games in real time.

sc94597 said:

Soundwave said:

To steer the topic back to DLSS5, I guess the big question I have right now is currently Nvidia only showed examples of an AI creating images that are relatively close to the source graphics, my question would be is there some limit on that? For example if the developer wants Grace in RE9 to look like a photoreal version of the actress Jennifer Lawrence (for example) instead ... can they give the DLSS model photo data of Jennifer Lawrence that it would spit out a final image that looks just like Jennifer Lawrence?

There are only two ways they could do this.

1. They pre-trained the model on generalized image-to-image. This is unlikely for a few reasons. Good general image-to-image models are relatively huge. The open-source ones start at around 13 billion parameters. That is not feasible to inference in real time even on a single data center GPU, let alone gaming ones. An RTX 5090 inferences on these models about 2 images per second, just for context. The datasets used to train them are also huge. Nvidia doesn't have access to any buffer data on these data samples like they do with their regular DLSS training sets. Now Nvidia could train an (or more likely source an already trained) image to image model and use it as a teacher for their specialized gaming specific model. But there are two issues with that. The first is that it would very much skew the codomain so much that you are risking the efficacy of your gaming specific model. The second is that it is a very inefficient method given the target objective is so specific.

2. They have invested heavily in model interpretation research and pulled off something like Claude's Golden Gate Bridge experiment but for image models rather than LLMs. If that were the case, they'd be able to allow much more control than you are talking about. You really don't need text or image inputs in this case, and can just directly control the model parameter weights. See: https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

This model is more likely something like what is described in this paper,

https://arxiv.org/pdf/2105.04619

but without using the G-buffer at all (if we take Nvidia's press release at face-value that they only use color and velocity buffers) and probably using a vision transformer instead of a CNN.

I wonder if what they're doing isn't all that dissimilar from these kinds of videos that are all over Tiktok/Instagram etc.:

https://www.instagram.com/reel/DVxSh0ODVec/

Seems like Google/Youtube does not allow or want too many videos like this because they're hard to find on Youtube but all over the place on Insta/Tiktok.

Just instead of a person it's a lighter data set that's trained more to enhance things like eyes, lips, wrinkles, increasing brightness, etc. of game characters from a bunch of data.

Last edited by Soundwave - 2 days ago

Existing User Log In

New User Registration

Gaming - Nvidia reveals DLSS 5 , essentially applies AI filter to games in real time. - View Post

Recent Badges: