Forums - Gaming Discussion - Upscaling/SR by Neural Networks on gaming?

The idea is using convolutional neural networks for resolving low resolution image output from gaming machines to a high resolution one. why it is still not happening?

First let me put some concepts:

-super resolution (SR) is obtaining a high resolution image/video from a low resolution one. Yes, it seems like a magic that generate missing information. the simpler methods are using interpolation (upscaling), that are used in every tv, console, HDMI cable, etc.

-Neural Networks are a machine learning technique that using layers of 'neurons' in order to learn something. Given tons of classifyed training data, the training process embed in the network the weigths that generate the trained data. Is important to note that training takes very long time, while classifyng data with already trained net is very fast. It can be very specifyed depending on training data, in such way it can increase the performance of input data if we already know it could be next to the trained data.

- In the SR application, the input is a low res image, and the output is a high resolution one. (Convolutional neural networks (CNN) allows to work with big chunks of data instead of just classes). The training process uses millions of very diverse low res images and the corresponding high res image by a given factor.  After the net is trained, you put on the net your low res image (even if it does not is part of training data) and it generates the correspondent high resolution image. This process is very fast, and the quality of generated image is superior that other SR methods.

Well, the idea is to use this powerful weapon in gaming, adapting it to the specific application:

-In a powerful hardware, they generate the 2 versions of the rendered image: one with the full resolution and AA, and other in low resolution/no AA that would be the result on the power limited gaming machine.

-Train a CNN with the low and high res images (millions of them) to get a good model that generate high res images from low res images. As they know exactly the kind of images they will get as input (the net will only be used on the images of the game), the net will be very accurate and the results tend to be even better than the regular SR CNN. 

Justify/advantages:

-Better Image quality when compared to regular upscaling and other methods. In a world where the resolution is increasing, and, as consequence, the perceptual difference of native full resolution to upscaled one is decreasing, the SR CNN can be undistinglishable from native full resolution.

- Developers could develop games focused more in performance, effects and details, not exactly in resolution/AA. as they know the result of the upscaler could bring the quality of image up to full resolution levels with a perfect AA, other aspects can be optimized.

-Less powerful machines would be able to run high resolution games. Render natively with high resolution is very costly in terms of computing power and power consumption. SR CNN is very efficient and has low demand for hardware. 

-Can be embedded on the consoles. additionally, it can provide a high resolution mode for it (-example 1, that would be the only way to have a power up dock on switch. The dock would still receive only the low res images, and would output the high resolution ones passed by the SR CNN. -exemple 2: PS4 slim/one S could match resolution of PS4 pro/ one X just by adding an extra simple cheap hardware for the SR CNN), and that could also cheaper the consoles, and let the cheaper ones with a high-end aspect.

-Developers that do not trained for their games could use already trained profiles of networks. For example, Sony could train nets for realistic games, cartunish games, and a lot of visual styles. Developers choose one for their game, and that does not bring any work overload for developers. (despite the results in image quality wouldnt be maximized specificaly for their game.)

 

Concluding, I cant see why not applying that. Anyway, maybe im too optimistic. I am just starting to read things about CNNs and got facinated on the results(i still did not get on practice yet). If I commited some mistake, or to know why it wouldnt be viable, I would like to know in the comments. And general thoutghs too. Thank you.



Around the Network

From my limited knowledge on the subject, I see bigger potential gaming applications for neural nets and the like than just image quality. With proper investment beforehand it could really streamline development pipelines for things like animation and AI.



https://github.com/lltcggie/waifu2x-caffe/releases
Go upscale an image from 1080p to 4k.

Consoles won't have the bandwidth, compute, or dedicated hardware to do an operation like this within 16 - 30ms this generation.

AI will probably be injected into render pipelines (we should be getting AI denoising on PC this year!), but it likely won't be this generation (consoles, mobile).
It probably won't be a straight upscale filter, on a fully rendered frame.
In my opinion the biggest gains will be seen in the content creation side of gaming.

Last edited by caffeinade - on 14 May 2018

Do you know what a real time application is? If this is done via cloud processing then it's gonna be too slow for real time rendering. If it's done in hardware, you will still need resources for the NN to produce the image which I assume is less efficient than just straight up rendering.

This NN will basically do nothing else than calculate a higher resolution from a lower resolution. Coincidentally the exact same thing a modern GPU does already. Why would you think a convoluted NN is in any way more efficient than a rendering engine which was specifically built for this task?



If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

"why it is still not happening?"

Probably because its more demanding than just doing it at a higher resolution first go around? and produces a "lesser" quality output than Straight Native high resolution would.

But if there was a co-proccessor on graphics cards to handle this burden... why not? its a valid option I guess.



Around the Network
JRPGfan said:

"why it is still not happening?"

Probably because its more demanding than just doing it at a higher resolution first go around? and produces a "lesser" quality output than Straight Native high resolution would.

But if there was a co-proccessor on graphics cards to handle this burden... why not? its a valid option I guess.

Or instead of a co-processor for useless stuff you could put in a bigger processor for rendering tasks. But I guess that would be to convoluted.



If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

This CNN would have to be so complex for it to be usable at all that it would be much, MUCH more efficient to simply just render a native 4K image.

I once did something similar but for audio (i.e. 'upscaling' from 8bit 8KHz to 16bit 44.1KHz using a NN), and generating 1 second of audio took ~5 minutes. And it only kind of really worked at all in a very specific domain of audio (speech); it sounded horrible for anything else. Training this NN took about 3 weeks, using a GTX1080. Just to give an idea how brute force NNs are.



vivster said:
JRPGfan said:

Or instead of a co-processor for useless stuff you could put in a bigger processor for rendering tasks. But I guess that would be to convoluted.

Yeah Im not "feeling" this NN for gaming thingy.
Seems like a round-about solution, that honestly probably isnt needed.



TallSilhouette said:
From my limited knowledge on the subject, I see bigger potential gaming applications for neural nets and the like than just image quality. With proper investment beforehand it could really streamline development pipelines for things like animation and AI.

Don't forget streamlined revenue by scientific neural optimization for presentation and implementation of micro transactions through "player engagement". Guess for which task a NN will be used first in gaming.



If you demand respect or gratitude for your volunteer work, you're doing volunteering wrong.

Well, NVIDIA presented its new video card that makes something similar.
Deep Learning Super Sampling (well, using the terms super sampling instead of super resolution, and deep learning instead of convolutional NN)
https://hothardware.com/news/nvidia-geforce-rtx-2080-performance-and-dlss
Using the technique, they managed to free computing resources and increase the frame rate. (even when compared to the same video card, but without DLSS)