Deep Learning applied to augmented and virtual scenarios

Mameli, Marco

How can Deep Learning (DL) make the creation of augmented and virtual sce- narios more effective? Computer graphics have become essential for creating advertisements, games, and films. This study aims to investigate the use of deep learning to solve computer graphics tasks, applying it to raster images and 3D objects repre- sented by meshes and interactions, particularly useful in rendering immersive virtual environments. The study based on classical neural and convolutional network approaches has made it possible to analyze their limitations. It has also demonstrated that the DenseNet is the best network for extracting fea- tures from images used to construct a 3D object, with the modified structure based on the encoder-decoder approach that permits network training in less time. A further demonstration of the usefulness of DL is provided by the results of image generation that allow G-Buffers to achieve path-tracing renderings of comparable quality to the real thing but in less time than standard ap- proaches. One of the most interesting results of the thesis is the demonstration of the usefulness of Reinforcement Learning (RL) for automatic interactions. It shows how RL can automate the activation of animations according to context and purpose. Thanks to its use, high scores were obtained in a game where the scene was not predetermined but created dynamically. These results con- firm that a character’s behaviour can adapt according to context and purpose. These studies are a starting point for pursuing different research directions and enabling future developments, for example, those based on transformers and self-supervised learning, which aim to improve the approaches described above. The object of this thesis is not just to show how DL can have a profound im- pact on Computer Graphics but also to illustrate how it can be used for other practical purposes, such as creating a rendering or 3D objects in less time.

Nella Computer Graphics, i problemi tradizionali sono attualmente affrontati con il deep learning. Reti neurali profonde dello stato dell'arte risultano più efficienti dei metodi classici, raggiungendo elevate prestazioni. Questa tesi esplora la possibilità di migliorare le applicazioni di computer graphics impiegando il deep learning. Si indagano le aree dell'interazione di scene virtuali, della generazione automatica di contenuti 3D da immagini RGB e del rendering. La generazione di contenuti 3D è laboriosa, richiede molto tempo e operatori specializzati. Grazie al Deep Learning e alla Pixel2Mesh, è possibile semplificare questo compito. Il layer convoluzionale viene addestrato applicando la configurazione encoder-decoder per la ricostruzione dell'immagine in input, dimostrando che con questo approccio il tempo di addestramento richiesto per ottenere la mesh 3D diminuisce. Inoltre, è stato prodotto sinteticamente un dataset per il rendering di una scena 3D e una rete convoluzionale addestrata con il framework GAN è stata utilizzata per produrre immagini di rendering con una qualità paragonabile a quelle generate da un motore di rendering basato sul Ray-Tracing. Utilizzando il Reinforcement Learning, Q-Learning, si dimostra che è possibile automatizzare l'interazione dei personaggi presenti all'interno di scenari virtuali, dimostrando così la possibilità di superare l'attuale limite della conoscenza a priori della scena e della sua composizione che richiedeva una scena chiusa e limitata. Infine, viene valutato il limite della capacità computazionale dei dispositivi per applicazioni di realtà aumentata e virtuale per l'integrazione di una rete neurale, al fine di limitare la richiesta di hardware dedicato per modellare la loro esecuzione. Questo limite è definito dalle dimensioni dell'immagine in input e viene testato su diversi dispositivi selezionati per fascia di prezzo e capacità computazionale.

Deep Learning applied to augmented and virtual scenarios / Mameli, Marco. - (2023 Mar 16).