In computer graphics, 3D modeling is a fundamental concept. It is the process of creating three-dimensional objects or scenes using specialized software that allows users to create, manipulate and modify geometric shapes to build complex models. This operation requires a huge amount of time to perform and specialised knowledge. Typically, it takes three to five hours of modelling to obtain a basic mesh from the blueprint. Several approaches have tried to automate this operation to reduce modelling time. The most interesting of these approaches are based on Deep Learning, and one of the most interesting is Pixel2Mesh. However, training this network requires at least 150 epochs to obtain usable results. Starting from these premises, this work investigates the possibility of training a modified version of the Pixel2Mesh in fewer epochs to obtain comparable or better results. A modification was applied to the convolutional block to achieve this, replacing the classification-based approach with an image reconstruction-based approach. This modification uses a configuration based on constructing an encoder-decoder architecture using state-of-the-art networks such as VGG, DenseNet, ResNet, and Inception. Using this approach, the convolutional block learns how to reconstruct the image correctly from the source image by learning the position of the object of interest within the image. With this approach, it was possible to train the complete network in 50 epochs, achieving results that outperform the state-of-the-art. The tests performed on the networks show an increase of 0.5% points over the state-of-the-art average.
Investigation on the Encoder-Decoder Application for Mesh Generation / Mameli, M.; Balloni, E.; Mancini, A.; Frontoni, E.; Zingaretti, P.. - 14496:(2024), pp. 387-400. (Intervento presentato al convegno 40th Computer Graphics International Conference, CGI 2023 tenutosi a Shanghai nel 28 August-1 September 2023) [10.1007/978-3-031-50072-5_31].
Investigation on the Encoder-Decoder Application for Mesh Generation
Mameli M.
;Balloni E.;Mancini A.;Frontoni E.;Zingaretti P.
2024-01-01
Abstract
In computer graphics, 3D modeling is a fundamental concept. It is the process of creating three-dimensional objects or scenes using specialized software that allows users to create, manipulate and modify geometric shapes to build complex models. This operation requires a huge amount of time to perform and specialised knowledge. Typically, it takes three to five hours of modelling to obtain a basic mesh from the blueprint. Several approaches have tried to automate this operation to reduce modelling time. The most interesting of these approaches are based on Deep Learning, and one of the most interesting is Pixel2Mesh. However, training this network requires at least 150 epochs to obtain usable results. Starting from these premises, this work investigates the possibility of training a modified version of the Pixel2Mesh in fewer epochs to obtain comparable or better results. A modification was applied to the convolutional block to achieve this, replacing the classification-based approach with an image reconstruction-based approach. This modification uses a configuration based on constructing an encoder-decoder architecture using state-of-the-art networks such as VGG, DenseNet, ResNet, and Inception. Using this approach, the convolutional block learns how to reconstruct the image correctly from the source image by learning the position of the object of interest within the image. With this approach, it was possible to train the complete network in 50 epochs, achieving results that outperform the state-of-the-art. The tests performed on the networks show an increase of 0.5% points over the state-of-the-art average.File | Dimensione | Formato | |
---|---|---|---|
Mameli_Investigation-Encoder-Decoder-application_2024.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso:
Tutti i diritti riservati
Dimensione
1.45 MB
Formato
Adobe PDF
|
1.45 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
paper229.pdf
Open Access dal 30/12/2024
Tipologia:
Documento in post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza d'uso:
Tutti i diritti riservati
Dimensione
15.06 MB
Formato
Adobe PDF
|
15.06 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.