In computer graphics, 3D modeling is a fundamental concept. It is the process of creating three-dimensional objects or scenes using specialized software that allows users to create, manipulate and modify geometric shapes to build complex models. This operation requires a huge amount of time to perform and specialised knowledge. Typically, it takes three to five hours of modelling to obtain a basic mesh from the blueprint. Several approaches have tried to automate this operation to reduce modelling time. The most interesting of these approaches are based on Deep Learning, and one of the most interesting is Pixel2Mesh. However, training this network requires at least 150 epochs to obtain usable results. Starting from these premises, this work investigates the possibility of training a modified version of the Pixel2Mesh in fewer epochs to obtain comparable or better results. A modification was applied to the convolutional block to achieve this, replacing the classification-based approach with an image reconstruction-based approach. This modification uses a configuration based on constructing an encoder-decoder architecture using state-of-the-art networks such as VGG, DenseNet, ResNet, and Inception. Using this approach, the convolutional block learns how to reconstruct the image correctly from the source image by learning the position of the object of interest within the image. With this approach, it was possible to train the complete network in 50 epochs, achieving results that outperform the state-of-the-art. The tests performed on the networks show an increase of 0.5% points over the state-of-the-art average.

Investigation on the Encoder-Decoder Application for Mesh Generation / Mameli, M.; Balloni, E.; Mancini, A.; Frontoni, E.; Zingaretti, P.. - 14496:(2024), pp. 387-400. (Intervento presentato al convegno 40th Computer Graphics International Conference, CGI 2023 tenutosi a Shanghai nel 28 August-1 September 2023) [10.1007/978-3-031-50072-5_31].

Investigation on the Encoder-Decoder Application for Mesh Generation

Mameli M.
;
Balloni E.;Mancini A.;Frontoni E.;Zingaretti P.
2024-01-01

Abstract

In computer graphics, 3D modeling is a fundamental concept. It is the process of creating three-dimensional objects or scenes using specialized software that allows users to create, manipulate and modify geometric shapes to build complex models. This operation requires a huge amount of time to perform and specialised knowledge. Typically, it takes three to five hours of modelling to obtain a basic mesh from the blueprint. Several approaches have tried to automate this operation to reduce modelling time. The most interesting of these approaches are based on Deep Learning, and one of the most interesting is Pixel2Mesh. However, training this network requires at least 150 epochs to obtain usable results. Starting from these premises, this work investigates the possibility of training a modified version of the Pixel2Mesh in fewer epochs to obtain comparable or better results. A modification was applied to the convolutional block to achieve this, replacing the classification-based approach with an image reconstruction-based approach. This modification uses a configuration based on constructing an encoder-decoder architecture using state-of-the-art networks such as VGG, DenseNet, ResNet, and Inception. Using this approach, the convolutional block learns how to reconstruct the image correctly from the source image by learning the position of the object of interest within the image. With this approach, it was possible to train the complete network in 50 epochs, achieving results that outperform the state-of-the-art. The tests performed on the networks show an increase of 0.5% points over the state-of-the-art average.
2024
9783031500718
9783031500725
File in questo prodotto:
File Dimensione Formato  
Mameli_Investigation-Encoder-Decoder-application_2024.pdf

Solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso: Tutti i diritti riservati
Dimensione 1.45 MB
Formato Adobe PDF
1.45 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
paper229.pdf

Open Access dal 30/12/2024

Tipologia: Documento in post-print (versione successiva alla peer review e accettata per la pubblicazione)
Licenza d'uso: Tutti i diritti riservati
Dimensione 15.06 MB
Formato Adobe PDF
15.06 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/336740
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact