Real-time semantic segmentation on embedded devices has recently enjoyed significant gain in popularity, due to the increasing interest in smart vehicles and smart robots. In particular, with the emergence of autonomous driving, low latency and computation-intensive operations lead to new challenges for vehicles and robots, such as excessive computing power and energy consumption. The aim of this paper is to address semantic segmentation, one of the most critical tasks for the perception of the environment, and its implementation in a low power core, by preserving the required performance of accuracy and low complexity. To reach this goal a low-rank convolutional neural network (CNN) architecture for real-time semantic segmentation is proposed. The main contributions of this paper are: i) a tensor decomposition technique has been applied to the kernel of a generic convolutional layer, ii) three versions of an optimized architecture, that combines UNet and ResNet models, have been derived to explore the trade-off between model complexity and accuracy, iii) the low-rank CNN architectures have been implemented in a Raspberry Pi 4 and NVIDIA Jetson Nano 2 GB embedded platforms, as severe benchmarks to meet the low-power, low-cost requirements, and in the high-cost GPU NVIDIA Tesla P100 PCIe 16 GB to meet the best performance.

A Low-Rank CNN Architecture for Real-Time Semantic Segmentation in Visual SLAM Applications

Laura Falaschetti
Primo
;
Lorenzo Manoni
Secondo
;
Claudio Turchetti
Ultimo
2022

Abstract

Real-time semantic segmentation on embedded devices has recently enjoyed significant gain in popularity, due to the increasing interest in smart vehicles and smart robots. In particular, with the emergence of autonomous driving, low latency and computation-intensive operations lead to new challenges for vehicles and robots, such as excessive computing power and energy consumption. The aim of this paper is to address semantic segmentation, one of the most critical tasks for the perception of the environment, and its implementation in a low power core, by preserving the required performance of accuracy and low complexity. To reach this goal a low-rank convolutional neural network (CNN) architecture for real-time semantic segmentation is proposed. The main contributions of this paper are: i) a tensor decomposition technique has been applied to the kernel of a generic convolutional layer, ii) three versions of an optimized architecture, that combines UNet and ResNet models, have been derived to explore the trade-off between model complexity and accuracy, iii) the low-rank CNN architectures have been implemented in a Raspberry Pi 4 and NVIDIA Jetson Nano 2 GB embedded platforms, as severe benchmarks to meet the low-power, low-cost requirements, and in the high-cost GPU NVIDIA Tesla P100 PCIe 16 GB to meet the best performance.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11566/302048
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact