In recent years, Convolutional Neural Networks (CNNs) have been at the forefront of advancements in medical imaging, particularly in the classification of disease severity from ultrasound images. However, CNNs often face challenges in capturing long-range dependencies within images, a crucial factor in accurately assessing conditions like COVID-19 pneumonia, where the evaluation of anatomical features and different artifacts in the lung ultrasound image (LUS) is vital. This research investigates the potential of Transformer-based models, known for their capability of capturing long-range dependencies, to classify the severity of COVID-19 pneumonia from LUS images. We explore and compare the performance of various architectures, including Swin Transformer Tiny (Swin-T), based solely on Multi-Head Self Attention (MHSA), and Bottleneck Transformer (BoTNet-50), an innovative hybrid architecture that integrates the MHSA mechanism into a convolutional backbone combining the strengths of both CNNs and Transformers. Our analysis, performed on the publicly available ICLUS dataset, reveals that BoTNet-50 outperforms ResNet-50 achieving an F1-Score of 0.6025, reflecting superior performance in LUS image classification. Swin-T, while initially underperforming when trained from scratch, saw considerable gains via transfer learning, ultimately reaching an F1-Score of 0.6513 and surpassing all ResNet-50 metrics. Moreover, Grad-CAM analysis highlights the remarkable sensitivity of Swin-T compared to ResNet-50 in detecting complex structures, leveraging its ability to discern broader contexts and complex spatial relationships in LUS images. This indicates that Transformer-based models hold significant promise for advancing the precision of COVID-19 pneumonia scoring assessment through LUS analysis.

Vision Transformer Approaches for COVID-19 Pneumonia Assessment in Lung Ultrasound Images / Fiorentino, Maria Chiara; Rosati, Riccardo; Melnic, Andrian; Conti, Edoardo; Zingaretti, Primo. - (2024), pp. 83-88. (Intervento presentato al convegno 3rd IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2024 tenutosi a St Albans, United Kingdom nel 2024) [10.1109/metroxraine62247.2024.10796379].

Vision Transformer Approaches for COVID-19 Pneumonia Assessment in Lung Ultrasound Images

Fiorentino, Maria Chiara;Rosati, Riccardo;Conti, Edoardo;Zingaretti, Primo
2024-01-01

Abstract

In recent years, Convolutional Neural Networks (CNNs) have been at the forefront of advancements in medical imaging, particularly in the classification of disease severity from ultrasound images. However, CNNs often face challenges in capturing long-range dependencies within images, a crucial factor in accurately assessing conditions like COVID-19 pneumonia, where the evaluation of anatomical features and different artifacts in the lung ultrasound image (LUS) is vital. This research investigates the potential of Transformer-based models, known for their capability of capturing long-range dependencies, to classify the severity of COVID-19 pneumonia from LUS images. We explore and compare the performance of various architectures, including Swin Transformer Tiny (Swin-T), based solely on Multi-Head Self Attention (MHSA), and Bottleneck Transformer (BoTNet-50), an innovative hybrid architecture that integrates the MHSA mechanism into a convolutional backbone combining the strengths of both CNNs and Transformers. Our analysis, performed on the publicly available ICLUS dataset, reveals that BoTNet-50 outperforms ResNet-50 achieving an F1-Score of 0.6025, reflecting superior performance in LUS image classification. Swin-T, while initially underperforming when trained from scratch, saw considerable gains via transfer learning, ultimately reaching an F1-Score of 0.6513 and surpassing all ResNet-50 metrics. Moreover, Grad-CAM analysis highlights the remarkable sensitivity of Swin-T compared to ResNet-50 in detecting complex structures, leveraging its ability to discern broader contexts and complex spatial relationships in LUS images. This indicates that Transformer-based models hold significant promise for advancing the precision of COVID-19 pneumonia scoring assessment through LUS analysis.
2024
9798350378009
File in questo prodotto:
File Dimensione Formato  
Vision_Transformer_Approaches_for_COVID-19_Pneumonia_Assessment_in_Lung_Ultrasound_Images.pdf

Solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso: Tutti i diritti riservati
Dimensione 984.38 kB
Formato Adobe PDF
984.38 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/342618
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact