In recent years, Convolutional Neural Networks (CNNs) have been at the forefront of advancements in medical imaging, particularly in the classification of disease severity from ultrasound images. However, CNNs often face challenges in capturing long-range dependencies within images, a crucial factor in accurately assessing conditions like COVID-19 pneumonia, where the evaluation of anatomical features and different artifacts in the lung ultrasound image (LUS) is vital. This research investigates the potential of Transformer-based models, known for their capability of capturing long-range dependencies, to classify the severity of COVID-19 pneumonia from LUS images. We explore and compare the performance of various architectures, including Swin Transformer Tiny (Swin-T), based solely on Multi-Head Self Attention (MHSA), and Bottleneck Transformer (BoTNet-50), an innovative hybrid architecture that integrates the MHSA mechanism into a convolutional backbone combining the strengths of both CNNs and Transformers. Our analysis, performed on the publicly available ICLUS dataset, reveals that BoTNet-50 outperforms ResNet-50 achieving an F1-Score of 0.6025, reflecting superior performance in LUS image classification. Swin-T, while initially underperforming when trained from scratch, saw considerable gains via transfer learning, ultimately reaching an F1-Score of 0.6513 and surpassing all ResNet-50 metrics. Moreover, Grad-CAM analysis highlights the remarkable sensitivity of Swin-T compared to ResNet-50 in detecting complex structures, leveraging its ability to discern broader contexts and complex spatial relationships in LUS images. This indicates that Transformer-based models hold significant promise for advancing the precision of COVID-19 pneumonia scoring assessment through LUS analysis.
Vision Transformer Approaches for COVID-19 Pneumonia Assessment in Lung Ultrasound Images / Fiorentino, Maria Chiara; Rosati, Riccardo; Melnic, Andrian; Conti, Edoardo; Zingaretti, Primo. - (2024), pp. 83-88. (Intervento presentato al convegno 3rd IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2024 tenutosi a St Albans, United Kingdom nel 2024) [10.1109/metroxraine62247.2024.10796379].
Vision Transformer Approaches for COVID-19 Pneumonia Assessment in Lung Ultrasound Images
Fiorentino, Maria Chiara;Rosati, Riccardo;Conti, Edoardo;Zingaretti, Primo
2024-01-01
Abstract
In recent years, Convolutional Neural Networks (CNNs) have been at the forefront of advancements in medical imaging, particularly in the classification of disease severity from ultrasound images. However, CNNs often face challenges in capturing long-range dependencies within images, a crucial factor in accurately assessing conditions like COVID-19 pneumonia, where the evaluation of anatomical features and different artifacts in the lung ultrasound image (LUS) is vital. This research investigates the potential of Transformer-based models, known for their capability of capturing long-range dependencies, to classify the severity of COVID-19 pneumonia from LUS images. We explore and compare the performance of various architectures, including Swin Transformer Tiny (Swin-T), based solely on Multi-Head Self Attention (MHSA), and Bottleneck Transformer (BoTNet-50), an innovative hybrid architecture that integrates the MHSA mechanism into a convolutional backbone combining the strengths of both CNNs and Transformers. Our analysis, performed on the publicly available ICLUS dataset, reveals that BoTNet-50 outperforms ResNet-50 achieving an F1-Score of 0.6025, reflecting superior performance in LUS image classification. Swin-T, while initially underperforming when trained from scratch, saw considerable gains via transfer learning, ultimately reaching an F1-Score of 0.6513 and surpassing all ResNet-50 metrics. Moreover, Grad-CAM analysis highlights the remarkable sensitivity of Swin-T compared to ResNet-50 in detecting complex structures, leveraging its ability to discern broader contexts and complex spatial relationships in LUS images. This indicates that Transformer-based models hold significant promise for advancing the precision of COVID-19 pneumonia scoring assessment through LUS analysis.File | Dimensione | Formato | |
---|---|---|---|
Vision_Transformer_Approaches_for_COVID-19_Pneumonia_Assessment_in_Lung_Ultrasound_Images.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso:
Tutti i diritti riservati
Dimensione
984.38 kB
Formato
Adobe PDF
|
984.38 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.