This paper presents AIRA-D FusionNet, a multi-modal system for violence recognition that integrates visual analysis using a MoViNet-A0 backbone with auditory processing of MFCC features through a BiLSTM network. Trained on a combined dataset, our fused model achieves a recall of 0.91 and an AUC of 0.856, demonstrating a 12% improvement in recall over unimodal baselines by effectively leveraging complementary audio-visual cues. To enable practical deployment, the model was successfully optimized for mobile inference by conversion to TensorFlow Lite. This confirms the system's viability for real-time violence detection applications on resource-constrained devices, offering a sensitive and efficient solution for automated security monitoring.

AIRA-D FusionNet: A Multi-Modal Deep Learning Framework for Violence Recognition through Audio-Visual Cues / Halilaj, M.; Bekteshi, E.; Myrto, E.; Dragoni, A. F.. - ELETTRONICO. - (2026), pp. 1-6. ( 3rd International Conference on Artificial Intelligence, Computer, Data Sciences, and Applications, ACDSA 2026 phl 2026) [10.1109/ACDSA67686.2026.11468260].

AIRA-D FusionNet: A Multi-Modal Deep Learning Framework for Violence Recognition through Audio-Visual Cues

Halilaj M.;Dragoni A. F.
2026-01-01

Abstract

This paper presents AIRA-D FusionNet, a multi-modal system for violence recognition that integrates visual analysis using a MoViNet-A0 backbone with auditory processing of MFCC features through a BiLSTM network. Trained on a combined dataset, our fused model achieves a recall of 0.91 and an AUC of 0.856, demonstrating a 12% improvement in recall over unimodal baselines by effectively leveraging complementary audio-visual cues. To enable practical deployment, the model was successfully optimized for mobile inference by conversion to TensorFlow Lite. This confirms the system's viability for real-time violence detection applications on resource-constrained devices, offering a sensitive and efficient solution for automated security monitoring.
2026
IEEE Xplore
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/356854
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact