AIRA-D FusionNet: A Multi-Modal Deep Learning Framework for Violence Recognition through Audio-Visual Cues

IRIS

This paper presents AIRA-D FusionNet, a multi-modal system for violence recognition that integrates visual analysis using a MoViNet-A0 backbone with auditory processing of MFCC features through a BiLSTM network. Trained on a combined dataset, our fused model achieves a recall of 0.91 and an AUC of 0.856, demonstrating a 12% improvement in recall over unimodal baselines by effectively leveraging complementary audio-visual cues. To enable practical deployment, the model was successfully optimized for mobile inference by conversion to TensorFlow Lite. This confirms the system's viability for real-time violence detection applications on resource-constrained devices, offering a sensitive and efficient solution for automated security monitoring.

AIRA-D FusionNet: A Multi-Modal Deep Learning Framework for Violence Recognition through Audio-Visual Cues / Halilaj, M., Bekteshi, E., Myrto, E., Dragoni, A.F.. - ELETTRONICO. - (2026), pp. 1-6. (3rd International Conference on Artificial Intelligence, Computer, Data Sciences, and Applications, ACDSA 2026 phl 2026) [10.1109/ACDSA67686.2026.11468260].

AIRA-D FusionNet: A Multi-Modal Deep Learning Framework for Violence Recognition through Audio-Visual Cues

Halilaj M.;Bekteshi E.;Myrto E.;Dragoni A. F.

2026-01-01

Abstract

This paper presents AIRA-D FusionNet, a multi-modal system for violence recognition that integrates visual analysis using a MoViNet-A0 backbone with auditory processing of MFCC features through a BiLSTM network. Trained on a combined dataset, our fused model achieves a recall of 0.91 and an AUC of 0.856, demonstrating a 12% improvement in recall over unimodal baselines by effectively leveraging complementary audio-visual cues. To enable practical deployment, the model was successfully optimized for mobile inference by conversion to TensorFlow Lite. This confirms the system's viability for real-time violence detection applications on resource-constrained devices, offering a sensitive and efficient solution for automated security monitoring.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Titolo della collana
	
				IEEE Xplore
			
	Codice DOI
	
				https://dx.doi.org/10.1109/ACDSA67686.2026.11468260
			
	Dati FAIR della ricerca
	
	sub-section
	
	URL
	
									https://ieeexplore.ieee.org/document/11468260
								
	DOI
	
									https://dx.doi.org/10.1109/ACDSA67686.2026.11468260

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/356854

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact