Real-Time Violence Detection in Video Footage Using a Mobile-Friendly CNN-Based Model

IRIS

Detecting violence in video content, particularly within domestic environments, presents an ongoing challenge in both social and technological contexts. This paper proposes a lightweight deep learning framework for real-time violence detection, optimized for mobile and edge deployment. The approach is based on MoViNet-A0, evaluated in both Base and Stream configurations, and is complemented by a custom Conv2D-based baseline designed for ultra-low-latency inference. All models were trained and validated on the AIRTLab dataset, which includes 350 annotated videos representing violent and non-violent scenes. The Mo YiN et-A0 Base model achieved a validation accuracy of 92.8%, while the Conv2D-based model reached 89.6% validation accuracy, along with a precision and F1-score close to 90%. Performance benchmarks conducted on Android devices and desktop platforms show that real-time inference is feasible, with latencies as low as 0.9 seconds per 10-frame sequence on mid-range smartphones. The entire pipeline has been designed for mobile deployment, and integration into a functional prototype application is currently in progress, aiming to enable real-time violence detection directly on mobile devices.

Real-Time Violence Detection in Video Footage Using a Mobile-Friendly CNN-Based Model / Halilaj, M.; Cannone, V.; Sernani, P.; Franco Dragoni, A.. - (2025), pp. 323-328. ( 4th IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2025 Ancona, IT 22-24 October 2025) [10.1109/MetroXRAINE66377.2025.11340181].

Real-Time Violence Detection in Video Footage Using a Mobile-Friendly CNN-Based Model

Halilaj M.;Cannone V.;Sernani P.;Franco Dragoni A.

2025-01-01

Abstract

Detecting violence in video content, particularly within domestic environments, presents an ongoing challenge in both social and technological contexts. This paper proposes a lightweight deep learning framework for real-time violence detection, optimized for mobile and edge deployment. The approach is based on MoViNet-A0, evaluated in both Base and Stream configurations, and is complemented by a custom Conv2D-based baseline designed for ultra-low-latency inference. All models were trained and validated on the AIRTLab dataset, which includes 350 annotated videos representing violent and non-violent scenes. The Mo YiN et-A0 Base model achieved a validation accuracy of 92.8%, while the Conv2D-based model reached 89.6% validation accuracy, along with a precision and F1-score close to 90%. Performance benchmarks conducted on Android devices and desktop platforms show that real-time inference is feasible, with latencies as low as 0.9 seconds per 10-frame sequence on mid-range smartphones. The entire pipeline has been designed for mobile deployment, and integration into a functional prototype application is currently in progress, aiming to enable real-time violence detection directly on mobile devices.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Codice ISBN
	
				9798331502799
			
	Codice DOI
	
				https://dx.doi.org/10.1109/MetroXRAINE66377.2025.11340181
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Halilaj_Real-Time-Violence-Detection_2025.pdf Solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza d'uso: Tutti i diritti riservati Dimensione 586.23 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	586.23 kB	Adobe PDF	Visualizza/Apri Richiedi una copia
Real-Time Violence Detection in Video Footage Using a Mobile-Friendly CNN-Based Model.pdf accesso aperto Tipologia: Documento in post-print (versione successiva alla peer review e accettata per la pubblicazione) Licenza d'uso: Licenza specifica dell'editore Dimensione 502.92 kB Formato Adobe PDF Visualizza/Apri	502.92 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/355418

Citazioni

ND

1

ND

social impact