Spotting the Aggressor: Pose-Based Violence Detection Through Spatial-Temporal Deep Learning Techniques

IRIS

The automatic detection of violent behaviour in video sequences has emerged as a critical area of research in public safety and surveillance, as the early detection of aggressive actions enables rapid intervention and can significantly mitigate potential harm. This paper proposes a novel methodology for the automatic detection and identification of violent behaviours in video sequences, with particular emphasis on the recognition of the specific individual responsible for such actions. A significant innovation of our approach is the integrated extraction of human pose features using a You Only Look Once-based (YOLO) model that efficiently captures critical key points which serve as essential cues for the detection of violent interactions. The proposed approach integrates human pose estimation techniques used to extract spatial features with temporal analysis models designed to capture the dynamic nature of aggressive behaviour. To assess the effectiveness of the method, two temporal architectures, a Bidirectional Long-Short-Term Memory (BiLSTM) network and a transformer-based model, were evaluated on the AirtLab dataset. Experimental results demonstrate the robustness and reliability of the proposed approach, highlighting high accuracy alongside real-time applicability. Furthermore, by relying on pose-based representations that can be processed in distributed edge-cloud architectures, the methodology offers enhanced privacy preservation compared to raw video processing approaches.

Spotting the Aggressor: Pose-Based Violence Detection Through Spatial-Temporal Deep Learning Techniques / Rongoni, A., Longarini, L., Prist, M., Pompei, G., Dragoni, A.F.. - (2025), pp. 329-334. (4th IEEE International Conference on Metrology for eXtended Reality, Artificial Intelligence and Neural Engineering, MetroXRAINE 2025 Ancona, IT 22-24 October 2025) [10.1109/MetroXRAINE66377.2025.11340312].

Spotting the Aggressor: Pose-Based Violence Detection Through Spatial-Temporal Deep Learning Techniques

Rongoni A.;Longarini L.;Prist M.;Pompei G.;Dragoni A. F.

2025-01-01

Abstract

The automatic detection of violent behaviour in video sequences has emerged as a critical area of research in public safety and surveillance, as the early detection of aggressive actions enables rapid intervention and can significantly mitigate potential harm. This paper proposes a novel methodology for the automatic detection and identification of violent behaviours in video sequences, with particular emphasis on the recognition of the specific individual responsible for such actions. A significant innovation of our approach is the integrated extraction of human pose features using a You Only Look Once-based (YOLO) model that efficiently captures critical key points which serve as essential cues for the detection of violent interactions. The proposed approach integrates human pose estimation techniques used to extract spatial features with temporal analysis models designed to capture the dynamic nature of aggressive behaviour. To assess the effectiveness of the method, two temporal architectures, a Bidirectional Long-Short-Term Memory (BiLSTM) network and a transformer-based model, were evaluated on the AirtLab dataset. Experimental results demonstrate the robustness and reliability of the proposed approach, highlighting high accuracy alongside real-time applicability. Furthermore, by relying on pose-based representations that can be processed in distributed edge-cloud architectures, the methodology offers enhanced privacy preservation compared to raw video processing approaches.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Codice ISBN
	
				9798331502799
			
	Codice DOI
	
				https://dx.doi.org/10.1109/MetroXRAINE66377.2025.11340312
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
Rongoni_Spotting-Aggressor-Pose-Based_2025.pdf Solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza d'uso: Tutti i diritti riservati Dimensione 6.94 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	6.94 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
Spotting the Aggressor Pose-Based Violence Detection Through .pdf accesso aperto Tipologia: Documento in post-print (versione successiva alla peer review e accettata per la pubblicazione) Licenza d'uso: Licenza specifica dell'editore Dimensione 1.01 MB Formato Adobe PDF Visualizza/Apri	1.01 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/355416

Citazioni

ND

0

ND

social impact