Violence against women and children remains a critical global issue, requiring immediate and innovative interventions. Traditional emergency response systems heavily rely on manual reporting, which may not be feasible in life-threatening situations. This paper introduces an AI-driven voice recognition model designed to detect distress signals in real time. The proposed system leverages deep learning techniques, specifically trained on emotionally labeled speech datasets, to classify distress calls and trigger emergency alerts when necessary. The system consists of a real-time audio capture module, a feature extraction component that processes speech signals, and a deep learning model trained to recognize distress speech patterns. It compares multiple feature extraction methods, including MFCCs and spectrogram-based approaches, and evaluates the performance of convolutional neural networks (CNNs) against state-of-the-art architectures such as Wav2Vec2 and Whisper. Results indicate that transformer-based models significantly outperform traditional CNNs, particularly in handling noisy environments and multilingual speech. The model has been successfully trained and evaluated, and an API has been developed to support real-time classification of audio input. While full mobile integration is still under development, these efforts demonstrate the feasibility of future deployment into mobile applications and IoT security devices for real-time emergency response.

AI-Driven Real-Time Distress Detection Through Speech Recognition for Emergency Response Systems / Halilaj, M.; Myrto, E.; Dragoni, A. F.. - 4044:(2025), pp. 18-25. ( 6th International Conference on Recent Trends and Applications in Computer Science and Information Technology, RTA-CSIT 2025 Tirana 22-24 may 2025).

AI-Driven Real-Time Distress Detection Through Speech Recognition for Emergency Response Systems

Halilaj M.;Dragoni A. F.
2025-01-01

Abstract

Violence against women and children remains a critical global issue, requiring immediate and innovative interventions. Traditional emergency response systems heavily rely on manual reporting, which may not be feasible in life-threatening situations. This paper introduces an AI-driven voice recognition model designed to detect distress signals in real time. The proposed system leverages deep learning techniques, specifically trained on emotionally labeled speech datasets, to classify distress calls and trigger emergency alerts when necessary. The system consists of a real-time audio capture module, a feature extraction component that processes speech signals, and a deep learning model trained to recognize distress speech patterns. It compares multiple feature extraction methods, including MFCCs and spectrogram-based approaches, and evaluates the performance of convolutional neural networks (CNNs) against state-of-the-art architectures such as Wav2Vec2 and Whisper. Results indicate that transformer-based models significantly outperform traditional CNNs, particularly in handling noisy environments and multilingual speech. The model has been successfully trained and evaluated, and an API has been developed to support real-time classification of audio input. While full mobile integration is still under development, these efforts demonstrate the feasibility of future deployment into mobile applications and IoT security devices for real-time emergency response.
2025
File in questo prodotto:
File Dimensione Formato  
Halilaj_AI-Driven-Real-Time-Distress-Detection_2025.pdf

accesso aperto

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso: Creative commons
Dimensione 1.12 MB
Formato Adobe PDF
1.12 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/350094
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact