Near-eye pupil tracking is essential for Virtual and Augmented Reality applications but poses challenges in resource-constrained environments due to the high computational demands of traditional frame-based systems. This work introduces SEEN, a lightweight Spiking Neural Network (SNN) with 144,687 parameters, designed for efficient eye tracking using Event-based Camera (EbC) data. Leveraging the sparse, asynchronous nature of EbCs and SNNs, SEEN processes Time-Surface representations of eye movements (random, saccades, reading, smooth pursuit, blinks) from the 3ET+ dataset, achieving an average Euclidean distance of 8.58 pixels on a subset and 7.98 pixels on the Kaggle 2025 challenge. Ablation studies reveal that, for recurrent leaky layers, applying a learnable β in upper layers closer to the input outperforms deeper placements, enhancing prediction accuracy and guiding efficient SNN design. The main contributions of this work include a SNN architecture and insights into design optimization, paving the way for future exploration of input features and feature extraction strategies in spiking neural networks for real-time eye tracking.
SEEN: A Convolutional Spiking Neural Network for Efficient Pupil Coordinate Prediction from Event Data / Troconis, Luigi Gabriel; Vella, Francesco; Freddi, Alessandro; Monteriu', Andrea. - (2025), pp. 1-6. ( 3rd Cognitive Models and Artificial Intelligence Conference, AICCONF 2025 Prague, Czech Republic 2025) [10.1109/AICCONF64766.2025.11064292].
SEEN: A Convolutional Spiking Neural Network for Efficient Pupil Coordinate Prediction from Event Data
Troconis Luigi Gabriel;Vella Francesco;Freddi Alessandro;Monteriu' Andrea
2025-01-01
Abstract
Near-eye pupil tracking is essential for Virtual and Augmented Reality applications but poses challenges in resource-constrained environments due to the high computational demands of traditional frame-based systems. This work introduces SEEN, a lightweight Spiking Neural Network (SNN) with 144,687 parameters, designed for efficient eye tracking using Event-based Camera (EbC) data. Leveraging the sparse, asynchronous nature of EbCs and SNNs, SEEN processes Time-Surface representations of eye movements (random, saccades, reading, smooth pursuit, blinks) from the 3ET+ dataset, achieving an average Euclidean distance of 8.58 pixels on a subset and 7.98 pixels on the Kaggle 2025 challenge. Ablation studies reveal that, for recurrent leaky layers, applying a learnable β in upper layers closer to the input outperforms deeper placements, enhancing prediction accuracy and guiding efficient SNN design. The main contributions of this work include a SNN architecture and insights into design optimization, paving the way for future exploration of input features and feature extraction strategies in spiking neural networks for real-time eye tracking.| File | Dimensione | Formato | |
|---|---|---|---|
|
SEEN_A_Convolutional_Spiking_Neural_Network_for_Efficient_Pupil_Coordinate_Prediction_from_Event_Data.pdf
Solo gestori archivio
Tipologia:
Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso:
Tutti i diritti riservati
Dimensione
292.49 kB
Formato
Adobe PDF
|
292.49 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


