Reddit is one of the few social networks that handles Not Safe For Work (NSFW) content in an explicit and well-structured way. Despite this, in the past literature on Reddit, there are very few researches concerning this topic. In particular, a study on the text of NSFW comments and posts published in this social medium is missing. In this paper, we aim at contributing to fill this gap by proposing an approach for extracting and analyzing text patterns from NSFW adult content in Reddit. Some peculiarities of our approach are the following: (i) text patterns are extracted based not only on frequency but also, and mostly, on several utility measures; (ii) extracted patterns contribute to the definition of social networks whose analysis allows us to extract several useful information about the users publishing and/or accessing NSFW content and the language adopted by them; (iii) our approach is not only descriptive but also predictive, because, in addition to identifying already existing user communities, it is able to propose new ones; these are made up of users who do not yet know each other but share the same interests and the same language.

Extraction and analysis of text patterns from NSFW adult content in Reddit / Cauteruccio, F.; Corradini, E.; Terracina, G.; Ursino, D.; Virgili, L.. - In: DATA & KNOWLEDGE ENGINEERING. - ISSN 0169-023X. - 138:(2022). [10.1016/j.datak.2022.101979]

Extraction and analysis of text patterns from NSFW adult content in Reddit

F. Cauteruccio;E. Corradini
;
D. Ursino
;
L. Virgili
2022-01-01

Abstract

Reddit is one of the few social networks that handles Not Safe For Work (NSFW) content in an explicit and well-structured way. Despite this, in the past literature on Reddit, there are very few researches concerning this topic. In particular, a study on the text of NSFW comments and posts published in this social medium is missing. In this paper, we aim at contributing to fill this gap by proposing an approach for extracting and analyzing text patterns from NSFW adult content in Reddit. Some peculiarities of our approach are the following: (i) text patterns are extracted based not only on frequency but also, and mostly, on several utility measures; (ii) extracted patterns contribute to the definition of social networks whose analysis allows us to extract several useful information about the users publishing and/or accessing NSFW content and the language adopted by them; (iii) our approach is not only descriptive but also predictive, because, in addition to identifying already existing user communities, it is able to propose new ones; these are made up of users who do not yet know each other but share the same interests and the same language.
2022
NSFW posts and comments; Pattern utility measures; Reddit; Social Network Analysis; Text patterns; Triads and cliques
File in questo prodotto:
File Dimensione Formato  
Cauteruccio_Extraction-analysis-text-patterns_2022.pdf

Solo gestori archivio

Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore)
Licenza d'uso: Tutti i diritti riservati
Dimensione 1.66 MB
Formato Adobe PDF
1.66 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/294442
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 15
  • ???jsp.display-item.citation.isi??? 13
social impact