Annotator-Target Hate Speech Biases

Visualizing Hate Speech Biases

by Matteo Abrate, Clara Bacciu and Lorenzo Cima
10.6084/m9.figshare.27220917

Companion visualization for:
Giorgi, T., Cima, L., Fagni, T., Avvenuti, M., & Cresci, S. (2025, June). Human and LLM biases in hate speech annotations: A socio-demographic analysis of annotators and targets. In Proceedings of the International AAAI Conference on Web and Social Media. DOI: https://doi.org/10.1609/icwsm.v19i1.35837

Metrics

Intensity: Represents both the direction and strength of bias in annotations. It ranges from -1 (a strong tendency to underestimate hate speech towards individuals with different socio-demographic characteristics) to +1 (a strong tendency to overestimate hate speech). A value close to 0 suggests little to no bias.
Prevalence: Indicates how frequently bias appears in the annotation process. It ranges from 0 (no bias present) to 1 (bias is present in all cases).
Agreement: Measures how consistently annotators agree on their assessments, calculated using Cohen’s Kappa statistic.
Significance: P-value. Represents the statistical significance of the bias test applied to the annotations. A lower p-value suggests stronger evidence of significant bias.
Unique Comments: The total number of distinct comments that have been annotated.
Annotations: The total number of annotations made across all comments.