Challenger at MultiPRIDE: Is It Hate Speech or Reclaimed?
Title: MultiPRIDE Contender: Navigating the Line Between Hate Speech and Reclaimed Language
Abstract: The proliferation of hate speech poses a growing threat within contemporary digital ecosystems, especially across social media channels. Although recent technological strides have yielded encouraging outcomes in the realm of automated hate speech identification, a significant hurdle persists: accurately separating authentic hate speech from language that has been reclaimed by targeted communities. This distinction is complicated by the fact that reclaimed expressions are often nuanced and heavily reliant on context, making precise labeling a formidable task.
This study introduces a straightforward and transparent methodology for differentiating between hate speech and reclaimed language, specifically engineered for the MultiPride Shared Task. The proposed solution creates dense semantic text embeddings and employs a label-noise filtering mechanism utilizing Cleanlab paired with logistic regression. This is subsequently followed by classification via a Multi-layer Perceptron (MLP) neural network. The architecture is optimized to deliver high performance even when constrained by limited computational power.
Assessment of the method relies on precision, recall, and F1-score metrics, with a specific focus on macro-averaged values. The experimental data reveals strong, robust performance, even in the face of severe class imbalance within the dataset. Ultimately, the results suggest that while the current approach is effective, there is considerable scope for enhancement by integrating larger embedding models and more sophisticated preprocessing strategies, all without compromising the system's interpretability.
Source: arXiv Generated at: 2026-06-02 00:00:00 UTC





