Biases in AI: An Experiment : Algorithmic Fairness in the World of Hateful Language Detection

University essay from KTH/Skolan för elektroteknik och datavetenskap (EECS)

Abstract: Hateful language is a growing problem in digital spaces. Human moderators are not enough to eliminate the problem. Automated hateful language detection systems are used to aid the human moderators. One of the issues with the systems is that their performance can differ depending on who is the target of a hateful text. This project evaluated the performance of the two systems (Perspective and Hatescan) with respect to who is the target of hateful texts. The analysis showed, that the systems performed the worst for texts directed at women and immigrants. The analysis involved tools such as a synthetic dataset based on the HateCheck test suite, as well as wild datasets created from forum data. Improvements to the test suite HateCheck have also been proposed.

  AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)