December 14, 2024

Language necessarily contains human biases, and so will machines trained on language corpora

I have a new draft paper with Aylin Caliskan-Islam and Joanna Bryson titled Semantics derived automatically from language corpora necessarily contain human biases. We show empirically that natural language necessarily contains human biases, and the paradigm of training machine learning on language corpora means that AI will inevitably imbibe these biases as well. Specifically, we look at […]