I have a new draft paper with Aylin Caliskan-Islam and Joanna Bryson titled Semantics derived automatically from language corpora necessarily contain human biases. We show empirically that natural language necessarily contains human biases, and the paradigm of training machine learning on language corpora means that AI will inevitably imbibe these biases as well. Specifically, we look at […]
Language necessarily contains human biases, and so will machines trained on language corpora
Security against Election Hacking – Part 2: Cyberoffense is not the best cyberdefense!
State and county election officials across the country employ thousands of computers in election administration, most of them are connected (from time to time) to the internet (or exchange data cartridges with machines that are connected). In my previous post I explained how we must audit elections independently of the computers, so we can trust the […]
Security against Election Hacking – Part 1: Software Independence
There’s been a lot of discussion of whether the November 2016 U.S. election can be hacked. Should the U.S. Government designate all the states’ and counties’ election computers as “critical cyber infrastructure” and prioritize the “cyberdefense” of these systems? Will it make any difference to activate those buzzwords with less than 3 months until the […]
The workshop on Data and Algorithmic Transparency
From online advertising to Uber to predictive policing, algorithmic systems powered by personal data affect more and more of our lives. As our society begins to grapple with the consequences of this shift, empirical investigation of these systems has proved vital to understand the potential for discrimination, privacy breaches, and vulnerability to manipulation. This emerging […]