Quantifying and reducing stereotypes in word embeddings
Files
Published version
Date
2016
DOI
Authors
Bolukbasi, Tolga
Chang, Kai-Wei
Zou, James
Saligrama, Venkatesh
Kalai, Adam
Version
Published version
OA Version
Citation
Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai. 2016. "Quantifying and reducing stereotypes in word embeddings." arXiv preprint arXiv:1606.06121. https://arxiv.org/abs/1606.06121
Abstract
Machine learning algorithms are optimized to
model statistical properties of the training data.
If the input data reflects stereotypes and biases
of the broader society, then the output of the
learning algorithm also captures these stereotypes.
In this paper, we initiate the study of
gender stereotypes in word embedding, a popular
framework to represent text data. As their
use becomes increasingly common, applications
can inadvertently amplify unwanted stereotypes.
We show across multiple datasets that the embeddings
contain significant gender stereotypes,
especially with regard to professions. We created
a novel gender analogy task and combined it
with crowdsourcing to systematically quantify
the gender bias in a given embedding. We
developed an efficient algorithm that reduces
gender stereotype using just a handful of training
examples while preserving the useful geometric
properties of the embedding. We evaluated our
algorithm on several metrics. While we focus on
male/female stereotypes, our framework may be
applicable to other types of embedding biases.