Many large internet platforms, including Facebook, Twitter, and YouTube, are beginning to rely on artificial intelligence to help stop the spread of hate speech online. The hope is that AI programs using complex natural language processing technology will eventually be able to identify hate speech much faster and more accurately than human beings. It’s a goal that feels more prescient than ever in an age when acts of violence are being linked to hate speech online. But how does AI make decisions about what kind of language qualifies as hate speech, and is it as impartial as we like to think it is?
AI is a Mirror, Not a Cure
Let’s start by taking a look at the way most modern AI is programmed. Unlike traditional software programming, in which humans meticulously prescribe every behavioral rule, deep learning algorithms develop their own behavior by studying billions of real-world data points. A program that seeks to identify hate speech will thus be presented with vast archives of data that contrast instances of hate speech with non-hateful speech. It will then decide to flag content based on how similar it is to the data points it studied that were categorized as hateful.
The problem with this approach is that language does not exist in a vacuum. Whether or not something can be considered hate speech depends enormously on contextual factors, such as who the speaker is, the country, the culture, and so on. Words that may be considered hateful in some contexts may not be in others.
In one recent study, widely-used AI models for processing hate speech were shown to be 1.5 times more likely to flag content created by African Americans and 2.2 times more likely to flag content written in African American Vernacular English. In another study, a group of PhD students at the University of Washington looked at more than 100,000 tweets that had been hand-flagged by humans as being either ‘hate speech,’ ‘offensive,’ or ‘abusive.’ Again, they found that African American user’s tweets were 1.5 times more likely to be flagged than those of other users. Taking their research a step further, they asked a group of workers labeling similar data to take contextual factors, such as a user’s race and dialect, into account. The results showed that when moderators knew more about the users, they were less likely to flag their tweets. Overall, racial bias against black speech decreased by eleven percent.
This evidences the dark reality that lies at the root of the deep learning model: the data that computers study shows the world as it is, not as it ought to be. If AI studies data points that have been flagged by humans due to racial bias, then it will learn to perpetuate that racial bias.
It’s All Kinds of Bias
Let’s take Google Translate as another example. The way this program learns language is by scanning through billions of pages of writing on the internet and studying the frequency with which certain words appear together. Take the word ‘chair,’ for example. The program begins to understand the meaning of ‘chair’ by noticing that it frequently occurs next to other furniture related words, such as ‘table,’ as well as chair functionality words, such as ‘sit.’
This is quite similar to the way children learn language. Our first method of deciphering meaning is through context.
An obvious example of where this becomes problematic can be found by using Google Translate to move between a non-gendered language and a gendered language. Let’s try it with English to Spanish. If you type in “I like my doctor,” in English, the translator spits out ‘Me gusta mi doctor.” Doctor is automatically rendered in the masculine in Spanish. But if you type in “I like my nurse,” you get the feminine form ‘enfermera,’ rather than the masculine ‘enfermero.’ Why does this happen? Because all over the internet, the word doctor occurs in the highest frequency with the pronouns ‘he’ and ‘him,’ while both ‘nurse’ and ‘flight attendant’ occur in the highest frequency beside female pronouns.
How Might This Be Affecting You?
This problem starts to get scarier when we look at its real-world applications.
Increasingly, job recruiters are relying on AI technology to take a first pass at résumés. This is a process that, with human eyes, is prone to bias – as evidenced by studies showing that minority candidates who “whitened” their names received more call-backs. If left unchecked, it stands to reason that AI may act on learned gender and racial bias in their decision-making processes.
“Let’s say a man is applying for a nurse position,” says computer scientist Aylin Caliskan. “He might be found less fit for that position if the machine is just making its own decisions. And this might be the same for a woman applying for a software developer or programmer position.”
An example of this can be seen in a scrapped AI recruiting tool originally developed by Amazon. The goal of the tool was to automatically sort through applicants for developer jobs and other technical posts. The actual results, however, were imperfect: because the industry and profession is dominated by men, the system taught itself to favor male candidates. Resumes that contained words like “Women’s” (as in a women’s college, club, etc.) were automatically penalized by the system.
Where Do We Go from Here?
Ultimately, many computer scientists are suggesting that there need to be more safeguards in the way that AI are taught to operate. People need to constantly be looking at outcomes and asking, “Why am I getting these results?”
Nobody should be assuming that computers are less biased than humans simply because they are machines. After all, machines learn about language as it exists and as it has existed. They have no notion of the way the world ought to be. That part is up to humans to decide.
Janet Barrow writes about the places where language meets history, culture, and politics. She studied Written Arts at Bard College, and her fiction has appeared in Easy Street and Adelaide Magazine. After two years in Lima, Peru, she recently moved to Chicago.