Hate speech-detecting AIs are fools for 'love'

September 14, 2018

State-of-the-art detectors that screen out online hate speech can be easily duped by humans, shows new study.

Hateful text and comments are an ever-increasing problem in online environments, yet addressing the rampant issue relies on being able to identify toxic content. A new study by the Aalto University Secure Systems research group https://ssg.aalto.fi has discovered weaknesses in many machine learning detectors currently used to recognize and keep hate speech at bay.

Many popular social media and online platforms use hate speech detectors that a team of researchers led by Professor N. Asokan have now shown to be brittle and easy to deceive. Bad grammar and awkward spelling--intentional or not--might make toxic social media comments harder for AI detectors to spot.

The team put seven state-of-the-art hate speech detectors to the test. All of them failed.

Modern natural language processing techniques (NLP) can classify text based on individual characters, words or sentences. When faced with textual data that differs from that used in their training, they begin to fumble.

'We inserted typos, changed word boundaries or added neutral words to the original hate speech. Removing spaces between words was the most powerful attack, and a combination of these methods was effective even against Google's comment-ranking system Perspective,' says Tommi Gröndahl, doctoral student at Aalto University.

Google Perspective ranks the 'toxicity' of comments using text analysis methods. In 2017, researchers from the University of Washington showed that Google Perspective can be fooled by introducing simple typos. Gröndahl and his colleagues have now found that Perspective has since become resilient to simple typos yet can still be fooled by other modifications such as removing spaces or adding innocuous words like 'love'.

A sentence like 'I hate you' slipped through the sieve and became non-hateful when modified into 'Ihateyou love'.

The researchers note that in different contexts the same utterance can be regarded either as hateful or merely offensive. Hate speech is subjective and context-specific, which renders text analysis techniques insufficient as stand-alone solutions.

The researchers recommend that more attention be paid to the quality of data sets used to train machine learning models--rather than refining the model design. The results indicate that character-based detection could be a viable way to improve current applications.
-end-
The study was carried out in collaboration with researchers from University of Padua in Italy. The results will be presented at the ACM AISec workshop in October.

The study is part of an ongoing project called Deception Detection via Text Analysis https://ssg.aalto.fi/research/projects/deception-detection-via-text-analysis in the Secure Systems group https://ssg.aalto.fi> at Aalto University.

Research article:

Tommi Gröndahl, Luca Pajola, Mika Juuti, Mauro Conti, N.Asokan:
All You Need is "Love": Evading Hate-speech Detection.
https://arxiv.org/abs/1808.09115

More information:

Tommi Gröndahl, Doctoral Candidate
Aalto University
Secure Systems group
tommi.grondahl@aalto.fi
tel. +358 400 426 523

N. Asokan, Professor
Aalto University
Secure Systems group
n.asokan@aalto.fi
tel. +358 50 483 6465

Aalto University

Related Perspective Articles from Brightsurf:

Empathy and perspective taking: How social skills are built
Being able to feel empathy and to take in the other person's perspective are two abilities through which we understand what is going on in the other's mind.

Perspective: Understanding COVID-19 vaccine efficacy
In this Perspective, Marc Lipsitch and Natalie Dean consider what would happen if a COVID-19 vaccine offers little to no protection in high-risk groups, like the elderly and those with comorbidities, yet is able to reduce infection or infectiousness in younger adults.

UC Berkeley demographers put COVID-19 death toll into perspective
With over 170,000 COVID-19 deaths to date, and 1,000 more each day, America's life expectancy may appear to be plummeting.

Contextual engineering adds deeper perspective to local projects
Contextual engineering is a novel approach combining technological expertise with deep understanding of cultural and societal conditions.

Recalling memories from a third-person perspective changes how our brain processes them
Adopting a third-person, observer point of view when recalling your past activates different parts of your brain than recalling a memory seen through your own eyes, according to a new paper.

Perspective: T cell responses to COVID-19 are a crucial target for research
While early research on the adaptive immune response to COVID-19 primarily looked at antibodies, more information is now emerging on how T cells react to the SARS-CoV-2 virus - addressing a crucial knowledge gap, say Daniel Altmann and Rosemary Boyton in a new Perspective.

COVID-19 from food safety and biosecurity perspective
Most recently emerged pneumonia of unknown cause named COVID-19 has a devastating impact on public health and economy surpassing its counterparts in morbidity and mortality.

Perspective: Rapid COVID-19 vaccine development
When seeking the fastest pathway to a vaccine to prevent coronavirus disease 19 (COVID-19), defining the stakes and potential hurdles is critical, says Barney Graham in this Perspective.

Perspective: Rapid repurposing of drugs for COVID-19
Given the rapid spread of COVID-19 and its relatively high mortality, filling the gap for coronavirus-specific drugs is urgent.

Social accounting, a different perspective when analysing public spending efficiency
A UPV/EHU's research group has shown that it is possible to express in terms of money the social value generated by a hospital.

Read More: Perspective News and Perspective Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.