Mitigating offensive search suggestions with deep learning

Article
05/29/2017

Winner of the PAKDD 2017 Best Paper Award

The humble search bar is the window through which most Internet users experience the web. Deep learning is set to enhance the capabilities of this simple tool such that search engines can now anticipate what the user is looking for whilst moderating offensive suggestions before the query is complete.

A lack of contextual and deeper understanding of the intent of search queries often leads to inappropriate or offensive suggestions. Striving for a safer and saner web, the Microsoft team began its research with deep learning techniques to help detect and automatically prune search suggestions.

Our paper titled “Convolutional Bi-Directional LSTM for Detecting Inappropriate Query Suggestions in Web Search” received the “Best Paper Award” at the recent Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2017. It was picked amongst a record-breaking 458 submissions at the leading international conference on knowledge discovery and data mining. Winning the award has been a humbling experience for the team. It upholds our efforts to establish deep machine learning as a powerful weapon against online vitriol.

[caption id="attachment_605" align="alignnone" width="1024"] (L-R) Manoj Chinnakotla and Harish Yenala receiving the award at the PAKDD 2017 Conference on 25th May in Jeju Island, South Korea[/caption]

Here’s a brief overview of what we found:

The Challenge

Human beings intuitively detect offensive language and derogatory speech online. Algorithms, however, struggle to clearly distinguish toxic language from benign comments. Systems often misjudge the level of toxicity or hate in certain phrases that require a deeper understanding of cultural norms, slang and context.

The problem is usually rendered difficult due to unique challenges posed by search queries such as lack of sufficient context, natural language ambiguity and presence of spelling mistakes and variations. For example, Marvin Gaye’s classic hit ‘If I Should Die Tonight’ could be deemed offensive simply because of the inclusion of the phrase ‘should die tonight’. Similarly, algorithms frequently misclassify offensive words and phrases as ‘clean’ simply because they were misspelled or used euphemisms. For example, the phrase ‘shake and bake’ is both a registered trademark for a popular food brand and street code for preparing illegal drugs on the move.

Safeguarding the Online Experience

The impact of inappropriate language and offensive speech online cannot be overstated. Internet access is ubiquitous and users cover diverse age groups and cultural backgrounds. Making search queries, online communication, and instant messaging safe for children, minorities, and sensitive communities is essential to preserve the integrity of the digital world.

Inappropriate suggestions on search queries or offensive comments on news articles could cause significant harm to vulnerable groups such as children and marginalized communities. Unsuitable suggestions could tarnish the reputation of corporations, inadvertently help someone cause harm to themselves with risky information, or lead to legal complications with authorities and regulators. Problems such as intimidation, threats, cyber bullying, trolling, explicit and suggestive content, and racist overtones need to be curtailed to help keep the Internet open and safely accessible to everyone.

The Solution

Conventional solutions to this problem have typically involved using –

A manually curated list of patterns involving such offensive words, phrases and slangs or
Classical Machine Learning (ML) techniques which use various hand-crafted features (typically words etc.) for learning the intent classifier or
Standard off-the-shelf deep learning model architectures such as CNN, LSTMs or Bi-directional LSTMs (BLSTMs).

In our current work, we propose a novel deep learning architecture called, “Convolutional Bi-Directional LSTM (C-BiLSTM)" – which combines the strengths of Convolutional Neural Networks (CNNs) with Bi-directional LSTMs (BLSTMs). Given a query, C-BiLSTM uses a convolutional layer for extracting feature representations for each query word which is then fed as input to the BLSTM layer which captures the various sequential patterns in the entire query and outputs a richer representation encoding them. The query representation thus learnt passes through a deep, fully connected network which predicts the target class – whether it is offensive or clean. C-BiLSTM doesn’t rely on hand-crafted features, is trained end-to-end as a single model, and effectively captures both local features as well as their global semantics.

Applying the technique on 79041 unique real-world search queries along with their class labels (inappropriate/clean), revealed that this novel approach was significantly more effective than conventional models based on patterns, classical ML techniques using hand-crafted features. C-BiLSTM also outperformed standard off-the-shelf deep learning models such as CNN, LSTM and BLSTM when applied to the same dataset. Our final C-BiLSTM model achieves a precision of 0.9246, recall of 0.8251 and an overall F1 score of 0.8720.

Although the focus of the paper was detecting offensive terms in Query Auto Completion (QAC) in search engines, the technique can be applied to other online platforms as well. Comments on news articles can be cleaned up and inappropriate conversations can be flagged up for abuse. Trolling can be detected and a safe search experience can be enabled for children online. This technique could also help make chatbots and autonomous virtual assistants more contextually aware, culturally sensitive, and dignified in their responses. More details about this technique and implementation could be found in the actual paper.

Final Thoughts

The deep learning technique detailed in this study could be a precursor to better tools that can fight online vitriol. When APIs based on this system are applied to social media platforms, email services, chat rooms, discussion forums, and search engines, the results will be parsed through multiple filters to ensure users are not exposed to offensive content.

Curtailing offensive language can transform the Internet, making it safely accessible to a wider audience. Making the web an open, safe and secure place for ideas and innovations is a core part of our mission to empower everyone, everywhere through the power of technology.

Mitigating offensive search suggestions with deep learning

Additional resources