Hate speech: Detection, Mitigation and Beyond @ICWSM

hate-alert

Code Project Slides Video Follow

Image credit: Unsplash

Abstract

Social media sites such as Twitter and Facebook have connected billions of people and given the opportunity to the users to share their ideas and opinions instantly. That being said, there are several ill consequences as well such as online harassment, trolling, cyber-bullying, fake news, and hate speech. Out of these, hate speech presents a unique challenge as it is deep engraved into our society and is often linked with offline violence. Social media platforms rely on local moderators to identify hate speech and take necessary action, but with a prolific increase in such content over the social media many are turning toward automated hate speech detection and mitigation systems. This shift brings several challenges on the plate, and hence, is an important avenue to explore for the computation social science community.

Date

Jun 7, 2021 6:30 PM — 8:00 PM

Event

ICWSM Tutorials

Important updates

Slides can be found here
Video of the tutorial can be found here!!

Contributions and achievements

Our papers are accepted in top conferences like AAAI, WWW, CSCW, ICWSM, WebSci. Link to the papers here
We have open sourced our codes and datasets under a single github organisation - hate-alert for the future research in this domain
We have stored different transformers models in huggingface.co. Link to hatealert organisation
Dataset from our recent accepted paper in AAAI - “Hatexplain:A Benchmark Dataset for Explainable Hate Speech Detection” is also stored in the huggingface datsets forum
We also participate in several hate speech shared tasks, winning many of them - hatealert@DLTEACL, hateminers@AMI, hatemonitors@HASOC and coming under 1% in hatealert@Hatememe detection by Facebook AI.
Notion page containing hate speech papers.

Tutorial Outline

In this translation style tutorial, we present an exposition of hate speech detection and mitigation in three steps. The following section presents a detailed plan for the tutorial:-

Introduction (15 min)- This section will cover the scentific interest in hate speech and various definitions of hate speech. This section will help you understand the outline and what to take home from this tutorial.
Analysis (20 min)- In this section, we analyze the spread of hate speech in online social media platforms like Twitter, Facebook, Gab etc. We observe that hate speech is spreading through online communities at an alarming rate. These hateful users are well connected among themselves and are reaching a wider audience. This case is more severe in moderation free platforms like Gab, Bitchute etc. The targets of such hate vary. These include the Muslims, Jews, Africans etc. This section is further divided into the following parts
1. Spread of hate speech
2. Effects of hate speech
3. Targets of hate speech
Detection (20 min)- Hate speech detection is a challenging task. We now have several datasets available based on different criterias language, domain, modalities etc.Several models ranging from simple Bag of Words to complex ones like BERT have been used for the task. The task performance seems to be improving over time, however, there are issues like generalizability, bias and explainability of the models.
1. Different datasets. This section is further divided into
2. Earlier detection models
3. Current detection models (based on transformers)
4. Multimodal and Multilingual hate speech
5. Hate user detection
6. Challenge: Evaluation, Explainability and Bias
Mitigation (20 min)- To deter the spread of hate speech, organizations have adopted several policies. These include the general policies like deletion of posts and/or accounts, shadow banning to softer approaches like counterspeech. Policies like banning/deletion seem to be effective in some cases, but there are issues of violation of freedom of speech. Recent research have started looking into automated generation of counterspeech as well.
1. Banning and suspending users
2. Counter speech detection
3. Counter speech generation
4. Challenges: Generation pitfalls, Moderation effects
Road to the future (15 min)- We end this tutorial with covering the summary of the challenges and road to the future for hate speech research.
1. Summary of challenges
2. Branches and extensions of hate speech.
3. Connections to offline violence.
4. Guidelines for building better dataset.
5. Adapting to newer events and platforms.

About the Organizers

Punyajoy Saha is a PhD scholar at the Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur (India). His research interests lies in the nexus of social computing and natural language processing. More about him can be found here.

Binny Mathew is a PhD scholar at the Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur (India). His research interest lies in computational social science and natural language processing. More about him can be found here.

Mithun Das is a PhD scholar at the Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur (India). His research interests lie in computational social science and natural language processing. More about him can be found here.

Pawan Goyal is an Associate Professor at the Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur (India). His research interest lies in natural language processing and text mining. More about him can be found here.

Kiran Garimella is the first IDSS postdoctoral fellow to receive a Hammer Fellowship, pioneers research into the spread of rumors and misinformation on closed platforms such as WhatsApp, a popular encrypted messaging service with millions of users worldwide. More about him can be found here.

Animesh Mukherjee is an Associate Professor at the Department of Computer Science and Engineering, Indian Institute of Technology, Kharagpur (India). His research interest lies in natural language processing, information retrieval and AI and ethics. More about him can be found here.