Online Hate Detection (& detection evasion techniques) – a challenging area for human rights action

3D render of a computer keyboard with one key labeled for hate speech, representing discriminatory messages that plague online message boards and comment areas.

March 7, 2019

Monitoring of hate speech on social media is a challenging area of human rights action. There are various approaches relying on human and technical monitoring mechanisms. Current approaches appear to have some serious limitations, given the scope and complexity of the problem.

A 2018 European study by Tommi Grondahl and his research colleagues examined hate speech detection mechanisms across several social media platforms. They found that some current technical approaches to monitoring online hate can be easily evaded by inserting typos and using other simple techniques such as adding non-hateful words. The authors suggest successful hate speech detection mechanisms should focus on the type of data as well as labeling criteria to improve accuracy.

Taking care of workers in this field is also an emerging human rights and labour issue. A recent story in The Verge by Casey Newton, “The Trauma Floor”, describes the trauma, mental health issues and work conditions of employees hired to carry out content review for Facebook. (Warning: The story contains some disturbing content but can be found here.)

Ongoing dialogue about what constitutes hate speech, and how to identify it online, needs to continue as well as efforts to document the harms of hate speech. A 2018 article from the University of Warsaw (Wiktor Soral, Michał Bilewicz and Mikołaj Winiewski) shows the importance of efforts to study and mitigate the harms of hate speech. The journal article titled “Exposure to hate speech increases prejudice through desensitization” describes the results of three studies. A summary is described in a story in the Pacific Standard. The authors concluded that “repetitive exposure to hate speech leads to desensitization to this form of verbal violence and subsequently to lower evaluations of the victims and greater distancing, thus increasing outgroup prejudice”. Studies such as this will help support informed policy decisions on monitoring and mitigation measures as well as legal sanctions.

A recent Canadian survey concluded that 60 per cent of Canadians report having seen hate speech on social media. Late last fall, Statistics Canada reported that police reported hate crimes are up significantly in Canada (see November 29, 2018 Globe and Mail story here).

In Canada and around the world, institutions, academics and experts are collaborating on, and sharing, strategies for combatting hate speech and hate crime. An example is the International Network for Hate Studies. Its website provides an online library and other resources and information about research aimed at understanding hate and hate crime. In Canada, a group of experts and activists have formed the Canadian Anti-Hate Network.

In 2018, the UN Committee on the Elimination of Racial Discrimination (CERD) called on Canada to develop a national action plan to combat racism. In its Concluding Observations, CERD recommended that Canada “develop and launch a new National Action Plan Against Racism, in compliance with its obligations under the World Conference Against Racism, through meaningful consultations process with civil society organizations, including ethnic minorities and Indigenous Peoples, which includes implementing legislation, dedicated resources, targets, and adequate monitoring and reporting mechanisms, using good practices mentioned in Ontario’s anti-racism strategy of 2017.”

The federal government is still considering the shape of a national anti-racism strategy. We should expect the challenge of monitoring and regulating hate speech and hate crime to be a part of such a strategy.

The impacts of online hate compounds the multi-generational impacts of discrimination for many equality seeking communities in Canada. Partners in the human rights community will continue to collectively challenge hate narratives that violate fundamental rights to live free of discrimination and its harms.

Wendy Moss