Date of Defense

7-11-2024 10:00 AM

Location

E1-1021

Document Type

Dissertation Defense

Degree Name

Doctor of Philosophy in Informatics and Computing

College

CIT

Department

Computer Science and Software Engineering

First Advisor

Dr. Marton Gergely

Keywords

Social media, online gaming, artificial intelligence, natural language processing, large language models, content moderation, hate speech, harassment.

Abstract

With the increase in popularity of online communities, such as social media platforms, online games, and chatroom servers, there is a need to improve chat and content moderation. Platforms have reported an increase in the prevalence of toxic behavior and hate speech. Meanwhile, moderators are reporting difficulties in keeping up with the amount of data to check as well and the type of content they are exposed to, which further harms their own mental health. The main objective of this work is to address the challenges that exist within online communities with the rising prevalence of hate speech. Additionally, some of the burden of reviewing incidents and applying accountability will be lifted off any human moderators. Leveraging the evolution of Large Language Models (LLMs), a Natural Language Processing (NLP)-based solution is prototyped that would monitor chat messages passing through a server and classifying them according to their content. To do this, several LLMs are selected based on the reviewed literature and tested against a dataset containing chat messages sourced from various platforms. The LLM that “behaved” the best is selected to run on a simulated chat server to monitor chat messages and maintain a record of classifications. The results of the model comparison showed that DistilBERT using two separate datasets dedicated to training and testing performed the best with an accuracy of 79%, which was higher than what the other models achieved. When embedding this model server-side within the chat simulation, it maintained its 79% accuracy after testing it again with a separate test set containing 1000 entries. With these results, leveraging LLMs for content moderation shows promise, as presented by the prototype moderator presented in this work. With further optimization and more data, the performance of the model will improve according to the environment it is monitoring.

Download

Included in

Information Security Commons

COinS

Nov 7th, 10:00 AM

TACKLING TOXICITY AND HARASSMENT IN ONLINE ENVIRONMENTS THROUGH THE USE OF ARTIFICIAL INTELLIGENCE

E1-1021

Dissertation Defenses

TACKLING TOXICITY AND HARASSMENT IN ONLINE ENVIRONMENTS THROUGH THE USE OF ARTIFICIAL INTELLIGENCE

Date of Defense

Location

Document Type

Degree Name

College

Department

First Advisor

Keywords

Abstract

Included in

Browse

Author Corner

Dissertation Defenses

TACKLING TOXICITY AND HARASSMENT IN ONLINE ENVIRONMENTS THROUGH THE USE OF ARTIFICIAL INTELLIGENCE

Presenter Information

Date of Defense

Location

Document Type

Degree Name

College

Department

First Advisor

Keywords

Abstract

Included in

Share

Browse

Author Corner