Date of Defense

12-11-2024 1:30 PM

Location

E1-1012

Document Type

Thesis Defense

Degree Name

Master of Science in Information Security

College

CIT

Department

Information Security

First Advisor

Prof. Mohammad Mehedy Masud

Keywords

Large Language Models, NLP, Security, Risk Rating, OWASP, fine-tuning, defense mechanisms

Abstract

This thesis examines the use of Large Language Models (LLMs) in education, with a focus on improving performance and implementing strong security measures. The research has two main goals, namely, the development of an effective lecture summarization technique using LLMs and identifying and addressing security vulnerabilities in LLM applications according to OWASP (Open Web Application Security Project) guidelines. For the former goal, we have proposed an effective framework for fine-tuning LLMs using real lecture datasets and compared the performance of different LLMs. For the latter goal, we conducted a thorough review of the application dataflow of the proposed framework and revealed several vulnerabilities, categorized as high risk, medium risk, and low risk. We also propose countermeasures to these vulnerabilities and demonstrate their efficacy. Thus, this study suggests a framework for securely integrating LLMs into educational purposes, tackling critical security concerns while harnessing the models' efficiency.

Share

COinS
 
Nov 12th, 1:30 PM

A SECURE AND EFFECTIVE FRAMEWORK FOR KEY CONCEPT MINING FROM EDUCATIONAL CONTENT USING LARGE LANGUAGE MODELS

E1-1012

This thesis examines the use of Large Language Models (LLMs) in education, with a focus on improving performance and implementing strong security measures. The research has two main goals, namely, the development of an effective lecture summarization technique using LLMs and identifying and addressing security vulnerabilities in LLM applications according to OWASP (Open Web Application Security Project) guidelines. For the former goal, we have proposed an effective framework for fine-tuning LLMs using real lecture datasets and compared the performance of different LLMs. For the latter goal, we conducted a thorough review of the application dataflow of the proposed framework and revealed several vulnerabilities, categorized as high risk, medium risk, and low risk. We also propose countermeasures to these vulnerabilities and demonstrate their efficacy. Thus, this study suggests a framework for securely integrating LLMs into educational purposes, tackling critical security concerns while harnessing the models' efficiency.