Date of Defense

23-4-2024 11:30 AM

Location

F1-1164

Document Type

Thesis Defense

Degree Name

Master of Science in Electrical Engineering (MSEE)

College

College of Engineering

Department

Electrical Engineering

First Advisor

Dr. Qurban Ali Andal Memon

Abstract

Sign language recognition research aims to develop systems and tools that can interpret and translate sign language into text or spoken language. During the past two decades, the challenges faced in this domain are multifaceted. This thesis on sign language recognition aims to address some challenges through various approaches. The foremost challenge addressed in this thesis is reduction in accuracy that uses transformer based deep learning architecture in addition to preprocessing steps that include augmentations and transformations. The augmentations and transformation helped increase the data size. Specifically, in-house signs were generated using different persons for initial results. The video frames generated included facial expressions and both fingers, which were later stacked. Later, the model was validated using generic sign languages to address improvement in accuracy. For producing results, the model was trained and tested on a set of frames. The comparisons with existing works are tabulated.

Share

COinS
 
Apr 23rd, 11:30 AM

TRANSFORMER BASED DEEP LEARNING MODEL FOR SIGN LANGUAGE RECOGNITION

F1-1164

Sign language recognition research aims to develop systems and tools that can interpret and translate sign language into text or spoken language. During the past two decades, the challenges faced in this domain are multifaceted. This thesis on sign language recognition aims to address some challenges through various approaches. The foremost challenge addressed in this thesis is reduction in accuracy that uses transformer based deep learning architecture in addition to preprocessing steps that include augmentations and transformations. The augmentations and transformation helped increase the data size. Specifically, in-house signs were generated using different persons for initial results. The video frames generated included facial expressions and both fingers, which were later stacked. Later, the model was validated using generic sign languages to address improvement in accuracy. For producing results, the model was trained and tested on a set of frames. The comparisons with existing works are tabulated.