Date of Defense
23-4-2024 11:30 AM
Location
F1-1164
Document Type
Thesis Defense
Degree Name
Master of Science in Electrical Engineering (MSEE)
College
College of Engineering
Department
Electrical Engineering
First Advisor
Dr. Qurban Ali Andal Memon
Abstract
Sign language recognition research aims to develop systems and tools that can interpret and translate sign language into text or spoken language. During the past two decades, the challenges faced in this domain are multifaceted. This thesis on sign language recognition aims to address some challenges through various approaches. The foremost challenge addressed in this thesis is reduction in accuracy that uses transformer based deep learning architecture in addition to preprocessing steps that include augmentations and transformations. The augmentations and transformation helped increase the data size. Specifically, in-house signs were generated using different persons for initial results. The video frames generated included facial expressions and both fingers, which were later stacked. Later, the model was validated using generic sign languages to address improvement in accuracy. For producing results, the model was trained and tested on a set of frames. The comparisons with existing works are tabulated.
Included in
TRANSFORMER BASED DEEP LEARNING MODEL FOR SIGN LANGUAGE RECOGNITION
F1-1164
Sign language recognition research aims to develop systems and tools that can interpret and translate sign language into text or spoken language. During the past two decades, the challenges faced in this domain are multifaceted. This thesis on sign language recognition aims to address some challenges through various approaches. The foremost challenge addressed in this thesis is reduction in accuracy that uses transformer based deep learning architecture in addition to preprocessing steps that include augmentations and transformations. The augmentations and transformation helped increase the data size. Specifically, in-house signs were generated using different persons for initial results. The video frames generated included facial expressions and both fingers, which were later stacked. Later, the model was validated using generic sign languages to address improvement in accuracy. For producing results, the model was trained and tested on a set of frames. The comparisons with existing works are tabulated.