Scalable ASL sign recognition using model-based machine learning and linguistically annotated corpora

Files
18005_Paper.pdf(852.42 KB)
Published version
Date
2018-05-12
DOI
Authors
Metaxas, Dimitris
Dilsizian, Mark
Neidle, Carol
Version
OA Version
Citation
Dimitris Metaxas, Mark Dilsizian, Carol Neidle. 2018. "Scalable ASL Sign Recognition using Model-based Machine Learning and Linguistically Annotated Corpora." Language Resources and Evaluation. 8th Workshop on the Representation & Processing of Sign Languages: Involving the Language Community, Language Resources and Evaluation Conference 2018. Miyazaki, Japan, 2018-05-12 - 2018-05-12
Abstract
We report on the high success rates of our new, scalable, computational approach for sign recognition from monocular video, exploiting linguistically annotated ASL datasets with multiple signers. We recognize signs using a hybrid framework combining state-of-the-art learning methods with features based on what is known about the linguistic composition of lexical signs. We model and recognize the sub-components of sign production, with attention to hand shape, orientation, location, motion trajectories, plus non-manual features, and we combine these within a CRF framework. The effect is to make the sign recognition problem robust, scalable, and feasible with relatively smaller datasets than are required for purely data-driven methods. From a 350-sign vocabulary of isolated, citation-form lexical signs from the American Sign Language Lexicon Video Dataset (ASLLVD), including both 1- and 2-handed signs, we achieve a top-1 accuracy of 93.3% and a top-5 accuracy of 97.9%. The high probability with which we can produce 5 sign candidates that contain the correct result opens the door to potential applications, as it is reasonable to provide a sign lookup functionality that offers the user 5 possible signs, in decreasing order of likelihood, with the user then asked to select the desired sign.
Description
License
Attribution-NonCommercial 4.0 International