Can Automated Gesture Recognition Support the Study of Child Language Development?

AbstractChildren’s prelinguistic gestures play a central role in their communicative development. Early gesture use has been shown to be predictive of both concurrent and later language ability, making the identification of gestures in video data at scale a potentially valuable tool for both theoretical and clinical purposes. We describe a new dataset consisting of videos of 72 infants interacting with their caregivers at 11&12 months, annotated for the appearance of 12 different gesture types. We propose a model based on deep convolutional neural networks to classify these. The model achieves 48.32% classification accuracy overall, but with significant variation between gesture types. Critically, we found strong (0.7 or above) rank order correlations between by-child gesture counts from human and machine coding for 7 of the 12 gestures (including the critical gestures of declarative pointing, hold outs and gives). Given the challenging nature of the data - recordings of many different dyads in different environments engaged in diverse activities - we consider these results a very encouraging first attempt at the task, and evidence that automatic or machine-assisted gesture identification could make a valuable contribution to the study of cognitive development.

Return to previous page