Zipfian word frequencies support statistical word segmentation

Abstract

Word frequencies in natural language follow a Zipfian distribution. Artificial language experiments that are meant to simulate language acquisition generally use uniform word frequency distributions, however. In the present study we examine whether a Zipfian frequency distribution influences adult learners' word segmentation performance. Using two experimental paradigms (a forced choice task and an orthographic segmentation task), we show that human statistical learning abilities are robust enough to identify words from exposures with widely varying frequency distributions. Additionally, we report a facilitatory effect of Zipfian distributions on word segmentation performance in the orthographic segmentation task, both in segmenting trained material and in generalization to novel material. Zipfian distributions increase the chances for learners to apply their knowledge in processing a new speech stream.


Back to Table of Contents