This research addresses the generation of meaningful interpretations of real-world perceptual stimuli. According to a widespread framework we will call the features-first view, a stimulus is initially encoded via semantically-laden, symbol-like properties that are compared to stored category representations to find the best match. Alternative theoretical perspectives challenge the features-first view, but there has been no direct empirical test. In our experiment, participants were shown photographic images of everyday objects and asked to judge as quickly as possible whether a provided verbal descriptive matched the picture. We tested different levels of delay between image and descriptive and found evidence that basic-level category labels were verified faster than clearly manifested descriptions of physical or functional properties. Accordingly, people know the category of the stimulus before knowing its semantic properties. The present evidence suggests that the category is used to achieve a property-level description of the meaning of the stimulus, not vice-versa.