Everyday interactions often depend on reasoning about space and time: collaborators need to know where events take place – and in what order – to, e.g., communicate driving directions, build pieces of furniture, or carry out strategic operations in military and sports settings (Núñez & Cooperrider, 2013). A simple set of driving directions may require a listener to interpret and reason about the spatial relations – such as next to and behind – and the temporal relations – such as after and during – that a speaker describes. The speaker may also use gestures to substitute, supplement, or disambiguate linguistic descriptions (Holle & Gunter, 2007; Perzanowski, Schultz, & Williams, 1998). Such rapid, rich, and productive interactions are transient and difficult to analyze behaviorally, and so they pose a challenge for experimenters.