We quantified the multiscale clustering in the acoustics of infant prelinguistic vocalizations, the development across the first two years of life, and the relationship between caregivers’ vocalizations for fifteen infant-caregiver dyads. The multiscale structure of infant utterances, spanning from seconds to over one hour was similar to that of caregivers, providing new evidence for the multiscale clustering of prelinguistic utterances. Additionally, we found that the multiscale clustering statistics of infant vocalizations and caregiver vocalizations matched even when controlling for other vocalization properties such as volubility, and more matching occurred for infant speechlike vocalizations relative to non-speechlike vocalizations. The matching of multiscale clustering statistics between two people is termed Complexity Matching. We provide a discussion regarding the notion of hierarchical rhythmic organization in infant language development. Complexity matching sheds new light on the coupling between infant and caregiver vocalizations, and may help to advance current theories of infant speech development.