Neural network models have been used extensively to model perceptual learning and the effects of discrimination training on generalization, as well as to explore natural classification mechanisms. Here we assess the ability of existing models to account for the time course of generalization shifts that occur when individuals learn to distinguish sounds. A set of simulations demonstrates that commonly used single-layer networks do not predict transitory shifts in generalization over the course of training, but that such dynamics can be accounted for when the output functions of these networks are modified to mimic the properties of cortical tuning curves. The simulations further suggest that prudent selection of training criteria can allow for more precise predictions of learning-related shifts in generalization gradients in behavioral experiments.