n-task Learning: Solving Multiple or Unknown Numbers of Reinforcement Learning Problems

AbstractTemporal difference (TD) learning models can perform poorly when optimal policy cannot be determined solely by sensory input. Converging evidence from studies of working memory suggest that humans form abstract mental representations that align with significant features of a task, allowing such conditions to be overcome. The n-task learning algorithm (nTL) extends TD models by utilizing abstract representations to form multiple policies based around a common set of external inputs. These external inputs are combined conjunctively with an abstract input that comes to represent attention to a task. nTL is used to solve a dynamic categorization problem that is marked by frequently alternating tasks. The correct number of tasks is learned, as well as when to switch from one task representation to another, even when inputs are identical across all tasks. Task performance is shown to be optimal only when an appropriate number of abstract representations is used.

Return to previous page