![]() |
|
||||||||||||||||||||||
|
Maarten Speekenbrink University College London Emmanouil Konstantinidis University College London
Decision-making in noisy and changing environments requires a fine balance between exploiting knowledge about good courses of action and exploring the environment in order to improve upon this knowledge. We present an experiment in which participants made repeated choices between options for which the average rewards changed over time. Comparing a number of computational models of participants' behaviour in this task, we find evidence that a substantial number of them balanced exploration and exploitation by considering the probability that an option offers the maximum reward out of all the available options.
Uncertainty and exploration in a restless bandit task (175 KB)