# Solving nonogram puzzles by reinforcement learning

- Frederic Dandurand,
*Université de Montréal*
- Denis Cousineau,
*Université d'Ottawa*
- Thomas R. Shultz,
*McGill University*

## Abstract

We study solvers of nonogram puzzles, which are good examples of
constraint-satisfaction problems. Given an optimal solving module for solving a
given line, we compare performance of three algorithmic solvers used to select
the order in which to solve lines with a reinforcement-learning-based solver. The
reinforcement-learning (RL) solver uses a measure of reduction of distance to
goal as a reward. We compare two methods for storing qualities (Q values) of
state-action pairs, a lookup table and a connectionist function approximator. We
find that RL solvers learn near-optimal solutions that also outperform a
heuristic solver based on the explicit, general rules often given to nonogram
players. Only RL solvers that use a connectionist function approximator
generalize their knowledge to generate good solutions on about half of unseen
problems; RL solvers based on lookup tables generalize to none of these untrained
problems.

Back to Table of Contents