%0 Journal Article %A Erik J Peterson %A Timothy D Verstynen %T A way around the exploration-exploitation dilemma %D 2019 %R 10.1101/671362 %J bioRxiv %P 671362 %X For all animals the decision to explore comes with a risk of getting less. For example, a foraging bee might find less nectar, or hunting hawk less prey. This loss is often formalized as regret. It’s been mathematically proven that exploring an uncertain world with a specific goal always has some regret. This is why exploration-exploitation can be a dilemma. Given this proof we wondered if the common advice to “focus on learning and not the goal” might have mathematical merit. So we re-imagined exploration in the dilemma as an open ended search for any new information. We then developed a new minimal description of information value, which generalizes existing ideas like curiosity, novelty and information gain. We use this description to model the dilemma as a competition between strategies that maximize reward and information independently. Here we prove this competition has a no regret solution. When we study this solution in simulation – using classic bandit tasks – it outperforms standard approaches, especially when rewards are sparse. %U https://www.biorxiv.org/content/biorxiv/early/2019/11/06/671362.full.pdf