User profiles for Hubert Soyer

Hubert Soyer

DeepMind
Verified email at google.com
Cited by 8519

Progressive neural networks

…, NC Rabinowitz, G Desjardins, H Soyer… - arXiv preprint arXiv …, 2016 - arxiv.org
Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding
catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The …

Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures

L Espeholt, H Soyer, R Munos… - International …, 2018 - proceedings.mlr.press
In this work we aim to solve a large collection of tasks using a single reinforcement learning
agent with a single set of parameters. A key challenge is to handle the increased amount of …

Learning to reinforcement learn

JX Wang, Z Kurth-Nelson, D Tirumala, H Soyer… - arXiv preprint arXiv …, 2016 - arxiv.org
In recent years deep reinforcement learning (RL) systems have attained superhuman
performance in a number of challenging task domains. However, a major limitation of such …

Learning to navigate in complex environments

P Mirowski, R Pascanu, F Viola, H Soyer… - arXiv preprint arXiv …, 2016 - arxiv.org
Learning to navigate in complex environments with dynamic elements is an important
milestone in developing AI agents. In this work we formulate the navigation question as a …

Vector-based navigation using grid-like representations in artificial agents

…, MJ Chadwick, T Degris, J Modayil, G Wayne, H Soyer… - Nature, 2018 - nature.com
Deep neural networks have achieved impressive successes in fields ranging from object
recognition to complex games such as Go 1 , 2 . Navigation, however, remains a substantial …

Prefrontal cortex as a meta-reinforcement learning system

…, Z Kurth-Nelson, D Kumaran, D Tirumala, H Soyer… - Nature …, 2018 - nature.com
Over the past 20 years, neuroscience research on reward-based learning has converged
on a canonical model, under which the neurotransmitter dopamine ‘stamps in’associations …

Multi-task deep reinforcement learning with popart

M Hessel, H Soyer, L Espeholt, W Czarnecki… - Proceedings of the …, 2019 - ojs.aaai.org
The reinforcement learning (RL) community has made great strides in designing algorithms
capable of exceeding human performance on specific tasks. These algorithms are mostly …

Grounded language learning in a simulated 3d world

…, F Hill, S Green, F Wang, R Faulkner, H Soyer… - arXiv preprint arXiv …, 2017 - arxiv.org
We are increasingly surrounded by artificially intelligent technology that takes decisions and
executes actions on our behalf. This creates a pressing need for general means to …

V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control

…, JT Springenberg, A Clark, H Soyer… - arXiv preprint arXiv …, 2019 - arxiv.org
Some of the most successful applications of deep reinforcement learning to challenging
domains in discrete and continuous control have used policy gradient methods in the on-policy …

Making efficient use of demonstrations to solve hard exploration problems

…, B Shahriari, M Denil, M Hoffman, H Soyer… - arXiv preprint arXiv …, 2019 - arxiv.org
This paper introduces R2D3, an agent that makes efficient use of demonstrations to solve
hard exploration problems in partially observable environments with highly variable initial …