Google Scholar

User profiles for Hubert Soyer

Hubert Soyer

DeepMind

Verified email at google.com

Cited by 8519

[PDF] arxiv.org

Progressive neural networks

…, NC Rabinowitz, G Desjardins, H Soyer… - arXiv preprint arXiv …, 2016 - arxiv.org

Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding
catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The …

Save Cite Cited by 2714 Related articles All 6 versions View as HTML

[PDF] mlr.press

Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures

L Espeholt, H Soyer, R Munos… - International …, 2018 - proceedings.mlr.press

In this work we aim to solve a large collection of tasks using a single reinforcement learning
agent with a single set of parameters. A key challenge is to handle the increased amount of …

Save Cite Cited by 1520 Related articles All 9 versions View as HTML

[PDF] arxiv.org

Learning to reinforcement learn

JX Wang, Z Kurth-Nelson, D Tirumala, H Soyer… - arXiv preprint arXiv …, 2016 - arxiv.org

In recent years deep reinforcement learning (RL) systems have attained superhuman
performance in a number of challenging task domains. However, a major limitation of such …

Save Cite Cited by 993 Related articles All 8 versions View as HTML

[PDF] arxiv.org

Learning to navigate in complex environments

P Mirowski, R Pascanu, F Viola, H Soyer… - arXiv preprint arXiv …, 2016 - arxiv.org

Learning to navigate in complex environments with dynamic elements is an important
milestone in developing AI agents. In this work we formulate the navigation question as a …

Save Cite Cited by 937 Related articles All 5 versions View as HTML

[PDF] ucl.ac.uk

Vector-based navigation using grid-like representations in artificial agents

…, MJ Chadwick, T Degris, J Modayil, G Wayne, H Soyer… - Nature, 2018 - nature.com

Deep neural networks have achieved impressive successes in fields ranging from object
recognition to complex games such as Go 1 , 2 . Navigation, however, remains a substantial …

Save Cite Cited by 690 Related articles All 14 versions

[PDF] biorxiv.org

Prefrontal cortex as a meta-reinforcement learning system

…, Z Kurth-Nelson, D Kumaran, D Tirumala, H Soyer… - Nature …, 2018 - nature.com

Over the past 20 years, neuroscience research on reward-based learning has converged
on a canonical model, under which the neurotransmitter dopamine ‘stamps in’associations …

Save Cite Cited by 604 Related articles All 12 versions

[PDF] aaai.org

Multi-task deep reinforcement learning with popart

M Hessel, H Soyer, L Espeholt, W Czarnecki… - Proceedings of the …, 2019 - ojs.aaai.org

The reinforcement learning (RL) community has made great strides in designing algorithms
capable of exceeding human performance on specific tasks. These algorithms are mostly …

Save Cite Cited by 287 Related articles All 8 versions View as HTML

[PDF] arxiv.org

Grounded language learning in a simulated 3d world

…, F Hill, S Green, F Wang, R Faulkner, H Soyer… - arXiv preprint arXiv …, 2017 - arxiv.org

We are increasingly surrounded by artificially intelligent technology that takes decisions and
executes actions on our behalf. This creates a pressing need for general means to …

Save Cite Cited by 279 Related articles All 2 versions View as HTML

[PDF] arxiv.org

V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control

…, JT Springenberg, A Clark, H Soyer… - arXiv preprint arXiv …, 2019 - arxiv.org

Some of the most successful applications of deep reinforcement learning to challenging
domains in discrete and continuous control have used policy gradient methods in the on-policy …

Save Cite Cited by 103 Related articles All 4 versions View as HTML

[PDF] arxiv.org

Making efficient use of demonstrations to solve hard exploration problems

…, B Shahriari, M Denil, M Hoffman, H Soyer… - arXiv preprint arXiv …, 2019 - arxiv.org

This paper introduces R2D3, an agent that makes efficient use of demonstrations to solve
hard exploration problems in partially observable environments with highly variable initial …

Save Cite Cited by 84 Related articles All 5 versions View as HTML

Create alert

Cite

Advanced search

Saved to My library

User profiles for Hubert Soyer

Hubert Soyer

Progressive neural networks

Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures

Learning to reinforcement learn

Learning to navigate in complex environments

Vector-based navigation using grid-like representations in artificial agents

Prefrontal cortex as a meta-reinforcement learning system

Multi-task deep reinforcement learning with popart

Grounded language learning in a simulated 3d world

V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control

Making efficient use of demonstrations to solve hard exploration problems