User profiles for Hubert Soyer
Hubert SoyerDeepMind Verified email at google.com Cited by 8519 |
Progressive neural networks
Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding
catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The …
catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The …
Impala: Scalable distributed deep-rl with importance weighted actor-learner architectures
In this work we aim to solve a large collection of tasks using a single reinforcement learning
agent with a single set of parameters. A key challenge is to handle the increased amount of …
agent with a single set of parameters. A key challenge is to handle the increased amount of …
Learning to reinforcement learn
In recent years deep reinforcement learning (RL) systems have attained superhuman
performance in a number of challenging task domains. However, a major limitation of such …
performance in a number of challenging task domains. However, a major limitation of such …
Learning to navigate in complex environments
Learning to navigate in complex environments with dynamic elements is an important
milestone in developing AI agents. In this work we formulate the navigation question as a …
milestone in developing AI agents. In this work we formulate the navigation question as a …
Vector-based navigation using grid-like representations in artificial agents
Deep neural networks have achieved impressive successes in fields ranging from object
recognition to complex games such as Go 1 , 2 . Navigation, however, remains a substantial …
recognition to complex games such as Go 1 , 2 . Navigation, however, remains a substantial …
Prefrontal cortex as a meta-reinforcement learning system
Over the past 20 years, neuroscience research on reward-based learning has converged
on a canonical model, under which the neurotransmitter dopamine ‘stamps in’associations …
on a canonical model, under which the neurotransmitter dopamine ‘stamps in’associations …
Multi-task deep reinforcement learning with popart
The reinforcement learning (RL) community has made great strides in designing algorithms
capable of exceeding human performance on specific tasks. These algorithms are mostly …
capable of exceeding human performance on specific tasks. These algorithms are mostly …
Grounded language learning in a simulated 3d world
We are increasingly surrounded by artificially intelligent technology that takes decisions and
executes actions on our behalf. This creates a pressing need for general means to …
executes actions on our behalf. This creates a pressing need for general means to …
V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control
Some of the most successful applications of deep reinforcement learning to challenging
domains in discrete and continuous control have used policy gradient methods in the on-policy …
domains in discrete and continuous control have used policy gradient methods in the on-policy …
Making efficient use of demonstrations to solve hard exploration problems
This paper introduces R2D3, an agent that makes efficient use of demonstrations to solve
hard exploration problems in partially observable environments with highly variable initial …
hard exploration problems in partially observable environments with highly variable initial …