Abstract
Transformer language models are today’s most accurate models of language processing in the brain. Here, using fMRI-measured brain responses to 1,000 diverse sentences, we develop a GPT-based encoding model and use this model to identify new sentences that are predicted to drive or suppress responses in the human language network. We demonstrate that these model-selected ‘out-of-distribution’ sentences indeed drive and suppress activity of human language areas in new individuals (86% increase and 98% decrease relative to the average response to diverse naturalistic sentences). A systematic analysis of the model-selected sentences reveals that surprisal and well-formedness of linguistic input are key determinants of response strength in the language network. These results establish the ability of brain-aligned models to noninvasively control neural activity in higher-level cortical areas, like the language network.
Competing Interest Statement
The authors have declared no competing interest.
Footnotes
Revision of introduction and discussion.