RT Journal Article SR Electronic T1 Reconstructing the cascade of language processing in the brain using the internal computations of a transformer-based language model JF bioRxiv FD Cold Spring Harbor Laboratory SP 2022.06.08.495348 DO 10.1101/2022.06.08.495348 A1 Sreejan Kumar A1 Theodore R. Sumers A1 Takateru Yamakoshi A1 Ariel Goldstein A1 Uri Hasson A1 Kenneth A. Norman A1 Thomas L. Griffiths A1 Robert D. Hawkins A1 Samuel A. Nastase YR 2022 UL http://biorxiv.org/content/early/2022/06/09/2022.06.08.495348.abstract AB Piecing together the meaning of a narrative requires understanding not only the individual words but also the intricate relationships between them. How does the brain construct this kind of rich, contextual meaning from natural language? Recently, a new class of artificial neural networks—based on the Transformer architecture—has revolutionized the field of language modeling. Transformers integrate information across words via multiple layers of structured circuit computations, forming increasingly contextualized representations of linguistic content. In this paper, we deconstruct these circuit computations and analyze the associated “transformations” (alongside the more commonly studied “embeddings”) at each layer to provide a fine-grained window onto linguistic computations in the human brain. Using functional MRI data acquired while participants listened to naturalistic spoken stories, we find that these transformations capture a hierarchy of linguistic computations across cortex, with transformations at later layers in the model mapping onto higher-level language areas in the brain. We then decompose these transformations into individual, functionally-specialized “attention heads” and demonstrate that the emergent syntactic computations performed by individual heads correlate with predictions of brain activity in specific cortical regions. These heads fall along gradients corresponding to different layers, contextual distances, and syntactic dependencies in a low-dimensional cortical space. Our findings provide a new basis for using the internal structure of large language models to better capture the cascade of cortical computations that support natural language comprehension.Competing Interest StatementThe authors have declared no competing interest.