Abstract
There is an important challenge in systematically interpreting the internal representations of deep neural networks. This study introduces a multi-dimensional quantification and visualization approach which can capture two temporal dimensions of a model learning experience: the “information processing trajectory” and the “developmental trajectory.” The former represents the influence of incoming signals on an agent’s decision-making, while the latter conceptualizes the gradual improvement in an agent’s performance throughout its lifespan. Tracking the learning curves of a DNN enables researchers to explicitly identify the model appropriateness of a given task, examine the properties of the underlying input signals, and assess the model’s alignment (or lack thereof) with human learning experiences. To illustrate the method, we conducted 750 runs of simulations on two temporal tasks: gesture detection and natural language processing (NLP) classification, showcasing its applicability across a spectrum of deep learning tasks. Based on the quantitative analysis of the learning curves across two distinct datasets, we have identified three insights gained from mapping these curves: nonlinearity, pairwise comparisons, and domain distinctions. We reflect on the theoretical implications of this method for cognitive processing, language models and multimodal representation.
Author summary Deep learning networks, specifically recurrent neural networks (RNNs), are designed for processing incoming signals sequentially, making them intuitive computational systems for studying cognitive processing that involves dynamic contexts. There has been a tradition in the fields of machine learning and neuro-cognitive science to examine how a system (either humans or models) represents information through various computational and statistical techniques. Our study takes this one step further by devising a technique for examining the “learning curves” of deep learning networks utilizing the sequential representations as part of RNNs’ architectures. Just as humans develop learning curves when solving problems, the introduced method captures both how incoming signals help improve decision-making and how a system’s problem-solving abilities enhance when encountering the same situation multiple times throughout its lifespan. Our study selected two distinct tasks: gesture detection and emotion tweet classification, to illustrate the insights researchers can draw from mapping models’ learning curves. The proposed method hinted that gesture learning experiences are smoother, while language learning relies on sudden knowledge gains during processing, corroborating the findings from previous literature.
Competing Interest Statement
The authors have declared no competing interest.