PT - JOURNAL ARTICLE AU - Katherine R. Storrs AU - Roland W. Fleming TI - Unsupervised Learning Predicts Human Perception and Misperception of Specular Surface Reflectance AID - 10.1101/2020.04.07.026120 DP - 2020 Jan 01 TA - bioRxiv PG - 2020.04.07.026120 4099 - http://biorxiv.org/content/early/2020/04/07/2020.04.07.026120.short 4100 - http://biorxiv.org/content/early/2020/04/07/2020.04.07.026120.full AB - Gloss perception is a challenging visual inference that requires disentangling the contributions of reflectance, lighting, and shape to the retinal image [1–3]. Learning to see gloss must somehow proceed without labelled training data as no other sensory signals can provide the ‘ground truth’ required for supervised learning [4–6]. We reasoned that paradoxically, we may learn to infer distal scene properties, like gloss, by learning to compress and predict spatial structure in proximal image data. We hypothesised that such unsupervised learning might explain both successes and failures of human gloss perception, where classical ‘inverse optics’ cannot. To test this, we trained unsupervised neural networks to model the pixel statistics of renderings of glossy surfaces and compared the resulting representations with human gloss judgments. The trained networks spontaneously cluster images according to underlying scene properties such as specular reflectance, shape and illumination, despite receiving no explicit information about them. More importantly, we find that linearly decoding specular reflectance from the model’s internal code predicts human perception and misperception of glossiness on an image-by-image basis better than the true physical reflectance does, better than supervised networks explicitly trained to estimate specular reflectance, and better than alternative image statistic and dimensionality-reduction models. Indeed, the unsupervised networks correctly predict well-known illusions of gloss perception caused by interactions between surface relief and lighting [7,8] which the supervised models totally fail to predict. Our findings suggest that unsupervised learning may explain otherwise inexplicable errors in surface perception, with broader implications for how biological brains learn to see the outside world.HighlightsWe trained unsupervised neural networks to synthesise images of glossy surfacesThey spontaneously learned to encode gloss, lighting and other scene factorsThe networks correctly predict both errors and successes of human gloss perceptionThe findings provide new insights into how the brain likely learns to see