Research Article: A Model for Statistical Regularity Extraction from Dynamic Sounds

Date Published: March 27, 2019


Author(s): Benjamin Skerritt-Davis, Mounya Elhilali.


To understand our surroundings, we effortlessly parse our sound environment into sound sources, extracting invariant information—or regularities—over time to build an internal representation of the world around us. Previous experimental work has shown the brain is sensitive to many types of regularities in sound, but theoretical models that capture underlying principles of regularity tracking across diverse sequence structures have been few and far between. Existing efforts often focus on sound patterns rather the stochastic nature of sequences. In the current study, we employ a perceptual model for regularity extraction based on a Bayesian framework that posits the brain collects statistical information over time. We show this model can be used to simulate various results from the literature with stimuli exhibiting a wide range of predictability. This model can provide a useful tool for both interpreting existing experimental results under a unified model and providing predictions for new ones using more complex stimuli.

Partial Text

Regularity extraction is an essential aspect of auditory object perception, in which the brain extracts useful information from sounds over time to interpret our surroundings [1]. This ability is often studied in the literature using deviance detection experiments, where listeners are presented with a sequence of sounds exhibiting some regularity and responses are compared between the members of the regularity and deviations from the regularity for signs of detection in the brain [2]. Behavioral and neural evidence has shown the brain is sensitive to a variety of regularities, with the mismatch negativity (MMN) as a typical marker of deviance detection in electroencephalography (EEG) and magnetoencephalography (MEG) [3].

We use the Dynamic Regularity Extraction (D-REX) model1 presented in [9] to simulate findings in the literature. This model is based on a Bayesian inference framework designed to perform sequential predictions in dynamic sequences containing unknown changes in underlying statistical structure [10, 11].

We collected surprisal responses from the D-REX model to stimuli found in the deviance and change detection literature. Stimuli range in predictability to show the capacity of the model to capture a variety of phenomena under a single framework. Using different sets of statistics in the model (via the dimensionality D), we can ascertain the statistics that are sufficient—the “simplest explanation”— for responses observed in the brain.

The D-REX model utilizes statistical descriptions of sound sequences to replicate findings across a wide swath of the regularity extraction literature. While these statistical descriptions may not be necessary to replicate these findings, the model provides a unified interpretation that is more generalizable toward natural sounds, where randomness and noise abound. Beyond retrospective intepreta-tion of existing results, the D-REX model can be used to guide future experiments probing the temporal processing of more complex sounds. Moreover, since the model employs a Bayesian framework that is agnostic to probability distributions and underlying statistics, the D-REX model offers a generalized platform to explore sufficient statistics underlying regularity tracking in the auditory system, as well as contrast different interpretations of behavioral and neurophysiological results for parsing complex sound sequences.