Stephen Porges (Polyvagal Perspective and Sound Sensitivity Research)

The International Misophonia Research Network (IMRN).


Long Title: Documenting the acoustic features that elicit subjective experiences related to pathogen, predator, danger, and safety

Principle Investigator(s): Stephen W. Porges, Ph.D., Department of Psychiatry, University of North Carolina

Background: The Center for Emotion and Attention at the University Florida developed a database of sounds, the International Affective Digitized Sound system (IADS). The IADS (Bradley & Lang, 2007) provides a set of acoustic stimuli for experimental investigations of emotion and attention. The acoustic stimuli, similar to the visual stimuli that constitute the International Affective Picture System (IAPS), were rated along two primary dimensions: affective valence (ranging from pleasant to unpleasant) and arousal (ranging from calm to excited). Categorizing affective qualities using these dimensions is an empirical approach that does not make any theoretical assumption regarding evolutionary or neurophysiological processes that may contribute to certain sounds triggering subjective responses along the dimensions of affect (valence, arousal) used to quantify the affective qualities of each specific stimulus.
In contrast to this empirical approach, the Polyvagal Theory makes predictions based on acoustic properties. The Polyvagal Theory proposes that subjective responses to sounds are initially (before associative learning) based on two features of the acoustic signal: pitch and variation in pitch. The theory articulates that for mammals there is a frequency band of perceptual advantage in which social communication occurs. It is within this frequency band that acoustic “safety” cues are conveyed.
Consistent with the theory, safety is signaled when the pitch of the acoustic signal is modulated within this band. Thus, a monotone within this band is not sufficient to signal safety. Moreover, the theory proposes that low frequency monotone sounds (e.g., dog’s bark, lion’s roar, large truck, and thunder) are inherent signals of predator and high frequency monotone sounds are inherent signals of pain and danger (e.g., shrill cries of babies or someone who is being injured).
The frequency band of perceptual advantage is functionally defined by the physics of the middle ear structures. During the evolutionary transition from ancient reptiles to mammals, the middle ear bones became detached from the jawbone. This functionally enabled mammals to communicate via vocalizations in a frequency band that could not be detected by the ancient reptiles. Operationally, in humans this frequency band functions from about 500 Hz to about 4000 Hz. Within these frequencies, the second and third formant in both male and female human speech always occur and in many cases so does the first formant. Basically, we can’t understand the meaning of speech without processing the formants, and this difficulty is a feature of many individuals with auditory processing difficulties.

Polyvagal hypotheses:

Low frequency sounds without modulation will trigger a subjective fear response to flee

High frequency sounds without modulation will trigger a subjective alerting response of pain or danger.

Modulated sounds within the frequency band of perceptual advantage will overlap with the frequencies of social communication that signal safety. To signal safety the sounds will need to be within this frequency and to be highly modulated simulating vocal prosody

Pathogen stimuli, such as coughing and sneezing, will have an acoustic signature overlapping with predator and/or danger signals.

We propose to evaluate the acoustic features of the stimuli presented in the IADS. Specifically, we will extract the same features from the stimuli that we have extracted from mammalian vocalizations. In our research we have demonstrated that vocalizations convey the physiological state of the individual. Thus, based on the intonation of the mammal, a conspecific would be able to determine whether the conspecific was dangerous or a potential mate. We still use similar acoustic cues, a loud low frequency monotone voice is intimidating regardless of the words being spoken, while a modulated voice in the frequency band of perceptual advantage is calming (e.g., mother’s lullaby).

TASK 1 – We will calculate the following parameters on each of the IADS stimuli: fundamental frequency, duration of components or vocalizations, slope of the pitch (rising or falling tone of the stimulus), variance of the fundamental about the slope, bandwidth of the spectrogram at 50% of the peak energy, and spectral tilt (a measure of the energy density balance across frequencies). In previous published studies based on research conducted in my laboratory several of these variables, when extracted from infant and prairie vole vocalizations, have been related to autonomic state.

TASK 2 – The derived acoustic parameters will be related to the standardized subjective ratings provided in the IADS manual. A variety of psychometric models will be used to relate the newly extracted acoustic features to the subjective rating.

Goal: The goal is to provide an acoustic model of stimuli that subjectively trigger defense (predator, danger) or safety and calmness. A secondary goal is illustrate that the affective dimensions of valence and arousal can be translated into acoustic features that have a neurophysiological substrate and a phylogenetic history.

We will provide an explanation of why sounds are perceived as pleasing and calming and why other sounds are frightening or signal danger. The explanation will be based on how our nervous system processes and categorizes acoustic features into predator, danger, and social (safety) signals; a process occurring through sensory processing pathways (i.e., neuroception) that are too important to be dependent on conscious decisions (i.e., perception). We will use the subjective ratings provided in the manual as the criteria to support our hypothesis.

Updates on Dr. Porges’ Study