Reverse Correlation Uncovers More Complete Tinnitus Spectra

Goal: This study validates an approach to characterizing the sounds experienced by tinnitus patients via reverse correlation, with potential for characterizing a wider range of sounds than currently possible. Methods: Ten normal-hearing subjects assessed the subjective similarity of random auditory stimuli and target tinnitus-like sounds (“buzzing” and “roaring”). Reconstructions of the targets were obtained by regressing subject responses on the stimuli, and were compared for accuracy to the frequency spectra of the targets using Pearson's \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$r$\end{document}. Results: Reconstruction accuracy was significantly higher than chance across subjects: buzzing: \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$0.52 \pm 0.27$\end{document} (mean \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$\pm$\end{document} s.d.), \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$t(9) = 5.766$\end{document}, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$p < 0.001$\end{document}; roaring: \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$0.62 \pm 0.23$\end{document}, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$t(9) = 5.76$\end{document}, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$p < 0.001$\end{document}; combined: \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$0.57 \pm 0.25$\end{document}, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$t(19) = 7.542$\end{document}, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{upgreek} \usepackage{mathrsfs} \setlength{\oddsidemargin}{-69pt} \begin{document} }{}$p < 0.001$\end{document}. Conclusion: Reverse correlation can accurately reconstruct non-tonal tinnitus-like sounds in normal-hearing subjects, indicating its potential for characterizing the sounds experienced by patients with non-tonal tinnitus.


I. INTRODUCTION
T INNITUS the perception of sound in the absence of any corresponding external stimulus-affects up to 50 million people in the U.S. [1], a third of whom experience functional cognitive impairment and diminished quality of life [2], [3]. Clinical guidelines for tinnitus management involve targeted exposure to external sounds as part of sound therapy or cognitive behavioral therapy [4]. Critically, patient outcomes improve when the employed external sounds are closely informed by the patient's internal tinnitus experience [5], [6], [7], [8]. However, existing strategies for characterizing tinnitus sounds, such as pitch matching (PM), are best suited for patients whose tinnitus resembles pure tones (e.g., ringing) [9], [10], [11], [12]. There is a need for methods to characterize tinnitus sounds in the estimated 20-50% of patients with nontonal (e.g., buzzing, roaring) tinnitus [13], [14].
Nontonal tinnitus sounds are presumed to be complex and heterogeneous [12], although few characteristics have been firmly established. Therefore, we base our approach on reverse correlation (RC), an established behavioral method [15], [16], [17] for estimating internal perceptual representations that is unconstrained by prior knowledge about the representations themselves. RC asks participants to render subjective judgments over ambiguous stimuli, and reconstructs the latent representation by regressing subject responses onto the stimuli. RC is closely related to Wiener theory, which has inspired "whitenoise" approaches to system characterization in physiology [18], [19] and engineering [20].
Here, we validate RC as a method for characterizing individuals' internal representations of tinnitus more completely. To that end, normal-hearing participants completed an augmented RC experiment, comparing random stimuli to a target tinnitus-like sound, yielding frequency spectrum estimates of their tinnitus representation. The estimated spectra were subsequently validated against the target tinnitus sounds. Our results demonstrate, for the first time, that tinnitus-like sounds with complex spectra can be accurately estimated using RC.

A. Stimuli
The frequency space of the stimuli was partitioned into b = 8 Mel-spaced frequency bins, which divide the frequency space between [100, 13, 000] Hz into contiguous segments of equal  amplitude (i.e., "rectangular" bins). Reconstruction detail increases with b, but b = 8 provides a good approximation to the chosen target sounds (cf. Fig. 3).
For each stimulus, [2,7] bins were randomly "filled" with power 0 dB. "Unfilled" bins were assigned −100 dB. All frequencies were assigned random phase. Inverse Fourier transform of the constructed spectrum yields a 500 ms stimulus waveform. 1

B. Target Sounds
Two spectrally complex and complementary target sounds ("buzzing" and "roaring") were downloaded from the American Tinnitus Association [21] and truncated to 500 ms in duration (power-spectral densities are displayed in the botton subplots of Fig. 3).

C. Experiment
Ten (n = 10) normal-hearing subjects listened to A-X trials containing a target sound (A) followed by a stimulus (X). X was randomly generated for each trial, while A remained the same within a block of 100 trials (Fig. 1). Subjects completed two (2) blocks per target sound (p = 200 total trials per subject). Subjects were told that some stimuli had A embedded in them, and were instructed to respond "yes" to such stimuli, otherwise "no." Subjects listened over earphones at a self-determined comfortable level. Presentation level was not recorded in this study. Procedures were approved by the UMass IRB.

D. Reconstruction
A subject performing p RC trials with b frequency bins produces a stimulus matrix Ψ ∈ R p×b and a response vector y ∈ {1, −1} p , where 1 corresponds to a "yes" response and −1 1 Software for the experiments and analysis was written in MATLAB and is freely available at https://github.com/alec-hoyland/tinnitus-reconstruction/ to a "no." RC classically assumes the subject response model: where x ∈ R b is the subject's internal representation of interest (i.e., of their tinnitus). Inverting this model yields: which is a restricted form of the Normal equation under the assumption that the stimulus dimensions are uncorrelated [16].

E. Validation
The experimental paradigm allows for direct validation of the reconstructionsx buzzing andx roaring . We represent the spectra of the target sounds as vectors s buzzing ∈ R b and s roaring ∈ R b using the same frequency bins as the stimulus with power equal to the mean power at frequencies within that bin. Pearson's r between s buzzing and s roaring and their corresponding reconstructions quantifies reconstruction accuracy. One-sample t-tests were performed on the mean Fisher-transformed Pearson's r values across subjects to assess significant differences from zero.

F. Synthetic Subjects
To establish bounds on human performance, additional experiments were run with two simulated subjects who give either ideal or random responses. Each experiment ran for p = 200 trials and was repeated 1000 times.
The ideal subject gives responses following: for i ∈ 1, . . ., p, where Q(x, y) is the quantile function for x ∈ [0, 1] of the similarity calculation Ψs, and Ψ i is the ith column of Ψ. Thus, the ideal subject has precise knowledge of every stimulus and responds according to (3). The random subject responds y i ∈ {1, −1} with uniform random probability, thus ignoring the stimulus entirely. . 2 shows the distribution of Pearson's r for human, ideal, and random subject responses. Human accuracy is statistically significantly higher than random chance and for some subjects, approaches the ideal case. Accuracy from the random subject was 0.00 ± 0.44 (mean ± st.dev.) for buzzing and 0.00 ± 0.39 for roaring, while mean accuracy from human responses was significantly different from 0 in all conditions: buzzing: 0.52 ± 0.27, t(9) = 5.766, p < 0.001; roaring: 0.62 ± 0.23, t(9) = 5.76, p < 0.001; combined: 0.57 ± 0.25, t(19) = 7.542, p < 0.001. From Fig. 2, it appears that the distribution of buzzing results differs from that of the roaring results, however the difference between buzzing and roaring is not statistically significant (two-way ANOVA across subjects (F (13) = 2.94, p > 0.05) and target signals (F (1) = 2.44, p > 0.05)). Fig. 3 plots the most accurate human reconstructions over the target sound spectra.

IV. CONCLUSION
Our results show that RC can accurately reconstruct the frequency spectrum of tinnitus-like sounds relevant to non-tonal tinnitus, and therefore represent a proof of concept for using RC to characterize non-tonal tinnitus. Subjects completed the required number of trials within ten minutes, indicating that this procedure could be conducted within a single clinical visit. RC may therefore be useful as the basis for a clinical assay to characterize a wider variety of tinnitus percepts than currently possible. Reconstruction accuracies observed here are below the simulated ideal, which may be attributed to noisy responses universally observed in applications of RC, and which may be mitigated by further optimizing the experimental protocol, stimulus generation, and reconstruction method. For example, recent approaches to improving RC reconstruction methods can boost efficiency, noise robustness and overall accuracy [22]. Future work will focus on more comprehensive validation of this approach, using larger sample sizes, more target sounds, and stricter control of sound presentation level.