Skip to Main Content
We propose a collaborative filtering (CF) model to predict user satisfaction in SDS evaluation. Inspired by the use of CF in recommendation systems, where a user's preference for a new item is assume to resemble that for similar items rated previously, we adapt the idea to predict user evaluations of unrated dialogs based on the ratings received by similar dialogs. Ratings of dialogs are gathered by crowdsourcing through Amazon Mechanical Turk. A reference baseline is provided by a linear regression model (LRM) based on the PARADISE framework. We present two versions of the CF model. First, the item-based collaborative filtering model (ICFM) clusters rated dialogs and builds an LRM for each cluster. The rating of an unseen dialog is predicted by the LRM of its most similar cluster. Second, the extended ICFM (EICFM) separates dialog features into user-related and system-related groups, to build LRMs for these separately. Experimental results on dialogs from the Let's Go! system show both ICFM and EICFM can significantly improve the proportion of variability explained by the LRM. We also demonstrate the generalizability of the CF model to a new dialog corpus from the systems in the Spoken Dialog Challenge (SDC) 2010.