In this paper, we present one possible way of analyzing social media conversional data in order to better understand customers. Ultimately, our goal is to analyze customer behavior as it is expressed in free-form conversations and extract from it commercially valuable information about the customer. In this study, we concentrate on using statistical techniques for analyzing this unstructured data at two levels: 1) at the level of the words used in the conversation and 2) by mapping those words to abstract concepts. The goal of such a statistical analysis is twofold. First, the statistically significant terms used by the users and the concepts associated with them provide insight on a user's interests that commercial services can use, for example, in order to target advertisements. In addition, knowing the evolution of a customer's interests and hobbies can be exploited commercially by retailers, media and entertainment companies, telecommunications companies, and more. In this paper, we describe a general framework for the analysis of social media data and, in turn, the application of the framework to the statistical analysis of the language of tweets.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.