Skip to Main Content
Victims, volunteers, and relief organizations are increasingly using social media to report and act on large-scale events, as witnessed in the extensive coverage of the 2010–2012 Arab Spring uprisings and 2011 Japanese tsunami and nuclear disasters. Twitter® feeds consist of short messages, often in a nonstandard local language, requiring novel techniques to extract relevant situation awareness data. Existing approaches to mining social media are aimed at searching for specific information, or identifying aggregate trends, rather than providing narratives. We present CrisisTracker, an online system that in real time efficiently captures distributed situation awareness reports based on social media activity during large-scale events, such as natural disasters. CrisisTracker automatically tracks sets of keywords on Twitter and constructs stories by clustering related tweets on the basis of their lexical similarity. It integrates crowdsourcing techniques, enabling users to verify and analyze stories. We report our experiences from an 8-day CrisisTracker pilot deployment during 2012 focused on the Syrian civil war, which processed, on average, 446,000 tweets daily and reduced them to consumable stories through analytics and crowdsourcing. We discuss the effectiveness of CrisisTracker based on the usage and feedback from 48 domain experts and volunteer curators.
Note: The Institute of Electrical and Electronics Engineers, Incorporated is distributing this Article with permission of the International Business Machines Corporation (IBM) who is the exclusive owner. The recipient of this Article may not assign, sublicense, lease, rent or otherwise transfer, reproduce, prepare derivative works, publicly display or perform, or distribute the Article.