Skip to Main Content
This paper describes a multi-method approach to delineate a “real world” community of practice from a large N dataset derived from the social networking site Twitter. The starting point is previous qualitative research of a virtual community of independent (“indie”) developers who create software for Apple's Macintosh and iPhone platforms. Indie developers have been active on Twitter from an early stage on and they use Twitter to sustain interactions between peers, exchange technical information and for viral “echo chamber” marketing. The publicly available Twitter API is used to mine a network consisting of several million edges, which is sized down to a large network containing roughly 1 million edges through several pruning methods. The fast greedy algorithm is then used to detect subgraphs within this large network. Triangulation with qualitative data proves that the fast greedy algorithm is able to distill meaningful communities from a large, noisy and ill-delineated network. The accuracy of this approach gives rise to the discussion of the value for businesses and market research, since it offers opportunities to identify and monitor target audiences at a finely grained level. However, we should be wary of the serious consequences with regard to privacy and ethics. The proposed multi-method approach allows micro level inferences from a macro dataset of which the individual Twitter user might be completely unaware. The results could have consequences for the anonymity of key persons behind the scenes of social and political movements or any other communities whose members are active on Twitter or other social networks.