By Topic

Compression of Graphical Structures: Fundamental Limits, Algorithms, and Experiments

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Yongwook Choi ; J. Craig Venter Inst., Rockville, MD, USA ; Szpankowski, W.

Information theory traditionally deals with “conventional data,” be it textual data, image, or video data. However, databases of various sorts have come into existence in recent years for storing “unconventional data” including biological data, social data, web data, topographical maps, and medical data. In compressing such data, one must consider two types of information: the information conveyed by the structure itself, and the information conveyed by the data labels implanted in the structure. In this paper, we attempt to address the former problem by studying information of graphical structures (i.e., unlabeled graphs). As the first step, we consider the Erdös-Rényi graphs G(n,p) over n vertices in which edges are added independently and randomly with probability p. We prove that the structural entropy of G(n,p) is (n;2)h(p)-logn!+o(1)=(n;2)h(p)-nlog+O(n), where h(p)=-plogp-(1-p)log(1-p) is the entropy rate of a conventional memoryless binary source. Then, we propose a two-stage compression algorithm that asymptotically achieves the structural entropy up to the nlog term (i.e., the first two leading terms) of the structural entropy. Our algorithm runs either in time O(n2) in the worst case for any graph or in time O(n+e) on average for graphs generated by G(n,p), where e is the average number of edges. To the best of our knowledge, this is the first provable (asymptotically) optimal graph compressor for Erdös-Rényi graph models. We use combinatorial and analytic techniques such as generating functions, Mellin transform, and poissonization to establish these findings. Our experiments confirm the theoretical results and also show the usefulness of our algorithm for some real-world graphs such as the Internet, biological networks, and social networks.

Published in:

Information Theory, IEEE Transactions on  (Volume:58 ,  Issue: 2 )