<?xml version="1.0" ?>
<rss version="2.0">
	<channel>
		<title><![CDATA[ IBM Journal of Research and Development - new TOC ]]></title>
		<link>http://ieeexplore.ieee.org</link>
		<description>TOC Alert for Publication# 5288520 </description>
		<year>2013</year>
		<month>May      </month>
		<day>23</day>
		<item>
			<title><![CDATA[Cover 1]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517294]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517294]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>C1</startPage>
			<endPage>C1</endPage>
			<fileSize>1380</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
		<item>
			<title><![CDATA[Cover 2]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517312]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517312]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>C2</startPage>
			<endPage>C2</endPage>
			<fileSize>6</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
		<item>
			<title><![CDATA[Table of Contents]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517321]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517321]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>1</startPage>
			<endPage>2</endPage>
			<fileSize>52</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
		<item>
			<title><![CDATA[Preface: Massive-scale analytics]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517308]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517308]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>0:1</startPage>
			<endPage>0:4</endPage>
			<fileSize>60</fileSize>
			<authors><![CDATA[Soffer, A.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Governing Big Data: Principles and practices]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517298]]></link>
			<description><![CDATA[As data-intensive decision making is being increasingly adopted by businesses, governments, and other agencies around the world, most organizations encountering a very large amount and variety of data are still contemplating and assessing their readiness to embrace &#x201C;Big Data.&#x201D; While these organizations devise various ways to deal with the challenges such data brings, the impact and importance of Big Data to information quality and governance programs should not be underestimated. Drawing upon implementation experiences of early adopters of Big Data technologies across multiple industries, this paper explores the issues and challenges involved in the management of Big Data, highlighting the principles and best practices for effective Big Data governance.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517298]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>1:1</startPage>
			<endPage>1:13</endPage>
			<fileSize>2548</fileSize>
			<authors><![CDATA[Malik, P.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Trends and outlook for the massive-scale analytics stack]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517306]]></link>
			<description><![CDATA[Massive-scale analytics (MSA) applications are characterized by the large amount of data that they process and the complexity of algorithms used to process the data. The ideal MSA system will not only support processing of large amounts of data but also offer a high degree of parallelism and support scheduling and resource allocation of complex workloads. Designers of MSA systems must provide three necessities: programming abstractions, runtime systems, and hardware. Historically, two communities have undertaken the task of designing MSA systems: the database community, which has argued for an SQL (Structured Query Language)-influenced processing paradigm, and the high-performance computing community, which has focused on developing infrastructures for highly efficient, but complex, parallel implementations. These two communities have developed disparate technologies to meet the necessities of MSA systems, and the solutions provided by the individual communities are not completely satisfactory. In this paper, we attempt to characterize the strengths and weaknesses of the approaches of these two communities at all levels of the MSA stack, characterize implications with respect to resource management within the MSA system, and define how an MSA system should be designed.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517306]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>2:1</startPage>
			<endPage>2:11</endPage>
			<fileSize>1932</fileSize>
			<authors><![CDATA[Ghoting, A.N.;Gunnels, J.A.;Kambadur, P.;Pednault, E.P.;Squillante, M.S.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Understanding system design for Big Data workloads]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517337]]></link>
			<description><![CDATA[This paper explores the design and optimization implications for systems targeted at Big Data workloads. We confirm that these workloads differ from workloads typically run on more traditional transactional and data-warehousing systems in fundamental ways, and, therefore, a system optimized for Big Data can be expected to differ from these other systems. Rather than only studying the performance of representative computational kernels, and focusing on central-processing-unit performance, this paper studies the system as a whole. We identify three major phases in a typical Big Data workload, and we propose that each of these phases should be represented in a Big Data systems benchmark. We implemented our ideas on two distinct IBM POWER7&#x00AE; processor-based systems that target different market sectors, and we analyze their performance on a sort benchmark. In particular, this paper includes an evaluation of POWER7 processor-based systems using MapReduce TeraSort, which is a workload that can be a &#x201C;stress test&#x201D; for multiple dimensions of system performance. We combine this work with a broader perspective on Big Data workloads and suggest a direction for a future benchmark definition effort. A number of methods to further improve system performance are proposed.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517337]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>3:1</startPage>
			<endPage>3:10</endPage>
			<fileSize>877</fileSize>
			<authors><![CDATA[Hofstee, H.P.;Chen, G.C.;Gebara, F.H.;Hall, K.;Herring, J.;Jamsek, D.;Li, J.;Li, Y.;Shi, J.W.;Wong, P.W.Y.;]]></authors>
		</item>
		<item>
			<title><![CDATA[A platform for eXtreme Analytics]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517309]]></link>
			<description><![CDATA[With the rapid increase in the volume of data that enterprises are producing, enterprises are adopting large-scale data processing platforms such as Hadoop&#x00AE; to store, manage, and run deep analytics to gain actionable insights from their &#x201C;big data.&#x201D; At IBM Research - Almaden, we have been helping enterprise customers build solutions exploiting data-intensive analytics. Our deep experience with actual users has led to an extensive understanding of the platform requirements needed to support these solutions, and our goal is to provide a powerful analytics platform, which we call eXtreme Analytics Platform (XAP), that can be used to create solutions for customer problems that have not been economically feasible to solve until now. XAP provides Jaql [i.e., JavaScript&#x00AE; Object Notation (JSON) query language, a scripting language to specify data flows, tools, and techniques to optimize the runtime execution of these flows], an improved task scheduler, connectors to data warehouses, and libraries for advanced analytics. Many of these technologies have been transferred to the IBM InfoSphere BigInsights&#x2122; product. In this paper, we describe the overall design principles and technology of XAP.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517309]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>4:1</startPage>
			<endPage>4:11</endPage>
			<fileSize>950</fileSize>
			<authors><![CDATA[Balmin, A.;Beyer, K.;Ercegovac, V.;McPherson, J.;Ozcan, F.;Pirahesh, H.;Shekita, E.;Sismanis, Y.;Tata, S.;Tian, Y.;]]></authors>
		</item>
		<item>
			<title><![CDATA[GPFS-SNC: An enterprise cluster file system for Big Data]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517331]]></link>
			<description><![CDATA[A new class of data-intensive applications commonly referred to as Big Data applications (e.g., customer sentiment analysis based on click-stream logs) involves processing massive amounts of data with a focus on semantically transforming the data. This class of applications is massively parallel and well suited for the MapReduce programming framework that allows users to perform large-scale data analyses such that the application execution layer handles the system architecture, data partitioning, and task scheduling. In this paper, we introduce GPFS-SNC (General Parallel File System for Shared Nothing Clusters), a scalable file system that operates over a cluster of commodity machines and direct-attached storage and meets the requirements of analytics and traditional applications that are typically used together in analytics solutions. The architecture extends an existing enterprise cluster file system to support these emerging classes of workloads by applying five innovative optimizations: 1) locality awareness to allow compute jobs to be scheduled on nodes where the data resides, 2) metablocks that allow large and small block sizes to co-exist in the same file system to meet the needs of different types of applications, 3) write affinity that allows applications to dictate the layout of files on different nodes in order to maximize both write and read bandwidth, 4) pipelined replication to maximize use of network bandwidth for data replication, and 5) distributed recovery to minimize the effect of failures on ongoing computation.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517331]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>5:1</startPage>
			<endPage>5:10</endPage>
			<fileSize>1861</fileSize>
			<authors><![CDATA[Jain, R.;Sarkar, P.;Subhraveti, D.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Toward a scale-out data-management middleware for low-latency enterprise computing]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517300]]></link>
			<description><![CDATA[Emerging transactional workloads from Internet and mobile commerce require low-latency, massive-scale, and integrated data analytics to enhance user experience and to improve up-selling opportunities. These analytics require new application platforms that must be able to absorb large volumes of data, provide low-latency access to the data, and cache data objects to improve access times in distributed environments. This paper reports on recent technologies built at IBM Research to address challenges in data access latency, data ingestion, and caching in the exemplary context of an online product recommendation application. We describe three technologies related to the issues and optimizations of key-value data object store and access. First, we describe the architecture of a global secondary index to greatly improve data access latency of Hadoop&#x2122; Database (HBase&#x2122;), an open-source key-value distributed data store. Second, we present an in-memory write-ahead log feature on HBase that significantly improves write operations for high-volume data ingestion. Third, we detail an innovative distributed caching system that exploits low-latency interconnects to use hash maps of data keys on each server for local lookup, while data resides and are accessed across clustered systems. The distributed cache can achieve a 100- to 1,000-fold performance gain over many caching methods. These technologies together form some necessary building blocks for a next-generation data-centric middleware for integrated transaction and analytic workloads.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517300]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>6:1</startPage>
			<endPage>6:14</endPage>
			<fileSize>2257</fileSize>
			<authors><![CDATA[Fong, L.L.;Gao, Y.;Guerin, X.R.;Liu, Y.G.;Salo, T.;Seelam, S.R.;Tan, W.;Tata, S.;]]></authors>
		</item>
		<item>
			<title><![CDATA[IBM Streams Processing Language: Analyzing Big Data in motion]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517299]]></link>
			<description><![CDATA[The IBM Streams Processing Language (SPL) is the programming language for IBM InfoSphere&#x00AE; Streams, a platform for analyzing Big Data in motion. By &#x201C;Big Data in motion,&#x201D; we mean continuous data streams at high data-transfer rates. InfoSphere Streams processes such data with both high throughput and short response times. To meet these performance demands, it deploys each application on a cluster of commodity servers. SPL abstracts away the complexity of the distributed system, instead exposing a simple graph-of-operators view to the user. SPL has several innovations relative to prior streaming languages. For performance and code reuse, SPL provides a code-generation interface to C++ and Java&#x00AE;. To facilitate writing well-structured and concise applications, SPL provides higher-order composite operators that modularize stream sub-graphs. Finally, to enable static checking while exposing optimization opportunities, SPL provides a strong type system and user-defined operator models. This paper provides a language overview, describes the implementation including optimizations such as fusion, and explains the rationale behind the language design.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517299]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>7:1</startPage>
			<endPage>7:11</endPage>
			<fileSize>1275</fileSize>
			<authors><![CDATA[Hirzel, M.;Andrade, H.;Gedik, B.;Jacques-Silva, G.;Khandekar, R.;Kumar, V.;Mendell, M.;Nasgaard, H.;Schneider, S.;Soule, R.;Wu, K.-L.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Real-time analysis and management of big time-series data]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517301]]></link>
			<description><![CDATA[The ability to process and analyze large volumes of time-series data is in increasing demand in various domains including health care, finance, energy and utilities, transportation, and cybersecurity. Despite the broad use of time-series data worldwide, the design of a system to easily manage, analyze, and visualize large multidimensional time series, with dimensions on the order of hundreds of thousands, is still a challenging endeavor. This paper describes the Streaming Time-Series Analysis and Management (STAM) system as a solution to this problem. STAM provides the capability to glean actionable information from continuously changing time series with thousands of dimensions, in real time. STAM exploits the IBM InfoSphere&#x00AE; Streams platform and allows for general-purpose large-scale time-series analytics for applications including anomaly detection, modeling, smoothing, forecasting, and tracking. In addition, the system provides user-friendly tools for managing, deploying, and initiating analytics on large-scale data streams of interest, and provides a web-based graphical visualization interface that allows highlighting of events of interest with interactive menus. In this paper, we describe the system and illustrate its use in a large-scale system-monitoring application.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517301]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>8:1</startPage>
			<endPage>8:12</endPage>
			<fileSize>2416</fileSize>
			<authors><![CDATA[Biem, A.;Feng, H.;Riabov, A.V.;Turaga, D.S.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Novel document detection for massive data streams using distributed dictionary learning]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517286]]></link>
			<description><![CDATA[Given the high volume of content being generated online, it becomes necessary to employ automated techniques to separate out the documents belonging to novel topics from the background discussion, in a robust and scalable manner (with respect to the size of the document set). We present a solution to this challenge based on sparse coding, in which a stream of documents (where each document is modeled as an <formula formulatype="inline"><tex Notation="TeX">$m$ </tex></formula>-dimensional vector <formula formulatype="inline"><tex Notation="TeX">$y$</tex></formula>) can be used to learn a dictionary matrix <formula formulatype="inline"><tex Notation="TeX">$A$</tex></formula> of dimension <formula formulatype="inline"><tex Notation="TeX">$m times k$</tex></formula>, such that the documents can be approximately represented by a linear combination of a few columns of <formula formulatype="inline"><tex Notation="TeX">$A$</tex></formula>. If a new document cannot be represented with low error as a sparse linear combination of these columns, then this is a strong indicator of novelty of the document. We scale up this approach to handle millions of documents by parallelizing sparse coding and dictionary learning, and by using the alternating-directions method to solve the resulting optimization problems. We conduct our experiments on high-performance computing clusters with differing architectures and evaluate our approach on news streams and streaming data from Twitter&#x00AE;. Based on the analysis, we share our insights on the distributed optimization and machine architecture that can help the design of exascale systems supporting data analytics.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517286]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>9:1</startPage>
			<endPage>9:15</endPage>
			<fileSize>1476</fileSize>
			<authors><![CDATA[Kasiviswanathan, S.P.;Cong, G.;Melville, P.;Lawrence, R.D.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Big Data text-oriented benchmark creation for Hadoop]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517313]]></link>
			<description><![CDATA[Massive-scale Big Data analytics is representative of a new class of workloads that justifies a rethinking of how computing systems should be optimized. This paper addresses the need for a set of benchmarks that system designers can use to measure the quality of their designs and that customers can use to evaluate competing systems offerings with respect to commonly performed text-oriented workflows in Hadoop&#x2122;. Additions are needed to existing benchmarks such as HiBench in terms of both scale and relevance. We describe a methodology for creating a petascale data-size text-oriented benchmark that includes representative Big Data workflows and can be used to test total system performance, with demands balanced across storage, network, and computation. Creating such a benchmark requires meeting unique challenges associated with the data size and its often unstructured nature. To be useful, the benchmark also needs to be sufficiently generic to be accepted by the community at large. Here, we focus on a text-oriented Hadoop workflow that consists of three common tasks: categorizing text documents, identifying significant documents within each category, and analyzing significant documents for new topic creation.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517313]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>10:1</startPage>
			<endPage>10:6</endPage>
			<fileSize>99</fileSize>
			<authors><![CDATA[Gattiker, A.;Gebara, F.H.;Hofstee, H.P.;Hayes, J.D.;Hylick, A.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Platform and applications for massive-scale streaming network analytics]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517319]]></link>
			<description><![CDATA[The ability to analyze massive amounts of network traffic data in real time is becoming increasingly important for communication service providers, as it enables them to optimize use of their service infrastructure and develop innovative revenue-generating opportunities. In particular, the real-time analysis of perishable user traffic (which is not stored because of privacy, regulatory, and other constraints) can provide insights into the use of applications and services by telecommunication subscribers. In this paper, we describe the design and implementation of a novel system for real-time analysis of network traffic based on IBM InfoSphere&#x00AE; Streams, a scalable stream-processing platform, which provides access and analysis with respect to the data objects and communication patterns of users at the application layer, in contrast to simple packet- and flow-based analysis that most current systems provide. We discuss our design considerations for such a system and further describe analytics applications developed to showcase its capabilities: online identification of most-frequent objects, online social network discovery, and real-time sentiment analysis. We also present performance results from a pilot deployment of this platform and its applications that analyzed Internet traffic generated by users at a large corporate research lab.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517319]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>11:1</startPage>
			<endPage>11:13</endPage>
			<fileSize>6140</fileSize>
			<authors><![CDATA[Zerfos, P.;Srivatsa, M.;Yu, H.;Dennerline, D.;Franke, H.;Agrawal, D.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Scalable community detection in massive social networks using MapReduce]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517336]]></link>
			<description><![CDATA[In this paper, we present a community-detection solution for massive-scale social networks using MapReduce, a parallel programming framework. We use a similarity metric to model the community probability, and the model is designed to be parallelizable and scalable in the MapReduce framework. More importantly, we propose a set of degree-based preprocessing and postprocessing techniques named DEPOLD (DElayed Processing of Large Degree nodes) that significantly improve both the community-detection accuracy and performance. With DEPOLD, delaying analysis of 1% of high-degree nodes to the postprocessing stage reduces both processing time and storage space by one order of magnitude. DEPOLD can be applied to other graph-clustering problems. Furthermore, we design and implement two similarity calculation algorithms using MapReduce with different computation and communication characteristics in order to adapt to various system configurations. Finally, we conduct experiments with publicly available datasets. Our evaluation demonstrates the effectiveness, efficiency, and scalability of the proposed solution.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517336]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>12:1</startPage>
			<endPage>12:14</endPage>
			<fileSize>3986</fileSize>
			<authors><![CDATA[Shi, J.;Xue, W.;Wang, W.;Zhang, Y.;Yang, B.;Li, J.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Visual analysis of large-scale network anomalies]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517342]]></link>
			<description><![CDATA[The amount of information flowing across communication networks has rapidly increased. The highly dynamic and complex networks, represented as large graphs, make the analysis of such networks increasingly challenging. In this paper, we provide a brief overview of several useful visualization techniques for the analysis of spatiotemporal anomalies in large-scale networks. We make use of community-based similarity graphs (CSGs), temporal expansion model graphs (TEMGs), correlation graphs (CGs), high-dimension projection graphs (HDPGs), and topology-preserving compressed graphs (TPCGs). CSG is used to detect anomalies based on community membership changes rather than individual nodes and edges and therefore may be more tolerant to the highly dynamic nature of large networks. TEMG transforms network topologies into directed trees so that efficient search is more likely to be performed for anomalous changes in network behavior and routing topology in large dynamic networks. CG and HDPG are used to examine the complex relationship of data dimensions among graph nodes through transformation in a high-dimensional space. TPCG groups nodes with similar neighbor sets into mega-nodes, thus making graph visualization and analysis more scalable to large networks. All the methods target efficient large-graph anomaly visualization from different perspectives and together provide valuable insights.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517342]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>13:1</startPage>
			<endPage>13:12</endPage>
			<fileSize>10215</fileSize>
			<authors><![CDATA[Liao, Q.;Shi, L.;Wang, C.;]]></authors>
		</item>
		<item>
			<title><![CDATA[A statistical approach to mining customers' conversational data from social media]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517283]]></link>
			<description><![CDATA[In this paper, we present one possible way of analyzing social media conversional data in order to better understand customers. Ultimately, our goal is to analyze customer behavior as it is expressed in free-form conversations and extract from it commercially valuable information about the customer. In this study, we concentrate on using statistical techniques for analyzing this unstructured data at two levels: 1) at the level of the words used in the conversation and 2) by mapping those words to abstract concepts. The goal of such a statistical analysis is twofold. First, the statistically significant terms used by the users and the concepts associated with them provide insight on a user's interests that commercial services can use, for example, in order to target advertisements. In addition, knowing the evolution of a customer's interests and hobbies can be exploited commercially by retailers, media and entertainment companies, telecommunications companies, and more. In this paper, we describe a general framework for the analysis of social media data and, in turn, the application of the framework to the statistical analysis of the language of tweets.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517283]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>14:1</startPage>
			<endPage>14:13</endPage>
			<fileSize>2289</fileSize>
			<authors><![CDATA[Konopnicki, D.;Shmueli-Scheuer, M.;Cohen, D.;Sznajder, B.;Herzig, J.;Raviv, A.;Zwerling, N.;Roitman, H.;Mass, Y.;]]></authors>
		</item>
		<item>
			<title><![CDATA[A real-time stream storage and analysis platform for underwater acoustic monitoring]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517318]]></link>
			<description><![CDATA[We describe a distributed, real-time system for the collection and analysis of underwater acoustic data. The system uses a number of preprocessing steps to classify and detect acoustic events and to identify and compensate for gaps in the data stream. Different event-detection techniques are applied in a distributed manner on the incoming data stream from each sensor to aid in the indexing and storage of the data. Other event-detection techniques process multiple simultaneous streams to identify and classify events of interest. Building upon the deployed system, a stream analytical platform provides data handling, preprocessing, and analytics in real time. These analytics identify and classify anthropogenic, environmental, and animal noise (a significant amount of which occurs outside the audible range of human hearing) and ascertain the direction of the noise source.]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517318]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>15:1</startPage>
			<endPage>15:10</endPage>
			<fileSize>9434</fileSize>
			<authors><![CDATA[Hayes, J.P.;Kolar, H.R.;Akhriev, A.;Barry, M.G.;Purcell, M.E.;McKeown, E.P.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Cover 3]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517341]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[May-July  2013]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6517341]]></guid>
			<volume>57</volume>
			<issue>3/4</issue>
			<startPage>C3</startPage>
			<endPage>C3</endPage>
			<fileSize>6</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
	</channel>
</rss>