<?xml version="1.0" ?>
<rss version="2.0">
	<channel>
		<title><![CDATA[ Computer Architecture Letters - new TOC ]]></title>
		<link>http://ieeexplore.ieee.org</link>
		<description>TOC Alert for Publication# 10208 </description>
		<year>2012</year>
		<month>February </month>
		<day>10</day>
		<item>
			<title><![CDATA[Cover 1]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087357]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087357]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>c1</startPage>
			<endPage>c1</endPage>
			<fileSize>101</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
		<item>
			<title><![CDATA[Cover 2]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087358]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087358]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>c2</startPage>
			<endPage>c2</endPage>
			<fileSize>115</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
		<item>
			<title><![CDATA[Heterogeneity in &#x0201C;Homogeneous&#x0201D; Warehouse-Scale Computers: A Performance Opportunity]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5887296]]></link>
			<description><![CDATA[The class of modern datacenters recently coined as &#x201C;warehouse scale computers&#x201D; (WSCs) has traditionally been embraced as homogeneous computing platforms. However, due to frequent machine replacements and upgrades, modern WSCs are in fact composed of diverse commodity microarchitectures and machine configurations. Yet, current WSCs are designed with an assumption of homogeneity, leaving a potentially significant performance opportunity unexplored. In this paper, we investigate the key factors impacting the available heterogeneity in modern WSCs, and the benefit of exploiting this heterogeneity to maximize overall performance. We also introduce a new metric, opportunity factor, which can be used to quantify an application's sensitivity to the heterogeneity in a given WSC. For applications that are sensitive to heterogeneity, we observe a performance improvement of up to 70% when employing our approach. In a WSC composed of state-of-the-art machines, we can improve the overall performance of the entire datacenter by 16% over the status quo.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5887296]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>29</startPage>
			<endPage>32</endPage>
			<fileSize>145</fileSize>
			<authors><![CDATA[Mars, J.;Lingjia Tang;Hundt, R.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Packet Chaining: Efficient Single-Cycle Allocation for On-Chip Networks]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5887297]]></link>
			<description><![CDATA[This paper introduces packet chaining, a simple and effective method to increase allocator matching efficiency and hence network performance, particularly suited to networks with short packets and short cycle times. Packet chaining operates by chaining packets destined to the same output together, to reuse the switch connection of a departing packet. This allows an allocator to build up an efficient matching over a number of cycles, like incremental allocation, but not limited by packet length. For a 64-node 2D mesh at maximum injection rate and with single-flit packets, packet chaining increases network throughput by 15% compared to a conventional single-iteration separable iSLIP allocator, outperforms a wavefront allocator, and gives comparable throughput with an augmenting paths allocator. Packet chaining achieves this performance with a cycle time comparable to a single-iteration separable allocator. Packet chaining also reduces average network latency by 22.5%. Finally, packet chaining increases IPC up to 46% (16% average) for application benchmarks because short packets are critical in a typical cache-coherent CMP. These are considerable improvements given the maturity of network-on-chip routers and allocators.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5887297]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>33</startPage>
			<endPage>36</endPage>
			<fileSize>139</fileSize>
			<authors><![CDATA[Michelogiannakis, G.;Nan Jiang;Becker, D.U.;Dally, W.J.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Exploring the Interaction Between Device Lifetime Reliability and Security Vulnerabilities]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5934666]]></link>
			<description><![CDATA[As technology scales, device reliability is becoming a fundamental problem. Even though manufacture test can guarantee product quality, due to various types of wearout and failure modes, permanent faults appear in the filed is becoming an increasingly important and real problem. Such types of wear-out creates permanent faults in devices during their lifetime, but after release to the user. In this paper, we perform a formal investigation of the impact of permanent faults on security, examine empirical evidence, and demonstrate a real attack. Our results show that permanent stuck-at faults may leave security holes in microprocessors. We show that an adversary with knowledge of a fault can launch attacks which can obtain critical secrets such as a private key in 30 seconds.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5934666]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>37</startPage>
			<endPage>40</endPage>
			<fileSize>142</fileSize>
			<authors><![CDATA[Chen-Han Ho;Staus, G.;Ullmer, A.;Sakaralingam, K.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Fault-Tolerant Vertical Link Design for Effective 3D Stacking]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5940983]]></link>
			<description><![CDATA[Recently, 3D stacking has been proposed to alleviate the memory bandwidth limitation arising in chip multiprocessors (CMPs). As the number of integrated cores in the chip increases the access to external memory becomes the bottleneck, thus demanding larger memory amounts inside the chip. The most accepted solution to implement vertical links between stacked dies is by using Through Silicon Vias (TSVs). However, TSVs are exposed to misalignment and random defects compromising the yield of the manufactured 3D chip. A common solution to this problem is by over-provisioning, thus impacting on area and cost. In this paper, we propose a fault-tolerant vertical link design. With its adoption, fault-tolerant vertical links can be implemented in a 3D chip design at low cost without the need of adding redundant TSVs (no over-provision). Preliminary results are very promising as the fault-tolerant vertical link design increases switch area only by 6.69% while the achieved interconnect yield tends to 100%.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5940983]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>41</startPage>
			<endPage>44</endPage>
			<fileSize>128</fileSize>
			<authors><![CDATA[Hernandez, C.;Roca, A.;Flich, J.;Silla, F.;Duato, J.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Experience with Improving Distributed Shared Cache Performance on Tilera's Tile Processor]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5953732]]></link>
			<description><![CDATA[This paper describes our experience with profiling and optimizing physical locality for the distributed shared cache (DSC) in Tilera's Tile multicore processor. Our approach uses the Tile Processor's hardware performance measurement counters (PMCs) to acquire page-level access pattern profiles. A key problem we address is imprecise PMC interrupts. Our profiling tools use binary analysis to correct for interrupt ``skid,'' thus pinpointing individual memory operations that incur remote DSC slice references and permitting us to sample their access patterns. We use our access pattern profiles to drive page homing optimizations for both heap and static data objects. Our experiments show we can improve physical locality for 5 out of 11 SPLASH2 benchmarks running on 32 cores, enabling 32.9%--77.9% of DSC references to target the local DSC slice. To our knowledge, this is the first work to demonstrate page homing optimizations on a real system.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5953732]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>45</startPage>
			<endPage>48</endPage>
			<fileSize>377</fileSize>
			<authors><![CDATA[Inseok Choi;Minshu Zhao;Xu Yang;Yeung, D.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Multilevel Cache Modeling for Chip-Multiprocessor Systems]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5962329]]></link>
			<description><![CDATA[This paper presents a simple analytical model for predicting on-chip cache hierarchy effectiveness in chip multiprocessors (CMP) for a state-of-the-art architecture. Given the complexity of this type of systems, we use rough approximations, such as the empirical observation that the re-reference timing pattern follows a power law and the assumption of a simplistic delay model for the cache, in order to provide a useful model for the memory hierarchy responsiveness. This model enables the analytical determination of average access time, which makes design space pruning useful before sweeping the vast design space of this class of systems. The model is also useful for predicting cache hierarchy behavior in future systems. The fidelity of the model has been validated using a state-of-the-art, full-system simulation environment, on a system with up to sixteen out-of-order processors with cache-coherent caches and using a broad spectrum of applications, including complex multithread workloads. This simple model can predict a near-to-optimal, on-chip cache distribution while also estimating how future system running future applications might behave.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5962329]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>49</startPage>
			<endPage>52</endPage>
			<fileSize>440</fileSize>
			<authors><![CDATA[Prieto, P.;Puente, V.;Gregorio, J.-A.;]]></authors>
		</item>
		<item>
			<title><![CDATA[On Supporting Rapid Thermal Analysis]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5962328]]></link>
			<description><![CDATA[Detailed thermal analysis is usually performed exclusively at design time since it is a computationally intensive task. In this paper, we introduce a novel methodology for fast, yet accurate, thermal analysis. The introduced methodology is software supported by a new open source tool that enables hierarchical thermal analysis with adaptive levels of granularity. Experimental results prove the efficiency of our approach since it leads to average reduction of the execution overhead up to 70% with a penalty in accuracy ranging between 2% and 8%.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=5962328]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>53</startPage>
			<endPage>56</endPage>
			<fileSize>377</fileSize>
			<authors><![CDATA[Siozios, K.;Rodopoulos, D.;Soudris, D.;]]></authors>
		</item>
		<item>
			<title><![CDATA[Cover 3]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087359]]></link>
			<description><![CDATA[Provides instructions and guidelines to prospective authors who wish to submit manuscripts.]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087359]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>c3</startPage>
			<endPage>c3</endPage>
			<fileSize>115</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
		<item>
			<title><![CDATA[Cover 4]]></title>
			<link><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087360]]></link>
			<description><![CDATA[ ]]></description>
			<pubDate><![CDATA[July-Dec.  2011]]></pubDate>
			<guid><![CDATA[http://ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=6087356&arnumber=6087360]]></guid>
			<volume>10</volume>
			<issue>2</issue>
			<startPage>c4</startPage>
			<endPage>c4</endPage>
			<fileSize>101</fileSize>
			<authors><![CDATA[]]></authors>
		</item>
	</channel>
</rss>
