# IBM Journal of Research and Development

## Volume 57 Issue 1/2 • Jan.-March 2013

IBM Blue Gene/Q

Key technologies and applications of the IBM Blue Gene/Q supercomputer are described in an overview article and 14 technical papers. This highly scalable system can achieve a performance of 20 PFLOPS and yet top the list of the most power-efficient supercomputers. Papers in this issue describe the compute chip, the five-dimensional network that interconnects the compute nodes, the packaging and power technologies that support more than 1,000 nodes in a rack, the hardware and software architectures, the methodologies for high performance and low power, and the initial experiences with applications run by Lawrence Livermore and Argonne National Laboratories as well as Columbia and Edinburgh Universities.

## Filter Results

Displaying Results 1 - 20 of 20
• ### [Front cover]

Publication Year: 2013, Page(s): C1
| PDF (1912 KB)
• ### [Front inside cover]

Publication Year: 2013, Page(s): C2
| PDF (6 KB)

Publication Year: 2013, Page(s):1 - 2
| PDF (47 KB)
• ### The IBM Blue Gene project

Publication Year: 2013, Page(s):0:1 - 0:6
Cited by:  Papers (1)
| | PDF (2196 KB)

This paper provides a short overview of the IBM Blue Gene® project and an introduction to all of the papers in this issue of the IBM Journal of Research and Development. View full abstract»

• ### Design of the IBM Blue Gene/Q Compute chip

Publication Year: 2013, Page(s):1:1 - 1:13
Cited by:  Papers (1)
| | PDF (7610 KB)

The heart of a Blue Gene®/Q system is the Blue Gene/Q Compute (BQC) chip, which combines processors, memory, and communication functions on a single chip. The Blue Gene/Q Compute chip has 16 $+$ 1 $+$ 1 processor cores, each with a quad single-instruction, mult... View full abstract»

• ### Packaging the IBM Blue Gene/Q supercomputer

Publication Year: 2013, Page(s):2:1 - 2:13
Cited by:  Papers (2)
| | PDF (15303 KB)

The IBM Blue Gene®/Q supercomputer is designed for highly efficient computing for problems dominated by floating-point computation. Its target mean time between failures for a 96-rack, 98,304-node system is three days, allowing tasks requiring computation for many days to run at scale, with little time wasted on checkpoint-restart operations. This paper describes various elements of the com... View full abstract»

• ### Design for low power and power management in IBM Blue Gene/Q

Publication Year: 2013, Page(s):3:1 - 3:11
Cited by:  Papers (3)
| | PDF (1218 KB)

In this paper, we explain the techniques used in IBM Blue Gene®/Q Compute chips to achieve high energy efficiency. Architectural techniques include the choice of a power-efficient, throughput-oriented processor core with a SIMD (single-instruction, multiple-data) floating-point unit, as well as multiple frequency domains for moving data. Design techniques include clock gating and the use of... View full abstract»

• ### Application-level power and performance characterization and optimization on IBM Blue Gene/Q systems

Publication Year: 2013, Page(s):4:1 - 4:17
Cited by:  Papers (6)
| | PDF (8382 KB)

In order to understand application-level power/performance tradeoffs on current computer systems, runtime monitoring capabilities are needed. Specifically, very fine-grained monitoring capabilities are needed to gain detailed insights on power and performance behavior. Performing fine-grained application-level characterizations not only helps fine-tune application code, but it also increases the c... View full abstract»

• ### IBM Blue Gene/Q system software stack

Publication Year: 2013, Page(s):5:1 - 5:12
Cited by:  Papers (1)
| | PDF (2905 KB)

The principal focus areas for system software on the IBM Blue Gene®/Q include ultrascalability and high reliability while delivering the full performance capability of the hardware to applications. The Blue Gene/Q system software has achieved these goals while adding functionality and flexibility compared with previous versions of Blue Gene®. Whereas part of the software stack was im... View full abstract»

• ### Modeling, validation, and co-design of IBM Blue Gene/Q: Tools and examples

Publication Year: 2013, Page(s):6:1 - 6:12
| | PDF (6327 KB)

Major architectural innovations in the compute node have been introduced in the IBM Blue Gene®/Q, including programmable Level 1 (L1) cache data prefetching units to hide memory access latency, hardware support for transactional memory (TM) and speculative execution (SE), an enhanced five-dimensional integrated torus network, and a high-performance quad floating-point SIMD (single-instructi... View full abstract»

• ### IBM Blue Gene/Q memory subsystem with speculative execution and transactional memory

Publication Year: 2013, Page(s):7:1 - 7:12
Cited by:  Papers (1)
| | PDF (1472 KB)

The memory subsystem of the IBM Blue Gene®/Q Compute chip features multi-versioning and access conflict detection. Its ordered and unordered transaction modes implement both speculative execution (SE) and transactional memory (TM). Blue Gene/Q's large shared second-level cache serves as storage for speculative versions, allowing up to 30 MB of speculative state for the 64 threads of a Blue ... View full abstract»

• ### Experimenting with low-overhead OpenMP runtime on IBM Blue Gene/Q

Publication Year: 2013, Page(s):8:1 - 8:8
Cited by:  Papers (1)
| | PDF (1361 KB)

As newer supercomputers continue to increase the number of threads, there is growing pressure on applications to exploit more of the available parallelism in their codes, including coarse-, medium-, and fine-grain parallelism. OpenMP™ is one of the dominant shared-memory programming models and is well suited for exploiting medium- and fine-grain parallelism. OpenMP research has focused on a... View full abstract»

• ### Determination of performance characteristics of scientific applications on IBM Blue Gene/Q

Publication Year: 2013, Page(s):9:1 - 9:12
| | PDF (862 KB)

The IBM Blue Gene®/Q platform presents scientists and engineers with a rich set of hardware features such as 16 cores per chip sharing a Level 2 cache, a wide SIMD (single-instruction, multiple-data) unit, a five-dimensional torus network, and hardware support for collective operations. An especially important feature is that the cores have four “hardware threads,” which makes... View full abstract»

• ### Massive data analytics: The Graph 500 on IBM Blue Gene/Q

Publication Year: 2013, Page(s):10:1 - 10:11
Cited by:  Papers (4)
| | PDF (4084 KB)

Graph algorithms are becoming increasingly important for biology, transportation, business intelligence, and a wide range of commercial workloads. Most graph algorithms stress to the limit various architectural aspects of conventional machines. The memory access patterns are irregular, with little spatial locality and data reuse. The amount of computation per loaded byte is very small, typically i... View full abstract»

• ### Science at LLNL with IBM Blue Gene/Q

Publication Year: 2013, Page(s):11:1 - 11:18
Cited by:  Papers (4)
| | PDF (8554 KB)

Lawrence Livermore National Laboratory (LLNL) has a long history of working with IBM on Blue Gene® supercomputers. Beginning in November 2001 with the joint announcement of a partnership to expand the Blue Gene research project (including Blue Gene®/L and Blue Gene®/P), the collaboration extends to this day with LLNL planning for the installation of a 96-rack Blue Gene®... View full abstract»

• ### Argonne applications for the IBM Blue Gene/Q, Mira

Publication Year: 2013, Page(s):12:1 - 12:11
Cited by:  Papers (1)
| | PDF (179 KB)

A varied collection of scientific and engineering codes has been adapted and enhanced to take advantage of the IBM Blue Gene®/Q architecture and thus enable research that was previously out of reach. Computational research teams from a number of disciplines collaborated with the staff of the Argonne Leadership Computing Facility to assess which of Blue Gene/Q's many novel features could be ... View full abstract»

• ### Co-design of the IBM Blue Gene/Q Level 1 prefetch engine with QCD

Publication Year: 2013, Page(s):13:1 - 13:10
| | PDF (865 KB)

In order to optimize performance of the IBM Blue Gene®/Q memory system for real scientific applications, the VHDL (Very-high-speed integrated circuits Hardware Description Language) design and optimization of the Level 1 prefetch engine were undertaken in a co-design exercise with assembler kernels that form the critical loops in quantum chromodynamics delivering improved performance for bo... View full abstract»

• ### Early experiences with scientific applications on the IBM Blue Gene/Q supercomputer

Publication Year: 2013, Page(s):14:1 - 14:9
| | PDF (2560 KB)

We report early experiences with porting highly complex scientific applications to the IBM Blue Gene®/Q platform. In addition, we report our progress in porting performance analysis tools that are deemed to be key in helping users understand massively parallel, massively threaded applications. Porting proved to be quite a smooth process. Although in this early study we did not use the full ... View full abstract»

• ### A framework for electric vehicle charging-point network optimization

Publication Year: 2013, Page(s):15:1 - 15:9
Cited by:  Papers (1)
| | PDF (3153 KB)

Electric vehicles (EVs) are often suggested as an effective green energy technology to reduce gasoline consumption and emissions. When preparing for the widespread adoption of EVs, a critical problem is to plan an optimal charging-point network that could best serve the customers as well as save costs. In this paper, a demand-based optimization model of an EV charging-point network is constructed.... View full abstract»

• ### [Back inside cover]

Publication Year: 2013, Page(s): C3
| PDF (5 KB)

## Aims & Scope

The IBM Journal of Research and Development is a peer-reviewed technical journal, published bimonthly, which features the work of authors in the science, technology and engineering of information systems.

The following IBM journal articles are freely available for all users to view:

Full Aims & Scope

## Meet Our Editors

Editor-in-Chief
Rachel D'Annucci Henriquez
IBM T. J. Watson Research Center