Skip to Main Content
Popular Internet services are hosted by multiple geographically distributed data centers. The location of the data centers has a direct impact on the services' response times, capital and operational costs, and (indirect) carbon dioxide emissions. Selecting a location involves many important considerations, including its proximity to population centers, power plants, and network backbones, the source of the electricity in the region, the electricity, land, and water prices at the location, and the average temperatures at the location. As there can be many potential locations and many issues to consider for each of them, the selection process can be extremely involved and time-consuming. In this paper, we focus on the selection process and its automation. Specifically, we propose a framework that formalizes the process as a non-linear cost optimization problem, and approaches for solving the problem. Based on the framework, we characterize areas across the United States as potential locations for data centers, and delve deeper into seven interesting locations. Using the framework and our solution approaches, we illustrate the selection trade offs by quantifying the minimum cost of (1) achieving different response times, availability levels, and consistency times, and (2) restricting services to green energy and chiller-less data centers. Among other interesting results, we demonstrate that the intelligent placement of data centers can save millions of dollars under a variety of conditions. We also demonstrate that the selection process is most efficient and accurate when it uses a novel combination of linear programming and simulated annealing.