Skip to Main Content
A virtual machine (VM), during its lifecycle, can be scheduled for execution at geographically disparate cloud locations depending upon the cost of computation and the load at these locations. However, trans-locating a live VM across high-latency low-bandwidth wide area networks (WAN) within 'reasonable' time is nearly impossible due to the large size of the VM image. In this paper, we deal with this problem by combining VM scheduling strategies with VM replication strategies. In particular, we propose to replicate a VM image selectively across different cloud sites, choose a replica of the VM image to be the primary copy and propagate the incremental changes at the primary copy to all the remaining replicas of the VM image. The replica placement strategies are based on factors that influence long-term costs such as the average per-unit cost of storage and the average per-unit cost of computation at different cloud sites besides the 'end-user' latency requirements associated with the VMs. We propose to compensate the additional storage requirements due to replication by exploring commonality that naturally exists amongst different VM images using de-duplication techniques. A key issue that naturally arises in this integrated replication and scheduling context for minimizing migration latencies associated with live migration of VMs across WAN is the design of a good replica placement algorithm that minimizes additional storage requirements. In this paper we address this issue as part for our integrated replication and scheduling architecture, called Cloud Spider. We discuss the trade-offs involved in the design of a replica placement algorithm and propose an algorithm that factors in deduplication ratios amongst pairs of VM images while deciding on the question of replica placement of the VM images. Preliminary experiments show extremely promising results.
Date of Conference: 23-26 May 2011