Skip to Main Content
Cloud computing is a powerful technique to increase hardware utilization dramatically and to scale up computational infrastructure instantly. To improve resource utilization, the cloud resource management system could suspend/migrate some running jobs/virtual machines in order to reserve resources for other ones. While the cost of moving the whole virtual machines is too high due to the big size of virtual machines, application level migration is a possible solution. We are interested in using checkpointing and recovery technique (C&R) to stop and restart running programs on virtual machines. The paper introduces the event-based checkpointing tool (EBC), in which users can easily checkpoint and restart running programs in cloud systems. EBC is based on event-driven architecture and supports both sequential and parallel programs like MPI programs. EBC is a useful tool supporting migration as well as scheduling in cloud systems.
Date of Conference: 1-3 Aug. 2012