Skip to Main Content
Efficient retrieval of replicated data from multiple disks is a challenging problem. Traditional retrieval techniques assume that replication is done at a single site using homogeneous disk arrays having no initial load or network delay. Recently, generalized retrieval algorithms are proposed to cover heterogeneous disk arrays, initial loads, and network delays. Generalized retrieval algorithms achieve the optimal response time retrieval schedule by performing multiple runs of a maximum flow algorithm. Since the maximum flow algorithm is used as a black box technique, flow values of the previous runs cannot be conserved to speed up the process. In this paper, we propose integrated maximum flow algorithms for the generalized optimal response time retrieval problem. Our first algorithm uses Ford-Fulkerson method and the second algorithm uses Push-relabel algorithm. Besides the sequential implementations, a multi-threaded version of the push-relabel algorithm is also implemented. Proposed algorithms are investigated using various replication schemes, query types, query loads, disk specifications, and system delays. Experimental results show that the sequential integrated push-relabel algorithm runs up to 2.5X faster than the black box version. Furthermore, parallel integrated push-relabel implementation achieves up to 1.7X speed up (~1.2X on average) over the sequential algorithm using two threads, which makes the integrated algorithm up to 4.25X (~3X on average) faster than its black box counterpart.