Skip to Main Content
Large-scale data grid systems (LDGS) facilitate collaborative sharing of large collections (Petabytes and 100s of millions of objects) containing files, databases and data streams that are geographically distributed across heterogeneous resources and multiple administrative domains. LDGS provide a "universal view" of the distributed data, resources, users and methods and hide the idiosyncrasies and the heterogeneity of the underlying infrastructure and protocols - enhancing user collaborations. To improve transparency, an "open policy" system is needed by which data providers and administrators can describe the exact processes and policies that implement LDGS services. We consider policies and processes as the essential defining characteristics of a productive LDGS collaboration. We have implemented an LDGS, called integrated rule-oriented data systems (iRODS), which provides a universal view while enabling an open policy environment for publishing descriptions of the available services. The open policy environment is supported by a distributed workflow/rule engine. The services are encoded as rules in a high-level workflow language that transparently describes the underlying functionality. Well-defined semantics are used to control the composition of the workflow functions, called micro- services, to map to the desired client-level actions. In this paper, we describe the iRODS system from the "universal view" and "open policy" perspective and show its scalability for managing more than 10 million files.