Summary form only given. We present a sophisticated and efficient parallel scheme for the DIRECT global optimization algorithm of Jones et al. (1993). Although several sequential implementations for this algorithm have been successfully applied to large scale MDO problems, few parallel versions of the DIRECT algorithm have addressed well algorithm characteristics such as a single starting point, an unpredictable workload, and a strong data dependency. These challenges engender many interesting design issues including domain decomposition, data access and management, and workload balancing. A hierarchical parallel scheme has been developed to address these challenges at three levels. Each level is supported by parallel and distributed data structures to access shared data sets, distribute workload, or exchange messages. Parameter estimation problems in systems biology provide an ideal application context for the present work. Global nonlinear parameter estimation results obtained on a 200 node Linux cluster are given for a cell cycle model for frog eggs.