The Global Arrays toolkit is a library that allows programmers to write parallel programs that use large arrays distributed across processing nodes through the Aggregate Remote Memory Copy Interface (ARMCI). OpenMP is an application programming interface that supports shared memory multiprocessing on many architectures and platforms. In the Symmetric-Multi Processors (SMP), the Global Arrays toolkit will expose the programmers quite similar to that provided by OpenMP. In this study, we will further our investigation on the performance of a parallel application implemented with the Global Arrays toolkit and OpenMP on Grid computing environment. The investigation focuses on the case that an SMP cluster is included in the Grid computing environment. The multi-level parallelism together with multi-level topology-aware techniques have been used in both implementations. We have found that performance of the evaluating application implemented with Global Arrays technique is comparable to that of the application implemented with OpenMP. This implies that programmer can directly port the Global Arrays application directly to the SMP cluster yet its performance is not dropped compared to the native implementation.