Distributed shared memory (DSM) is an important technology that provides programmers the underlying execution mechanism for shared memory programs. To improve the performance of DSM, recent studies have been carried out with introducing compiler assistance. The compiler generates codes for dependency analysis and communication. This paper proposes high-performance DSM, called Offloaded-DSM, in which the processes of dependency analysis and communication are offloaded to the cluster network. In Offloaded-DSM, the host machine can concentrate on computation of an application itself, while the network maintains coherency in parallel. Through the results of preliminary evaluation, Offloaded-DSM reduces execution time up to 32% in eight nodes and exhibits good scalability.