I. Introduction
Tackling learning tasks on graph-structured data usually need extracting representations from nodes' features and graph structure. For this purpose, graph neural networks (GNNs) [1]-[8] compute new representations for each node by aggregating its neighbors' feature vectors or representations. Inspired by convolutional neural networks (CNNs) successfully used in computer vision [9]-[10], researchers began to extend convolution operation onto graphs as an aggregation method. Most works were based on spectral graph convolutional neural networks (GCNNs) [11]-[12], and graph convolutional networks (GCNs) was introduced in [3] which applied some simplifications to the basic frameworks and improved model's scalability and classification performance successfully. GCNs have been improved and applied to various application scenes [13]-[14].