I. Introduction
Mental disorders, such as autism spectrum disorder (ASD) and attention deficit/hyperactivity disorder (ADHD), have become a growing global public health concern. Their high prevalence gradually poses a huge pressure on the health center services [1], [2]. In the past decades, computer-aided diagnosis (CAD) approaches are developed to address the psychiatrist shortage by automatically analyzing high-resolution medical images, e.g., functional magnetic resonance imaging (fMRI) [3], [4]. fMRI can investigate aberrant neurobiological functions in mental disorders by detecting tiny changes in blood flow [5], [6]. Recently, deep learning-based CAD approaches (DL-CAD), e.g., long short-term memory network (LSTM) [7], gated recurrent units (GRU) [8], and hopfield neural network [9] et al., achieved decent performance in mental disorder diagnosis. However, the successful training of deep learning models tends to require sufficient training samples.