Automatic Design System With Generative Adversarial Network and Convolutional Neural Network for Optimization Design of Interior Permanent Magnet Synchronous Motor

The optimal design of interior permanent magnet synchronous motors requires a long time because finite element analysis (FEA) is performed repeatedly. To solve this problem, many researchers have used artificial intelligence to construct a prediction model that can replace FEA. However, because the training data are generated by FEA, it takes a very long time to obtain a sufficient amount of data, making it impossible to train a large-scale prediction model. Here, we propose a method for generating a large amount of data from a small number of FEA results using machine learning. An automatic design system with a deep generative model and a convolutional neural network is then constructed. With its sufficient data, the proposed system can handle three topologies and three motor parameters in a wide range of current vector regions. The proposed system was applied to multi-objective optimization design, with the optimization completed in 13–15 seconds.


I. INTRODUCTION
I NTERIOR permanent magnet synchronous motors (IPMSMs) are widely adopted in electric vehicles and industrial robots because of their high output, efficiency, and reliability [1], [2], [3]. However, a major problem with IPMSMs is their long optimization period, which is caused by the high degree of freedom in their design and the use of finite element analysis (FEA). Many researchers are trying to solve this problem using optimization methods; some proposed automatic design with efficient optimization methods against a high degree of freedom [4], [5], [6], [7], [8] while others proposed optimization methods with surrogate models to replace time-consuming FEA [9], [10], [11], [12], [13].
Because an IPMSM rotor can be designed with a large number of geometries due to the embedded permanent magnet (PM), various design alternatives thus need to be considered during optimization. Bonthu et al. [4] minimized torque ripple and cogging torque by optimizing the notch shape of the rotor surface of the permanent magnet assisted synchronous reluctance motor. Islam et al. [5] optimized two rotor design parameters at multiple output points of an IPMSM using the response surface method. Zheng et al. [6] performed multi-objective optimization of an IPMSM with rare-earth PMs and ferrite PMs using the response surface method. These size optimizations with computer-aided design (CAD) were effective for optimizing the shape for a given topology. However, it is difficult to deal with multiple topologies because the geometries depend on the initial shape. To solve this problem, many studies have proposed rotor design based on topology optimization. Ishikawa et al. [7] minimized PM volume using multi-material topology optimization for an asymmetric IPMSM rotor. Sato et al. [8] applied multi-material topology optimization to an IPMSM rotor using a normalized Gaussian network. Although these methods produce completely new topologies, some topologies cannot be manufactured and a very large number of candidate solutions must be considered in the optimization due to the huge design space. There is thus a need for a method for generating manufacturable design alternatives with multiple topologies in a small design variable space. Therefore, this study uses a generative adversarial network (GAN) to solve this problem. A GAN is a deep generative model that uses two deep neural networks proposed by Goodfellow et al. [14]. The GAN has the advantage of dimensionality reduction from multidimensional images to a small latent variable space, and it can integrally represent design alternatives for various motor topologies.
FEA, a numerical modeling method, is generally used to calculate the characteristics of an IPMSM. FEA can be used to obtain very accurate operating characteristics of IPMSMs, but it is time-consuming. Many researchers have thus investigated the construction of prediction models using artificial This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ intelligence (AI) to reduce analysis time while maintaining the accuracy of FEA. Dhulipati et al. [9] used support vector regression (SVR) to train a prediction model for a six-phase IPMSM. Hao et al. [10] trained a model to learn the relationship between the design parameters and torque ripple of an IPMSM using radial basis function networks, and used the model for optimization. Pan et al. [11] used XGBoost to learn the relationship between the torque characteristics and the structural parameters of permanent magnet arc motors, and used the model for optimization. These conventional machine-learningbased approaches are mainly used for size optimization. On the other hand, some studies use deep learning for topology optimization. Barmada et al. [12] used a convolutional neural network (CNN) to learn the relationship between the shape and torque characteristics of synchronous reluctance motors, and used the CNN for optimization. Asanuma et al. [13] trained a model to learn the relationship between the topology near the rotor surface and the torque characteristics of an IPMSM using transfer learning with CNNs. The CNN has the advantage of achieving highly accurate predictions from complex topological information by feature extraction, making them suitable for predicting the characteristics of design alternatives generated by a GAN.
These conventional AI-based modeling methods achieve high accuracy but take into account only specific geometries or one current vector condition, making them unsuitable for various applications. The small scale of these modeling techniques is due to the difficulty in obtaining sufficient data. For example, when training a prediction model to replace FEA, the training data are generally generated by FEA. Assuming that the FEA for generating training data takes 12.4 minutes to analyze the speedtorque characteristics of an IPMSM [15], an FEA of 100000 datasets would take more than two years.
This research solves the problem of obtaining sufficient data and aims to construct a general-purpose IPMSM design system that applies deep learning models to design and modeling. Fig. 1 shows an overview of the present study. First, for data acquisition, we use semi-supervised learning where the training data are generated using machine learning. In [15], the authors proposed a method for training a prediction model that can accurately predict the speed-torque characteristics of doublelayered IPMSMs from a small number of design parameters and FEA results using machine learning. This prediction model can be used to calculate the operating characteristics of various double-layered IPMSMs from their design parameters to generate a large dataset in a short time. Thus, by constructing a machine learning model that is limited to a certain typical rotor shape from a small dataset, and applying the data generation process to various shapes, we can quickly obtain sufficient data.
Using the generated data, an automatic rotor core design system based on the GAN and the CNN is constructed. The two trained deep learning models are then used for multiobjective optimization design. It is shown that the design time is significantly reduced. The contributions of this paper can be summarized as follows. i is the design parameters for each rotor shape. Blue arrows represent the prediction flow used to generate the training data described in Section II. Red arrows represent the data flow of the proposed rotor core design system described in Section III. a) A quick training data generation method for large-scale deep learning is proposed. b) A deep generative model that integrally represents different topologies in the latent space is applied. c) A prediction model of FEA that can be applied in a wide range of current vector regions is presented. d) An automatic design system for IPMSMs that enables quick design is proposed. The rest of this paper is organized as follows. Section II describes the proposed method for quickly generating training data. Section III describes deep learning based on the generated dataset. Section IV shows the results of torque maximization and PM volume minimization using the proposed automatic design system. Section V summarizes the results.

II. OBTAINING TRAINING DATA
Semi-supervised learning with machine learning is used to quickly generate a sufficient amount of data for training the deep learning models. In this section, we describe the process of generating training data.
The target of this study is an IPMSM with 8 poles and 48 slots of distributed winding stators. Fig. 2 shows the rotor topologies of 2D-, V-, and Nabla-structure IPMSMs [2], [16]. In this study, we generate data based on these three types of topologies. The permanent magnet is NMX-S49CH, and the iron core is 10JNEX900. See [2] for detailed information on the stator geometry, body size, and other parameters.
First, the relationship between rotor shape and speed-torque characteristics is learned using the method proposed in [15]. For the prediction model in [15], the input variables are the design parameters, such as the PM thickness, and current conditions. Although the objective is to predict the speed-torque characteristics, the prediction models do not directly predict the speed-torque characteristics; instead, we set three motor parameters (PM flux linkage Ψ a and d-and q-axis inductances L d , L q ) as the prediction targets to improve accuracy. The average torque T and the limit speed N lim are obtained from the predicted motor parameters as follows.
where P n is the number of pole pairs, i d and i q are the d-and q-axis current, respectively, and V om is the maximum induced voltage. The FEA conditions for the training data for the prediction models were generated according to (3).
where U(a, b) is a random variable with a uniform distribution on an open interval (a, b), d i is the i-th design parameter, and d lwr and d upr are its lower and upper bound, respectively. The numbers of design parameters are 11, 5, and 8 for 2D, V, and Nabla, respectively. The relationship between the armature current I a , the current phase angle β and i d , i q are as follows.
The shapes were generated by CAD based on the randomly generated design parameters, and FEA was performed under the randomly generated current conditions to generate the training data for prediction models. The objective of the prediction models is to accurately learn the function f in the following equation. See [15] for more details.
The prediction models described above are constructed for the three topologies. Table I shows the machine learning methods used to predict the characteristics for each topology, where GPR is Gaussian process regression. The values in parentheses are the coefficients of determination r 2 for the test data. For the training data, we used a total of 26209 randomly generated shapes (2D: 8256; V: 7927; Nabla: 10026), all of which were analyzed under random current vector conditions. The software JMAG-Designer 19.1 was used for the FEA. The prediction accuracies for Nabla were the lowest among the three topologies, despite the largest number of training data. A detailed discussion of the differences in prediction accuracy between topologies is beyond the focus of this study and will be the subject of future research, but the effects of the imbalance in prediction accuracy for each topology and each parameter are described in Section IV.
The trained prediction models and the randomly generated conditions according to (3) were then substituted into (5) to generate shape and motor parameter pairs for 55000 shapes for each topology, for a total of 165000 shapes. Table II summarizes the number of datasets used for training in each phase. The reason for generating 55000 shapes is to stabilize the images generated by the GAN explained in Section III-B. Because the motor parameters change nonlinearly due to the effect of magnetic saturation, we predicted the change in motor parameters versus the current vector condition for each shape. Because the maximum armature current of the IPMSM used in this research is 232 A, the characteristic data generated by the prediction are the discrete PM flux linkage with an armature current ranging from 5 to 235 A in 5-A increments and the discrete d-and q-axis current versus d-and q-axis inductance characteristics with dand q-axis currents ranging from 5 to 235 A in 10-A increments. Here, the PM flux linkage is assumed to be independent of the current phase [15].
The data augmentation method described above predicted the characteristics for 623 current vector conditions for each shape, meaning that the FEA results for 102795000 conditions (165000 shapes × 623 conditions/shape) were predicted from FEA results for only 24000 conditions. The prediction of 102795000 data points was completed in a total of 3.6 hours, thus concluding that sufficient data were obtained in a practical amount of time.

III. AUTOMATIC DESIGN SYSTEM
Using the dataset created in Section II, we trained the deep learning models for the automatic rotor shape design system. The automatic design system consists of two types of deep learning model, one for design and the other for characteristic prediction. This section describes the training methods and the results for these two deep learning models.

A. Material Representation for Rotor Core
First, we describe the numerical representation of the motor shape. There are two types of numerical representation of motor shapes, namely that used in [15] (see Section II) that specifies the design parameters of the shape, denoted as the parameter representation, and that used in topology optimization that specifies the material at each coordinate, denoted as the material representation. The parameter representation can represent only one topology depending on the reference shape and is unsuitable for a system that handles multiple topologies in an integrated manner. For example, the design parameters used for the 2D, V, and Nabla structures are 11-, 5-, and 8-dimensional, respectively, making it difficult to handle different topologies with a given parameter representation. Therefore, in the proposed automatic design system, the motor shape is represented numerically by the material representation. Fig. 3 shows the material representation method used in the proposed system. An electromagnetic steel sheet, a PM, or air is specified for each pole coordinate of the rotor, and the three materials are assigned to the RGB (red, green, blue) values of the image as one-hot vectors, respectively, to represent the rotor shape in the image, as shown in the right image in Fig. 3. Because the shape considered in this study is d-axis symmetric and there is no magnet or air layer near the shaft, only half of the geometry in the circumferential direction and 60% of the geometry in the radial direction are converted into images. The magnetization direction (angle) of the PM is represented by the difference in the brightness of the blue color by inputting the normalized value of the angle information d PM , which is calculated as follows: where θ P M ∈ [−90 • , 90 • ] is the angle of the magnetization direction of each PM. In the actual image generation process for the material representation, the material information at each coordinate was extracted and converted into an image for the rotor shape whose dimensions were parametrically generated  by CAD in JMAG-Designer. Note that due to the coordinate transformation, the straight part of the rotor shape becomes a gentle curve in the image.

B. Training of Generative Adversarial Network
This study uses lightweight GAN [17] to generate materially represented rotor images. The training data for the lightweight GAN is the converted 165000 images from the shapes parametrically generated in Section II. The image was a 3 × 256 × 256 tensor.
To determine the parameters of GAN, this study uses Fréchet Inception Distance (FID) [18]. FID measures the overall semantic realism of the synthesized images. We let GAN randomly generate 50000 images and computed FID between the generated images and the whole training dataset. Fig. 4 shows the number of parameters of the models and the calculated FID in the different latent variable dimensions. To reduce the FID, the dimension of the latent variable needs to be increased, but the computational cost also increases with the number of parameters. Thus, the dimension of the latent variable was set to 2 8 = 256.   5 shows an example of a shape generated by lightweight GAN. The output images of the GAN clearly show the three types of rotor shape. All of these images were sampled from the same latent variable space, indicating that the use of GAN allows us to handle a wide variety of shapes in a unified manner. In addition, the huge design space can be reduced to 256 dimensions of the latent variable space and undesignable shapes can be eliminated.

C. Training of Convolutional Neural Network
This study uses CNN to predict the motor characteristics from rotor images generated by the GAN. Fig. 6 shows the architecture of the CNN used in this study. The regression CNN is built based on ResNet-18 [19]. It is a multi-task learning architecture that simultaneously predicts three motor parameters f or a single shape. Because each motor parameter is nonlinear with respect to the current vector, the characteristic data generated in Section II were approximated by polynomial equations using the least-squares method. The coefficients of the approximation equation were used as the prediction target. The approximation equations are shown below.
where w p i (p ∈ {Ψ a , L d , L q }) is the coefficient, and I a is calculated as follows: As shown in Fig. 6, the number of output nodes in fully connected layers was determined according to the number of coefficients in (4)- (6).
The true values of the coefficients of each motor parameter for 16500 shapes were calculated to minimize the squared errors from the motor parameters in each of the 623 conditions for each shape generated in Section II. From the normalized 165000 datasets (combinations of shapes and approximation equation coefficients), 120000 were used as training data, 30000 were used as validation data, and 15000 were used as test data for training. Adam was used for optimization, the weight decay was 0.0001, the learning rate was 0.001, and the mini-batch size was 16. The loss function L CNN is defined as follows.
where k p is the coefficient for balancing losses, MSE is a function that returns the mean squared error, w p is the true value of the weight vector for each parameter, andŵ p is the CNN prediction of w p . In this study, we set the coefficient (k Ψ a , k L d , k L q ) = (3, 2, 1). Fig. 7 shows the prediction accuracy of the trained multitask CNN on the test data, and Fig. 8 shows the training and validation errors. Note that this prediction accuracy is not for the FEA results, but for the data generated by machine learning. First, the validation error was almost the same as the training error, concluding in no tendency to overfit. A comparison of the prediction results indicates that the accuracy was high except for the PM flux linkage and d-axis inductance in the Nabla structure. The low prediction accuracy of these two characteristics is due to the prediction error generated during data generation because their prediction accuracy in Table I is also low. Because a perfect prediction of the data with errors implies a deviation from the true data by FEA, this result in Fig. 7 is reasonable. Prediction accuracy for FEA is discussed in Section IV.
To demonstrate the high performance of the proposed CNN, we compare its performance with other conventional CNNs, ResNet-34 [19], ResNet-50 [19], VGG-11 [20] with batch normalization, and AlexNet [21]. Table III compares the prediction results and the model parameters for 10 training runs, where only the networks on the input side from the 1000-dimensional fully connected layer were changed while keeping the multi-task architecture in Fig. 6 fixed. All the training parameters were the same. The proposed CNN based on ResNet-18 and the CNN based on ResNet-34 had nearly equal validation errors for all motor parameters and had lower prediction errors than the other CNNs. In addition, the proposed model has fewer parameters and then requires less training time. Therefore, the proposed architecture is superior in both performance and computational cost.
Then, to verify the effectiveness of the proposed CNN's multi-task architecture, we compare the prediction accuracy of the CNNs when trained with different loss coefficients. Table IV compares their prediction results for 10 training runs, where one-hot vectors mean that the learning is not multitasking. Comparison results show that the multi-task architecture reduces  Fig. 9. Overall configuration of automatic design system. The blue part is used for shape optimization, and the red part is used for optimal current vector search for a given shape. the verification errors for all motor parameters due to the suppression effect of overfitting. Furthermore, the setting with a larger weight of errors in Ψ a and L d efficiently reduced the verification error, although a slightly larger verification error in L q .

IV. OPTIMIZATION DESIGN
The combination of the deep generative model and the characteristic prediction model in Section III leads to an automatic design system, as shown in Fig. 9. This section demonstrates the usefulness of the system by performing a multi-objective optimization design in the 256-dimensional latent variable space of the generative model.

A. Problem
In this study, the design goals are to minimize the volume of the PMs and maximize the maximum torque under the torque constraint. The problem setup is as follows: where V PM is the volume of a PM for each candidate solution and T Max pred is the predicted maximum torque for each candidate solution. These parameters are normalized by the initial values V P M init and T Max init , respectively. The volume of the PM was defined as a percentage of the image area. w 1 and w 2 are weight coefficients. (w 1 , w 2 ) = (1, 1) in this setup. The constraint condition g i is a torque constraint for n required operating points , which is multiplied by a coefficient α = 1.03 to consider the prediction error. The torque prediction results are given as the results of maximum power control at the required speed as follows: where I am is the maximum armature current, and T CNN and N CNN are the torque and limit speed calculated by substituting the motor parameters predicted by the CNN into (1) and (2), respectively. The solution for maximum power control was obtained by a brute-force search. NSGA-II [22] was used as the optimization algorithm, and the framework pymoo was used for the implementation [23]. The population size was set to 100 and the number of offspring was set to 10. Latin hypercube sampling was used for sampling the initial population, the tournament method was used for selection, simulated binary crossover was used for crossover, and polynomial mutation was used for mutation. The termination condition was set to 100 generations.

B. Optimization Results
To verify the robustness of the proposed system, this study performed optimization under three conditions. Table V     the characteristics of all the populations generated in the optimization process under each condition. The maximum armature voltage was set to 507 V.
In Condition 1, the maximum armature current was set to 232 A, and the optimization was performed at two required operating points, (3000 min −1 , 197 Nm) and (11000 min −1 , 40 Nm), which were determined based on the reference motor in [2]. The solution population transitioned to satisfy the constraints; all the individuals in the final population satisfied the constraints. The Nabla structure most easily produced the maximum torque. Almost all of the Pareto solutions are the Nabla structure under severe torque requirements.
In Condition 2, the maximum armature current was fixed at 232 A and the torque constraint was relaxed from that in Condition 1. Optimization was performed at two required operating points, (3000 min −1 , 170 Nm) and (11000 min −1 , 40 Nm). In Condition 2, many candidate solutions for all three topologies satisfy the relaxed torque constraint. The V structure reduced the volume of PMs the most while maintaining high torque. The 2D structure, which was designed for high efficiency [2], does not appear in the Pareto solution for Condition 2. In Condition 3, the armature current limit was reduced from 232 to 104 A and the two required operating points were set to (1000 min −1 , 100 Nm) and (9000 min −1 , 30 Nm). In this condition, the maximum torque requirement was reduced, but the current limit was also reduced, resulting in a tighter torque requirement. In addition, as in Condition 1, many Nabla structures were selected to easily obtain the torque. Thus, the optimization design of the proposed system can be performed under various current limits.
To verify the optimality of the obtained solutions, we performed comparative experiments with different multi-objective optimization algorithms, RNSGA-II [24], NSGA-III [25], UNSGA-III [26], and RNSGA-III [27]. Table VI compares the minimum fitness of the last population for each optimization, where the reference points for RNSGA-II and RNSGA-III were set to (0.82, -1.17) and (0.68, -1.07), and the reference directions for NSGA-III and UNSGA-III were set in 8 segments based on Das-Dennis method. The other basic optimization parameters were common. The comparison results showed that NSGA-II used is superior to the other multi-objective optimization methods in this problem set.
FEA was conducted by selecting three candidate solutions for each condition from the Pareto solutions. Fig. 12 shows the  prediction results and the FEA results for the speed-torque characteristics of the selected candidate solutions, where solutions A-I correspond to the solutions in Fig. 10 and the blue points represent the required operating points for each condition.
The shape images of the Pareto solutions are all clear and designable, confirming the effectiveness of the trained GAN. A comparison of the FEA results with the CNN prediction results indicates that the prediction accuracy of the speed-torque characteristics is very high for all candidate solutions. The FEA results for all Pareto solutions satisfy the required operating point, which means that the prediction error is less than 3%. These results imply that the poor prediction accuracy for the Nabla structure in Table I was improved by the CNN. In other words, the CNN recognized and eliminated as noise the insensitive error and standard error generated in the process of data generation by SVR and GPR, respectively. This shows that the proposed data generation method can tolerate a certain degree of inaccuracy in a machine learning method.
Finally, we discuss the optimization time. Fig. 13 shows a histogram of the design time for 100 optimization designs under each condition. For the calculations, a computer with an Intel Core i7-9700K CPU, 32.0 GB of RAM, and an NVIDIA GeForce RTX 2070 SUPER (8 GB) GPU was used. The proposed system can design a shape that satisfies the requirements in 13-15 seconds, effectively reducing the optimization time compared to that for the conventional optimization calculation for the same scale (generally several days to several weeks).

V. CONCLUSION
This paper proposed a deep learning technique for optimizing IPMSM rotors. The results can be summarized as follows. a) We proposed a method for quickly generating a large amount of FEA data for training large-scale deep learning models using machine learning specific to each topology. This method generated 102795000 training data from 26209 FEA results. Also, the proposed data generation method could tolerate a certain degree of inaccuracy in a machine learning method. b) The proposed generative model can be used to design rotor topologies for three IPMSMs with high precision and can represent different topologies in a unified 256-dimensional latent variable space. c) We proposed a prediction model that can quickly and accurately predict motor parameters in a wide range of current vectors with various geometries. d) We proposed an automatic rotor design system for IPMSMs using two deep learning models. The proposed system can be used for various required operating points and current ranges. The time required for design optimization was only 13-15 seconds. Once the proposed automatic design system is trained, the design optimization under various conditions can be performed many times in a short time, leading to a significant reduction in the design and development time of IPMSMs. In the future, we will study the characteristics not covered in this paper, such as loss and vibration. Additionally, other magnetic materials and rotor topologies will be examined to expand the scope of the proposed automatic design system.
The dataset generated in Section II is available at IEEE Dat-aPort [28] and the Python implementation of the characteristics prediction models is available at GitHub [29].