ViTab Transformer Framework for Predicting Induced Electric Field and Focality in Transcranial Magnetic Stimulation

Transcranial magnetic stimulation is an electromagnetic induction-based non-invasive therapeutic technique for neurological diseases. For finding new clinical applications and enhancing the efficacy of TMS in existing neurological disorders, the current study focuses on a deep learning-based prediction model as an alternative to time-consuming electromagnetic (EM) simulation software. The main bottleneck of the existing prediction models is to consider very few input parameters of a standard coil such as coil type and coil position for predicting an output of electric field value. To overcome this limitation, a transformer-based prediction model titled as ViTab transformer is developed in this work to predict electric field (E-max), focality or area of stmulation (S-half), and volume of stimulation (V-half) by considering several input parameters such as sources of MRI images, types of coils, coil position, rate of change of current, brain tissues conductivity, and coil distance from the scalp. The proposed framework consists of a vision and a tab transformer to handle both image and tabular-type data. The prediction performance of the offered model is evaluated in terms of coefficient determination, R2 score, for E-max, V-half, and S-half in the testing phase. The obtained result in terms of R2 score for E-max, V-half, and S-half are found 0.97, 0.87, and 0.90 respectively. The results indicate that the suggested ViTab transformer model can predict electric field as well as focality more accurately than the current state-of-the-art methods. The reduced computational time, as well as efficient prediction accuracy, resembles that ViTab transformer can assist the neuroscientist and neurosurgeon prior to providing superior TMS treatment in near future.


ViTab Transformer Framework for Predicting
Induced Electric Field and Focality in Transcranial Magnetic Stimulation

I. INTRODUCTION
T RANSCRANIAL magnetic stimulation (TMS) is a non-invasive neuro-modulation method that employs a strong magnetic field to stimulate neurons in the brain cortex.Since the introduction of TMS in 1985, this therapy has been successfully used to treat a wide range of neurological symptoms.United States Food and Drug Administration (FDA) has approved TMS for the treatment of various neuropsychiatric disorders such as major depressive disorders (MDD), obsessive-compulsive disorders (OCD), and other brain-related diseases in different years.When the medicine based treatment modalities do not work effectively for neurological disorders, the neuromodulation technique of TMS is used.The major advantages of TMS include non-invasive property, minor side effects, low cost, etc. [1], [2], [3], [4], [5], and [6].
TMS works based on the electromagnetic induction law of Faraday, in which pulses of current flow through metallic coils placed on the human head to produce a time-varying magnetic field.Then the changing magnetic field generates an electric field in brain cortex, raises neurotransmitter levels, and improves neural connectivity.The generated electric field must be strong enough to depolarize the target neurons responsible for neurological diseases [7], [8], [9].The depth (e.g.field penetration distance from the vertex) and focality (e.g.area of stimulation) of the produced electric field are two additional parameters that are responsible to determine the efficiency and side effect of TMS therapy [10], [11], [12].
Design of a new coil and performance evaluation of a coil in new application are carried out in a simulation software.Recursive-based computer simulation of TMS helps to calculate the electric field strength within the brain cortex before applying it clinically.There are some paid electromagnetic (EM) software available for manual TMS simulation, such as Sim4Life, SIMNIBS, etc.These software programs go through a number of stages while estimating the electric field.The first step is to create a three-dimensional model of the human head through the MRI images.The creation of a three-dimensional human head model by segmenting MRI images takes about 8 to 10 hours [13], [14], [15], [16], [17], [18].After that, the electric field intensity is predicted using finite element analysis (FEA) of the volume conductor model (VCM).The calculation time for estimating the electric field depends on the size of tissue model's voxel [19], [20].The estimation of the electric field intensity cannot be generalized based on the result of a single-head model since the structure of a human head model is not fixed.Moreover, the optimization with iterative stimulation takes a significant amount of computational time.
Since the estimation of the electric field using the VCM requires a significant amount of computation time, an alternative deep learning based method can be employed to reduce it.Several works have recently been published that require very less computation time than conventional EM software by incorporating simulation data into deep learning architectures [13], [14], [15], [16], [17], [18], [21].However, in the reported works, several parameters including the distance of coil from the scalp, conductivity of the tissue [22], [23], [24], and the rate of current change [25], [26] are not taken into consideration as input parameters to predict the electric field through the deep learning-based model.For instance, Drakaki et al. [26] has demonstrated that the rate of current change has a linear proportional implication on the induced electric field due to its dependency on the coil model, stimulator model, and pulse intensity.By adjusting the rate of current change, clinicians can tailor the treatment to achieve desired outcomes, such as stimulating specific brain regions or modulating neural activity.Moreover, by considering the safety issue, the flexibility of controlling this parameter is necessary rather than providing an identical fixed value for different neural disease cases where the requirement of activity threshold is different [27], [28].Contrarily, the researches [22], [23], [24] have demonstrated that a significant change in the generated electric field has occurred with the modification in tissue conductivity.For training purposes, it has thus been necessary to take into consideration the input tissue conductivity parameter with a large range of fluctuation while taking into account all conceivable practical uncertainties.If this is not considered, the model can produce a false forecast after being deployed.Moreover, in real-world situations, the Electrical Impedance Tomography (EIT) or Magnetic Resonance Electrical Impedance Tomography (MREIT) [24], [29], [30] measurements of brain-tissue electrical conductivity assist in identifying anomalies that are susceptible to neurological disorders.The model's prediction of the E-field for a certain brain illness therapy would be incorrect if it did not take into account the MREIT-measured conductivity as an input parameter.Therefore, these input parameters play a very imperative role in predicting output electric field.If one of these parameters is changed, the output of stimulation is also changed [22], [23], [24], [25], [26].Thereby, this study has proposed a transformer-based model called ViTab to predict the electric field intensity (E-max), area of stimulation (S-half), and volume of stimulation (V-half) directly from MRI scans by considering a variety of factors such as subject-specified MRI image, coil type, coil position, rate change of current, conductivity of brain tissues, and coil distance from skin.A dataset made up with pairs of input parameters and the corresponding output parameter's values is used to translate the input parameters to the E-max, S-half, and V-half.This prediction model can accurately estimate induced electric field in the head model.
The major contributors of this study are highlighted as: • A larger dataset by considering different input parameters such as human head, coil type, position of coil, coil distance from skin, rate change of current, and conductivities of brain tissue is generated that aids the value on effective field estimation.
• Different input including an image feature, two categorical features, and five numerical features and various output pairs including E-max, V-half, and S-half numerical are employed in the database, • A new prediction model titled as ViTab transformer is developed to learn sequential information of input features for estimating E-max, S-half, and V-half instead of using electromagnetic software.
• The proposed model out performs the existing state-ofthe-art prediction models.The organization of this article is summarized as follows: Section I describes the fundamentals of transcranial magnetic stimulation, how it operates, and the limitations of electromagnetic software for computing electric fields in TMS.Section II presents the latest research on TMS simulation and their drawbacks.Section III comprehensively describes the suggested architecture.The results of the proposed model are thoroughly discussed and analyzed in Section IV, along with a comparison to published works.The summary of the article and recommendations for the future work are included in Section V.

II. RELATED WORKS
Concerning the time complexity of commercial EM simulation software, few researchers have recently focused on the use of the deep learning-based network to predict the electric field in a TMS system.For instance, Yokota et al. [13] suggested a deep neural network (DNN) model to forecast the electric field from a figure of eight (fo8) TMS coil.For segmenting the T1 and T2 weighted MRI images, the FreeSurfer software has been utilized to create a 3D model of a human skull.Then, the finite element analysis is performed on SimNIBS software for estimating the induced electric field for several coil positions.Following that, a dataset is produced using a VCM for various MRI scans and coil positions.Then the dataset is used to train the DNN model.The outcome demonstrates that the DNN model provides quite similar output to the SimNIBS EM software computation.However, the model's drawback is that it can only predict the electric field for a single type of TMS coil.If the coil type is altered, the model does not function correctly.Another delimitation is that the estimation accuracy of the DNN model depends on the quality of the MRI scans.Therefore, the performance of the DNN model can deteriorate if the picture quality deteriorates.
Moreover, a DNN model is proposed by Sathi et al. [14] to calculate the electric field inside a human head phantom model by considering different coil designing parameters.In this work, the COMSOL Multiphysics software is used to build datasets for training the DNN model.The VCM of a two-shell human head is created to compute the electric field for the halo-V assembly (HVA) coil with the variation of six design parameters.The DNN model is subsequently trained using newly produced dataset.The model is trained on a dataset of 100 samples.In this study, some characteristics including conductivity and rate change of current are not taken into account.However, a realistic human head model is not considered in this task, which prevents the network from producing an accurate output in real-world scenarios.This analysis also ignores crucial output factors such as focality and V-half for prediction.In the meantime, Tashil et al. [17] proposed a deep convention neural network (DCNN) to predict induced electric fields from T1 and T2 weighted MRI of 11 healthy person.The DCNN model could predict induced electric fields accurately but the main drawback of this model is that it could predict induced electric fields for a fixed coil position parameter.At the same time, Hongming et al. [18] developed a self-supervised deep-learning model which can accurately predict induced electric fields based on different coil positions of a single coil-type parameter.After-while, Guoping et al. [16] also proposed a DNN model based on T1 weighted isotropic and anisotropic MRI images, different types of coil, different coil positions, and variation of rate change of current parameters.This study hasn't considered coil distance from the scalp and change of conductivities as input parameters.In another study, Afuwape et al. [15] created a deep CNN model to measure the strength of the produced electric field across sixteen distinct types of coils straight from the patient's MRI data.T1-weighted MRI scans are used to create the 3D human model.The induced electric fields for several coils in a 3D human head are calculated using the Sim4Life program, and a dataset of 3200 samples is produced from the simulated results.The produced dataset is used to train the CNN model which can estimate the induced electric field & V-half of stimulation.Due to the small number of sample in dataset, the model's V-half prediction accuracy is extremely low.Another drawback of this model is that it only takes into account types of coil and coil positions, neglecting for the other parameters such as the rate change of current, the conductivity of brain tissues, and distance from the scalp which affect electric field.
Most of the earlier deep learning works on TMS [13], [14], [15], [16], [17], [18] have focused on the prediction of the electric field using the input parameters of coil type, coil position, and MRI image.However, these existing works ignore vital factors such as the conductivities of the brain tissues, the distance of the coil from the skin, and the rate of current change.These characteristics are really important for superior electric field measurement even though the majority of recent researches have treated these parameters as having a default value.By considering the limitations of existing works, this study generates a new database to deal with different factors such as MRI image types, coil configurations, coil positions on the human head, coil distance from the skin, and conductivity of brain tissues including white matter conductivity, gray matter conductivity, and scalp conductivity.Moreover, there has not been found any research yet that attempts to estimate the induced electric field, focality, and V-half by taking into account all relevant elements simultaneously.Another vital advantage of this work is to employ the transformer-based model for learning the contextual information quickly from the input parameters and improve the prediction accuracy than the existing state-of-the-art models.

III. METHODOLOGY
Fig. 1 illustrates the overall workflow used to predict the induced field from a TMS coil.The process begins with data samples generation from the conventional simulation process through SimNIBS software.After that, a ViTab transformer model is employed to train with generated data.Then, the performance of the ViTab model is assessed using the testing dataset that comprises of previously unseen data for the final prediction of E-max, V-half, and S-half of the TMS coil.A detailed explanation of the entire prediction process is presented in the following subsections.

A. Conventional Simulation Process
The process of data sample generation is initiated with the aid of SimNIBS software where the VCM of the human head model and different coil parameters are generated for finite element analysis (FEA).After that, the output data samples are found that are combined with the input data samples for fitting with the transformer-based prediction model.metabolic mapping (CFMM) [31].With the aid of MATLAB software, the 3D human head models generated from the individual MRI scans in SimNIBS software by segmenting the pictures into several anatomical layers including bone, cerebrospinal fluid (CSF), eyes, gray matter (WM), ventricles, and white matter (WM).
2) Finite Element Analysis: Following the creation of 3D VCM, the simulation is initiated by determining the change of magnetic potential, A in the fitted TMS coil [32], which is represented by the d A/dt and is calculated by using Biot-Savart law under quasi-static approximation, [33] which is shown in Equation (1).
where B is Magnetic flux density in electro-static case and r, r denotes the spatial vector representation.Magnetic vector potential, A is obtained from Magnetic flux density with V denoting integration over all of space.Moreover, σ, ⃗ n and φ represented tissues conductivity, normal vector to the surface of the tissue, and electric scaler potential respectively.
Then, the induced electric field in 3D volume conductor head model E = −∇ϕ− ∂A ∂t is calculated by solving equation (2) with Neumann boundary condition which is shown in equation (3).Because the frequency content of TMS pulses is confined at frequencies lower than around 10 kHz, the assumption is made that E-fields fluctuate slowly with time [29].Each tetrahedral element of the head mesh receives an electric field due to these computations.
After simulating in the software, output parameters of 99 th percentile induced electric field (E-max), 50 th percentile volume of stimulation (V-half), and focality (S-half) are produced.Moreover, Fig. 2 aids in a better comprehension of E-max (99 percentiles), D-half (50 percentiles), V-half (50 percentiles), and S-half (50 percentiles).In this figure, x 1 Emax 99th and x 2 Emax 50th are represented by the location (points) where 99 th percentile (E-max) and 50 th percentile of the induced electric field have been found.Then, equation ( 4) assists to find the distance between these two points which is called Dhalf.After that, the volume of stimulation (V-half) has been found using equation ( 5) which calculates the volume between 99 th percentile to 50 th percentile of the induced electric field.At last, equation ( 6) is measured focality (S-half) of the TMS coil with the help of V-half and D-half.
Here, D-half indicates the distance of brain region to which the induced electric field is half of E-max.Then, V − half and S − half are presented volume of stimulation and area of stimulation (focality) respectively.

B. Dataset Generation & Pre-Processing
For a head model (T1 weighted MRI images, I) along with different input parameters such as coil type (x1), coil position (x2), Rate change of current (x3), coil distance (x4), WM conductivity (x5), GM conductivity (x6), and scalp conductivity (x7), the output parameters including E-max (>100V/m) [35], [36], V-half, and S-half are taken.Consequently, a database is created with a combination of images, categorical and numerical data types for input features, and numerical data types for output prediction.A total number of 1024 samples are collected from the simulation software to create the dataset, D CSV format where N represents the number of samples in the database.After creating the database, some pre-processing techniques are performed on image and numerical input data features.The augmentation method is employed for images by rotating a single image at angles of 0 • , 45 • , 90 • , 180 • , and 270 • in the clockwise direction as indicated in Fig. 3 for increasing the total number of samples as well as helping to increase the model's accuracy [37], [38].A visualization of correlation heatmap generated by using Seaborn python library as presented in Fig. 4 where the correlation factor between augmented images and the original image is measured ∼0.47.Therefore, a variation of > 0.50 of newly generated images compared to the original image has been found which qualifies the augmented images as new content with different pixels information.Therefore, after performing image augmentation, the final database is created with a total of 5120 samples.A summary of the database with their ranges of values is Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE I INTERPRETATION OF INPUT AND OUTPUT FEATURES
presented in Table I.In addition, the normalization technique is utilized only for the numerical data type to retain the values within a range of 0 to 1.A visualization of normalized output for numerical data is presented in Fig. 5.The database is split into training and testing datasets with a ratio of 80:20.The training and testing dataset contain a total number of samples of 4096 and 1024 respectively.Then, the training dataset is fit to the ViTab transformer model to build a complex relationship between input and output pairs.On the contrary, the testing dataset is used to evaluate the model learning.The vision (ViT) transformer processes input MRI image data, whereas the tab transformer processes the numerical and categorical aspects of the input data.The combination of two different transformers allow the ViTab transformer design to learn the order of input characteristics and create a complicated relationship between input and output features [39], [40], [41], [42].The followings include a description of each component of the ViTab transformer.

C. ViTab Prediction Framework
1) ViT Transformer: The main working principle of the ViT transformer is to pay attention to every pixel in a picture to identify the relationship between each pixel's sequences.To accomplish this, the transformer block of the model incorporates the multi-head self-attention mechanism, where a number of self-attention are performed simultaneously and generates a contextual embedding vector based on how much one feature or patch is related to another feature or patch.As the transformer block supports 1D encoded input vector, the patch encoding of the 2D MRI scans is performed before being fed to the transformer block.
a) Patch encoder: The central axial MRI image, I with dimensions of (64 × 64×3) is divided into patches I p ϵR (N ×P 2 •C) where, P represents the patch size, and the total number of patches is indicated as N= ( 64×64 p 2 ).Then, the linear projection of patches (EϵR (P 2 •C)× p d ) is performed to flatten each patch and embed each patch with a dimension of p d .The information about each embedded patch position is then carried via a positional embedding, E pos hence concatenated to it.The output of the patch encoder, Z 0 is found by concatenating with the E pos where, Z 0 ϵR (N × p d ) .
b) Transformer block: The main goal of the transformer block is to pay attention to every input embedding patch vector and determine how much one vector affects another.Thereby, the transformer block uses multiple heads of self-attention to attend the input embedding sequence before passing the results to a feed-forward network.The self-attention calculates three parametric matrices, such as query, key, and value, for each input using projection matrices, where, kandv are the dimensions of key and value respectively.The generated query, key, and value matrices are found as follows.
After calculating the parametric matrices, the most relevant feature is found by the attention function f 0 and the individual attention function is combined and projected to achieve multihead self-attention, m vh output.Here, h represents the number of heads.After that, the final output of ViT transformer, V T is found by passing to a multilayer perceptron (MLP) layer that receives a normalized output of m vh .
2) Tab Transformer: The tabular data, X T able incorporates two types of features including categorical features, X T able Cat and numerical features, X T able N um .The considerations of coil type (x 1 ) and coil position (x 2 ) generate categorical feature X T able Cat = {x 1 , x 2 } and the numerical features as represented X T able N um = {x 3 , x 4 , x 5 , x 6 , x 7 }.From these two features, the categorical features are passed to the Tab transformer model to reveal the contextual information of the feature classes.The basic building block of the Tab transformer is a stack of transformer blocks with column embedding.
a) Column embedding: Column embedding is used to represent each class of categorical features into embedding vectors where one categorical feature X T able cat i has d th classes.These classes are converted into numerical values in order to convey them into transformer blocks.The embedded vector e ϕ i ϵR ((d i +1)×l) with dimension, l for the ith feature with d th classes generate the following embedded output.
b) Transformer block: The output of column embedding, E ϕ is fed into the multi-head attention, m T h based transformer block which generates query, key, and value vectors for each column-embedded vector.After that, the output of m T h is concatenated with the input of E ϕ .Then the generated output, R c is passed to an MLP unit.The output of the transformer block, T T is created by adding MLP's output to the output of R c .
3) Final Prediction: The output of ViT and Tab transformer are flattened so that all of the features are turned into a 1D vector.After that, the output is concatenated with the normalized numerical features, N n to generate concatenated output, Y V i T ab .Then, MLP is utilized on Y V i T ab with 7 dense layers of 6000-2000-512-256-28-64-32 sequential nodes.Finally, an output dense layer of 3 units is employed for final prediction of E-max, V-half, and S-half.The following equations are utilized for the purpose of prediction.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
For all the 7 dense layers, rectified linear units (RELU) activation function is used to add non-linearity in the regression model, and the linear activation function is used for the final output layer.The selected optimum parameters for the ViTab transformer model are summarized in Table II.After several laborious training, the parameter values are selected based on the superior model's performance.

D. Hyperparameter Settings
Selecting model hyperparameters is a vital task to get superior results from any deep learning model.Thereby, the optimum hyperparameters of the proposed ViTab framework are selected based on the values shown in Table III.The 'Adam' optimizer with a learning rate of 0.0001 is chosen for compiling the model.Moreover, the mean square error loss function is used to compute the regression loss for 100 epochs.For both training and testing, the Google Colab platform with Python 3.8.15version is used.For dataset preparation, the NumPy 1.3.5 and Scikit-learn 1.0.2packages are used for model evaluation.In addition, Keras 2.9.0 and Tensorflow 2.9.2 frameworks [43] are employed for model implementation.
IV. RESULT AND DISCUSSION In this section, the obtained results from laborious experiments on the developed database are presented.For ensuring a superior prediction, a 4-fold cross-validation technique is applied to divide the data samples randomly 4 times using about 80% of the training set and the rest 20% as testing data.At every fold, data shuffling is performed to ensure non-repeated samples during training and testing.Finally, the results are obtained by averaging each fold results of the proposed model.

A. Quantitative Analysis
The quantitative analysis of the proposed model is performed based on four matrices including coefficient of determination (R 2 ), mean absolute error (MAE), mean square error (MSE), and mean absolute percentage error (MAPE).Table IV shows the summary of these evaluation matrices for E-max, S-half, and V-half prediction on the ViTab model.In terms of R 2 value, the E-max provides maximum prediction accuracy of 0.97 compared to S-half and V-half values.Moreover, the individual prediction performances based on (25) to (28) are described in detail in the following sections.
Here, y i = actual stimulated value y i = predicted value y = mean of stimulated value n = total number of sample E-Max: In the training and testing phases, the model for E-max prediction provides R 2 values of 0.98 and 0.97, respectively.In this case, R 2 may more precisely explain the variance between the actual and predicted values and it provides the prediction value within a reasonable range.The MAPE number, which is 1.340% for the training phase and 1.509% for the testing phase, is likewise within a desirable and acceptable range. V-half: The R 2 values of V-half for the training and testing datasets are 0.93 and 0.87, respectively.The model provides results that are near the real output value by properly generalizing the connection between the input characteristics and the actual output.The MAPE values are found as 2.946% during the training phase and 3.943% during the testing phase demonstrating how effectively and correctly this model can perform prediction.For both MAE and MSE, the error rate is comparatively very low. S-half: The training R 2 score for the S-half prediction is found 0.96, and the testing R 2 score is found 0.90.This suggests that the model can accurately generalize the link between the input characteristics and actual output to provide output that is close to the true output value.For further demonstrating the model's effectiveness, MAPE is found to be 2.039% in the training phase and 3.197% in the testing phase.
The parity plot of the actual and predicted values of E-max, V-half, and S-half are shown in Fig. 7.The graph indicates that the ViTab transformer model prediction accuracy is superior enough because the simulated and predicted values are aligned with the fit curve.Therefore, the proposed model has the superiority to predict E-max, V-half, and S-half values in an accurate manner.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Qualitative Analysis
From the created dataset, a sample of data is taken for analyzing the model performance qualitatively.Table V shows how well the ViTab transformer model can predict the values of E-max, V-half, and S-half for the sample data.Table V indicates that the predicted values for the E-max and S-half are quite close to the actual values.On the other hand, the difference between the actual and predicted values for the V-half prediction is modest because the R 2 score is smaller than those for the other two predictors.

C. Computational Time Analysis
Table VI represents the computation time for the ViTab transformer model and SimNIBS EM software.The required computational times of ViTab model are 0.025s and 0.033s in GPU and CPU respectively.Whereas, in the SimNIBS software, each stimulation is required to 7 min 40s excluding the VCM model generating time.Moreover, considering the VCM generation, the required time is 8 to 10 hours.Thereby, the ViTab transformer model is preferred over SimNIBS EM software because of its lower computation time.

D. Model Comparison
To find out the superiority of the proposed ViTab model over the existing state-of-the-art models such as DNN and CNN, the developed database is also trained on these models.V-half, and S-half prediction in both the train and test phases.From this table, it is shown that the proposed ViTab transformer model outperforms the DNN and CNN model by a significant margin for the identical input features.In terms of E-max prediction, the test R 2 score is found 0.86, 0.89, and 0.97 for DNN, CNN, and ViTab transformer model respectively.on the contrary, in terms of V-half, the DNN and CNN model testing R 2 score is found 0.83 whereas the ViTab model shows a score of 0.87.For S-half, the testing R 2 score is found 0.84, 0.86, and 0.90 for DNN, CNN, and ViTab transformer respectively.Therefore, after analyzing all the performance index values, the ViTab transformer model functions better than other models because it has the acquired sequential and contextual knowledge of the input characteristics, whereas other models have struggled to create this relationship properly.
Moreover, Table VIII shows a comparison between this study and the existing studies [14], [15] on TMS coil-induced field prediction.It is observed that this study works with two types of data such as structural data (numerical data and categorical data) and non-structural data (image).For these two types of data, the proposed model can predict the area, volume, and intensity of the stimulation.Compared with work [15], the ViTab transformer model achieves greater accuracy of 0.97 and 0.90 for both electric field and V-half prediction.Therefore, the accurate prediction of these three output parameters could aid in an effective treatment process for neurological patients since the process not only depends on the intensity value of the induced electric field but also on the area and volume of the stimulation.In addition, the proposed model can predict the induced electric fields more accurately by considering variant types of input coil parameters than the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE VIII COMPARISON BETWEEN EXISTING STUDIES AND
PROPOSED ViTab TRANSFORMER MODEL existing works.In terms of prediction score, R 2 , the proposed ViTab model achieves the highest score over the existing works.
V. CONCLUSION This study proposes a novel ViTab transformer model for TMS coil-induced field prediction that can simultaneously process both tabular and image-type data.In addition, a new database comprising 5120 samples is developed in this work by considering additional input properties such as rate change of current and conductivity of different tissue mediums that significantly influence the generated electric field of the TMS coil.Compared to the existing models, the ViTab transformer model outperformed in estimating the electric field, V-half, and S-half of the TMS coil with an accuracy of 97%, 87%, and 90% respectively.This improved electric field prediction results in reduced side effects and unwanted stimulation to assist neurosurgeons prior to TMS therapy.Moreover, it excels neuroscientist's TMS research concentrating on novel coil design as well as analyzing its implications on all possible practical uncertainties of different human cases in ViTab transformer-based electromagnetic software.Although the proposed model is superior for continuous value prediction, however depending on the significance of determining the direction of electric field distribution, a future work could be done by emphasizing on segmentation task in parallel with the proposed regression network.

1 )
Human Head Models: The T1-weighted MRI scans of 32 healthy individuals ranging in age from 20 to 70 years are collected from the western university centre for functional and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 2 .
Fig. 2. Visualization of TMS coil induced electric field distribution at points where distributions are 99 to 50 percentiles.

Fig. 3 .
Fig. 3. Image data augmentation at different angles of rotations in the clockwise direction.

Fig. 6
represents the proposed prediction model named as ViTab transformer which is the combination of a vision and a tabular transformer to knobs both image and tabular data.The inputs for these two distinct types of architectures have a variety of properties including image [I], categorical features (coil type [x1] and coil position [x2]), and numerical features (rate change of current [x3], coil distance [x4], WM conductivity [x5], GM conductivity [x6] and scalp conductivity [x7]).

Fig. 7 .
Fig. 7. Parity plot of training and test data for E-max, V-half, and S-half prediction.

TABLE V ACTUAL
VS PREDICTED VALUES FOR A SINGLE SAMPLE OF THE DATABASE Table VII presents the loss and accuracy results of E-max, Algorithm 1 ViTab Transformer Model for Predicting the Output of TMS Coil half mean squared error loss: S − hal f Loss = L M S E (S − hal f, Y s−hal f ) 2. Testing stage Predicted value, {E − max; V − hal f ; S − hal f } test_ pr ed = Y Final_V i T ab Sample image,categorical,numetical

TABLE VII PERFORMANCE
EVALUATION OF THE PROPOSED ViTab TRANSFORMER OVER THE EXISTING DEEP LEARNING MODELS