Diffusion Barrier Prediction of Graphene and Boron Nitride for Copper Interconnects by Deep Learning

The continuous scaling-down size of interconnects should be accompanied with ultra-thin diffusion barrier layers, which is used to suppress Cu diffusion into the dielectrics. Unfortunately, conventional barrier layers with thicknesses less than 4 nm fail to perform well. With the advent of 2D layered materials, graphene and hexagonal boron nitride have been proposed as alternative Cu diffusion barriers with thicknesses of ≈1 nm. However, defects such as vacancies may evolve into a Cu diffusion path, which is a challenging problem in design of diffusion barrier layers. The energy barrier of Cu atom diffused through a di-vacancy defect in graphene and hexagonal boron nitride is calculated by density functional theory. It is found that graphene offers higher energy barrier to Cu than hexagonal boron nitride. The higher energy barrier is attributed to the stronger interaction between Cu and C atoms in graphene as shown by charge density difference and Bader’s charge. Furthermore, we use the energy barriers of different vacancy structures and generate a dataset that will be used for machine learning. Our trained convolutional neural network is used to predict the energy barrier of Cu migration through randomly configured defected graphene and hexagonal boron nitride with $R^{2}$ of >99% for $4 \times 4$ supercell. These results provide guides on choosing between 2D materials as barrier layers, and applying deep learning to predict the 2D barrier performance.


I. INTRODUCTION
According to the international technology roadmap for semiconductors (ITRS) [1], the device size is shrinking continuously from 22 nm in 2012, to 14 nm in 2014, 10 nm in 2016, 7 nm in 2018 and 5 nm in 2020. Since the 130 nm technology node, Cu has become the dominant interconnect material owing to its superior properties such as low resistivity and electromigration resistance [2] compared to Al. However, the high diffusivity of Cu makes it susceptible to diffusion into the surrounding dielectrics [3]- [6]. Cu diffusion may cause a short-circuit between the neighboring Cu interconnects, or a deep-level trap [5] with the transistors underneath.
The associate editor coordinating the review of this manuscript and approving it for publication was Wai-Keung Fung .
In addition to Cu diffusion, the size effect also presents a major issue in Cu interconnects. When the thickness of an interconnect is comparable to the mean free path of an electron (39 nm for Cu at room temperature), the resistivity increases significantly compared to bulk Cu. The size effects include different types of scattering phenomena such as electron surface scattering, grain boundary scattering, surface roughness scattering and impurities [7]- [9].
To prevent Cu diffusion, a barrier must be integrated into the interface between Cu and dielectrics. Conventional barrier materials such as Ta, TaN and TiN [10]- [12] were used to block the diffusion between Cu and dielectric. However, the relatively higher resistivity, and the deficiency in the blocking properties in thicknesses of several nanometers make it vital to search for an alternate. In addition to the scaling-down issues for these materials, the adhesion to Cu is not ideal and presents a challenge. Therefore, other materials such as Ta has been integrated between the diffusion barrier and Cu.
With the advent of nanotechnology, 2D materials can play an important role in IC technology. For instant, embedding 2D material in the interconnect as a barrier layer would be crucial due to continuous scaling-down of the back-endof-the-line (BEOL). Fortunately, many 2D materials were found to exhibit good blocking properties. Among these are graphene and hexagonal boron nitride (hBN). Many studies showed that graphene has an excellent barrier properties [13]- [15], due to its high impermeability and diffusion blockage properties [16], [17]. Moreover, graphene can reduce the surface scattering of Cu which demonstrate it as a liner layer [18], [19]. On the other hand, hBN, which has almost the same hexagonal structure of graphene with a lattice mismatch of 1.5% [20], was found to be capable of suppressing Cu diffusion into dielectrics [21] with a thickness of ≈ 1 nm.
For a perfect 2D layer, Cu atoms experience a large energy barrier (≈ 31 eV) [22] when penetrating through the basal plane, which makes it a perfect barrier layer. However, defects in 2D materials such as vacancies, grain boundaries and edges, can have significant influence on the mechanical, optical, thermal, and electrical properties of the material [23], [24]. The presence of defects in 2D materials is inevitable during the synthesis and the transfer processes and can affect the over-all performance of the barrier between metal and the dielectric in BEOL, especially Cu, which can diffuse fast into dielectrics. Different types of defects may lead to different Cu diffusion behavior. It is instinctive that the larger the defect size, the lower the energy barrier for a Cu atom to cross the layer, as shown in Ref [25], which can be attributed to the weak interaction between the crossing Cu atom and the defect neighboring atoms.
Recently, machine learning (ML) showed great potential in prediction of material properties. For instance, a convolution neural networks (CNN) was able to identify phases and phase transition of matters [26], which makes it possible to exploit CNNs in predicting properties of materials with arbitrary structures, such as predicting bandgap of configurationally hybridized graphene and hexagonal boron nitride [27], or assisting in designing new materials [28]. Furthermore, a combination of machine learning and material databases successfully predicted several properties of stoichiometric inorganic crystalline material, such as material classification, bandgap energy and heat capacities [29]. Another combination of analytical solution and molecular dynamics were developed to train a shallow and deep neural networks to predict fracture stress of graphene samples [30].
Although many 2D materials were demonstrated as barrier layers for Cu diffusion, a qualitative and quantitative comparison is still not explored. Here we aim to investigate the electronic interaction between a diffused Cu atom and graphene/hBN layer. Which ultimately assists in choosing between graphene and hBN as barrier layer. Furthermore, with the help of density functional theory (DFT), we generated three sets of data, i.e., training, validation and test set. each set consists of the structure configuration of 2D layer represented in a 2D matrix, and its corresponding energy barrier. We restricted the defect type to single mono-vacancy and double mono-vacancy, due to the relatively small supercell.
This paper is organized as follows: Section II involves investigation of graphene and hBN as barrier layers in Cu interconnects. Section III involves the application of ML in predicting the barrier performance for Cu diffusion through the defected 2D layer. Finally, the conclusions are summarized in Section IV.

II. GRAPHENE AND HEXAGONAL BORON NITRIDE AS BARRIER LAYERS IN Cu INTERCONNECTS
In this section, we investigate graphene and h-BN as barrier layers for Cu atom diffusion through sheet defect. The defect is assumed to be a di-vacancy of Cu atoms in the 2D layer. The di-vacancy defect is large enough for a Cu atom to pass through, which helps to capture the atomic interaction between 2D layer and the diffused atom. All our calculations were performed within density functional (DFT) theory as implemented in the QUANTUM ESPRESSO [31] simulation package. All details of our simulations are listed in Section V.

A. ADSORPTION OF Cu ON GRAPHENE AND hBN
We first validate our model by calculating the adsorption energy of Cu on perfect graphene. Three possible adsorption sites were investigated i.e., top, bridge and hollow. During the relaxation, adatoms were only allowed to move in the direction perpendicular to the graphene basal plane. At the same time, the relaxation of all C atoms was unrestricted. The relaxation procedure was stopped when the Hellmann-Feynman forces on all atoms were smaller than 10 −2 eV/Å. The adsorption geometry is obtained from the positions of the atoms after relaxation. The adatom height is defined as the differences in z coordinate of the adatom and the average of the z coordinates of all C atoms in the graphene layer. Of the three adsorption sites considered, the site with the largest adsorption energy (minimum total energy) is referred to as the favored site. The adsorption energies of Cu on perfect graphene are shown in Table 1. The preferred adsorption site was found to be the bridge site with −0.264 eV which is in good agreement with previous reports [32]. Followed by the top and hollow sites with −0.263 and -0.12 eV, respectively.
Existence of intrinsic defects such as vacancies is unavoidable in 2D layers grown by the chemical vapor deposi- tion (CVD) method. These defects are susceptible to further enlargement during the transfer process and the subsequent treatment. Among different paths for the Cu atom, the perpendicular path to the basal plane of 2D layer is considered as the fastest diffusion paths [33].
The adsorption of Cu on mono-vacancy layer is investigated by removing a single atom from a perfect layer, then the Cu atom is positioned above the defect and the whole structure is relaxed. For hBN, there is a possibility that the vacancy atom is B or N. Sine the B and N radii are different, the Cu atom is expected to be adsorbed in different heights with different adsorption energies. Table 2 shows the heights and the adsorption energies for the Cu atom. It is observed that the Cu atom is adsorbed above the defect center of graphene with 1.41 Å, and 1.37 Å for hBN with B vacancy, and 1.8 Å for hBN with N vacancy.
We further calculate the interaction between a Cu atom and the 2D layer with a di-vacancy. For graphene layer, the Cu atom is adsorbed exactly at the center of the divacancy with C-Cu distance of 1.9 Å, as shown in Fig. 1a. The adsorption energy is calculated to be −5.1 eV. The lower adsorption energy means the Cu atom is more favored to be adsorbed on graphene layer with di-vacancy. The same behavior is for hBN layer except that the symmetry is broken at the di-vacancy and the Cu atom is adsorbed with B-Cu and N-Cu of 2.04 and 1.88 Å, respectively (Fig. 1b). The adsorption energy is −5.94 eV, which in good agreement with literature [34]. It can be noticed that the distance Cu-N is shorter than Cu-B as shown in Fig. 1b. Therefore, it is expected that Cu has stronger interaction with neighboring N atoms when crossing the hBN basal plane, which will determine the energy barrier behavior as discussed later.

B. ENERGY BARRIER CALCULATION
A perfect 2D layer is highly impermeable to atomic species [16]. The diffusion barrier for Cu atom translocation through the hollow of graphene basal plane is as large as ≈ 31 eV [22], which makes it a perfect barrier layer for Cu diffusion. However, the presence of a vacancy defect may provide a columnar diffusion path for Cu due to the low electron density at the vacancy. The energy barrier that opposes the Cu atoms to pass through the defect is calculated using nudged elastic band method (NEB) [35], which is embedded in QE package. To compare graphene and hBN as barrier layers, a Cu atom is positioned at 3.5 Å above the defect center. Then the atom is forced to move towards the di-vacancy center in a perpendicular direction to the 2D layer basal plane, and the energy barrier along the diffusion path is recorded as shown  in Fig. 2. The energy barrier for the Cu atom increases as it gets closer to the basal plane and reaches its maximum value at the defect center with 6.3 and 7.3 eV for hBN and graphene, respectively. In other words, a Cu atom requires higher energy to diffuse through defected graphene than hBN. Which suggests that graphene outperforms hBN as barrier layer for Cu.
The higher energy barrier of graphene can be explained by calculating the charge density difference (CDD) of the adsorbed Cu atom on 2D layer with di-vacancy relative to the pristine layer and the isolated Cu atom according to the formula where ρ 2DL|Cu and ρ 2DL are the charge densities of the 2D layer with/out a Cu atom, respectively, and ρ Cu is the charge density of the isolated Cu atom in the same geometry structure. As shown in Fig. 3, there are considerable electron density transfers between the Cu atom and the near atoms for both graphene and h-BN. However, there is a strong electron density depletion around the Cu atom at the graphene divacancy as shown in Fig. 3a, which suggests a strong interaction between the C and Cu atoms. The Cu atom also examine a similar electron density depletion at the hBN di-vacancy, with less depletion at the Cu-B side as shown in Fig. 3b.
To elucidate the charge rearrangement between the Cu and the surrounding atoms, we follow the electronegativity of the chemical constituents principle; being Cu is the least electronegative (1.90), followed by B (2.04), C (2.55) and N (3.04). Thus, the Cu adatom tends to donate charge to the surrounding atoms with higher electronegative. Therefore, CDD isosurfaces show more charge depletion from Cu towards N, C and B atoms, respectively. Although the CDD explain the interaction between the Cu atom and other atoms in term of CDD, there is a still need to quantify the transferred charge between the atoms. With this aim, Bader's charge analysis [36] were used to calculate the donated (gained) charge at each atom as shown in Fig. 3c and Fig. 3d. At the defect region in both structures, Cu always donates charge with +0.81e and +0.58e for graphene and hBN, respectively. Atoms which gained more charge are those N atoms with −2.36e, while B atoms donate more with +2.14e. On the other hand, all C atoms neighboring the Cu atom gain charge as their charges are −0.12e and −0.19e. Therefore, the higher energy barrier for graphene can be attributed to the stronger interaction of the Cu atom at the graphene as explained by the CDD isosurfaces and Bader's charge. The relatively low electronegative of B atoms results in less interaction with the Cu atom and thereby hBN offers less energy barrier compared with graphene. VOLUME 8, 2020

III. ENERGY BARRIER PREDICTION BY MACHINE LEARNING FOR GRAPHENE AND hBN A. CONSTRUCTION OF THE VCN
In this paper, we employ VGG16 convolutional network (VCN) [37] for predicting the energy barrier of graphene and hBN for Cu. Fig. 4 shows a schematic of the VCN which combines convolution layers and fully-connected layers (FC) into one model. In our VCN model, the input matrix corresponding to the 2D structure is fed to the input layer. The input matrix is transformed into feature maps and transferred to the next layer and so on, until the last convolution layer. Then the data is down-sampled by a max pooling layer, which will serve as an input to the fully-connected layers.
The last FC layer acts as the output layer which contains the predicted energy barrier (more details about VCN are in supplementary note A, and the detailed hyperparameters are shown in Table. S1).

B. PREPARATION OF THE DATASET AND NETWORK TRAINING
After the neural network is built, it is trained by the generated dataset, which are the input structural matrices and their corresponding energy barrier values. In order to prepare enough dataset for the neural network, we conducted DFT calculations to calculate the energy barrier of a Cu atom for all defected structures with mono-vacancy and double monovacancy. Structural information must be well described to serve as an input data to train the neural network (NN). The material descriptor which contains the structural details of the material is very crucial [38] for training and affects the model accuracy. Since the materials under consideration have a 2D structure, it is suitable to use 2D matrix to represent the structure of graphene and hBN. One intuitive way to construct the input matrix is to represent the C atom in graphene by ''1'', and the vacancy defect by ''0'' as shown in Fig. 5a & 5c. Similarly, The B and N atoms is represented by ''1'' and ''2'', respectively, as in Fig. 5b. The matrix representation of the 2D structure expedites the NN to capture the features of the topologically defected structure. During the training process, the NN learns what is the energy barrier of a mono-vacancy layer to Cu atom, and how the position vacancy affects the value of the energy barrier. Moreover, how two mono-vacancy atoms interact with each other.
In our model, we consider 4 × 4 supercell of graphene and hBN. First, we assume a single mono-vacancy in the supercell, i.e., 32 structures. Every structure has a single monovacancy as shown in Fig. S1(a-c) (Supplementary Material). Then, we assume a double mono-vacancy in the supercell, which means additional 496 structures as shown in Fig. S1(df) (Supplementary Material). After the construction of the dataset, it is split into training, validation and test set. The training set is used to adjust the weights and biases during the training phase. While the validation set is used to avoid the network over-fitting. The test set is to provide a measurement of the model performance. Thus, the test set is not seen by the model during the model learning. The splitting process is done randomly by the computer. The mean square error (MSE) is used as a criterion to terminate the training of the model.

C. ENERGY BARRIER PREDICTION FOR GRAPHENE AND hBN BY DEEP LEARNING
After the training phase, the test set is fed to the neural network to predict the energy barrier of the defected 2D material. The energy barrier of a Cu atom depends on the defected area size and the surrounding atoms. That is, the number of vacancy atoms and their position affect the total energy barrier. For example, if vacancy atoms are located next to each other, then Cu atoms interact with each other and they experience higher energy barrier in graphene (≈ 12.45 eV) as shown in Fig. S2(a). While the energy barrier is fixed when there is a single C mono-vacancy. Connected double mono-vacancy can be formed at the boundary between two supercells with double mono-vacancy as shown in Fig. S3 due to the periodic boundary conditions. However, its energy barrier is the same as that of double mono-vacancy within the same supercell. The presence of two atoms (B and N) in hBN results in two energy barrier values for a single mono-vacancy, depending on the vacancy atom. Obviously, there is a link between any atom and its neighboring atoms, and consequently, the energy barrier. As convolutional layers can extract features of matrix elements and their neighbors, VCN detects these linkages and extract the features of each structure and relate it to the output energy barrier during the NN training and validation. Although the training and validation datasets are relatively small (512 datasets for each material), it can be noticed that the output is almost identical to the targeted energy barrier for both graphene and hBN as shown in the lift chart in Fig. 6, except for the first two hBN samples with low diffusion barrier (Fig. 6b). The lift chart shows the output of test set (expected) and the energy barriers predicted by VCN. We further calculated mean absolute error (MAE), coefficient of determination (R 2 ) and root mean square error (RMSE). These metrics were calculated using the test set which provide an independent measure of the performance of the VCN, since it is not used during training nor validation phases. As provided in Table 3, the network shows a good performance for graphene with 0.07, 0.008 and 0.09 of MAE, MSE and RMSE, respectively. However, for  hBN, these metrics are relatively high. We attribute the lower prediction performance of the network for hBN to the input representation matrix. Since hBN has two atoms (B and N), and graphene has only one atom (C), a VCN needs more datasets to further improve the performance. The prediction performance can be improved by considering larger supercell and dataset size. Increasing the dataset size decreases the chances of overfitting and provides more atomic features in the input matrices.
Due to the extremely high computational cost, it is impractical to compute the energy barrier of all the possible configurations for the defect in 2D material. As shown previously, the defect may occur anywhere in the lattice, which results in huge number of defect configurations. The diffusion properties and energy barrier of Cu are determined by the defect type and location. Therefore, it is crucial to calculate the energy barrier of Cu for all configurations. The previous results show that it is possible to exploit ML to enhance the performance of 2D materials as barrier layers in the BEOL by predicting their energy barrier to Cu, using reasonable computational cost. Similarly, other classes of 2D materials such as transition metal dichalcogenides (TMDs) can be predicted. The structure of TMDs such as MoS 2 and WSe 2 can VOLUME 8, 2020 be represented by 3D matrix since it consists of transition metal layer (Mo, W) sandwiched between two chalcogen layers (S, Se). Which paves a way for thorough comparison study among 2D materials as barrier layers.

IV. CONCLUSION
In this paper, we investigate defected graphene and hBN as barrier layers for Cu diffusion in BEOL. The diffused Cu atom experiences higher energy barrier when crossing the divacancy defect of graphene. Charge density difference and Bader's charge show that the interaction of Cu-C in graphene is higher than Cu-B/N in hBN, which suggests graphene as a barrier layer in Cu interconnects over hBN. Moreover, hundreds of DFT simulations to generate energy barriers of configurationally defected graphene and hBN are carried out. These datasets are used to train, validate and test our NN model. Our trained model shows a good prediction performance that may stimulate the application of machine learning to predict different classes of 2D barrier layers performance.

V. SIMULATION DETAILS
All our calculations were performed within density functional theory (DFT) as implemented in the QUANTUM ESPRESSO [31] simulation package. We used the generalized gradient approximation (GGA) in the parametrization by Perdew, Burk and Ernzerhof [39] and the projector augmented wave (PAW) method [40], [41]. A cut-off energy of 600 eV and Gaussian smearing with a width of σ = 0.025 eV for the occupation of the electronic levels were used. A Monkhorst-Pack -centered 10×10×1 k-point mesh is used. Convergence tests are conducted for all the chosen Cut-off energy, Gaussian smearing and k-points mesh. All structures are modeled as 4 × 4 supercell, of which the periodic boundary conditions are applied along the x − y plane. The repeated 2D layers are separated from each other by 20 Å of vacuum. The adsorption energies are calculated as