Interface Scattering as a Nonlocal Transport Phenomenon in Semiconductors

Macroscopically non-local effects are common in electron transport in semiconductor devices, occurring whenever the mean free path and/or the deBroglie wavelength are not small compared to geometry/flow length scales. When such effects are important, standard diffusion-drift (DD) theory becomes inaccurate and in need of revision, with the best known example being density-gradient (DG) theory wherein a gradient term is added to the electron gas equation of state to approximate the effect of quantum non-locality. Here we consider a similarly motivated gradient correction to the electron-lattice interaction that accounts for non-locality in the transport physics. Versions of DD and DG theory with this correction are discussed, and then are applied to the analysis of long-channel field-effect transistors where they are found to provide a physics-based approach for modeling and understanding interface scattering.


I. INTRODUCTION
Macroscopic (or continuum) approaches are widely used to model the electrical behavior of semiconductor devices, with diffusion-drift (DD) theory serving as the workhorse for the last seven decades [1]. But DD theory has its limits, and especially when the physical interactions represented within it can no longer be reasonably described as local and instantaneous. In such cases the theory can be improved by adding correction terms to its material response functions in order to better capture the non-local and/or history-dependent effects. The best known example of this type of corrected DD theory is density-gradient (DG) theory in which a gradient correction to the electron gas equation of state is used to model quantum non-locality effects [2]. In the present paper we introduce an analogous correction into the response function representing the electron-lattice interaction to allow non-local scattering effects to be captured. We then illustrate the corrected versions of DD and DG theory with an application to long-channel FETs operating under low-field conditions.
Although the approach to semiconductor modeling developed herein can serve as a phenomenology for fitting experimental data, as in earlier work [2] we emphasize its status as a theory, grounded in physical principles, and therefore having predictive power. The non-local effects of interest typically have a classical physics origin associated with the electron mean free path (mfp) not being small compared to geometric/flow length scales. These effects are of course automatically included in microscopic treatments such as in Monte Carlo simulation, e.g., [3], however, the implications of a long mfp for macroscopic descriptions have not been well explored in a semiconductor context. But this is not true in gas dynamics [4] where a seminal result is the Chapman-Enskog expansion that provides a series approximation to the Boltzmann equation (say, for a monatomic hard-sphere gas) in which the small parameter is the Knudsen number, i.e., the ratio between the mfp and a characteristic length scale [5]. The lowest order approximation in this series gives the Euler equations of fluid mechanics, while at first-order one finds the Navier-Stokes equations that describe viscous fluids. Beyond this, the second-order and third-order approximations are known as the Burnett [6] and super-Burnett [7] equations, respectively. These latter FIGURE 1. Depictions of the scattering and average velocity profiles for three different flow situations near surfaces: (a) ordinary fluid mechanics driven by a pressure head, with no slip at the boundary, and with a viscous interaction associated with particle-particle scattering; (b) electron flow driven by an electric field, with diffuse scattering at the surface, and with a long mfp (associated with electron-lattice scattering), and a short deBroglie wavelength; and (c) same as (b) but with a short mfp and a long deBroglie wavelength. The length of the arrows corresponds to the mfp. theories add correction terms to the Navier-Stokes description and they "should" have given better representations of the behavior at higher Knudsen numbers (e.g., in high Mach number flows that exhibit shocks). Unfortunately, experience has not borne out this expectation as the Burnett and super-Burnett equations have been found to be intrinsically unstable [8], [9]. This conclusion eventually led researchers to develop "regularization" procedures to tame the instabilities [10], or alternatively to abandon the Boltzmann equation basis altogether and instead create better behaved equations using methods of classical field theory [11]. Motivated by the success of the latter work, we here employ a classical field theory approach [12] to generalize DD/DG theory so as to describe non-local transport effects.
As noted the non-local electron transport phenomena being studied here usually have the classical Knudsen origin, but they could also arise quantum mechanically whenever the deBroglie wavelength is not small compared to geometric/flow length scales. This is the same fundamental source as the quantum effects of concern in DG theory [2], and just as in that case we again have the apparent paradox of a theory based on classical physical principles being used to describe quantum effects. As explained in [2], this is possible because continuum theories are concerned with populations of particles rather than single particles, and violations of the physical laws at the particle level are not relevant so long as they do not have macroscopically significant consequences. The key difference between DG theory and the theory presented here is that the quantum non-locality effect corrected for in DG theory was associated with the description of the electron gas itself whereas in this paper we are concerned with non-locality in the interaction between the electron gas and the lattice. This difference has implications for the derivations in that the former could exploit equilibrium thermodynamics and variational calculus [2], whereas the latter is a non-equilibrium property that is constrained only by the second law of thermodynamics.
The most important manifestations of non-local transport in semiconductor devices are in high-field and quasi-ballistic situations, and these cases are obviously worthy of attention given their practicality and the fact that they are generally not well modeled by existing continuum descriptions such as energy transport theory [13], [14]. However, we leave their examination to future work, and instead concentrate here on the much simpler situation of a long-channel field-effect transistor (FET) under low-field conditions. This problem is of course ordinarily modeled using DD (or DG) theory, yet the fact that the electron mfp (and perhaps the deBroglie wavelength) is not small compared to the inversion layer thickness means that non-local transport effects should be manifested and that the equations describing the flow should be modified. To illustrate the physics we are concerned with, in Fig. 1 we depict three hypothetical boundary layer flows, each dominated by a different scattering mechanism. Fig. 1a depicts the situation of ordinary Navier-Stokes fluid mechanics in which the flow of fluid particles is driven by an applied pressure head and no slip is assumed at the boundary. The effect of the boundary is transmitted into the flow by inter-particle scattering that is expressed macroscopically as a viscous stress in the gas and results in the non-uniform average velocity profile shown. Fig. 1b instead pictures a flow of electrons inside a semiconductor that is driven by an applied voltage and in which it is assumed that there is no electron-electron scattering (and hence no possibility of a viscous interaction), that the deBroglie wavelength is short enough that quantum non-locality plays no role, and that the wall scattering is diffuse. If the mfp were also short, then the average velocity profile would be uniform, however in Fig. 1b the mfp is assumed to be longer so that the effect of the diffuse wall scattering extends non-locally out into the flow, again producing a non-uniform average velocity profile as shown. Lastly, in Fig. 1c we consider a second electron flow in which all is the same as in Fig. 1b except that the mfp is now assumed short and the deBroglie wavelength is instead taken to be long. The latter is depicted in Fig. 1c by the smearing of the green color associated with each electron (i.e., a quantum mechanical wave-packet) which corresponds to the probability of it being at any given location. Carrying this Born interpretation further, we can say that the diffuse scattering by the boundary will be non-local with a probability proportional to the overlap of a given electron's 'spread' with the boundary potential [15]. This non-local scattering could again produce a non-uniform average velocity profile like that shown in the Fig. 1c.
In DG theory the (quantum) non-locality is approximated by introducing into DD theory a correction term involving a spatial gradient [2]. Similarly, both the Boltzmann/Burnett [5] and classical field theory [11] developments of gas dynamics for the Knudsen regime represent the non-local effect associated with a long mfp by extra spatial gradient terms. And this same basic approach is followed here as well, with the main difference (as already noted) being that in both DG theory and the Knudsen descriptions of gas dynamics the gradient terms enter the theories via the material response functions that describe the gases themselves, whereas for describing non-local mfp and/or quantum effects on scattering in semiconductor transport the crucial gradient correction instead enters into the representation of the electron-lattice interaction. When this gradient correction is applied to DD theory, we here term the new description velocity gradient (VG) theory; when applied to DG theory, we instead call the new description generalized gradient (GG) theory.
One further question regarding the transport theory developed here is whether or not electron viscosity needs to be considered as has occasionally been presumed in the semiconductor device literature including by the first author [16], [17], [18], [19]. In other words, is the physics of Fig. 1a relevant to understanding electron transport in semiconductors? Research on the viscosity of conduction electron gases in solids dates back to pioneering theoretical efforts of Abrikosov and Khalatnikov [20] and of Gurzhi [21], [22]. Intriguingly, the latter paper suggested that electron viscosity could actually lower the resistance of a point contact, a possibility that is referred to as the Gurzhi effect or "superballisticity". To access this physics it is necessary that the mfp associated with electron-electron scattering be small compared to all other length scales, and this represents an extreme experimental challenge as it requires a conducting material of very high quality (to minimize defect scattering) as well as low temperatures (to suppress phonon scattering) and ultra-small geometries (e.g., a point contact). Indeed these difficulties doomed all efforts (e.g., [23]) until 2017 when Polini and co-workers [24], [25], [26] found that the near-ideal transport characteristics of graphene when interposed between boron nitride layers allowed one not only to observe the "superballistic" Guzhi effect, but also to measure the electron viscosity in metallic graphene and to witness other hydro phenomena such as vortices [27]. Since this graphene breakthrough, viscous phenomena have also been seen in high-quality GaAs/AlGaAs heterostructures at low temperature [28]. Overall these discoveries have been remarkable, but for us their import is simply that viscous effects are negligible in all ordinary semiconductor devices operating at room or higher temperatures. Hence, in this paper we assume electron viscosity effects are negligible.
The paper is organized as follows. In Section II we outline the continuum theory of electron transport in semiconductors in general terms. The implications of thermodynamics are covered in Section III, and then in Section IV we specify the material response function describing the crucial electronlattice interaction. In Section V we use the gradient-corrected DD or DG equations (i.e., VG or GG theory) to perform numerical simulations of an FET and compare the results with those obtained using the conventional DD/DG approach. The paper ends in Section VI with a summary and remarks on the broader implications for semiconductor device modeling.

II. MACROSCOPIC ELECTRON TRANSPORT THEORY
Because microscopic derivations of fluid descriptions for the Knudsen regime have proved problematic in gas dynamics (as noted in the Introduction), we here follow [2] and [11] in developing our continuum electron transport theory using methods of classical field theory [12].
Employing a classical field theory approach, we start by postulating the existence of an electron continuum and a lattice continuum that are characterized by various densities such as charge/mass, momentum, etc. that are functions of space and time. (A hole continuum would of course be added if holes are to be modeled). We now force these continua to obey the classical laws of physics starting with the conservation of charge/mass for the electron gas (in integral form): where V is a fixed volume with surface S and outward unit normal vector n, n is the electron density, u is the electron velocity field vector, and the right side is zero because we assume fully ionized impurities, no photogeneration, etc. When the field variables are differentiable, the divergence theorem and the arbitrariness of the volume V allow (1a) to be reduced to the following partial differential equation (PDE): where d n /dt = ∂/∂t + u · ∇ is the material derivative. Next, constraining the electron continuum to obey the conservation of linear momentum yields: where m and q are the mass and charge of an electron, E is the electrostatic field, E n is the force per charge exerted by the lattice on the electron gas, and the forces within the electron gas are represented by an electron pressure p n and an optional electron stress σ n that is included if the DG effect is deemed significant [2]. When the field variables are differentiable we can use (1b), the divergence theorem, and the arbitrariness of the volume V to obtain: One last constraining balance law on the electron gas is the conservation of angular momentum. For convenience of presentation we here omit discussion of it, noting only that its sole additional consequence is that the DG stress tensor must be symmetric, i.e., σ n = (σ n ) T . The lattice continuum would in general also be similarly constrained by balance laws (see, e.g., [29]), however, for this paper the lattice equations are trivially satisfied because we assume that the impurities are fully ionized and that the lattice is rigid. Lastly, since E is the electrostatic field it will obey (in differential form): where D = E + P is the electric displacement, P is the polarization, N D and N A are doping densities, and p eq is the hole density assumed here to be equilibrated.
In this work we follow [19] in including two different aspects of the electron-lattice interaction physics by taking E n to be the sum of two terms, E nd + E nr . The contribution E nd (where the 'd' stands for 'dissipative') describes the drag force exerted by the lattice on the electron gas resisting its relative motion, while E nr (where the 'r' stands for 'recoverable') describes the lattice diffraction effect that is responsible for the electron effective mass. The latter is realized by choosing [19] so that (2b) transforms to: where m n ≡ (1 − λ r )m is the electron effective mass. As discussed in [2] and elsewhere, in semiconductor physics it is convenient to define an electron chemical potential ϕ n that obeys ∇p n = −qn∇ϕ n so that (4b) can be converted from an equation in terms of pressure into one in terms of chemical potential: is the electrochemical potential (or quasi-Fermi level) and (3) 1 has also been used. Since we are focused on lowfield behavior, we can generally neglect the electron inertia term on the left side of (4c). If we can also ignore the DG correction, then the DD approximation results when the material response function for E nd can be accurately characterized as local and instantaneous (see below). A final aspect of the continuum description not discussed here in detail is a consistent set of boundary conditions that are also derived from the integral forms as applied across boundaries where the field variables are not necessarily differentiable so that (1)-(4) do not follow. These conditions are necessary to formulate well-posed boundary value problems within the theory, and one specific condition is discussed in Section V.

III. THERMODYNAMIC CONSIDERATIONS
The equations of the previous section represent the constraints imposed by the conservation laws of charge/mass, linear momentum, and angular momentum, and by the electrostatics on the field variables. As always in classical field theory [12] these general laws are not sufficient to describe specific semiconductor behavior, and to complete the description one must add certain constitutive equations that quantify the responses of the particular materials to imposed forces. The forms of these additional equations are fairly arbitrary, but they must not violate the conservation laws, the laws of thermodynamics, or various requirements of symmetry and invariance. In addition there is a bias toward simple-but-not-too-simple expressions (e.g., local and instantaneous) for the sole reason that the overall model will tend to be most useful when the materials can be accurately described by such functions [2].
In developing material response functions in classical field theory, thermodynamics plays a central role [12]. In this regard we note that conventional electron transport theory is sometimes formulated with the electron gas having a different temperature than the lattice [30], and this implies that the electron gas and the lattice are being treated as distinct thermodynamic entities. However, this rarely seems justified in semiconductors as it would appear to require strong electron-electron scattering and weak electron-lattice scattering, and the former is usually seen only in heavily doped regions where an accurate transport description is generally not needed. Electron-lattice scattering is instead highly variable ranging from negligible in ballistic devices to dominant in devices governed by DD theory. When weak, the electron gas can indeed be driven far from equilibrium, however, the lack of electron-electron scattering means that the gas's thermal energy will remain small and almost all its excess macroscopic energy will be mechanical.
Based on the foregoing we treat the electron-lattice system as a single entity thermodynamically, and hence begin by writing a single energy conservation law for the combined system: where e n and e l are stored energy densities of the electron gas and the lattice, q is the heat flux vector, E · (dP/dt) is the rate work is done by the electrostatic field on the lattice polarization, and − n · up n and qnu · E are the local rates work is done on the electron gas by the pressure p n and by the electrostatic field, respectively. As discussed in [2], when the DG effect is significant, the stress σ n and an associated self-equilibrating double-pressure η n also do work on the electron gas with the rates given by n·σ n ·u and − n·η n ∇ ·u, respectively. Assuming the field variables are differentiable, one can re-write (5a) as: where (4b) has been used and I is the identity matrix. The pressure and double-pressure are non-dissipative and the stress is too since electron viscosity has been assumed negligible, and thus (5b) divides cleanly into purely recoverable terms on the left side and purely dissipative terms on the right side. For quasistatic processes for which the right side is zero, the recoverability of the left side implies integrability, and hence the existence of an entropy function η with where ρT is the integrating factor and T is the single temperature of the system. Combining (5b) and (6), we then write the Clausius-Duhem inequality for the combined system as: where the right hand side is the rate of entropy production and the requirement that it be non-negative is the local statement of the second law of thermodynamics in the field theory.
In DG theory the stress in the electron gas σ n is asserted to depend only on the electron density and the density-gradient, and not on individual components of the strain-gradient in the electron gas. For this to be true, the coefficient of the velocity terms in (5b) and (6) must vanish, which implies: where we take η n = n ∇n in order to ensure the symmetry of the stress tensor (as required to conserve angular momentum). Defining a free energy function ρχ ≡ ρe l −ρTη−P·E, we can then re-cast (6) where n ≡ ∇n · ∇n. Next we assume e n and e l are individually characterized by equations of state: Inserting (9)  Equation (10a) should be satisfied for all possible values of the field variables, and it therefore follows that p n = m n n 2 ∂e n ∂n n = 2m n n 2 ∂e n ∂ n which are termed the recoverable material response functions and which connect these quantities to the equations of state in (9). If the dependence of e n on n in (9) 1 is negligible, then an expression for e n (n) can be developed from equilibrium statistical mechanics; using (10b) 1 one in turn finds p nDD (n) and ϕ nDD (n) appropriate for Maxwell-Boltzmann or Fermi-Dirac cases, and the equations of Section II then reduce to DD theory. And if e n also depends on n , then using (10b) 2 one obtains the quantum-corrected equations of DG theory [2]. In addition, there are dissipative response functions for q and E nd and their forms should be such that the inequality in (7) is maintained under all conditions. On this basis we conclude that the primary dependences are which for (7) to always be satisfied must individually obey

IV. MATERIAL RESPONSE FUNCTION DESCRIBING THE ELECTRON-LATTICE INTERACTION
In this section we discuss the specification of the material response function (11a) 1 that characterizes the dissipative interaction of the electron gas with the lattice. (For all the other response functions in (9) and (11a) 2 we use conventional expressions, e.g., (9) 2 assumes a linear dielectric with permittivity ε d .) The electron-lattice interaction is mediated by scattering (by phonons, defects, and impurities) and the basic effect is a drag felt by the electron gas as it moves through the lattice. The usual form assumed in DD/DG theory is a linear instantaneous proportionality to the velocity u: In this expression μ n is the electron mobility that in general can depend on the electric field, the doping density, etc.; in this paper we include only a standard expression to describe velocity saturation. The negative sign (with a positive mobility) is needed to satisfy (11b) 1 , and it corresponds to the drag force always resisting the motion.
As discussed earlier, to represent non-local effects of the type depicted in Figs. 1b,c in (12a), we introduce a gradient correction. Based on (11a) 1 , this will be a dependence on the velocity gradient and can be thought of as representing the non-locality by effectively including the next term in a Taylor-like expansion. A rate correction as in [19] is not relevant for Section V and so the simplest form that also preserves rotational invariance reduces to: where the length λ VG is a material parameter characterizing the non-local VG effect. In keeping with Figs. 1b and 1c, the added term can contain both classical (mfp) and quantum (deBroglie) contributions, with the classical contribution usually dominating so that λ VG can be expected to be of the order of the mfp. It should be emphasized that in (12b) μ n and λ VG are "bulk" parameters that characterize the electron gas interaction with the lattice with the latter quantifying the strength of its response to velocity gradients. As such, they can vary with electric field or other variables (e.g., in μ n to account for velocity saturation), though for the low-bias conditions of most interest in this paper they will be essentially constant. 1 Importantly, as "bulk" parameters their values should not be affected by a nearby surface as in the case of MOSFET channel transport studied in Section V. Lastly, it should be clear that the form in (12b) is no more than a lowest order approximation and could well require revision, e.g., if the derivative term were large (12b) could erroneously change sign and become an unphysical accelerating force. As noted in the Introduction, we call the modified DD theory that incorporates (12b) velocity-gradient (VG) theory, and when both the DG and VG corrections are included, we instead refer to it as generalized-gradient (GG) theory. Because the DG and VG corrections describe different physical phenomena, they need not both be important simultaneously, and thus a first modeling decision is to choose which of the four descriptions to use for a given problem (i.e., DD, DG, VG, or GG theory). This choice and application details are illustrated in the next section in an analysis of inversion layer transport.

V. APPLICATION TO A SILICON MOSFET
To analyze a device situation using DD, DG, VG, and GG theory one must formulate and solve boundary value problems within each theory. To this end, we here formulate the problem in GG theory, and then can find the equations for the other theories simply by taking appropriate limits.
1. An interesting feature of the semiconductor transport equations is that in steady-state (12a) says that E nd is irrotational (i.e., ∇ × E nd = −∇ × (u/μ n ) = 0), but that u is not unless μ n is constant (or more precisely only when u × ∇μ n = 0). On this basis it seems reasonable to require that the VG correction also not generate rotation in (u/μ n ). A form that meets this constraint is E nd = −(u/μ n ) + λ 2 VG ∇ 2 (u/μ n ). For the low-field problems of interest in this paper, the difference between this equation and (12b) is negligible, but this would not necessarily be true at high fields. It is undoubtedly correct to assume that the long-channel MOSFET of Fig. 2 is in a scattering-dominated regime so that the inertia term in the momentum balance equation (4c) can be ignored. Using (12b) it can then be re-written as: As noted earlier, in this work μ n and λ 2 VG can mostly be regarded as constant, and so taking the curl of (13a) results in: where ω = ∇ × u is the vorticity. 2 This equation shows that the range of the VG effect on vorticity is set by the distance λ VG . A second equation comes from combining (1b) in steady-state with (13a) to find: Using a vector identity this can be re-written as where the approximation comes from neglecting the dilatational VG correction to the driving potential, an assertion that should be accurate except at very low source-drain biases. This approximation is especially valuable here because it also (i) eliminates the explicit appearance of u and thereby simplifies the equations, and (ii) avoids certain irrelevant problems that arise in modeling the drain transport where for convenience in this paper we are improperly treating the incoming channel carriers and the cold drain electrons as a single population [31]. Next, assuming the simplest forms for (10b) 1 and (10b) 2 , we have the DG equation [2] ∇ · (2b n ∇s) = s nDG + ϕ nDD (s) − ψ (15) which is a PDE for s ≡ √ n and b n is the DG coefficient. From (15) it is evident that the distance λ DG ≡ (qb n /k B T) 2. If μ n is not constant and we modified the equation for E nd as stated in footnote 1, then one would re-define the vorticity vector as ω ≡ ∇ ×(u/μ n ) which would obey an equation like (13b). sets the range of the DG effect. And lastly, from (3) the electrostatic PDE obeyed by ψ is where the holes (in the bulk of the n-channel FET) are treated as non-degenerate (given their low density) and in equilibrium (with N s D taken to be the source donor density because we here assume the zero of potential is at the source contact). The associated screening (Debye) length is where n c is a characteristic density; two relevant values are the depletion width L depl D obtained when n c = N A and the inversion layer thickness L inv D obtained when n c is the average inversion layer density. Finally, after the above PDEs are solved, one can obtain the electron velocity field using (13a) as where the approximation is the same as that used in reaching (14b). Since we here model the FET of Fig. 2 in two dimensions only the z-component of the vorticity (ω z ) needs to be solved for, and in this case (17) reduces to while (14b) takes the form ∇ · μ n n∇ nDG ∼ = λ 2 VG nω z,y ,x − λ 2 VG nω z,x ,y (18b) The coupled differential system of GG theory in 2D then consists of (13b), (15), (16), and (18b) to be solved simultaneously for ω z , nDG , s, and ψ. For DG, VG, and DD theories the equations corresponding to (13) - (18) are reached simply by setting λ VG or b n or both to zero, respectively.
In developing boundary conditions to be applied to the foregoing systems of PDEs, we assume initially that the MOSFET has the ideal geometry shown in Fig. 2. Among other things this means that we ignore the possibility of surface roughness at the Si-SiO 2 interface (see the Appendix), or at least an explicit treatment thereof, and instead take the interface to be a straight line. The boundary conditions to be applied at this and other interfaces for solving the equations for DG n , s, and ψ are the same as those used in DD/DG theory [2]. As for the boundary condition on vorticity, at ohmic contacts and at all outer boundaries of the semiconductor we assume no vorticity injection, i.e., ω z = 0. More complicated is the vorticity boundary condition at the Si-SiO 2 interface as this has to represent the range of microscopic behaviors from specular to diffuse surface scattering [15], [32]. Macroscopically, the former is equivalent to "full slip" (u x,y = 0, where the coordinate system is as shown in Fig. 2) while the latter corresponds to "no slip" (u x = 0) of the electron gas at the interface. To relate these conditions to vorticity, we apply (17) at the interface and write the lateral velocity component as: In the case of "no slip" (u x = 0), (19a) implies ω z,y = (μ n DG n,x /λ 2 VG ), while for "full slip" we should have u x = μ n DG n,x , and (19a) then implies ω z,y = 0. A simple vorticity boundary condition that allows one to interpolate between these two limits is: where the dimensionless surface scattering parameter α can have any value between 0 (full slip) and 1 (no slip). In this formula, it should be emphasized that μ n and λ VG are "bulk" parameters (as noted earlier) that should not change value when applied at the boundary in (19b). To solve the above equations in the geometry of Fig. 2 requires numerical methods. As was the case for DG theory [2], we have found that a well conditioned approach to solving the GG equations is to switch from the density to a 'Slotboom variable' by changing variables from s to v using s = S 0 exp( qv k B T ). And as is well known, a convenient variable set for solving the DD equations comes from using a numerical reversion of ϕ n (n) to eliminate n, and leaving DD n and ψ as the solution variables. We use this same approach and variables (with the addition of ω z ) to solve the equations of VG theory. In all cases, the equations so transformed have been programmed in Comsol [33], and solved using continuation procedures to obtain convergence.
We now turn to analyzing the MOSFET problem and comparing results obtained using the various theories. In so doing it should be remembered that for this case most conventional continuum modeling employs DD theory, and works well if the mobility is treated as a gate-bias-dependent fitting parameter called the 'effective' or 'channel' mobility (μ eff n ). DG theory is also sometimes used, especially with very thin oxides and when one wishes to model the inversion layer location more precisely, e.g., for estimating capacitance. For our new analyses we begin by comparing various length scales of the problem, namely L inv D , L depl D , λ DG , λ VG , the device length L, and the oxide thickness t ox . A first point is that for the devices of interest L (here taken to be 0.5μm) is not much larger than L depl D , and therefore the full electrostatic equation (16) must be solved. Second, since the mfp is generally larger than the deBroglie wavelength, the VG effect should be dominated by the classical mechanism, and it seems reasonable to suggest that λ VG ∼ mfp or roughly 10nm [33]. And since λ VG is then comparable to (or larger than) L inv D , we conclude that for modeling inversion layer transport it is important to include the VG correction. Lastly since λ DG is small compared to L depl D , L, t ox (here taken to be 20nm), λ VG , and even L inv D , it seems justifiable here to ignore the DG correction; note however that this would not be true if the oxide was instead very thin. Thus, on theoretical grounds, VG theory would seem to be most appropriate for modeling the large MOSFET of Fig. 2. To see all of this in numerical simulation, we next turn to comparing FET analyses made using the DD, DG, VG, and GG theories.
We begin by illustrating a GG solution in Fig. 3 with plots of n, ω z , and u x as functions of position in the FET. For this simulation we have taken λ VG to be 10nm, and α arbitrarily chosen to be 0.8 (which as seen in Fig. 3b is effectively a no-slip condition). The bias condition in these plots is V D = 0.8V which is not a low-field situation, and given this we introduced a simple velocity saturation into the mobility model with a low field mobility of 1360cm 2 /Vsec and a saturation velocity of 10 7 cm/sec. The density plot in Fig. 3a shows the quantum confinement effect produced by the DG correction, Figure 3b plots the velocity in the x-direction and shows a velocity profile with the expected boundary layer effect (much like that depicted in Fig. 1b), and Fig. 3c shows the vorticity generated by the diffuse interface scattering. These observations are highlighted in the density and velocity profiles along a cutline across the boundary layer shown in Fig. 4.
As is well known, the experimentally derived effective mobility μ eff n of channel carriers is generally found to be much lower than the bulk mobility, with the reduction typically being 50% or more in the case of silicon FETs. Furthermore, μ eff n has been found to vary with gate voltage, and extensive experimentation with Si MOSFETs has shown this variation to be "universal" in the sense that if μ eff n (V G ) is plotted as a function of an effective vertical electric field defined by then it is essentially independent of doping density and substrate bias [34]. In (20) the value of β depends on the device type, e.g., for (100) orientation β = 1 2 for n-channel and 1 3 for p-channel [34], and other orientations have been studied as well [35]. As an example, the "universal" curve for n-channel (100) FETs taken from [34] is shown as the blue squares in Fig. 5. The question for us is whether this "universal" behavior can be understood using the VG treatment of surface scattering. To explore this, we take V D = 0.1V and allow V G to vary, and then from VG simulations extract the variation of μ eff n with E eff . As seen in Fig. 6 (green dashed line), we find that good agreement with experiment can be obtained when the effective field is low by assuming α = 0.54 and λ VG = 10nm. This agreement at low E eff does not, however, extend to higher fields with the simulated result not dropping nearly as strongly as is seen experimentally. Additional simulations (not shown) indicate that this fit cannot be improved significantly simply by changing the values of the material constants λ VG and α.
From our numerical studies, the most likely explanation (see the Appendix for a brief discussion of alternatives) for the inadequate fit of the VG simulation (green dashed curve in Fig. 5) to experiment (blue squares) is that the value of the surface scattering parameter α in (19b) is not constant as was assumed, but instead increases with the strength of the confinement. The idea is that stronger confinement will result in more frequent surface collisions, which are thus more effective in reducing slip (when these scatterings are diffusive). To represent this, we used the normal electric field just inside the semiconductor surface as a measure of the strength of the confinement, and then made α depend linearly on it using: where the unit normal points into the semiconductor, the max function means the electric field term is present only when the field is confining, and the parameter E slip gauges the strength of this surface interaction effect on the slip. Using this formula with α 0 = 0.54 and E slip = 3.4 MV/cm, we now obtain the excellent fit displayed in Fig. 5 (red curve). It should be noted that in this simulation the mobility is essentially constant (since V D = 0.1 and hence the saturation effect is negligible) and includes no additional "fitting" terms. Moreover, when E eff = 1 MV/cm, (n · E) surf = 1.85 MV/cm, and therefore according to (21), the confinement has increased α from its low-field value of 0.54 to 0.83. This decrease in the amount of slip in going from weak to strong confinement seems physically plausible. One basis for assessing the physical fidelity of the model embodied in (21) is to test its "universality" in the sense of [34], [35], i.e., is VG theory's prediction of μ eff versus E eff essentially unchanged if, for example, the substrate doping is varied? This is studied in Fig. 6 where we have varied N A from 3x10 14 cm -3 to 3x10 16 cm -3 , and indeed we find little variation in the curve except at low effective fields.
As a final set of simulations, in Fig. 7 we compare DD, DG, VG, and GG results for the drain current as a function

FIGURE 7.
Comparison of IV curves computed using DD, DG, VG, and GG theories; note that the former two effectively assume "full slip" while the latter two have little slip at the Si-SiO 2 interface as dictated by the model involving (21). These simulations show the DG correction to be of secondary importance when computing the drain current when t ox is relatively large.
of the applied drain voltage. In these simulations we again incorporate velocity saturation as explained earlier. The DD and DG results are effectively the cases with full slip (specular scattering) whereas the VG and GG results correspond to the realistic case with minimal slip based on the model in (21). The former would constitute the conventional DD/DG treatment if the results were scaled to match the VG/GG results by fitting the mobility, and thereby defining effective mobilities μ eff n . That the DG correction is of secondary importance, as was asserted earlier based on a scaling argument (since t ox is relatively large), is evident from comparisons of the DD and DG simulations and the VG and GG results shown in Fig. 7. We note that the DD current is slightly higher than the DG current because of a DG-induced threshold voltage shift, and the GG current is slightly larger than the VG current because of the additional effect the DG repulsion has in reducing surface scattering.

VI. CONCLUSION
The DD description of electron transport has had a remarkably long reign as the standard for detailed engineering-oriented simulations of semiconductor devices. Undoubtedly this is due to its being a classical field theory (rather than being a microscopic theory like one based on the Boltzmann equation), a modeling approach renowned for being parsimonious, efficient, and accurate [12]. However, in the case of the DD description the last attribute often results from fitting rather than correct physics, and while this phenomenological usage is of undoubted practical value, it is still important to understand the actual range of the DD approach as a physical theory. Limits on DD theory's range can of course originate in flaws in its very foundation (e.g., it is not always justified to treat the conduction electrons as a continuum), but error more often enters via the specific material response functions used. This is seen in mobility models, of which a wide variety have been proposed, and a good choice (e.g., to represent velocity saturation) is obviously critical for accuracy. A trickier class of constitutive errors arises when non-local effects become significant and gradient corrections need to be considered. This is well illustrated by DG theory in which the particular response function describing the electron gas gets corrected to account for effects of quantum non-locality that are manifested when the electron deBroglie wavelength is not small compared to the geometry or flow length scales of a device [2]. The present paper goes beyond DG theory to treat another type of non-local effect with two noteworthy aspects: (a) Non-local effects can arise not only quantum mechanically as in DG theory, but can also originate classically when the electron mfp is not small compared to the geometry or flow length scales of a device. This type of non-locality has long been studied in gas dynamics where it characterizes the Knudsen regime [4], [5], [6]. Semiconductor reseachers have been aware of this physics as well [13], [31], however its impact on macroscopic transport theory has not been systematically explored [19]. (b) Non-local effects (of classical or quantum origin) can affect the transport theory not only through the need for gradient corrections to the description of the fluid itself (e.g., an electron gas in DG theory [2] or a molecular gas in Knudsen gas dynamics [4], [5], [6], [11]), but for electron transport in solids it can also necessitate corrections to the response function that characterizes the fluid-solid interaction. To incorporate this physics we followed [2] and [11] in using methods of classical field theory [12], and created new versions of DD and DG theory that we call velocity-gradient (VG) theory and generalized-gradient (GG) theory, respectively. The forms of both the DG and VG corrections are agnostic as to whether the underlying mechanism is classical or quantum mechanical since the macroscopic derivations rely solely on macroscopic principles of thermodynamics and invariance (as well as on the underlying continuum assumption). As was the case with DG theory [2], [37], the exact form of the VG correction proposed herein should be regarded as a provisional lowestorder expression to be verified or emended by comparing with appropriate experiments or microscopic simulations [38]. Beyond this, the VG and GG theories as presented here might well have value as phenomenologies much as is true of the DD and DG theories [2].
To illustrate the VG and GG theories we applied them to the analysis of inversion layer transport in a MOSFET. The relevance to this case stems from the fact that the mfp is not small compared to the inversion layer thickness. Our study yields a new physics-based approach for treating the surface scattering of channel carriers that is shown to provide reasonable agreement with the known behavior of silicon MOSFETs [34], [35] when the gradient parameters are chosen properly and the effect of the confinement on the slip at the interface is accounted for. This is of interest as physics, but its practical value is less clear given the well established use of DD theory as a phenomenology wherein the effective channel mobility is used as a fitting parameter. However, it may be of more value for the analogous but more complicated situations of SOI FETs [38] and FinFETs [39]. In addition, there are other applications where non-local transport effects are even more dominant such as those involving high-field and nanoscopic transport, and in future we look to investigate the utility of the VG and GG theories for such problems. This seems especially worthwhile given that the DD and DG theories continue to be widely used for analyzing mainstream semiconductor devices despite the extreme level of scaling of recent technology generations [40], [41] as well as for devices in materials such as GaN [42] and SiC [43]. Often these applications would seem to challenge the underlying assumptions of DD theory, and if one desires more than a phenomenology to curve-fit experiment, then extensions of DD/DG theory like VG/GG theory will become increasingly germane.

APPENDIX OTHER POSSIBLE EXPLANATIONS FOR THE "UNIVERSAL" CHANNEL MOBILITY IN SILICON
One alternative explanation for the lack of agreement in Fig. 5 between the VG simulation with constant α (green dashed curve) and experiment (blue squares) is that, as the effective field increases, a higher-order correction term to (12b) is becoming significant. Of course having an extra coefficient will mean that better fits to data can be obtained, however, we have been unable to discover a form for the correction that produces excellent agreement with experiment, and especially with the approximately linear variation of μ n that is observed when E eff is large.
A second hypothesis regarding the disagreement of Fig. 5 is that the known roughness of the Si-SiO 2 interface [A1] could be responsible, with the conception being that under conditions of strong confinement the channel carriers would feel the roughness more strongly. To explore this we modified the geometry of Fig. 2 to include varying amounts of sinusoidal roughness, re-ran the calculations, and indeed could find better agreement when the roughness amplitude was about 1nm (data not shown). However, these fits also never gave the near-constant slope of the μ eff plot at high E eff . Whether some more complicated roughness profile could lead to more satisfactory results is not known.