Improved Chebyshev Spectral Method Modeling for Vector Radiative Transfer in Atmospheric Propagation

An improved spectral coefficient method based on Chebyshev polynomials of the second kind is employed to solve the vector radiative transfer (VRT) equation under the assumption of parallel-plane atmosphere and Rayleigh scattering. The solver is extended to consider a Lambertian surface at the bottom of the atmosphere (BOA). The computational properties of the proposed algorithm are analyzed, and the validity of the implemented method is tested against a publicly available benchmark dataset.

A plethora of numerical methods exist to solve the related integro-differential equations; in-depth collections of various solving approaches can be found in [15], [16], [17], [18], and [19].Among them, the spectral coefficient method based on Chebyshev polynomials of the first kind presented in [20] (and extended to time-dependent and inhomogeneous cases in [21]) is a particularly elegant, straightforward, and versatile technique.It is based on a discrete ordinate approach in which the spatial dependence of the Stokes vector is approximated Emanuele Tavanti was with the Department of Electrical, Electronic, Telecommunications Engineering and Naval Architecture (DITEN), University of Genoa, 16145 Genoa, Italy.He is now with the Department of Information Engineering (DII), University of Pisa, 56122 Pisa, Italy (e-mail: emanuele.tavanti@unipi.it).
Color versions of one or more figures in this article are available at https://doi.org/10.1109/TAP.2024.3378831.
Digital Object Identifier 10.1109/TAP.2024.3378831through a summation of first kind Chebyshev polynomials.The final step of this algorithm consists in solving a linear system of equations in which the coefficient matrix is almost banded and sparse (in contrast with the collocation spectral methods [22], [23]), resulting in memory-efficient storage and the possibility to use specifically optimized routines.This method has been employed with the Henyey-Greenstein phase function and Mie scattering.The latter is described by the Mie theory, which provides the exact electromagnetic field when a plane wave impinges on a homogeneous sphere of isotropic material.Also two approximations to describe the scattering for a spherical particle have, however, been identified, namely the Rayleigh and optical approximations [24], [25].These approximations and the Mie scattering can be distinguished by defining the size parameter x = 2πa/λ 0 , where a is the radius of the sphere and λ 0 is the wavelength of the incident radiation in the vacuum (a more general approach exists for surrounding mediums other than vacuum), and m is the refractive index of the material of which the sphere is made.If x ≪ 1 and |mx| ≪ 1, the Rayleigh scattering approximation can be used.On the other hand, if x ≫ 1, the optical approximation is feasible.Between these two approximations, the Mie scattering holds, and a proper description of the electromagnetic field is given by the Mie theory (this theory can be used to have a NOT approximated computation of the electromagnetic field, as previously stated).
In the context of the atmospheric propagation of light and waves at radio frequencies, the conditions x ≪ 1 and |mx| ≪ 1 hold in several scenarios of particular interest [26]; hence, the Rayleigh scattering is noteworthy.At the best of the authors' knowledge, the method in [20], [21] has not been developed for the Rayleigh scattering.Moreover, the approach in [20], [21] considers a totally non-reflecting surface at the bottom of a parallel-plane medium, namely a "black surface"; in other to better account for the reflecting properties of a real ground, a Lambertian surface [27], [28], [29] should be consider.However, the extension of the aforementioned algorithm to manage a Lambertian surface does not appear to be straightforward.Indeed, the final linear system involves the Chebyshev expansion coefficients of the spatial derivative of the Stokes vector as unknowns, leading to a not very agile enforcement of the conditions of the Lambertian surface.
In the present article, the improved spectral scheme reported in [30], which leads to almost banded linear systems of equations involving directly the Chebyshev expansion coefficients of the Stokes vector as unknowns, is applied to solve the VRT equation in a parallel-plane atmosphere.The physical phenomenon assumed at the base of the electromagnetic propagation is the Rayleigh scattering.The related boundary value problem (BVP) is, moreover, modified in order to account for a Lambertian surface at the bottom of the atmosphere (BOA), which, thanks to the kind of involved unknowns, is easily numerically enforced by means of boundary bordering.Beyond the physical interest of the proposed algorithm, it has noteworthy computational properties, specifically regarding the sparseness of the aforementioned linear systems of equations and its multigrain parallelizability [31].This latter is uncommon in the landscape of the available VRT solvers, which could not be parallelizable (e.g., the widely adopted Successive Orders of Scattering [32], [33]) or could not be flexible enough to exploit parallel hardware.An analysis of the computational properties will be presented, and this will also provide some insights on how to efficiently implement the solver.
The article is organized as follows.In Section II, Chandrasekhar's development of the VRT equation under the assumption of parallel-plane atmosphere and Rayleigh scattering is summarized.Right after, the numerical approach is presented, and the Lambertian surface is introduced compatibly with Chandrasekhar's setting and the employed discretization.An analysis of the computational properties is also shown.In Section III, the accuracy of the method is tested against the publicly available benchmark dataset commented in [34].Finally, conclusions are drawn in Section IV.

II. VRT IN A PARALLEL-PLANE ATMOSPHERE AND RAYLEIGH SCATTERING
The first part of this Section is dedicated to fundamental equations for the radiation problem of interest.Thereafter, the discretization approach to solve them numerically is presented.Proper boundary conditions will be defined and enforced, and the selection of a proper number of Chebyshev polynomials of the second kind will be discussed.Finally, the computational characteristics of the resulting code will be highlighted.

A. VRT Equation for Parallel-Plane Atmosphere and Rayleigh Scattering
The VRT equation in a parallel-plane atmosphere can be stated as follows [1]: where i = [I l , I r , U, V ] T is the modified Stokes vector [1], [2], τ = ∞ z ρσ t dz ′ is the optical thickness (ρ is the number of particles in a unit volume and σ t is the extinction cross section), ν = cos θ ∈ [−1, 1] is the cosine of the zenith angle θ , φ ∈ [0, 2π ) is the azimuthal angle, and P is the phase matrix (defined later).It is important to highlight that the phase matrix embeds the information regarding the considered kind of scattering; in the context of the present text, it describes mathematically the effect of the scattering in the Rayleigh regime on the Stokes vector.The top of the atmosphere (TOA) is located at τ = 0, whereas the BOA is at τ = τ 1 .A radiation beam impinges on the TOA from the propagation direction identified by the zenith angle π − θ 0 (corresponding to cos(π − θ 0 ) = − cos(θ 0 ) = −ν 0 ) and the azimuth angle φ 0 .The net flux of the impinging radiation with respect to a unit area perpendicular to the direction of propagation is In the case of Rayleigh scattering, the phase matrix can be decomposed as follows [1]: where, setting φ = φ ′ − φ In this case, the solution to (1) can be formulated as follows [1]: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Discrete Ordinate Approach
A first step toward the numerical implementation consists in the discretization of the ν and ν ′ variables, known as ordinates.Choosing as discrete values the abscissae of a double-Gauss-Legendre quadrature of 2N L order [35], (17) can be approximated as [20] where w j are the corresponding weights.The double-Gauss-Legendre quadrature is chosen in place of the classic Gauss-Legendre one because its set of abscissae is clustered toward the critical ordinate ν = 0 other than ν = ±1 [35].
In this way, the integro-differential equation has been reduced to a system of 2N L ordinary differential equations (ODEs).

C. Chebyshev Coefficient Spectral Method
At first, the change of variable Thereafter, the ξ function is approximated by a partial sum of a Chebyshev series [20] as where T k is the first kind of Chebyshev polynomial of order k.
The same expansion is applied to the exponential term in ( 19) From now on, the differentiating scheme reported in [30] will be pursued.Thanks to the relationship where U k is the second kind of Chebyshev polynomial of order k, and the partial sum in (20), the derivative of ξ appearing in (19) can be expressed as Moreover, the relationship allows to restate the function ξ with the base U k as follows: A similar expansion also holds for the exponential term in ( 21) Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
+ 1 2 By inserting ( 23), (25), and ( 26) in (19), it is possible to exploit the orthogonality between the elements of the base U k to collect a set of N C equations for each ν i .Also, by stacking the equations for the whole set of ordinates ν i , the following system of equations can be built: where ν is a column-vector containing N C repetitions of the vector ν(m) , and S (1) with ℵ = 2N L m × 2N L m, 0 ℵ and I ℵ zero and identity matrices, respectively, having dimension k indicating the kth element of the vector in argument, and diag N C (•) building a block-diagonal matrix where each block is given by the matrix in argument.
It is important to note that the last 2N L m rows of D0 (the differentiation operator) contain only zero entries.Consequently, the corresponding equations can be removed since they do not introduce information.In order to obtain a solvable problem, additional equations are added by exploiting the boundary conditions, as detailed in Section II-D.In accordance with [30], the last 2N L m rows are, moreover, permuted with the first 2N L m ones, so the linear system is almost upper triangular.

D. Boundary Conditions
At the TOA, there is no diffused radiation entering the atmosphere.Hence, the following condition must be enforced: Because of (7), this influences the boundary conditions endowing ( 10), ( 13)-( 16) for the functions ǐ(0) , (1) , (2) , V (0) , and V (1) , respectively.Through ( 8), ( 9), (11), and ( 12), since the condition at the TOA must hold for every ν < 0 and φ ∈ [0, 2π ), it can be seen that ( 35) turns out in the following boundary conditions: Because of the expansion (20), τ = 0 ⇔ s = −1 and T k (−1) = (−1) k , each of these conditions can be translated to A similar condition would be applied at the BOA if a totally absorbing ground (black surface) was considered; however, a more realistic setting is given by a partially reflecting ground.This can be done considering a Lambertian surface [36], which can be modeled as where Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where F is the reflection matrix, g is the albedo of the surface, and ζ is given by stacking N L times the vector f.From ( 39) to ( 40) on, the + and − signs in the superscripts indicate that the vectors contain functions evaluated in positive and negative ordinates v, respectively; in the specific case of î+ and î− , these contain Stokes vectors that refer to upwelling and downwelling diffuse radiation, respectively.Since the herein adopted mathematical description of the Lambertian surface comes from [36], where the alternative Stokes parameterization [I, Q, U, V ] T is considered, the matrix T lr →I Q allows us to adapt that description to our setting, transforming Qualitatively speaking, (38) shows that the upwelling diffused radiation at τ 1 (namely, the diffused radiation reflected by the surface, contained in î+ (τ 1 , φ)) and the downwelling diffused radiation again at τ 1 (namely, the diffused radiation impinging on the surface, contained in î− (τ 1 , φ)) are related by means of two quantities: the reflection matrix F, and the beam impinging at the TOA, which undergoes an exponential attenuation (given by e −(τ 1 /ν 0 ) ) when traversing the atmosphere.It is important to note that for g = 0, (38) collapses to where the implication is due to the invertibility of Tlr→I Q .This equation means that there is not an upwelling diffused radiation from the ground; therefore, for g = 0 the Lambertian surface turns out to be a black one.Since (38) must hold for every |ν| and φ ∈ [0, 2π ), inspecting (8), ( 9), (11), and (12), it can be seen that ( 38) must be imposed only through ǐ(0) , whereas the other unknown functions must be constrained as if a black surface was at the ground.Therefore, the following set of boundary conditions must hold at the BOA: where and ζ 0 is obtained by stacking N L times the vector It is important to note that ǐ(0),+ and ǐ(0),− are intrinsically different from î+ and î− that appear in (38); indeed, these last contains full Stokes vectors, as mentioned shortly after (47), whereas the first ones contains ǐ(0) vectors, which appear in the decomposition of the Stokes vectors [as can be seen in ( 7) and ( 8)].
Considering that τ = τ 1 ⇔ s = 1, and T k (1) = 1, the following equation is achieved for the Chebyshev expansion of ǐ(0) : where On the other hand, (49) turns out to the following relationship for the functions (1) , (2) , V (0) , and V (1) : where a k ≡ a k with m = 1.From ( 37) and (59), the following block matrix equation for the function ǐ(0) can be built: where with k = 0, . . ., N C − 1.When g = 0, the surface turns out to be black, as introduced previously, and (61) becomes where Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
For the functions (1) , (2) , V (0) , and V (1) , from ( 37) and (60), the following block matrix equation holds for any g (m = 1): where Equations ( 61), or (63) when g = 0, and (65) can substitute the zero rows cited at the end of Section II-C (resulting in a boundary bordering), completing the linear system to solve.These are solved by means of QR factorizations to preserve numerical stability, as asserted in [30].It is noteworthy that the QR factorization can be implemented by means of Givens rotations or Householder reflections [37], [38]: the Givens rotations are indicated in [30] for the numerical stability, but many implementations (such as in LAPACK [39] and MAT-LAB [40], [41]) use the Householder reflections, since they are computationally more efficient [38].The numerical stability can, however, be achieved in practice with the Householder reflections too, as asserted in [38]; thus, our code also adopts this kind of implementations.

E. Selection of the Number of Chebyshev Polynomials
One critical task of the presented technique is the selection of N C .An adaptive QR factorization is proposed in [30] to solve the linear systems and automatically choose the optimal number of Chebyshev polynomials; however, since the values of N C needed to accurately solve the problem of interest are significantly lower than the ones resulted in [30], as will be seen in the Section III, a much simpler approach is herein applied.The linear systems associated with the functions ǐ(0) , (1) , (2) , V (0) , and V (1) are dealt with separately.The number N C is incremented by N C+ units, and the system is rebuilt and solved until the following condition is met: where ϵ is a user-defined threshold and a old is the solution of the linear system with the previous number of Chebyshev polynomials (a old is zero-padded to match the length of a).
The first property comes from the two levels of parallelization that can be implemented simultaneously, namely a coarse and fine grained one.The coarse-grained parallelism regards equations ( 10), ( 13)- (16).Indeed, these are independent of each other, and so they can be solved separately.The fine-grained parallelism is about the construction of the coefficient matrix in (27).Fig. 1 exemplifies the structure of this matrix.It is visible that the almost banded pattern with an upper boundary bordering is given by the introduction of (65).Each nonnull block of the coefficient matrix, delimited by the cells of the dotted grid in the figure, is independent of each other, as can be seen by the single components ( 30)-( 33), ( 62), (64), (66).These blocks can, thus, be built separately, and this can happen for each equation ( 10), ( 13)-( 16); therefore, this fulfills the concept of multigrain parallelism.
The sparseness of the coefficient matrix is clear again from Fig. 1.As an example, let us consider N L = 40, N C = 70, and m = 1 (this combination of parameters corresponds to the case related to the last row of Table VII for the function (1)  in Section III).The sparsity, given as the ratio between the nonzero values of the coefficient matrix and its total number of entries, is about 3%.This enables the adoption of available dedicated data structures to avoid the allocation of useless and big amount of space in the RAM [42].
It is interesting to study theoretically the computational burden of the proposed solver.To highlight the computationally heaviest steps of the algorithm, we look for an upper bound of the number of FLoating-point OPerations (FLOPs).The real number of FLOPs will be significantly lower in practice.The computation of this upper bound is reported in the Appendix, where it can be seen that the dominant term is the one containing N 3 C N 3 L , leading to an asymptotic arithmetic complexity O(N 3 C N 3 L ).This, however, is due to the hypothetical adoption of the classic QR factorization for dense matrices, as explained in the Appendix.Thanks to the availability of the previously cited specific implementations for sparse matrices [40], [41], this computational burden can be significantly reduced in practice.
is observed at the zenith angles having |ν obs | in the set 0.02, 0.06, 0.1, 0.16, 0.2, 0.28, 0.32, 0.4, 0.52, 0.64, 0.72, 0.84, 0.92, 0.96, 0.98, 1 (68) and azimuth angles φ obs ∈ {0, 30, 60, 90, 120, 150, 180}[ • ].The net flux of the impinging beam with respect to a unit area perpendicular to the direction of propagation is π f = π [0.5, 0.5, 0, 0] T .Since f V = 0, from (9) it can be seen that V is always zero; therefore, the following analysis will focus on the first three Stokes parameters only.It is important to note that the Stokes vector considered in [34] is defined differently with respect to the one herein adopted.In particular, the relationship between the two parameterizations is In the following, the Stokes vector obtained from the proposed method will be transformed to the convention reported in (69), in order to directly compare the results with [34].
It is important to highlight that the benchmark dataset involves only unpolarized impinging electromagnetic radiation, as can be seen from f U = f V = 0; however, the proposed computational method can be applied to impinging radiation having any kind of polarization since there are no constraints on the vector f.Beyond the tabulated values of τ available in [34], the optical thickness is, moreover, the input of the algorithm that enables the numerical study of the electromagnetic propagation in the presence of wanted atmospheric phenomena; as an example, the effect of the Rayleigh scattering for a wave traversing a rainfall can be surveyed feeding the code with an optical thickness computed by means of a drop-size distribution (see e.g., [2], [28]).
The increment N C+ = 10 and the threshold ϵ = 10 −3 are adopted.In order to establish the accuracy of the presented algorithm, the following relative error metric is computed (derived from [43], [44], [45]): where p C ∈ {I C , Q C , U C } is the reference value taken from [34], whereas pC ∈ { Ĩ C , QC , Ũ C } is the corresponding Stokes parameter estimated by the proposed method.The maximum value of p C over v obs and φ obs is used in the denominator of (70) to avoid possible divisions by zero or by very low values of p C .To summarize the performance of the presented algorithm, Table I reports the averages μ and the standard deviations σ of E computed among ν obs , φ obs , and ν 0 , keeping fixed τ 1 , g = 0 (black surface) and observing the emergent radiation at the level TOA (the combinations of variables and parameters (τ 1 , ν obs , φ obs ; ν 0 , g), which lead to zero value at the denominator of (70) are excluded).The same kind of information is reported in Tables II-VI for the level BOA and g = 0, TOA and g = 0.25, BOA and g = 0.25, TOA and g = 0.8, BOA and g = 0.8, respectively.As can be seen, the relative error is quite low for each considered scenario,  showing the proposed technique allows to achieve very accurate results in all the considered cases.VII, moreover, reports the numbers of Chebyshev polynomials of the second kind to achieve the condition (67) for the linear systems associated with the functions (1) , (2) , and ǐ(0) .As expected, higher optical thicknesses Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.EQUATIONS ASSOCIATED WITH FUNCTIONS (1) , (2) , AND Ǐ(0) (ν 0 = 1) require higher N C to properly describe the spatial dependence of the diffused radiation.
To show some examples of the output provided by the proposed method, Figs.2-7 illustrate the Stokes parameters among ν obs with g ∈ {0, 0.25, 0.8} at the TOA and BOA levels, when τ 1 = 1, ν 0 = 1, φ obs = 0. Finally, Fig. 8 reports the computational times against the optical thickness for the ground albedo g ∈ {0, 0.25, 0.8} for the most computationally demanding elevation of the impinging beam, namely the lowest one ν 0 = 0.1.The code ran in the MATLAB environment on a PC equipped with a CPU Intel Core i7-11700F and 64 GB of RAM.The peak usage of RAM has been 68 MB.As can be seen, the execution time increases as the optical thickness grows; this is related to the increasing N C required for growing τ 1 , as reported in Table VIII.Fig. 8 shows that the calculation time for g = 0 is significantly lower than g = 0.25 and 0.8.This happens because for g = 0 the surface at the BOA is black, and so (62) is substituted by (64), as outlined in   Section II-D; the matrices in (64) are purely diagonal matrices, whereas (62) are not.Therefore, g = 0 leads to a lower computational burden for the solver of the linear system of equations.When g ∈ {0.25, 0.8}, and so (62) is used, the computational times are essentially the same; also trying with   g closer to the extreme values 0 and 1 (the results are not reported here for the sake of brevity), the computational times do not change appreciably, indicating that the computational burden is weakly dependent on g when this is positive.This

TABLE VIII NUMBER OF CHEBYSHEV POLYNOMIALS TO SOLVE THE LINEAR SYSTEMS
OF EQUATIONS ASSOCIATED WITH THE FUNCTIONS (1) , (2) , AND Ǐ(0) (ν 0 = 0.1) fact is also supported by Tables VII and VIII, where the variations of N C among g are few or even absent.The computational times have been also compared with the ones provided in [44], which reports execution times on the same benchmark for the case τ 1 = 1, g = 0, and φ = 90 • .Although details concerning the computing platform are not indicated, even considering the lowest execution time for which the reported numerical accuracies in [44] are comparable with the ones reported in the present article, this is about 20 min against about 24 s needed by our method in the worst case, namely at the most computationally demanding elevation of the impinging beam, i.e., ν 0 = 0.1.It is, however, worth remarking that, as noted in [45] and [46], the framework of VRT comparisons would be significantly impaired by factors such as different implementations (programming languages, compilers, . ..) and different hardware/software environments, so that execution times are not commonly reported in the papers presenting VRT codes.
As a further comparison, we assess how much the improved spectral method [30] affects the computational burden in place of the original spectral scheme, implementing the latter in the same software and hardware environment.Since the original VRT solver [20], [21] does not consider the Rayleigh scattering, the phase matrix is accordingly substituted.The comparison can be done only for g = 0, since the original method considers only a black surface at the BOA.For the most computationally demanding case, given by τ 1 = 1 and ν 0 = 0.1, the execution time of the original method is 27 s, and the peak usage of RAM is 65 MB; therefore, the improved spectral scheme has not grown the computational burden of the original algorithm.

IV. CONCLUSION
An improved spectral coefficient method based on Chebyshev polynomials of the second kind has been applied to solve the VRT equation, which is a precious tool in the scope of atmospheric propagation for telecommunication and remote sensing.The Rayleigh scattering has been considered at the base of the propagation phenomena and a parallel-plane atmosphere is assumed.A Lambertian surface is, moreover, considered at the BOA to simulate a realistic environment.The reported discretization approach leads to a few sparse and almost banded linear systems of equations to be solved.An analysis of the computational properties is presented, and the algorithm has been tested against a publicly available benchmark dataset, showing very good accuracy in a wide range of cases.

APPENDIX ANALYSIS OF THE ARITHMETIC COMPUTATIONAL COMPLEXITY
An analysis of the arithmetic computational complexity of the developed procedure is provided in this Appendix.Since the solver modifies the value of N C iteratively (as shown in Section II-E), the upper bound of FLOPs for a single iteration (i.e., for the current value of N C ) is evaluated.We, moreover, perform the analysis focusing generically on (17), which models any of the five equations ( 10), ( 13)-( 16) in which the VRT problem is decomposed.Let us start with the analysis of the coefficient matrix in (27).The matrix D0 does not involve FLOPs to be built.The matrix product diag(ν) D0 needs products, because of the simple structures of the two matrices.
Because of the product with the scalar 2/τ 1 , the number of FLOPs to compute the first addend of the coefficient matrix in ( 27) is The matrix Ŝ(m) 0 does not need FLOPs.The third addend of the coefficient matrix in (27) requires the matrix ˆ .This, from (33), needs 4N 2 L evaluations of the m × m matrix function (ν, ν ′ ), and the resulting matrices must be multiplied by the weights of the double-Gauss-Legendre quadrature, giving a total number of FLOPs equal to where N is the number of FLOPs to evaluate (ν, ν ′ ) (it depends on which of the five equations ( 10), ( 13)-( 16) is selected).Once that ˆ is available, diag N C ( ˆ ) Ŝ(m) 0 can be performed.To obtain its number of FLOPs, the key point is to note that Ŝ(m) 0 can be seen as the sum of a diagonal matrix Ŝ(m) 0,d and a matrix Ŝ(m) 0,sd having only the 4N L mth superdiagonal: The product between the block diagonal matrix diag N C ( ˆ ) and multiplications.On the other hand, the product diag multiplications.After the products, the two resulting matrices have nonoverlapping nonnull blocks; hence, their sum can be achieved without resorting to additions.Overall, the number of FLOPs for diag The number of FLOPs to compute the coefficient matrix in ( 27) is, therefore, given by the sum of ( 72), (73), and (78) Let us now focus on the known vector in (27), with its expression shown in (34).Therein, we find the vector b(τ 1 , ν 0 ).This collects the expansion coefficients of (26), which are given by the inner products between exp(−τ 1 (s+1)/(2ν 0 )) and the basis functions U k (s); depending on how these integrals are computed, we indicate with N b the number of FLOPs to get b(τ 1 , ν 0 ).Another involved quantity is the m × 1 vector e(ν, ν 0 , f); this varies if (10) or one of ( 13)-( 16) is selected, so we generically indicate the number of FLOPs to get e(ν, ν 0 , f) with N e .About S (1) 0 , it does not need FLOPs.After these preliminary clues, the evaluation of ê(ν 0 , f, τ 1 ) requires the product S (1)  0 b(τ 1 , ν 0 ); because of the structure of S (1)  0 , each entry of the resulting vector implies three FLOPs (two multiplications and one addition), so 3N C FLOPs are performed.Therefore, from (34), the number of FLOPs to compute the known vector in (27) is Let us know focus on the enforcement of the boundary conditions, given by (61), (63), and (65).In (61), the matrices W k and the vector ř must be computed; however, from (62), it can be seen that the matrices W k require FLOPs only to get FT lr →I Q , which does not depend on k.The quantity T lr →I Q does not involves FLOPs, whereas F does.In particular, in (54) each block F−i has one multiplication [see (55)]; thus, for the matrix in square brackets in (54), N L FLOPs are performed (the first row of blocks is repeated).Given the product with Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
the scalar −2g, 2N L FLOPs are done for F. Now, we look at FT lr →I Q .From (54) and (52), it can be seen that the resulting matrix is (neglecting the constant −2g) It can be built with previously computed quantities, and so further FLOPs are not needed.About ř, we look at (56).
Because of the structures of T lr →I Q and ζ 0 , it can be seen that the resulting vector of T lr →I Q ζ 0 has the quantity 2 f 0 in one each three entries, whereas the remaining elements are zero.Thus, it can be built with only one FLOP.This is doubled after the product with the scalar gν 0 exp(−τ 1 /ν 0 ).The action of S is limited to the selection of entries, and so the number of FLOPs for ř is only 2; therefore, the number of FLOPs to enforce (61) is When g = 0, (63) must be forced in place of (61), but, since (63) is considerably simpler than (61) and involves fewer FLOPs, only (82) is used in the following to compute the total upper bound.About the boundary condition (65), FLOPs are not needed.The number of FLOPs to build the system (27), endowed with the boundary conditions, can, therefore, be evaluated as the sum of (79), (80), and (82), namely Now, the computational burden of solving (27) must be considered.As indicated at the end of Section II-D, this system is solved using the QR decomposition with Householder reflections.Rewriting (27) in the compact form Ca = k, the following steps need to be analyzed.
1) C is factorized in the matrices Q and R.
2) Right-multiplying with Q T , we get Ra = Q T k.
3) The system is solved by means of back substitution.As previously pointed out, C is a sparse matrix.Specialized QR factorization methods for this kind of matrix having remarkable efficiency are available in MATLAB, as reported in [40] and [41]; regrettably, in these references, an analysis about the number of FLOPs seems to be not available.Looking for an upper bound, the classic QR factorization with Householder reflections for dense matrices is, therefore, considered.This involves 4n 3 /3 FLOPs, where n here is the number of rows or columns of a square matrix [37], and, in our setting, this number of FLOPs traduces to Finally, we are able to provide an upper bound of the number of FLOPs for the full method, summing (83) and (86) (87)

Manuscript received 27
June 2023; revised 2 February 2024; accepted 5 March 2024.Date of publication 27 March 2024; date of current version 7 May 2024.This work was supported in part by Italian Ministry of Education and Research (MIUR) under the PRIN 2017 Project GREEN TAGS "Chipless radio frequency identification (RFID) for GREEN TAGging and Sensing" under Grant 2017NT5W7Z.(Corresponding author: Emanuele Tavanti.)

Fig. 1 .
Fig. 1.Structure of the coefficient matrix with upper boundary bordering for (1) (i.e., m = 1) when N L = 40, N C = 10.The indexes of the rows are at the left of the matrix, whereas the indexes of the columns are at the bottom.

4 3 ( 3 (
2N L m N C ) 3 .(84)The step 2 is a matrix-vector product needing a number of FLOPs equal to2N L m N C (4N L m N C − 1).(85)The back substitution in the step 3 needs (2N L m N C ) 2 FLOPs[37]; therefore, an upper bound of the number of FLOPs to solve the linear systems of equations is4 2N L m N C ) 3 + 2N L m N C (4N L m N C − 1) + (2N L m N C ) 2 .(86)

TABLE I AVERAGE
μ AND STANDARD DEVIATION σ AT LEVEL TOA AND g = 0 TABLE II AVERAGE μ STANDARD DEVIATION σ AT LEVEL BOA AND g = 0 TABLE III AVERAGE μ AND DEVIATION σ AT LEVEL TOA AND g = 0.25

TABLE IV μ
AND STANDARD DEVIATION σ AT LEVEL BOA AND g = 0.25

TABLE V AVERAGE
μ AND STANDARD DEVIATION σ AT LEVEL TOA AND g = 0.8

TABLE VI AVERAGE
μ AND STANDARD DEVIATION σ AT LEVEL BOA AND g = 0.8

TABLE VII NUMBER
OF CHEBYSHEV POLYNOMIALS TO SOLVE THE LINEAR SYSTEMS