Portfolio Optimization Using a Consistent Vector-Based MSE Estimation Approach

This paper is concerned with optimizing the weights of the global minimum-variance portfolio (GMVP) in high-dimensional settings where both observation and population dimensions grow at a bounded ratio. Optimizing the GMVP weights is highly influenced by the data covariance matrix estimation. In a high-dimensional setting, it is well known that the sample covariance matrix is not a proper estimator of the true covariance matrix since it is not invertible when we have fewer observations than the data dimension. Even with more observations, the sample covariance matrix may not be well-conditioned. This paper determines the GMVP weights based on a regularized covariance matrix estimator to overcome the abovementioned difficulties. Unlike other methods, the proper selection of the regularization parameter is achieved by minimizing the mean squared error of an estimate of the noise vector that accounts for the uncertainty in the data mean estimation. Using random-matrix-theory tools, we derive a consistent estimator of the achievable mean squared error that allows us to find the optimal regularization parameter using a simple line search. Simulation results demonstrate the effectiveness of the proposed method when the data dimension is larger than, or of the same order of, the number of data samples.


Introduction
Decision making with regards to investment in the stock market has become increasingly more complex because of the dynamic nature of the stocks available to investors and the advent of new unconventional and risky options [1].Throughout the years, the portfolio optimization problem has attracted the attention of many signal-processing researchers due to its close relationship to the field.The portfolio optimization problem aims at achieving the maximum possible returns with the least volatility percentage [2].The Economist, Harry Markowitz, introduced the modern portfolio theory, or mean-variance analysis (MVP), in [3].Other portfolios such as the global MVP and the maximum sharp ratio portfolio (MSRP) have been proposed as improvements of the MVP.Portfolio optimization utilizes the available financial data to reach conclusions regarding the allocation of wealth to each of the available stocks.The most important measurement in portfolio optimization is the data covariance matrix (CM).
CM estimation in the classical signal processing framework relies on asymptotic statistics of a number of observations, n, which is assumed to grow largely compared to the population dimension, p, i.e., n/p → ∞ as n → ∞ [4].However, many practical applications, such as finance, bioinformatics and data classification, require an estimate of the CM when the data dimension is large compared to the sample size [5].In such cases, it is well known that the default estimator, i.e., the empirical sample covariance matrix (SCM), is usually ill-conditioned, leading to poor performance.
If the case where p > n, the SCM is not invertible; whereas for p < n, the SCM is invertible but might be illconditioned, which substantially increases estimation error.In other words, for a large p, it is not practically guaranteed that the number of observations is sufficient to develop a well-conditioned CM estimator [6].Such scenarios have motivated researchers to look into estimation problems in the high-dimensional regime [4].
In scenarios with limited data, a regularized SCM (RSCM) estimator of the following general form is widely used [5]: where Σ is the SCM defined in (7) (further ahead), β, γ ∈ R + are the regularization, or shrinkage, parameters.These parameters can be determined based on minimizing the mean-squared error (MSE), which results in oracle shrinkage parameters, β o and γ o , as follows [5,6]: where .F denotes the Forbenius matrix norm.The estimation of (β o , γ o ) based on (2) depends on the true CM, Σ.To circumvent this issue, Ledoit and Wolf [6] proposed a distribution-free consistent estimator of (β o , γ o ) in high-dimensional settings.The work in [5] assumes that the observations are from unspecified elliptically symmetric distribution.The consistent estimator proposed in [7] uses a hybrid CM estimator based on the Taylor's M-estimator and Ledoit-Wolf shrinkage estimator, which suits a global minimum variance portfolio (GMVP) influenced by outliers.
A similar approach based on the M-estimator is proposed in [8], considering n > p with fully automated selection of the shrinkage parameters.The minimum variance portfolio estimator in [9] is based on certain sparsity assumptions imposed on the inverse of the CM.The work presented in [10] proposes a different RSCM estimator by manipulating the expression of the GMVP weights.
In this paper, we propose a single-parameter CM estimator.Instead of minimizing the MSE, as in (2), we minimize the MSE of the estimation of the sample noise vector.We utilize RMT tools to obtain a consistent estimator of this MSE.The value of the regularization parameter γ is selected as the one that minimizes the estimated MSE.By choosing to minimize the MSE of the noise vector's estimation, we consider the inaccuracy of estimating the true mean.

Global Minimum Variance Portfolio
We consider a time series comprising y 1 , y 2 • • • , y L logarithmic returns of p financial assets over a certain investment period.We assume that the elements of y t , (t = 1, 2, • • • , L) are independent and identically distributed (i.i.d.) and are generated according to the following stochastic model [11]: where µ t ∈ R n×1 and Σ t ∈ R p×p are the mean and the CM of the asset returns over the investment period, and x t is an i.i.d.random noise vector of zero mean and identity CM.For simplicity, we drop the subscript t from µ t and Σ t .
For the investment period of interest, we define w ∈ R p as the asset holdings vector, also known as the weight vector.
The GMVP optimally minimizes the portfolio variance under single-period investment horizon, such that the weight vector is normalized by the outstanding wealth [11], i.e., where 1 p is a column vector of p 1's.The solution of ( 4) can be obtained by using the Lagrange-multipliers method, which results in the optimum weights [7]: The CM in ( 5) is unknown and should be estimated.As stated earlier, the SCM estimate does not perform well because it is usually ill-conditioned; hence, we apply the RSCM estimator and ( 5) becomes where Σ RSCM is the RSCM which can take the form of (1), for example.In the following section, we develop a RSCM estimator method and properly set the value of its regularization parameter.

The proposed Consistent Vector-Based MSE Estimator
The SCM, Σ, and the sample mean, µ, can be estimated from the n past return observations as follows: We notice that computing Σ using (7) involves evaluating the sample mean, not the true mean.This can worsen performance, especially for a small number of observations.Subtracting µ from both sides of (3), we obtain where δ µ − µ and ∆ = Σ . Eq. ( 9) can be viewed as a linear model with bounded uncertainties in both Σ 1 2 and µ [12].We seek an estimate, x t that performs well for any allowed perturbation (∆, δ) by formulating the following min-max problem [12]: A unique solution can exist which takes the form [12] x x t is a function of γ, which when properly set leads to the best estimate of x t .It is easy to recognize that ( Σ + γI) −1 can be used as an estimator of the CM inverse, i.e., Σ −1 Such estimator is widely used in the literature, e.g., [13,14,15,16,17,18,19,20,21]; to name a few.The optimal value of γ that estimates Σ −1 γ is the one that minimizes the MSE for estimating x t .That is We choose the optimal γ o as follows: γ o = arg min MSE(γ).( 14) The choice of minimizing the MSE is reasonable because, under certain conditions, the minimization problem in (4) and the minimum MSE are equivalent [22], [23].Unlike the other methods, it is remarkable that the uncertainty in estimating the mean is incorporated in (13).We expect the effect of the uncertainty in the mean estimation to be high when we have a limited number observations.Also, unlike the methods that are based on (2), when we search for the optimal γ that minimizes (13), we actually estimate the inverse of the CM rather than estimating the CM itself.This is important because we use it in (6).We obtain the following normalized (by n) expression of the MSE (see Appendix 7): We observe that ( 15) is expressed in terms of the unknown quantity, Σ.In this case, using a direct plugin formula, i.e., substituting Σ with Σ results in However, the estimator in ( 16) is an inconsistent estimator in the regime where n and p grow at constant rate [24].To clarify, Fig. 1 plots an example of the derived MSE(γ) (15) and the plugin estimation method ( 16) versus a wide range values of γ.It is clear that using the plugin strategy does not help obtain the minimum MSE suitably.Instead, as the figure depicts, the plugin estimation method selects an improper γ that corresponds to a high MSE.
As an alternative remedy , we seek a consistent estimator of (15) by leveraging tools from RMT.To this end, we need to first obtain an asymptotic expression of (15).To do so, the following assumption should hold true.
Assumption 1 leads to the following theorem: where δ1 is the unique positive solution to the following system of equations: where hence, δ1 can be written as follows: Similarly, δ2 is obtained by solving  15), the plugin estimator, Eq. ( 16), the asymptotic curve, Eq (17), and the consistent MSE Eq. ( 22).The results are generated from Gaussian data that follows (3) with p = n = 300, [Σ] i,j = 0.6 |i−j| and µ = 1 p .
Proof: see Appendix 8. Now, we are in a position to reveal the consistent estimator of (15).
Theorem 2 Under Assumption 1, the consistent estimator of (15) is given by ( 22) where δ1 and δ2 are the consistent estimators of δ 1 and δ 2 , respectively, and are given by Proof: see Appendix 9.
Back to Fig. 1 which compares the derived MSE with the asymptotic formula (17) and the consistently estimated MSE (22).It can be seen clearly that the consistent MSE is more suitable to obtain the value of γ that minimizes (15).
A closed form solution for γ in ( 22) is infeasible, so we rely on using a line search, where we search for γ that minimizes (22) within a predefined range.

Summary of the proposed VB-MSE (vector based-MSE) method for Portfolio Optimization
1. From the historical data estimate Σ using Eq. ( 7).

Use γ o to compute
4. Calculate w GMVP from ( 6) by using

Performance Evaluation
As conventionally described in the financial literature, we implement the out-of-sample strategy defined in terms of a rolling window method (see [7]).At a particular day t, the training window for CM estimation is formed from the previous n days, i.e., from t − n to t − 1, to design the portfolio weights, w GMVP .The portfolio returns in the following 20 days are computed based on these weights.Next, the window is shifted 20 days forward and the returns for another 20 days are computed.The same procedure is repeated until the end of the data.Finally, the realized risk is computed as the standard deviation of the returns.The following list describes the data from different stock market indices used in our evaluation:  From Fig. 2 (a) -(f), we can conclude that, on average, the proposed VB-MSE method compares favorably to all the benchmark methods tested in this paper.The method is also more consistent over the various datasets.

Conclusion
In this paper, we have proposed a regularized covariance matrix estimator under high-dimensionality settings.The proposed method searches for the optimal regularization parameter based on a consistent estimator of the MSE of the estimated vector.Portfolio optimization results from real financial data show that the proposed method performs reasonably well and outperforms a host of benchmark methods.

Mathematical Tools
For convenience, we write Equation (3) in matrix form where X = [x 1 x 2 • • • x n ] with x i ∼ N (0, I p ).We need to express the SCM in (7) in an appropriate matrix form as well, as follows: where B ∈ R p×n .It can be immediately recognized from (7) that B is

Figure 2 :
Figure 2: Annualized realized risk versus training window length for different stock indices.

Fig. 2 (
Fig. 2 (a) plots the result of the S&P 100 index from 2 Jan. 2015 to 30 Dec. 2016.As can be seen from the figure, the performance of the proposed VB-MSE method outperforms all other the methods except at n = 80 and 100, where it is slightly worse than Ell1-RSCM aand Ell3-RSCM.Similarly, VB-MSE has a superior performance in Fig. 2 (b), which plots the result from 7 Jan.2014 to 31 Dec. 2015.However, at n = 20 and 80 Ell1-RSCM and Ell3-RSCM perform better.The realized risk for the HSI index is depicted in Fig. 2 (c) from 1 Jan. 2016 -27 Dec. 2017.The proposed method has a comparable performance to Quest 1, Ell1-RSCM and Ell3-RSCM at n = 20, 40 and 60 but it outperforms all the methods for 100 < n ≤ 340.The results of the XMI index from 4 Jan.2016 -29 Dec. 2017 and from 10 Jan. 2014 -31 Dec. 2015 are shown in Fig. 2 (d) and Fig. 2 (e), respectively.Overall, in both figures, VB-MSE is the best performing method.Finally, Fig. 2 (f) plots the realized risk of the S&P 500 index from 10 Jan. 2015 -31 Dec. 2017.The figure shows clearly that the proposed method outperforms the other methods when 200 ≤ n ≤ 400.