Investigation of Optimal Data Encoding Parameters Based on User Preference for Cloud Storage

The erasure encoding scheme creates multiple coded data and parity fragments to protect the data from the losses. Nowadays, most storage systems like cloud storage utilize the erasure coding scheme to attain superior data consistency, reliability, and availability. Most of the existing literature focuses on either the cost of recovery or overhead due to the redundant storage without considering the interests of the users, such as high reliability and lower storage cost. We believe that the storage service provider should choose an appropriate encoding scheme with optimal values of two encoding parameters, i.e., data fragments and parity fragments. The values of these encoding parameters depend on the size of the input data and the Quality of Service ( $QoS$ ) requirements of the users, such as storage efficiency, availability, and recoverability. These parameters play a crucial role in providing higher reliability and lower storage costs. Therefore, in this paper, we investigate to identify optimal parameters to provide higher reliability and lower storage cost while considering the user’s preferences. We present the analysis of the Reed-Solomon coding scheme from the perspective of storage overhead, the probability of data availability, data recoverability, and storage efficiency to identify the optimal values of encoding parameters. We performed the experiments on the Reed-Solomon encoding schemes, and results are reported.


I. INTRODUCTION
Cloud storage service providers such as Dropbox, Microsoft, Google, and Amazon allow storing of the massive amount of data for both individual and enterprise customers in their datacenters [1], [2]. However, the storage systems may often experience the unavailability of customers data due to hardware and software failures. The erasure encoding scheme [3] creates multiple coded fragments of data and parity to protect the data from such losses. Nowadays, many Cloud Service Providers (CSP) employ Erasure Coding (EC) to reduce unexpected faults and minimize the data unavailability [4], [5]. However, the EC encoding scheme makes an encoding decision based on the static threshold value. The usage of such a static threshold-based encoding The associate editor coordinating the review of this manuscript and approving it for publication was Cristian Zambelli . may suffer from storage overheads and fragility in a dynamic demand scenario.
The main focus of the work in this paper is to study how effectively to provide the services benefiting both CSP and customers. We believe that any cloud system is cost-effective and productive if CSP deploys optimal encoding parameters (as suggested in our approach). To maximize their service performance and availability, minimize the impact of service failures, and enhance the business continuity, the reliability and performance are two crucial issues in cloud storage that, in turn, will influence the service quality [6]. In case CSP tweaks the parameters, the storage systems may experience the unavailability of customer's data mostly during peak times, and may lead to loss of customers, which may, subsequently, impact the business metrics [6], [7]. In the absence of the optimal strategy, the total operational and storage costs may relatively increase for providing the services to the users. VOLUME 8, 2020 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ And, it is more profitable if CSP adopts our strategy. For the expansion and leveraging of their business and improving users' trust, CSP should promise to incorporate the optimal strategy in the storage service, which helps in cost reduction.

A. MOTIVATION
CSPs store the user data in their datacenters. Since these datacenters are located geographically across the world, users often may encounter data unavailability. Moreover, the CSPs need to select the appropriate encoding parameters for different sized data to fulfill the dynamic demands of the users and to optimize the overall system costs while providing high reliability. The selection of suitable parameters is crucial, and the criteria for selecting these parameters depend on the size of input data and other Quality-of-Service (QoS) requirements such as storage efficiency, availability, recoverability. Keeping these points in mind, in this paper, we aim to select the optimal parameters to achieve data availability and recoverability.

B. CONTRIBUTIONS
The main contributions of this research are highlighted below: • We propose an algorithm for investigating the optimal encoding parameters that meet the users' expectations.
• We explore the QoS requirements and summarize the findings according to the user's requirements and file size.
• We perform extensive experiments in real-time and discuss the performance analysis of the operations.

C. ORGANIZATION
The rest of the paper is organized as follows. The notations and function definitions are defined in Section II. Section III introduces erasure coding techniques and summarizes the related work. Section IV discusses the aspects of selecting the optimal encoding parameters for erasure coding. The proposed methodology is presented in the Section V.
The results and analysis with experimental setup are presented in section VI. Finally, we conclude the paper in section VII.

II. PRELIMINARIES
In this section, Table 1 summarizes the notations used throughout the paper and then describes the set of function definitions that are used in our proposed work.

A. FUNCTION DEFINITIONS
• κ ←− getFileCategory( ) : This function takes the file as input. It computes the size of , and then determines and returns the file category (κ) based on its size, i.e, small, medium, and large.
• δ (α,β) ←− getEncodingPairs(ρ, κ): This function takes user preferences and file category as an input, and then determines and returns the optimal encoding pair(s) from the set of suitable encoding pairs (refer to Table 7).
This function determines and returns the optimal encoding pair from the set of suitable encoding pairs by calling Algorithm 1.
• (δ α , δ β ) ←− getSeparateFragments(δ γ ): This function splits the input set δ γ into two parts say δ α and δ β where δ α is the set of α number of data fragments and δ β is the set of β number of parity fragments.
• count ←− getPairsCount(δ (α,β) ) : This function counts and returns the number of pairs existing in the set of encoding pairs.

III. BACKGROUND AND RELATED WORK
In this section, first, we discuss the Erasure Coding (EC) scheme and its variants, and then we review the related works based on EC mechanisms.

A. ERASURE CODING
EC technique generates the input data D into α number of data fragments (D 0 , D 1 , D 2 , . . . , D α−1 ) and β number of parity fragments (C 0 ,C 1 ,C 2 ,. . ., C β−1 ) with a total of γ (= α + β) fragments. Maximum Distance Separable (MDS) [8], [9] property of the EC provides error recovery and data reconstruction in spite of any β number of unavailable fragments [10]. The recovery process uses the parity fragments to reconstruct the corrupted/unavailable fragments using rest of the data and parity fragments. We use the phrase fragment unavailable to indicate data loss, failure, corruption, or storage server unreachable.
The following section presents the widely used Reed − Solomon [11] and Cauchy Reed-Solomon [12] erasure codes that are used in this work for comparison and analysis of encoding pairs.

B. REED-SOLOMON (RS) CODES
Consider a typical storage system as shown in Figure 1 in which the symbol D i denotes the data disks where 0 ≤ i < α and C j denotes the parity disks where 0 ≤ j < β. It must satisfy the condition γ ≤ 2 w + 1 [11] where every strip is a w-bit word having w ∈ {8, 16, 32, 64}. Each word is a number between 0 and 2 w − 1. It applies γ × α generator matrix which operates in a Galois Field GF(2 w ) to perform several operations like addition, subtraction, multiplication, and division on these words [9].
A Vandermonde matrix is used to construct the Generator Matrix (GM). It computes a codeword by multiplying GM with the α data words and β coding words as in Figure 2.
The recovery process performs the inverse operation and multiplication operation which are used for solving the set of independent linear equations. The addition operations in GF(2 w ) are done by performing bitwise XOR, but the multiplication operations are bit complex and expensive [9], [11].

C. CAUCHY REED-SOLOMON (CRS) CODES
CRS [12] replaces the Vandermonde matrices by the Cauchy matrices and also reduces the expensive multiplications operation of RS codes. CRS uses an additional XOR operations to minimize the number of multiplications of RS codes. This alteration converts the Generator matrix G T from a γ ×α matrix of w-bit words to a wγ ×wα matrix of bits. CRS coding performs multiplication operation on all the strips rather than a single w-bit data words.
Each strip contains w fragments where w should satisfy the constraint γ ≤ 2 w . The fragment size must be a multiple of the machine word size to achieve high performance. This encoding process involves only XOR operations. Therefore, the coding fragment is constructed as the XOR of all data fragments that have one bit in the coding fragment row of G T . The process is depicted in Figure 3, which illustrates how the last coding fragment is created as the XOR of all data fragments identified by the last row of G T [12].

D. RELATED WORK
Several techniques are proposed in the literature for efficient cloud storage. A three-way replication technique [2], [10] supports easy access and repairability in storage, but it increases storage overhead cost. It requires two-thirds of the raw storage capacity to store redundant data. Whereas, Erasure Coding technique minimizes the Mean Time To Data Loss (MTTDL) and redundancies in storage. See [14], [15], and [16] for more understanding. Many authors proposed several EC schemes (e.g., [11], [12], [17], [18]) to achieve the fault tolerance and reliability in the storage systems. Schnjakin et al. [13] compared several EC schemes while providing their supporting libraries such as [19]- [21]. They determined an appropriate encoding algorithm for the Cloud-RAID system. Schuman and Plank [22] presented the encoding and decoding operations performance of a few erasure codes. While several authors evaluated the performance based on bandwidth and memory operations, others targeted to reduce the number of accesses to reconstruct the original data. Few others focused on some aspects such as read performance (e.g., [14], [23], [24]), coding performance (e.g., [22], [25]), fault tolerance (e.g., [26], [27]), proof of retrievability (e.g., [28]- [30]), and reliability (e.g., [31], [32]). Table 2 presents the summary of various works that employs erasure coding in their research for different purposes.
The reliability performance of erasure codes and its variant for several storage systems is studied in [14], [15], [33]- [37]. The reliability of erasure codes is evaluated in [16], [34], [38] based on Markov model. The estimation of Mean Time To Data Loss (MTTDL) for reliability analysis is studied in [16]. However, Greenan et al. [39] scrutinized the feasibility issues with the modeling of the modern storage systems for reliability analysis using the MTTDL and Markov model. Based on their study, they introduced the NOrmalized Magnitude of Data Loss (NOMDL) metric for reliability analysis. This metric is based on the failure and repair characteristics of the servers.
The work proposed in this paper, to achieve reliability, we rely on the data partitioning and fragment placement according to the encoding parameters of erasure codes. The partitioning of the input data is performed in order to minimize the impact of data loss. Each fragment is placed on a distinct server to maximize the tolerance against server failures.
In summary, we reviewed the most relevant literature related to erasure coding. However, most of the existing works have focused on the repair bandwidth, read latency, and storage overheads. But, our work focused on investigating optimal encoding parameters of erasure codes depending on various aspects like parameter selection and user preferences. To the best of our knowledge, the existing literature has not explored towards the identification of appropriate encoding parameters for efficient cloud storage. Therefore, our goal in this paper is to analyze and pick the optimal encoding parameters of erasure codes.

IV. ASPECTS OF PARAMETERS SELECTION
The optimal parameter value selection plays a crucial role to attain better storage efficiency, reliability, and availability in addition to the reduction in storage cost. In this section, we discuss the criteria to select the optimal encoding parameters for erasure coding.
The basic aspects of parameter selection to provide the quality of service to the users expectations depend on the number of fragments created, the size of each fragment, and the file size. Hence, we incorporate, primarily, the input file size and user preference to investigate the optimal value of the encoding parameters to achieve the QoS requirements. We classify the input files into three different categories based on their sizes such as small(256 KB), medium(512 KB or 1024 KB), and large(256 MB or 512 MB). Typically, the size or the number of categories can be variant.

A. USER PREFERENCES
The users specify their preferences in terms of availability of the resources, efficiency of the performance, the recoverability of the data, and acceptable overhead of storage. These preferences, predominantly, influence the cost of the storage for the users. So, the users notify the current demand of these preferences as HIGH, AVERAGE, or LOW. For example, if the user's preference for availability is HIGH, then probably, it will influence the storage cost. If the user's preference for the efficiency is HIGH, then probably, it will influence the storage overhead. So, we experimented with different possible cases for the user preferences with the combinations of the current demand including HIGH, AVERAGE, and LOW.
To investigate the effectiveness of the encoding parameters and to satisfy the current demand of the users, we study the performance of EC in Section VI based on the decisive factors discussed in the following section.
The probability of data availability (P avail ) [10] can be computed using Equation 1.
In this equation, M i denotes the number of ways in which we can organize inaccessible fragments on unavailable servers; N −M γ −i is the number of ways in which we can organize accessible fragments on available servers, and N γ is the total number of ways in which we can organize the γ fragments on total servers. Generally CSPs store two absolute copies that can provide availability of data with the probability 0.99 [10]. Assume that N = 10 5 , and M = 4500, applying EC with γ = 24 fragments and r = 1/2 computes the probability of availability,P avail , to be 1. In this paper, we analyzed the availability for various (α, β) pairs to maximize availability.

(ii) Rate of Encoding
The data encoding rate (r) [10] can be calculated as the number of data fragments (α) divided by the total erasure code fragments (γ ), where γ > α.
The number of redundant fragments β represent the fault tolerance capability of the encoding parameters. Moreover, this encoding rate increases the storage cost by a factor of f = 1 r , called as boost factor.

(iii) Storage Efficiency
The storage efficiency (η) can be calculated as the ratio of the number of data fragments (α) and the total erasure code fragments (γ ). In this paper, we evaluated the efficiency of various (α, β) pairs to analyze the performance of EC.
(iv) Recoverability The optimal number of redundant fragments is another important parameter that provides recoverability. The EC encoding generates γ fragments and can recover actual data from any α fragments. Therefore, it can tolerate up to the β = γ − α number of unavailable fragments. We used the recoverability metric to maximize data reliability and minimize the impact of data loss. (

v) Storage Overhead
The EC encoding includes certain fragments overhead for the data recovery purpose. This overhead is caused by the padding of β fragments. The estimated storage overhead is computed by the ratio of the number of redundant fragments and the number of data fragments.
However, the erasure coding generates γ fragments of the same size corresponding to the given data size. Hence, the resulting fragment size is the multiplication of one fragment size and γ . Practically, using these fragments size and data size, we can compute the space overhead which is computed as: In our experiments, we used Equation 5 to measure the storage overhead.

V. PROPOSED METHODOLOGY
This section presents the investigation procedure of the optimal encoding parameters, followed by the fragments generation procedure.
In this approach, a user is expected to provide the input file along with the preferences to the CSP for the selection of optimal encoding parameters and generation of coded fragments, which is shown in Figure 4. The following steps show the corresponding sequence of operations.
1) The CSP, first, sends the file as an input to both the investigation process module and EC block. The investigation process module, initially, calculates the size of the input file, and then it uses the preferences to decide the encoding (α, β) pair. 2) The EC block gets the input optimal pair values from the investigation process to encode the input file.
3) The EC scheme encodes the input file based on the selected encoding (α, β) pairs to generates the α number of data and β number of parity fragments. The selection of optimal encoding parameters and the generation of fragments are divided into two phases, as discussed below.

A. PHASE I: INVESTIGATION PROCEDURE
In this phase, an optimal encoding pair is selected among the set of suitable pairs which is shown in Algorithm 1.
Algorithm 1 takes the set of encoding pairs δ (α,β) and file category κ as an input, and it returns the optimal encoding pair (α, β). The algorithm performs the following operations: i. Initially, it calls the function getPairsCount, which takes δ (α,β) as an input and returns the number of pairs exists in the input set. ii. In case of only a single pair exist in the set δ (α,β) or file belongs to the small size category then it returns first pair of the input set by calling getPair (1, δ (α,β) ) function. iii. In case of only two pairs exist in the set δ (α,β) then it returns second pair of the input set by calling getPair(2, δ (α,β) ) function. iv. In case of only three pairs exist in the set δ (α,β) and file belong to the medium size category then it returns second pair of the input set by calling getPair(2, δ (α,β) ) function, otherwise (file belongs to large size category) it returns the third pair of the input set by calling getPair(3, δ (α,β) ) function. v. In case of a set δ (α,β) contains more than three pairs and file belong to the medium size category then it returns average pair from the input set by calling getAveragePair(δ (α,β) ) function, otherwise (file belongs to large size category) it returns the large pair of the input set by calling getLargePair(δ (α,β) ) function.

B. PHASE II: FRAGMENT GENERATION PROCEDURE
Phase II executes Algorithm 2 which generates the coded fragments based on user's preference and input file. Algorithm 2 takes the user's preference ρ and file as an input, and returns the pair which contains a set of data and parity fragments. The algorithm performs the following operations: i. It calls the function getFileCategory with an input file to get the file category κ. ii. Then it calls the function getEncodingPairs function with preference ρ and file category κ as an input to get the set of encoding pairs δ (α,β) . iii. Then it selects the optimal encoding pair (α, β) ∈ δ (α,β) by calling selectOptimalEncodingPair function. iv. Then it calls the function executeErasureCoding with input α and β values to generates the set of γ fragments. v. Now, the getSeparateFragments function is executed which separates the set of data and parity fragments (δ α , δ β ) from the set δ γ . vi. Finally, the algorithm returns the pair (δ α , δ β ) containing the set of data and parity fragments.

VI. RESULTS AND ANALYSIS
In this section, first, we discuss the experimental setup with packages and libraries, followed by several performance analysis and results.

A. EXPERIMENTAL SETUP
The experimental setup comprises of a desktop machine with Intel R Core TM i7-3770 CPU @ 3.  Tables 3 and 4.
We analyze the data availability for various input (α, β) pair values with N =  Table 3 demonstrates the data availability for different values of N = {10 5 ,10 6 , 10 7 } and the average availability at M = 10%. Since the ranges are overlapping, we have taken the average of N values to see how they get impacted due to M with {10, 20, 30, 40, 50}, which is summarized in Table 4. We choose a large number of unavailable servers to determine availability. The probability of data availability for N = 10 5 , 10 6 , and 10 7 is nearly the same. Hence, we deduce that the data availability is unrestrained from the total number of existing servers, and further, it achieves the equilibrium state. The relatively better availability is achieved at M = 10% which lies between 0.9658393 and 0.9999999. Figure 5 summarizes the data availability probability of erasure encoding parameters at M = 20% corresponding to the total number of servers, and the results display the higher availability at EC (12,12) & EC (12,10).

D. RELIABILITY ANALYSIS
The important consideration in cloud storage are the data recoverability metric, minimizing the impact of data loss, and server failures. The increase in the number of redundant fragments increases the recoverability of the corrupted fragments to build a durable system. Each fragment can be placed on a distinct server to maximize the tolerance against server failures. To recover the corrupted fragments, it requires to retrieve α number of fragments from the storage. To capture the probability of data loss and normalized magnitude of data loss, we use a discrete-event simulator SimECD [48] with its default configuration of 4TiB disk capacity per node. The simulation is based on the generation of failure and repair events in a production datacenter. It measures the probability of data loss (PDL) based on the number of permanently lost 75112 VOLUME 8, 2020  fragments over an expedition period (10 years in our case). The lowest PDL represents more tolerance capacity against data corruption. It also uses Normalized the Magnitude of Data Loss (NOMDL) by measuring the expected amount of data loss normalized to the storage capacity. Figure 6 shows the graph of reliability metric for various α and β value pairs of RS codes. Since NOMDL values in the graph are very small relative to PDL values, we scale the NOMDL by 10 4 in the graph.
The redundant fragment increases the capacity of data recovery and decreases the impact of data loss or server failures. The pairs, i.e., (3,2), (8,3), and (10, 3) has limited number of redundant fragments. That may lead to high chances of inaccessibility due to failures. As well, (8,3) and (10,3) has the lowest probability of data availability, according to Table 4. As can be seen in Figure 6, the pairs (3, 2), (8,3), and (10, 3) has relatively higher PDL than others. According to Figure 6 and the number of parity fragments, we observe that the pairs RS (12,12) & RS (12,10) has lowest PDL, NOMDL and highest recoverability among all pairs, so that these pairs offer better data reliability and high accessibility.

E. ENCODING EFFICIENCY AND STORAGE COST ANALYSIS
We plot the estimated storage efficiency and overheads for various input (α, β) pair values as shown in Figure 7. According to the Figure 6 and Figure 7, we say that the EC (12,12) achieves high reliability and EC (10,3) provides the high efficiency and low storage overhead.

F. SPACE OVERHEAD ANALYSIS
We analyze the space overhead of the RS and CRS schemes based on various input (α, β) values. We perform experiments on various file samples of different sizes. VOLUME 8, 2020   The Figure 8(a) and Figure 8(b) show the space overhead for the input files of size 256KB and 512MB respectively. The CRS scheme is relatively expensive as compared to the RS encoding for the smaller file as shown in Figure 8(a). The space overheads of both the schemes are nearly the same for the larger file, however, little higher overheads for the smaller file. The reason lies in the fact that the encoding library includes 80byte header to each fragment. Figure 9 represents the overall space overhead based on the implementation results for various input (α, β) values to RS encoding. The results show that RS (10,3) and RS (12,12) has the lowest and the highest space overhead respectively.

G. EXECUTION COST ANALYSIS
In this subsection, we analyze the execution cost for the sample files of size 256KB, 512KB, 1024KB, 256MB, and 512MB using Python timeit module. We perform each encoding operation 1000 times to evaluate the average execution time which is summarized in Table 5.
We observe from the Table 5 that all the operations require extremely less time, between 0.00021 to 2.41µs, for executing the encoding operation of all the sample files. Further, we analyze the execution cost of encoding operations for the sample files and illustrate the behavior of the small and large file in Figure 10(a) and Figure 10(b) respectively. We observe     that the RS scheme takes more computation time compared to the other schemes, and the Jerasure RS and CRS schemes require almost equal computation time. Hence we conclude that the Jerasure RS and CRS encoding scheme is computationally better than the RS schemes.
The EC encoding operations performance for various input (α, β) pair values on the basis of availability, recoverability, efficiency, and storage overhead is summarized in Table 6. We perform the availability analysis at M = 20% and we assume the availability to be Low at 0.7, average at 0.8, and high at 0.9. We observe the data recoverability based on the percentage of additional fragments. We assume the recoverability as low if percentage < 50, average if 50 ≤ We achieve high availability and high efficiency at EC (8,4) and EC (12,6) with average recoverability and storage overhead. Further, we achieve high efficiency and lower storage overhead at EC (8,3) and EC(10, 4) with average availability. Moreover, the EC (12,8) achieve high availability with average efficiency, recoverability, and storage overhead. Hence, either of the pair, (8,4) or (12,6), is the suitable choice for designing the highly available, efficient, and fault-tolerant storage system with average storage overhead. Based on the aforementioned findings, the input preference, as classified below, define the user requirements for the encoding of the file.
• Preference-I: High availability with average efficiency, recoverability, and storage overhead.
• Preference-II: High efficiency and lower storage overhead with an average availability.
• Preference-III: High availability and high efficiency with average recoverability and storage overhead.
• Preference-IV: High availability and high recoverability with average efficiency.
We have applied the following rules based on Algorithm 1 to assign the Encoding pair values to different files depending upon the number of available encoding pair values: • Case-I: If only one pair is available, then assign that pair to all the files irrespective of their size.
• Case-II: If two pairs are available, then assign the smaller pair value to small and medium size files; and higher pair value to large size file.
• Case-III: If three pairs are available, then assign the smaller pair value to small files; medium pair to medium size files and higher pair value to large size file.
• Case-IV: If more than three pairs are available, then assign the smaller pair value to small files; higher pair value to large files; and any one of the remaining pair value to medium size files depending upon the first come first serve basis. Table 7 illustrates the above discussed rules to assign the encoding pair values. Suppose for given input file and user preference, we get the multiple (α, β) pair values as (3, 2), (5, 3), (6,4), (12,8) for encoding a particular file. Therefore, we select the minimum number of γ (= α + β) fragments from the offered pairs for small file, average number of γ fragments for medium file, and maximum number of γ fragments for large file. Thus, we choose the encoding pair values (3,2), {(5, 3) or (6, 4)}, and (12,8) for small, medium, and large files respectively.

VII. CONCLUSION
The CSPs provide Storage-as-a-Service to the end-users. To prevent data unavailability due to server failures and software or hardware faults, they use the EC concept. The encoding scheme provides a durable and recoverable system with high reliability and low storage cost based on the dynamic selection of encoding parameters as per the user preferences. Our findings provide direction towards the selection of optimal encoding pairs of erasure coding suitable to the specific user's requirements. The investigation revealed that the probability of availability achieved up to 0.9999999 when ten percent of the servers were unavailable. It is also observed that the encoding operations require extremely less time, between 0.00021 to 2.41µs, for all the sample files. On the other hand, we explored the QoS requirements and summarized the findings according to the user's requirements and file sizes. RS (12,12) & RS (12,10) provided high reliability and also offered better data recovery and high accessibility. However, the results showed that both RS (12,12) & RS (12,10) have the highest space overhead. Though, we achieved high availability and high efficiency at EC (8,4) and EC (12,6) with average recoverability and storage overhead. Further, we achieved high efficiency and low storage overhead at EC (8,3) and EC (10,4) with average availability. Also, EC (12,8) achieved high availability with average efficiency, recoverability, and storage overhead. Hence, either of the pairs, (8,4) or (12,6), is the suitable choice for designing the highly available, efficient, and fault-tolerant storage system with average storage overhead, on a specific user's specifications. These choices may change based on the user's preferences. The results indicate that utilizing the optimal selection of encoding values. We can significantly reduce the unnecessary space overhead and computation overhead, and fulfill the user QoS requirements while providing higher availability and reliability. Essentially, reducing the overall system cost.