MOBIUS: Model-Oblivious Binarized Neural Networks

A privacy-preserving framework in which a computational resource provider receives encrypted data from a client and returns prediction results without decrypting the data, i.e., oblivious neural network or encrypted prediction, has been studied in machine learning that provides prediction services. In this work, we present MOBIUS (Model-Oblivious BInary neUral networkS), a new system that combines Binarized Neural Networks (BNNs) and secure computation based on secret sharing as tools for scalable and fast privacy-preserving machine learning. BNNs improve computational performance by binarizing values in training to $-1$ and $+1$, while secure computation based on secret sharing provides fast and various computations under encrypted forms via modulo operations with a short bit length. However, combining these tools is not trivial because their operations have different algebraic structures and the use of BNNs downgrades prediction accuracy in general. MOBIUS uses improved procedures of BNNs and secure computation that have compatible algebraic structures without downgrading prediction accuracy. We created an implementation of MOBIUS in C++ using the ABY library (NDSS 2015). We then conducted experiments using the MNIST dataset, and the results show that MOBIUS can return a prediction within 0.76 seconds, which is six times faster than SecureML (IEEE S\&P 2017). MOBIUS allows a client to request for encrypted prediction and allows a trainer to obliviously publish an encrypted model to a cloud provided by a computational resource provider, i.e., without revealing the original model itself to the provider.

I. INTRODUCTION 1) Background: Machine learning methods are widely used in various situations, such as healthcare, manufacturing, and financial services. Consequently, privacy has become a serious concern in the use of big data. In general, the following two features are important for practical use of machine learning: (1) make a prediction oblivious without downgrading performance; and (2) guarantee the security of a trained model.
In the first feature, the application of a privacy-preserving mechanism to a prediction is necessary for guaranteeing the privacy of a client. However, a privacy-preserving mechanism may downgrade the throughput of a model and may thus not be used because of poor performance. To solve this dilemma, a privacy-preserving scheme that does not downgrade performance of a model is necessary.
In the second feature, different privacy-preserving frameworks have been proposed but a framework that hides a model itself, i.e., making the model oblivious, has not been proposed. According to the Recht hypothesis [1], deep neural networks work well because they memorize most of their training data. Several machine learning systems provide prediction as a service in the cloud, and the aforementioned problem strongly affects the trustworthiness of a resource provider who manages a cloud. For example, a resource provider can extract information or even leak a model that he/she receives and manages. Therefore, a trainer who owns a dataset and trains a model has to completely trust a resource provider who provides a prediction service. Consequently, a trainer who wants to maintain privacy will hesitate to outsource machine learning services unless he/she completely trusts a resource provider. To solve this problem, a model should be encrypted to prevent unauthorized entities, including a resource provider, from accessing the model itself.
2) Motivating Example: The main goal of this work is to create a system that provides encrypted prediction as well as encryption of a model in such a way that other entities, including a service provider, cannot access the model itself.
We call this the model-oblivious problem.
In the model-oblivious problem, there are three entities, namely, a trainer, a resource provider, and a client. A trainer trains a model with plaintexts, encrypts the model, and then uploads the encrypted model to a cloud provided by a resource provider. When the client utilizes a model, he/she accesses the cloud. By encrypting the model, neither the resource provider nor a client can extract information from the model. Similarly, a client can encrypt input data that will be given to the resource provider. Figure 1 shows an example scenario describing the intuition behind the model-oblivious problem. Consider a scenario that includes a hospital, a cloud server, and doctors as the trainer, the resource provider, and clients, respectively. The hospital trains a model with datasets it collected, encrypts the model, and then publishes the encrypted model on a cloud server, such as AmazonEC2, to make it publicly available to doctors. The cloud server can then execute a prediction for an input provided by a doctor by using the encrypted model without decryption. With the encrypted model, situations where the cloud server tries to extract information from the model or use the model for other purposes can be prevented. In addition, as an equally important measure, the amount of We note that oblivious prediction [2]- [9] and encrypted training [10]- [14] have not discussed or implied the features of the model-oblivious problem. To the best of our knowledge, only SecureML [15] is the only other system that considers the model-oblivious problem, and our goal is to construct a faster system without sacrificing prediction accuracy.
3) Cryptographic Approach: In this work, we focus on the use of cryptography for guaranteeing the security of a trained model. One of the possible solutions to training while preserving privacy is differential privacy [16], which can prevent a trained model from leaking an individual record by perturbing the records with randomized noise. Given this capability, many works on neural networks use differential privacy [11], [17]. There are also works on further applications of differential privacy, e.g., data collection on an untrusted server [18], [19] or general function release [20]. However, according to Dowlin et al. [10], the notion of differential privacy is not useful in the prediction phase. Moreover, preventing unauthorized entities from accessing a model is outside the scope of differential privacy. On the other hand, cryptography can rigorously control authorized access to only users with the correct secret information. We therefore construct a system that uses cryptography to encrypt a trained model and prevent unauthorized entities from accessing the trained model.

4) Contribution:
In this work, we propose a new system named Model-Oblivious BInarized neUral networkS (MO-BIUS), which enables scalable encrypted prediction and encryption of a trained model. MOBIUS uses binarized neural networks (BNNs) [21] and secure computation based on secret sharing as its main tools. BNNs are neural networks whose values for weight matrices and activation functions are binarized to +1 or −1. By avoiding the use of real numbers, the computational time of operations with binarized values can be improved. Secure computation based on secret sharing distributes input data from a client as shares such that an individual share leaks nothing about the original data, and it can evaluate the data without reconstructing the data via the homomorphism of the shares. A bit length of shares can be shortened in comparison with conventional cryptography, and thus the resulting secure computation can perform better than other cryptographic tools, such as fully homomorphic encryption (FHE) [22].
We note that our contribution is non-trivial. We improved the algorithms used in BNNs to make them compatible with the algebraic structures of secure computation. The bit-shift method used in the original BNNs [21] downgrades prediction accuracy. Batch normalization [23] is used for improving the accuracy but it is based on real numbers and is incompatible with secure computation, which is based on integers. These problems can also potentially downgrade computational performance (See Section VI-A for details). To overcome these limitations, we first improve the algorithm of BNNs, particularly the process of batch normalization, to make use of integers and make them compatible with secure computation. Both the accuracy and the computational time can be thus improved.
We present the construction of MOBIUS utilizing secure computation based on secret sharing and the improved BNNs and its implementation in C++ using the ABY library [24]. We conducted experiments using the MNIST dataset, and the results show that MOBIUS can perform a prediction within 0.76 seconds, which is six times faster than SecureML [15] even without optimizing our implementation (See Section VI-C for details).

5) Related Works:
The closest work is SecureML [15]. The main motivation of SecureML was to provide scalable encrypted training, i.e., solving the model-oblivious problem is not their main goal. Moreover, encrypted training is the out scope of this work. As related works on combining BNNs with cryptography, TAPAS [14] and FHE-DiNN [12] based on FHE have been concurrently proposed. FHE-DiNN utilized discretized neural networks where domains are defined from −w to +w, but its experiments were conducted with −1 to +1 exactly the same as BNNs. These works aim to provide fast computation of FHE [22], [25] in BNNs, and they did not discuss the model-oblivious problem.

II. PRELIMINARIES
In this section, we provide backgrounds on neural networks and secure computation to help in understanding our work.

A. Binarized Neural Network
Binarized neural networks (BNNs) [21] were proposed to reduce overloads by minimizing data sizes. To do this, values presented in neural networks are binarized to +1 or -1 in order to reduce the required computational resources.
The original work on BNNs [21] described methods to binarize three protocols, namely, full connection, batch normalization, and activation, which are required in standard neural networks. Full connection computes matrix multiplications between vectors and weight matrices. Batch normalization makes the distribution for nodes uniform in the training phase and contributes to speeding up both training and prediction.
Activation applies non-linear processing to output vectors, and a sign function is utilized in BNNs. Among the protocols described above, batch normalization has adopted a bit-shift method to be computed in a binarized form, which is different from well-known batch normalization algorithms [23], because the operations in well-known batch normalization algorithms require real numbers, consequently creating a bottleneck in the computations.

B. Cryptographic Preliminaries
In this section, we describe the notations and terminologies used in secure computation based on secret sharing.
1) Secret Sharing.: A t-out-of-n secret sharing scheme over a finite domain D consists of the following two algorithms: Reconst takes x 1 , . . . , x t ∈ D as input, and outputs x ∈ D. In these algorithms, for i ∈ {1, . . . , n}, x i is called the i-th share of x. We denote x = ( x 1 , . . . , x n ) as their shorthand. Any less than t shares of x over the t-out-of-n secret sharing scheme jointly give no information on x, whereas any ≥ t shares jointly determine x by using Reconst. Several secret sharing schemes that have been proposed typically have finite domains, e.g., a residue class ring Z M modulo integer M > 1 and an ℓ-length binary string [26]. An i-th share of an ℓ-dimensional vector v = (x 1 , . . . , x ℓ ) over a domain D consists of i-th shares of its components and is denoted by Analogously, an i-th share of a matrix is defined in the same way. Therefore, a secret sharing scheme over vectors, matrices, and tensors, among others, can be defined.
2) Secure Computation based on Secret Sharing.: We define sub protocols of secure computation that we utilized in our work. The following computations are defined over a residue class ring Z M = {0, . . . , M − 1} modulo integer M . Several efficient implementations of the protocols have been provided [24], [27]. In our implementation, we utilize the ABY library [24], which is based on a two-party setting (See Section VI-A for details), and supports secure computation over a residue class ring modulo integer M = 2 m (m = 8, 16, 32, or 64). Here, Half can be instantiated by the use of CMP although it is not originally included in the ABY library.

C. Security and Network Settings
In this paper, we focus on the semi-honest adversary. More precisely, we consider the adversary who follows protocols but curiously learn client's or trainer's data. As mentioned above, in our proposed protocol, there are three parties: the client, the service provider, and the trainer, and note that there are n servers in the cloud hosted by the service provider.
The trainer locally trains with plaintexts, i.e., non-encrypted training, and constructs a model of a BNN. Then the trainer computes shares of the model with respect to an underlying t-out-of-n secure computation scheme, and then uploads the resulting shares to the servers. Namely, the adversary cannot learn the model as long as the adversary corrupts less than t servers.
The client computes shares of its query of the prediction on trainer's model with respect to the underlying t-out-of-n secure computation scheme, and then sends the resulting shares to the cloud. More than t − 1 servers jointly compute a protocol of the prediction with input the shares of model and the shares of query, and then output its result. Namely, the adversary cannot learn client's query as long as the adversary corrupts less than t servers.
However, similar to previous proposals [5], [15], we do not aim to hide the size of client's query, the network architecture of trainer's model, and which secure computation protocols are used. The authors of MiniONN [5] suggested that such information can be protected by adding dummy layers, which can also be integrated with our proposed protocol.
Finally, we assume the use of secure channel, which can be instantiated by the transport layer security (TLS) [28]. This setting is the same as that in other literature [5], [15].

1) Technical Problem:
This work aims to create a system that achieves both the performance and the security of a trained model by using BNNs. The values in the operations of BNNs are binarized into +1 or −1 and may seem to be compatible with the algebraic structures of secure computation. However, the processes of the original batch normalization [23] that improve the performance of neural networks are linear operations in real numbers, making them incompatible with secure computation in integers.
2) Transformation into Integers: To solve the compatibility problem, we transform the parameters of batch normalization into linear operations in integers by truncating lower digits of the parameters and then multiplying them by a constant. We heuristically know that such transformation has small influence on the accuracy because errors can be reset by using nonlinear processing in an activation function after the batch normalization. In particular, the possibility that the truncation of digits changes the sign of the output of batch normalization (i.e., from positive to negative and vice versa) and influence the activation function is negligible. The output of the batch normalization in the output layer is identical to that of BNNs, and the maximized value in these output vectors can be finally obtained as a prediction result. The possibility that the index of a maximized value is changed is negligible, and thus the truncation of digits does not affect the prediction result. In actual applications, the sizes of the parameters can be chosen such that the decline in the accuracy in a trained BNN model is minimal.
The method described above solves the incompatibility problem between the algebraic structures of the operations of BNNs and secure computation. Moreover, this method achieves a higher accuracy than the bit-shift method in the original BNNs [21] because the standard batch normalization can clip distribution with a higher accuracy. Finally, we can construct MOBIUS by combining an efficient and scalable secure computation based on secret sharing and the improved BNNs.

IV. BINARIZED NEURAL NETWORKS COMPATIBLE WITH SECURE COMPUTATION
In this section, we propose improved BNNs to be used in MOBIUS. First, we discuss the difference between the proposed BNNs and the original BNNs [21]. Then, we describe the algorithms used in the proposed BNNs. Finally, we instantiate an architecture for MNIST, a large database of handwritten digits, as a concrete example of the proposed BNNs.

A. Avoiding Shift-Based Batch Normalization
As described in the previous section, our batch normalization uses only integers. Let γ (i) , β (i) , µ (i) , and σ (i) be learned parameters and ǫ a small positive value. The result of ordinary batch normalization can be obtained with the following equation: By replacing coefficients, Equation (1) can be transformed aŝ By substituting s ′ (i) , t ′ (i) for integers s (i) , t (i) using an appropriate integer q, called scale parameter, we obtain an alternative integerx ′ (i) forx (i) as follows: Although the value of q can be determined layerwise or even nodewise, the same q is used in every node for brevity in this paper. As the value q increases, the deterioration of BNN prediction accuracy decreases. However, the increase in q causes the increase of a bit length of a modulo M . However, the value of the bit length of modulo M increases as q increases, consequently increasing memory requirements and calculation costs. Therefore, q should be as small as possible to maintain high prediction accuracy.

B. Improved Binarized Neural Networks
In this section, we describe the binary full connection, batch normalization, and activation algorithms used in the proposed BNNs. The binary full connection algorithm is shown in Algorithm 1. This algorithm takes an integer vector a and a learned weight matrix W as inputs, then outputs the result of matrix multiplication W a.

Algorithm 1 BinaryFullConnection
The batch normalization algorithm is shown in Algorithm 2. This algorithm takes an integer vector c, which is usually an output of binary full connection, and batch normalization parameters s ′ , t ′ as inputs, then outputs the result of batch normalization. Batch normalization parameters s ′ , t ′ are obtained as described in IV-A.

Algorithm 2 BatchNormalization
Input: c ∈ Z d : input vector s ′ , t ′ ∈ Z d : batch normalization parameters Output: b ∈ Z d Procedure: 1 The activation algorithm is shown in Algorithm 3. This algorithm takes an integer vector b as input, then outputs a binary vector that represents the signs of each element of the input vector b.

C. Binarized Neural Networks for MNIST dataset
In Section IV-B, we described the algorithms used in the proposed BNNs. To use BNNs for learning or predicting data, we need to instantiate a concrete architecture and determine the entire procedure. We instantiate an architecture for MNIST dataset image classification (See Section VI-B for details on the MNIST dataset).
Consider a typical architecture with an input layer of size 784, two hidden layers of size d, and an output layer of size 10, Fig. 2. Architecture of the BNNs for MNIST dataset as shown in Figure 2. In the hidden layers, the full connection, batch normalization, and activation algorithms are executed in order. In the output layer, only the full connection and batch normalization algorithms are executed. In this architecture, even though the maximum value index of the output vector is the result of the prediction, we omit this process because the proposed method is designed to return output vectors as the result of secure computation.

Algorithm 4 Binarized Neural Network for MNIST
The weight matrices W 1 , W 2 , and W 3 used in algorithm 4 are learned parameters, and the batch normalization parameters s ′ j , t ′ j can be calculated as described in Section IV-A. In the case of d = 128, 1000 (d is the size of hidden layers), we confirm experimentally that the deterioration of prediction accuracy towards test data is negligible when a scale parameter q = 10, 000. Therefore, we use q = 10, 000 in all experiments in this work.

V. MOBIUS DESIGN
In this section, we describe the design of MOBIUS. We first describe share generation of a trained model in the preprocessing phase, and then show its main algorithms.
MOBIUS is composed of protocols we call secure full connection, secure batch normalization, and secure activation. The main sequences of these protocols are almost the same as those described in the previous section, but we utilize secure computation in the internal processes.

A. Secret Sharing a Model
We first construct shares of parameters, which are learned in plaintexts, except for that of batch normalization by utilizing secret sharing described in Section II-B. In this construction, let M be a modulo of the secret sharing. Moreover, for any a, a is a secret share if a > 0 and M + a is a secret share if a < 0. Hereinafter, we denote 0 ≤ a ≤ ⌊ M 2 ⌋ as a non-negative integer and ⌊ M 2 ⌋ < a < M as a negative integer. Learned weight matrices W i (i = 1, · · · ; L − 1) are shared using secret sharing and are stored in each server in a distributed manner. Parameters of the batch normalization are computed using the computation in Section IV-A, and its resulting parameters s i , t i (i = 1, · · · , L − 1) are stored in each sever as shares by utilizing the secret sharing. Finally, the size information (L, n 0 , · · · , n L ) of the shares themselves are not shared, i.e., they are stored as plaintexts.

B. Model-Oblivious Prediction
The construction of a prediction protocol for MNIST in MOBIUS is shown in Algorithm 5. The secure full connection, secure batch normalization, and secure activation are denoted by SecureFC, SecureBN, and SecureAct, respectively. Moreover, for any matrix X, X i,j indicates an element of the i-th row and j-th column and X i indicates an element of the i-th column.

Algorithm 5 SecureBinaryNN for MNIST
Input: input ∈ Z 784 M : Shares of Input Vectors Output: output ∈ Z 10 M : Prediction Results Procedure: The secure full connection protocol is described in Algorithm 6. The matrix multiplication between shares is computed similarly as in Algorithm 1. for j = 0 to din do 3:

Algorithm 6 SecureFullConnection
end for 6: end for The secure batch normalization protocol is described in Algorithm 7. Although the original batch normalization [23] requires computations of root or division, the secure batch normalization protocol can be performed with only addition and multiplication by performing the computation in Equation (2) in advance.  A. Implementation 1) Language and Library: MOBIUS is implemented in C++ with the ABY library [24] for secure computation. The ABY library is a secure computation framework with twoparty setting, and contains three types of shares, namely, Arithmetic, Boolean, and Yao. These shares have different operations, and the ABY library provides efficient conversions between them. We refer the readers to the paper [24] for details on the ABY framework.

Algorithm 7 SecureBatchNormalization
We briefly describe several parts related to our implementation below. The arithmetic shares can be used in arithmetic operations, such as addition and multiplication. Therefore, the secure full connection and secure batch normalization are implemented with arithmetic shares. In terms of share size, the ABY library includes four parameters as a modulo p, i.e., 8,16,32, and 64 bits. Although we omit the detail due the space limitation, the MNIST dataset is available with 32bit parameter. The secure activation requires a comparison operation of secure computation, which can be computed with Boolean or Yao shares. In the ABY library, Arithmetic shares cannot be directly converted into Boolean shares, i.e., the Arithmetic shares are first converted to Yao shares and then from Yao shares to Boolean shares. Therefore, since the conversion of Arithmetic shares to Boolean shares requires two conversions, we used Yao shares in the secure activation. Besides, according to the benchmark of the ABY library [24], a comparison operation using Yao shares can be computed faster than using Boolean shares. We hence implement the secure activation with Yao shares.
2) Overview of Implementation: We show the implementation flow of shares in Figure 3. In the ABY library, Yao shares cannot be converted directly into Arithmetic shares, and thus Yao shares need to be converted to Boolean shares first, and then from Boolean shares to Arithmetic shares. We note that even though Boolean shares can be used in the secure activation, the flow shown in Figure 3 provides the fastest implementation. In the algorithms of the original BNNs [21], secure activation should be performed with Boolean shares because of the bit-shift operations. The extra conversion to Boolean shares and the overhead operations of Boolean shares may downgrade the computational performance.
The implementation of MOBIUS was created by simply The second column refers to the capability to make a model oblivious. The third column refers to the evaluation of the model as output at the prediction phase. The final column refers to total elapsed time of execution in the real time, i.e., including both off-line computation and on-line computation. Although several source codes of the other systems were obtained, we could not execute them and the values in the Table were obtained from their papers. using the available ABY library and is therefore not optimized unlike SecureML [15]. Therefore, the performance of the following experiments can be improved by optimizing implementation. We plan to publish our source codes for subsequent works. The training phase is out of the scope of this work, and therefore a model is trained in advance. Shares of the model and input from a client are generated by the PutSIMDINGate function of the ABY library.

1) Machine Environments:
We conducted experiments with the MNIST dataset using the algorithms described in Section IV-C on two AmazonEC2 c4.8xlarge machines, both of which are running Linux and have 60 GB of RAM. The two machines are hosted in the same region as a LAN setting. The bandwidth is 1 GB/s, and the neural network has two hidden layers with 128 neurons in each layer. This setting is identical to that of SecureML [15]. We also utilize the sign function as the activation function. The neural network is fully connected. We them compare the performance of our protocol with SecureML and other state-of-the-art protocols with cryptography [5], [12], [14].
2) Dataset: The MNIST dataset contains 70,000 images of handwritten digits from 0 to 9. In particular, the MNIST dataset has 60,000 training samples and 10,000 test samples, each with 784 features representing 28×28 pixels in the image. Each feature is a grayscale between 0-255.

C. Results
The experimental results are shown in Table I, Figure 4, and Figure 5. Table I shows a comparison of different protocols based on capability to make a model oblivious, accuracy, and computational time. Figure 4 shows a comparison of MOBIUS and the original BNNs based on prediction accuracy with respect to the number of neurons. Figure 5 shows a comparison of MOBIUS and the original BNNs based on computational time for prediction with respect to the number of neurons.
As shown in Table I, MOBIUS is the fastest system that combines BNNs and secure computation based on secret sharing despite having the capability to encrypt a model. We again note that the accuracy of MOBIUS may even be improved by optimizing our implementation. As shown in Figure 4, MOBIUS has better prediction accuracy than the original BNNs because it uses improved BNNs that does not use bit-shift operations. Finally, as shown in Figure 5, the computational time of MOBIUS seems to be linear with respect to the number of neurons, although the computational time becomes 100 times longer than the original BNNs. We can thus approximately measure performance for any number of neurons.

VII. CONCLUSION
In this work, we presented MOBIUS (Model-Oblivious BInarized neUral networkS), a system that enables scalable encrypted prediction and encryption of a trained model. As our main technical contribution, we presented new algorithms of BNNs that are compatible with secure computation by representing all parameters in integers and removing the bitshift method used in the original BNNs [21]. We then designed the main construction of MOBIUS with secure computation based on Arithmetic shares and Yao shares. We also conducted experiments using the MNIST dataset, and the results show that MOBIUS achieves higher computational performance and higher accuracy than SecureML, which is the only other system that considers the model-oblivious problem. As future work, we plan to conduct experiments on more complicated datasets, such as CIFAR10.