A Communication-efficient Local Differentially Private Algorithm in Federated Optimization

Federated optimization, wherein several agents in a network collaborate with a central server to achieve optimal social cost over the network with no requirement for exchanging information among agents, has attracted significant interest from the research community. In this context, agents demand resources based on their local computation. Due to the exchange of optimization parameters such as states, constraints, or objective functions with a central server, an adversary may infer sensitive information of agents. We develop a differentially-private additive-increase and multiplicative-decrease algorithm to allocate multiple divisible shared heterogeneous resources to agents in a network. The developed algorithm provides a differential privacy guarantee to each agent in the network. The algorithm does not require inter-agent communication, and the agents do not need to share their cost function or their derivatives with other agents or a central server; however, they share their allocation states with a central server that keeps track of the aggregate consumption of resources. The algorithm incurs very little communication overhead; for m heterogeneous resources in the system, the asymptotic upper bound on the communication complexity is O(m) bits at a time step. Furthermore, if the algorithm converges in K time steps, then the upper bound communication complexity will be O(mK) bits. The algorithm can find applications in several areas, including smart cities, smart energy systems, resource management in the sixth generation (6G) wireless networks with privacy guarantees, etc. We present experimental results to check the efficacy of the algorithm. Furthermore, we present empirical analyses for the trade-off between privacy and algorithm efficiency.


I. INTRODUCTION
As the number of mobile and IoT devices and the volume of data increase, the risk of privacy leaks also increases.Moreover, because of privacy concerns, users may not wish to share their on-device data, which could be detrimental to many applications' usability.Federated optimization is a step toward resolving this issue.In federated optimization, resource-constrained clients (agents) coordinate with a central server to solve an optimization problem without exposing their private data [1], [2].Federated optimization has been used in federated learning that facilitates clients to collaboratively learn models without sharing their private ondevice data.Instead, clients share the learned parameters with a central server.The central server aggregates the updates sent by the clients to update the global model [3].More information is available in recent comprehensive surveys [4], [5].
Although federated optimization provides privacy to agents, stronger privacy guarantees can be obtained by combining it with privacy techniques [3] such as differential privacy [6]- [8], federated f -differential privacy that guarantees a differential privacy to agents against an individual adversary and a group of adversaries [9], secure multiparty computation [10]-a cryptography-based technique, etc.Briefly, differential privacy provides a certain privacy guarantee to users storing data in a shared database [11], [12].It allows an analyst to learn aggregate statistics on the database without compromising the privacy of participating users.This is done by adding randomness to the data via a trusted curator, following a certain probability distribution.Local differential privacy is proposed for scenarios where the curator is untrusted and may act as an adversary.In local differential privacy, users randomize their data before storing it in the database [13], [14].
In federated optimization, agents make decisions based on their local on-device computations.Because they exchange certain optimization parameters, such as states, constraints, objective functions, derivatives of objective functions, or multipliers, with the central server, an adversary may infer sensitive information of agents [15], [16].Furthermore, communication overhead between clients/agents and the central server is another major issue in federated optimization [17]- [19].Privacy-preserving federated optimization solutions have applications in many areas; a few are listed as follows, healthcare [20], smart cities [21], the Internet of things [22], [23], supply chain management-collective supply chain risk prediction [24], smart energy systems [25].Interested readers can refer to the recent survey for more applications [5].In addition, because of the massive growth of resourceconstrained IoT devices and the requirement of low-latency communications, scalability, and the quality of user experiences in the sixth generation (6G) wireless networks [26], communication-efficient privacy-preserving federated optimization solutions will have many promising applications along with resource management.
Most of the work on federated optimization with differential privacy involves machine learning and deep learning techniques [6]- [9]; we could not find any work on federated optimization with differential privacy for allocating multiple shared heterogeneous resources.This paper fills this gap.We develop a local differential privacy-additive increase multiplicative decrease, called LDP-AIMD algorithm, for multiresource allocation in federated settings that provides privacy guarantees to agents in the network with little communication overhead.Additionally, the agents in the network achieve close to optimal allocations, and the network achieves close to a social optimal value.We consider multivariate cost functions coupled through the allocation of multiple heterogeneous resources; it is more challenging to solve such optimization problems than single variable problems; furthermore, the single resource allocation solutions used for multiresource settings are not efficient, and provide sub-optimal solutions [27].Finally, the developed algorithm provides privacy guarantees to the agents' partial derivatives of cost functions.The LDP-AIMD algorithm is motivated by the ideas from the additive increase and multiplicative decrease (AIMD) algorithm [28], [29].The AIMD algorithm was proposed for congestion control in the Transmission Control Protocol (TCP) and is widely used in resource allocation.AIMD has recently been used to solve optimization problems wherein there is no requirement for inter-agent communication, but agents coordinate with a central server to solve the problem [30], [31].The algorithm incurs little communication overhead over the network.However, there is no prior work on AIMD with differential privacy for federated optimization.
The LDP-AIMD algorithm consists of two phases: additive increase and multiplicative decrease.In the additive increase phase, agents continually increase their resource demands until they receive a one-bit capacity constraint notification from a central server.The central server broadcasts the one-bit capacity event notification when the aggregate resource consumption reaches its capacity.After receiving the notification, agents enter into the multiplicative decrease phase and decrease their resource demands.They calculate their resource demands in the multiplicative decrease phase based on the average allocation, noisy derivatives of the cost function, and other parameters.Noise is added to the derivatives of the cost function drawn from a certain probability distribution.In line with prior work [30], [32], [33], we consider that the cost function of an agent is convex, continuously differentiable, and increasing in each variable.They are coupled through the allocation of multiple divisible heterogeneous resources.LDP-AIMD provides a differential privacy guarantee to the partial derivatives of the agents' cost functions and obtains close-to-optimal values for multiple resources.Additionally, the solution incurs very little communication overhead.For m heterogeneous divisible resources in the system, the asymptotic upper bound communication complexity will be O(m) bits at a time step.Furthermore, if the algorithm converges in K time steps, then the upper bound communication complexity will be O(mK) bits.
AIMD algorithm and its variants are implemented in several real-world applications such as electric vehicle (EV) charging [34], [35], sharing economy [36], optimal power generation in micro-grids [37], decentralized power sharing [38], collaborative cruise control systems [30], other applications can be found at [32], [39].Additionally, federated optimization, learning, and local differential privacy are used in many real-world applications [40], [41].For example, Google deployed a local differentially private model called RAPPOR [42] to collect data on statistics of a user's Chrome web browser usage and settings [43] without compromising the privacy of participating users.Microsoft deployed local differential privacy mechanisms to collect telemetry users' data to enhance users' experiences with privacy guarantees to the users [14].Apple deployed local differential privacy techniques for their servers to learn new words generated by the users' devices that are not in the device dictionary; they classify these words in groups with privacy guarantees to the users [44].Furthermore, Google uses federated learning in Gboard (Google's virtual keyboard) for next-word pre-diction [45], and emoji prediction from the typed text [46].NVIDIA and Massachusetts General Brigham Hospital developed a federated learning-based model to predict COVID-19 patients' oxygen needs during the recent pandemic [47].

A. PAPER CONTRIBUTIONS
As discussed previously, though there are several works on federated optimization with differential privacy, most of the work deal with training a global model in the machine learning application areas (federated learning).Our work, however, is the first in allocating multiple heterogeneous divisible resources in federated settings with differential privacy guarantees, which brings many opportunities and challenges.Our main contributions to this paper are as follows: □ We model the multi-resource allocation problem as a federated optimization problem that provides differential privacy guarantees to agents in the network.In the model, each agent has a multivariate cost function wherein a variable represents the allocation of a resource.Such optimization problems are more challenging to solve than single-variable optimization problems.□ Our algorithm is based on the additive increase multiplicative decrease (AIMD) algorithm, incurs little communication overhead, provides close to optimal solutions and guarantees differential privacy.Existing AIMD-based resource allocation solutions do not automatically provide differential privacy guarantees to participating agents [30], [31], [34], [35], [48].
In the model, agents do not need to share their private information with other agents or a server.The agents randomize their partial derivatives of the cost functions before calculating their resource demands.In the model, we consider a central server that keeps track of aggregate resource consumption at a time instant and broadcasts a one-bit notification in the network when aggregate resource consumption reaches resource capacity.Although the central server knows the aggregate resource consumption, it does not know the partial derivatives of the cost functions or the cost functions of the agents.Furthermore, the solution incurs negligible communication overhead; the server broadcasts a onebit capacity constraint notification in the network when aggregate resource demands reach resource capacity.Additionally, the communication complexity is independent of the number of agents in the network.For m resources in the system, the asymptotic upper bound communication complexity will be O(m) bits at a time step.Furthermore, if the algorithm converges in K time steps, then the upper bound communication complexity will be O(mK) bits.□ We show theoretical results for the LDP-AIMD algorithm for noises drawn from Gaussian and Laplacian distributions.□ We present experimental results to check the empirical effectiveness of the algorithm.We observe that the algorithm provides (ϵ 1i + ϵ 2i )-local differential privacy and (ϵ 1i +ϵ 2i , δ 1i +δ 2i )-local differential privacy guarantees to an agent for two shared resources in the system, respectively, depending on the probability distribution of noise.We also analyze the trade-off between privacy and the algorithm's efficiency.

B. OUTLINE
The paper is organized as follows: Section II presents the notations used in the paper and a brief background on the AIMD algorithm and deterministic AIMD algorithm.Section III presents the problem formulation.Section IV presents our developed algorithm, the LDP-AIMD algorithm.Additionally, theoretical results on the LDP-AIMD algorithm with noise drawn from the Laplacian and Gaussian distributions are presented in the section.Section V presents the experimental setup and results.Section VI describes the related work on distributed resource allocation, distributed optimization, federated optimization, and differential privacy.Section VII provides the conclusion of the paper, and finally, Appendix presents the proofs of theorems stated in Section IV.

A. NOTATIONS
Let us consider that n agents in a multi-agent system collaborate to access m shared divisible heterogeneous resources and wish to minimize the social cost of the system.Let the capacities of resources be C 1 , C 2 , . . ., C m , respectively.Let the set of real numbers be denoted by R, the set of positive real numbers be denoted by R + , and the set of m-dimensional vectors of real numbers be denoted by R m .For i = 1, 2, . . ., n and j = 1, 2, . . ., m, let the amount of resource j allocated to agent i be denoted by x ji ∈ [0, C j ].Furthermore, suppose that allocating resources incurs a certain cost to each agent, captured in the agent's cost functions.Let the cost function of agent i be denoted by f i : R m + → R + .Let the cost function f i be twice continuously differentiable, strictly convex, and increasing in each variable.Let ν = 0, 1, 2, . .., denote the time steps, and let x ji (ν) denote the amount of instantaneous allocation of resource j of agent i at time step ν.
For n agents in the network, we model the multi-resource allocation problem as the following federated optimization problem: Let the solution to this optimization problem be denoted by x * ji .We assume that the solution is strictly positive.Furthermore, let x * = (x * 11 , . . ., x * mn ).As the cost function f i is strictly convex and the constraint sets are compact, there exists a unique optimal solution [30].We assume that all agents have strictly convex cost functions that are differentiable and increasing.Furthermore, the agents are cooperative and demand the resources with their true valuations with the aim to minimize the overall cost to the network.

B. AIMD ALGORITHM
We present the fundamental additive-increase and multiplicativedecrease (AIMD) algorithm.The AIMD algorithm consists of two phases-additive increase and multiplicative decrease.In the additive increase (AI) phase, an agent increases its resource demand linearly by a constant α j ∈ (0, C j ], called additive increase factor, until it receives a one-bit capacity event notification from the central server.The central server tracks the aggregate resource consumption and broadcasts the one-bit capacity event notification when the total resource consumption reaches its capacity.For time steps ν = 0, 1, 2, . .., the additive increase phase is formulated as After receiving the capacity event notification, an agent responds in a probabilistic way to decrease its resource demand.Suppose that the k'th capacity event occurs at time step ν.Then, for 0 ≤ β j < 1 and β j ∈ {β j , 1} with certain probability distribution, the multiplicative decrease (MD) phase is formulated as After the multiplicative decrease phase, agents again enter into the AI phase until they receive the next capacity event notification, and they continue this.Interested readers can refer to [30] and [32] for details on the AIMD and the AIMDbased resource allocation models.

C. DETERMINISTIC AIMD ALGORITHM
We briefly describe here the deterministic AIMD algorithm by [29], which is the starting point of the LDP-AIMD algorithm.The deterministic AIMD algorithm consists of two phases: The additive increase and the multiplicative decrease phases.In the additive increase phase, an agent increases its resource demand linearly by the additive increase factor α j ∈ (0, C j ], until it receives a one-bit capacity event notification from the central server when the aggregate demand of a resource reaches its capacity.For time step ν = 0, 1, 2, . .., the additive increase phase is formulated as After receiving the capacity event notification, agents decrease their resource demands in a deterministic way using multiplicative decrease factor 0 ≤ β j < 1 and scaling factor 0 < λ ji ≤ 1.Let k j denote the capacity event of resource j, for j = 1, 2, . . ., m.Let x ji (k j ) denote the amount of average allocation of resource j at capacity event k j for agent i.For all i, j, and k j , the average allocation is calculated as follows: Suppose that the k j 'th capacity event occurs at time step ν.Then the multiplicative decrease phase is formulated as The agent i calculates λ ji (k j ) for resource j as in (6).Moreover, after decreasing the demands, agents again start increasing their demands linearly by α j until they receive the next capacity event notification.This process repeats over time to obtain optimal values over long-term average allocations. For . ., x mi ) denote the partial derivative with respect to x ji of cost function f i (•).For k ∈ N, let t kj denote the time at which k j 'th capacity event occurs.Let x 1i (t k1 ) denote the average allocation of agent i over the capacity events for resource 1 until time instant t k1 , and x 2i (t k1 ) denote the average allocation of agent i over capacity events for resource 2 until time instant t k1 .Recall that x ji (t kj ) is also denoted by x ji (k j ); it is calculated as in ( 4), for j = 1, 2, . . ., m.Additionally, let Γ j > 0 denote the normalization factor of resource j, for j = 1, 2, . . ., m.It is chosen such that 0 < λ ji (k j ) ≤ 1 holds.For all i, j, at the k j 'th capacity event λ ji (k j ) is obtained as follows: x ji (t kj ) .
Furthermore, each agent runs its algorithm to demand shared resources.It is experimentally shown that following the approach with appropriate values of λ ji (k j ), α j , β j , and Γ j , the long-term average allocations of agents converge to optimal values, that is, and the system achieves an optimal social cost n i=1 f i (x * 1i , . . ., x * mi ).In the deterministic AIMD algorithm, an adversary may obtain the actual allocations of an agent and may infer the derivatives of the cost function or the cost function of the agent.On the other hand, the LDP-AIMD algorithm provides privacy guarantees to agents in the network on the partial derivatives of their cost functions and obtains close to optimal social cost.

III. PROBLEM FORMULATION
We aim to protect the privacy of agents' partial derivatives of cost functions with a certain privacy guarantee.Let us assume that agent i stores its private information in data-set D i , for i = 1, 2, . . ., n.Let k 1 capacity events of resource 1 and k 2 capacity events of resource 2 occur until time instant t k .Recall that, for k ∈ N, x 1i (t k ) denote the average allocation of agent i over the capacity events for resource 1 until time instant t k ( refer Equation ( 4)).And, x 2i (t k ) denotes the average allocation of agent i over capacity events for resource 2 until time instant t k .Note that for the sake of simplicity of notations, we consider two resources here.
For i = 1, 2, . . ., n, ) denote the set of partial derivatives of the cost functions of agent i up to the k 1 'th capacity event of resource 1 (occurred until time instant t k ), we define it as Additionally, let D 2,i (t k ) denote the set of partial derivatives of the cost functions of agent i up to the k 2 'th capacity event of resource 2 (occurred until time instant t k ), we define it as Let ∆q 1i denote the sensitivity for resource 1, and let ∆q 2i denote the sensitivity for resource 2. We define the p-norm sensitivity as follows.
Definition III.1 (p-norm Sensitivity).For i = 1, 2, . . ., n, let and for For p ∈ N, the p-norm sensitivity metric for agent i for resource 1 is defined as Analogously, the p-norm sensitivity metric ∆q 2i for resource 2 is defined.
For agent i, i = 1, 2, . . ., n, we define the local ϵ idifferential privacy as follows: + → R + be the cost functions of agent i.Let the set D i be the input values to the privacy mechanism, defined as Let S be the sample space.Furthermore, let M qi : S × D i → R be a privacy mechanism, and q i : D i → R be the query on D i .Then for ϵ i > 0 and output values η i ∈ R, if the following holds then M qi is called an ϵ i -local differential privacy mechanism.
Here, ϵ i is the privacy loss bound of agent i.
Definition III.3 (Local (ϵ i , δ i )-differential privacy).Let f i : R 2 + → R + be the cost functions of agent i.Let the set D i be the input values to the mechanism, defined as Let S be the sample space.Furthermore, let M qi : S × D i → R be a privacy mechanism, and q i : D i → R be the query on D i .Then for ϵ i , δ i > 0, and for all input values and for all output values η i ∈ R, if the following holds then M qi is called an (ϵ i , δ i )-local differential privacy mechanism.
Here, ϵ i and δ i are the privacy loss bounds; δ i denotes the probability that the output of the mechanism M qi varies by a multiplicative factor of exp(ϵ i ) when applied to data-sets D i .The smaller values of δ i and ϵ i imply that higher privacy is preserved.Note that for δ i = 0, the mechanism preserves ϵ i -local differential privacy.
Recall that the average allocation is calculated as in (4); let the vector x ∈ (R n + ) m denote (x 11 , . . ., x mn ).Also recall that the solution to federated optimization Problem II.1 is denoted by x * = (x * 11 , . . ., x * mn ).In this paper, we develop a local differentially private AIMD algorithm, the LDP-AIMD algorithm.The LDP-AIMD solves the federated optimization Problem II.1 with close to optimal values and provides a privacy guarantee on the partial derivatives of the cost functions of agents.That is, for two resources in the system, j = 1, 2 and k j ∈ N, we obtain with a (local) differential privacy guarantee on the partial derivatives of the cost function of each agent with negligible communication overhead.

IV. THE LDP-AIMD ALGORITHM
This section presents the local differentially private additiveincrease multiplicative-decrease (LDP-AIMD) algorithm.
As we discussed earlier, in the deterministic AIMD algorithm (described in Section II-C), adversaries may infer the actual VOLUME 11, 2023 resource allocations of agents and may also infer the partial derivatives of the cost functions of an agent.Although the agents make their decisions in the deterministic AIMD algorithm based on their local computations; however, the agents are prone to privacy breach.The LDP-AIMD algorithm bridges the gap and provides a strong privacy guarantee on the partial derivatives of agents' cost functions and so on their cost functions and obtains close to optimal social cost.

A. LDP-AIMD ALGORITHM FOR MULTIPLE RESOURCES
In the LDP-AIMD algorithm, each agent works as a curator and randomizes partial derivatives of its cost function.The LDP-AIMD algorithm has two phases similar to the classical AIMD algorithm-additive increase (AI) and multiplicative decrease (MD).We denote the additive increase factor by α j > 0, multiplicative decrease factor by 0 ≤ β j < 1, and normalization factor by Γ j > 0, for j = 1, 2, . . ., m.Each agent runs its LDP-AIMD algorithm to demand shared resources.Additionally, we consider a central server that keeps track of the aggregate allocation of resources and broadcasts a one-bit capacity event notification when the aggregate demand reaches the capacity of the resource.The central server initializes the parameters Γ j , α j , and β j with desired values and sends these to agents when they join the network.The algorithm works as follows: After joining the network, agents start increasing their demands for shared resource j linearly by the additive increase factor α j > 0 until they receive a one-bit capacity event notification from the central server.The central server broadcasts this notification when the aggregate resource demand reaches resource capacity.We illustrate the system diagram in Figure 1.
For discrete time steps ν = 0, 1, 2, . . ., we describe the AI phase as: After receiving the capacity event notification from the central server, an agent decreases its demands using multiplicative decrease factor 0 ≤ β j < 1, the average resource allocations, and the noisy scaling factor λji .The noisy scaling factor λji (k j ) is calculated using the partial derivatives of cost functions f i of agent i and noise drawn from a certain probability distribution.Let d ji (k j ) be a random variable and denotes the noise added to the partial derivatives of cost functions of agent i for resource j.It is drawn from certain probability distribution at the k j 'th capacity event.At the k j 'th capacity event, for i = 1, 2, . . ., n and j = 1, 2, . . ., m, we obtain λji (k j ) as follows: x ji (k j ) . ( The value of normalization factor Γ j is chosen such that 0 < λji (k j ) ≤ 1.
Let us suppose that the k j 'th capacity event occurs at time step ν.Additionally, let S j (ν) ∈ {0, 1} denote the capacity event notifications, for resource j, j = 1, 2, . . ., m; it is updated to S j (k j ) = 1, when the k j 'th capacity event occurs.The multiplicative decrease phase is formulated as follows: x ji (ν + 1) = λji (k j )β j + 1 − λji (k j ) x ji (ν).(12) After decreasing the resource demands, the agents again start to increase their demands linearly until they receive the next capacity event notification; this process is repeated over time.By following this, agents achieve privacy guarantees on their partial derivatives, and their long-term average allocations reach close to optimal allocations.The differentially private AIMD algorithm for agents is presented in Algorithm 2, and the algorithm of the central server is presented in Algorithm 1.The LDP-AIMD algorithm provides privacy Algorithm 1: Algorithm of the central server.guarantees to agents, and the average allocations asymptotically reach close to the optimal values, and the system incurs negligible communication overhead.Note that the more noise is added to the partial derivatives of the cost functions, the more privacy is guaranteed, but the convergence of average allocations will go farther from the optimal point, and the solution's efficiency will decrease.
Readers can find a similar formulation for the expected value of opinions of agents used in [49] in the context of social dynamical systems.
We state the following asymptotic upper bound for the communication complexity of the algorithm.
Remark IV.1 (Communication complexity).For a multiagent system with m resources, the communication complexity in the worst-case scenario will be O(m) bits at a time step.Moreover, if the algorithm converges in K time steps, then the communication complexity will be O(mK) bits.
Let ∆q ji denote the sensitivity for partial derivatives of the cost functions of agent i for resource j.We now present . . .

B. NOISE DRAWN FROM LAPLACIAN DISTRIBUTION
In this subsection, we obtain the result for Laplace distribution.The agent i adds the noise d ji (t k ) to its partial deriva-tives of the cost functions to calculate its scaling factors (see (11)).The noise is drawn from the Laplacian distribution with probability density function p ji (z) = exp(−|z|/(∆qji/ϵji)) 2∆qji/ϵji with location 0 and scale parameter ∆q ji /ϵ ji , for ϵ ji > 0.
For two resources in the system, agent i obtains (ϵ 1i + ϵ 2i )differential privacy guarantee, as in the following result.
Theorem IV.2.Let D i be the data universe of the partial derivatives of cost function f i of agent i.Let M q1i and M q2i denote the privacy mechanisms for resource 1 and 2, respectively, of agent i.Let A i denote the privacy mechanism for coupled resources.Moreover, for two resources in the multi-agent system, let η 1i , η 2i ∈ R denote the output and ϵ 1i , ϵ 2i > 0 denote the privacy loss bounds of the privacy mechanisms M q1i and M q2i , respectively.Additionally, let d 1i and d 2i represent noise drawn from Laplace distribution with location 0 and variance 2(∆q ji /ϵ ji ) 2 used in the privacy mechanisms, then the coupled privacy mechanism A i is Proof.It is presented in the Appendix.

C. NOISE DRAWN FROM GAUSSIAN DISTRIBUTION
This subsection presents the results for Gaussian noise added to an agent's partial derivative of its cost function to calculate its scaling factor (see (11)).If the noise d ji (t k ) is drawn from the Gaussian distribution with probability density function ) with zero-mean and standard deviation σ ji ≥ ∆qji ϵji 2(ln 1.25 δji ); agent i obtains (ϵ 1i + ϵ 2i , δ 1i + δ 2i )-differential privacy guarantee, as we obtain in the following result.
Theorem IV.3.Let D i be data-sets of the partial derivatives of cost function f i of agent i.Let M q1i and M q2i denote the VOLUME 11, 2023 privacy mechanisms for resource 1 and 2, respectively, of agent i.Let A i denote the privacy mechanism for coupled resources.Moreover, for two resources in the multi-agent system, let η 1i , η 2i ∈ R denote the output and ϵ 1i , ϵ 2i > 0 denote the privacy loss bounds of the privacy mechanisms M q1i and M q2i , respectively.Additionally, let d 1i and d 2i be noise drawn from Gaussian distribution with mean zero and standard deviation σ 1i and σ 2i , respectively, used in the privacy mechanisms, then the coupled privacy mechanism A i is (ϵ 1i + ϵ 2i , δ 1i + δ 2i )-differentially private.
Proof.The proof sketch is presented in the Appendix.
We make the following remark on adding the noise to the additive increase phase.
Remark IV.4.We expect similar results when noise is added to the additive increase phase; however, we expect a slower convergence.Furthermore, as our algorithm aims to preserve the privacy of the partial derivatives of the cost functions of agents, we add noise in the multiplicative decrease phase only.

V. RESULTS
In this section, we describe the experimental setup for the LDP-AIMD algorithm and present the results.The results show that with appropriate noise levels, the long-term average allocations reach close to the optimal values with a privacy guarantee on the derivatives of the cost functions of agents.Furthermore, we analyze the trade-off between privacy loss and the algorithm's accuracy-the more noise we add, the better the privacy, but the lesser the accuracy.

A. SETUP
We consider six agents (the number is a random pick) that share two resources with capacities C 1 = 5 and C 2 = 6.The agents initialize their allocations x 1i (0) = x 2i (0) = 0, for i = 1, 2, . . ., 6.We chose the additive increase factors α 1 = 0.01 and α 2 = 0.0125, the multiplicative decrease factors β 1 = 0.70 and β 2 = 0.6.Additionally, we chose the normalization factors Γ 1 = Γ 2 = 1 1000 .To create different cost functions, we use uniformly distributed integer random variables a i ∈ [10,30], b i ∈ [15,35], h i ∈ [10,20], and g i ∈ [15,25], for i = 1, 2, . . ., 6.The cost functions are listed as follows: Agents 1 and 2 demand the resources using cost function as in (13)(i); analogously, agents 3 and 4 demand the resources using cost function as in (13)(ii), and agents 5 and 6 demand the resources using cost function as in (13)(iii).Recall that, for agent i, ∆q 1i and ∆q 2i represent the sensitivity of the partial derivatives of cost functions with respect to resource 1 and resource 2, respectively.For a fixed time step t k , let k 1 capacity events of resource 1 and k 2 capacity events of resource 2 occur until t k .We add the same level of noise for all agents sharing a resource; to do so, we calculate the maximum sensitivity until time step t k for resource 1 as ∆q 1 (t k ) = max{∆q 11 (t k ), . . ., ∆q 1n (t k )}.Analogously, for resource 2, as ∆q 2 (t k ) = max{∆q 21 (t k ), . . ., ∆q 2n (t k )}.

B. MAIN RESULTS
We now present experimental results to check the efficacy of the LDP-AIMD algorithm.Moreover, we present the results with Gaussian noise added to the partial derivatives of the cost functions of agents the agents obtain (ϵ 1i +ϵ 2i , δ 1i +δ 2i )differential privacy guarantee, and we also present the results with Laplacian noise added to the partial derivatives of the cost functions of agents, the agents obtain (ϵ 1i + ϵ 2i )differential privacy guarantee and agents' allocations on long-term averages are close to optimal values.

1) With Gaussian noise:
This subsection presents the results obtained by adding Gaussian noise with mean zero and standard deviation σ ji to the partial derivatives of the cost functions of agents, j = 1, 2. In this case, the sensitivity is calculated as the l 2norm of the consecutive partial derivatives of cost functions of agents.Furthermore, the standard deviation for agent i is calculated as σ 1i = ∆q1i ϵ1i 2(ln 1.25  δ1i ) for resource 1; similarly, standard deviation for agent i for resource 2 is calculated as σ 2i = ∆q2i ϵ2i 2(ln 1.25  δ2i ).For a fixed δ 1i = δ 1u = δ 1 and σ 1i = σ 1u = σ 1 , and the sensitivity ∆q 1 = max{∆q 11 , . . ., ∆q 1n }, all agents have the same privacy loss values, ϵ 1i = ϵ 1u = ϵ 1 , for i, u = 1, 2, . . ., 6. Similarly, for a fixed δ 2i = δ 2u = δ 2 and σ 2i = σ 2u = σ 2 , and the sensitivity ∆q 2 = max{∆q 21 , . . ., ∆q 2n }, we have, ϵ 2i = ϵ 2u = ϵ 2 , for i, u = 1, 2, . . ., 6.In the experiment, the Gaussian noise with mean zero and standard deviation σ 1 is added to the partial derivatives of the cost functions with respect to resource 1; analogously, Gaussian noise with mean zero and standard deviation σ 2 is added to the partial derivatives of the cost functions with respect to resource 2. For resource 1, we chose δ 1 = 0.01 with l 2 -norm sensitivity ∆q 1 = 1.32 and standard deviation for Gaussian noise σ 1 = 20.50 to obtain the privacy loss value ϵ 1 = 0.2.Likewise, for resource 2, we chose δ 2 = 0.01 with l 2 -norm sensitivity ∆q 2 = 2.53 and standard deviation for Gaussian noise σ 2 = 39.31 to obtain the privacy loss value ϵ 2 = 0.2.As the random variable (noise) d ji (k j ) follows Gaussian distribution and σ ji (k j ) = ∆qji(kj ) ϵji 2(ln 1.25 δji ), the mechanism is (ϵ 1i + ϵ 2i , δ 1i + δ 2i )-local differentially private (See Theorem 1, Section 4).The initial few partial derivatives are not considered while calculating the maximum sensitivities because they depend on the initialization states.partial derivatives of the cost functions of agents at each time step.Figure 2 illustrates the evolution of average allocations of resources for a few selected agents.We observe in Figure 2 that the average allocations of resources reach close to the optimal values over time with the chosen privacy losses and the sensitivities; the noise is drawn from Gaussian distribution.To compare the results for optimal values, we solved the optimization Problem using a solver; the optimal values by the solver are denoted by the dashed straight lines of the respective colors in Figure 2. Figure 3 presents the evolution of partial derivatives of the cost functions with respect to a resource as shaded error bars.We observe that partial derivatives gather closer to each other over time; that is, error values decrease over time.Thus, the partial derivatives of the cost function with respect to a resource make a consensus over time.Hence, the long-term average allocations are close to optimal values with chosen privacy losses and sensitivities.Gaussian noise of different standard deviations.Note that the larger the standard deviation, the larger the Gaussian noise.Moreover, the absolute difference between long-term average allocations and optimal allocations increases when the standard deviation increases.Additionally, the errors in the shaded error bars of partial derivatives of the cost functions also increase.Thus, the long-term average allocations go farther from the optimal values as noise increases.
Over time the difference between the consecutive partial derivatives of the cost functions decreases; in Figure 4, we observe that the l 2 -sensitivities of resources converge over time.Moreover, Figure 6(a) shows the ratio of the total cost obtained by the LDP-AIMD algorithm at the last capacity event of the experiment and the total optimal cost evaluated with different standard deviations for several experiments.When the standard deviation increases, thus the noise, then the ratio of the total costs increases.Hence, privacy increases, but accuracy decreases.
The dependence on privacy parameters ϵ 1 , ϵ 2 and δ 1 , δ 2 and the ratio of the total cost obtained by the algorithm at the last capacity events of the experiment and the total optimal cost are illustrated in Figure 6(b).We observe that keeping δ 1 fixed when ϵ 1 is small (better privacy) (analogously, for δ 2 and ϵ 2 ), the ratio of the costs is larger; hence, lesser accuracy is provided.Furthermore, after a particular value of ϵ 1 and ϵ 2 , the ratio of the costs does not change much when the value of ϵ 1 and ϵ 2 increase.We observe that the values of ϵ 1 and ϵ 2 greater than 0.1 change the ratio of the costs very little.Additionally, similar to ϵ 1 and ϵ 2 , when the values of δ 1 and δ 2 are small, the ratio of the costs is large.However, after a particular threshold, the ratio of the costs does not change much when the values of δ 1 , δ 2 increase.FIGURE 8: Evolution of absolute difference of the average allocations and the optimal allocations of selected agents, and the evolution of partial derivatives of cost functions with Laplacian noise, l 1 -sensitivities ∆ 1 = 5.9 and ∆ 2 = 6.34 and privacy budgets ϵ 1 = ϵ 2 = 0.1.

2) With Laplacian noise:
This subsection presents the results for (ϵ 1i +ϵ 2i )-local differential privacy (See Theorem 2, Section 4) obtained by adding Laplacian noise to the partial derivatives of the cost functions of agents.In this case, the l 1 -norm of the consecutive partial derivatives of cost functions of agents is calculated.Over time the difference between the consecutive partial derivatives of the cost functions decreases as illustrated in Figure 7; that is, the l 1 -sensitivities of resources converge over time.Figure 8 illustrates the evolution of the average allocation of resources and the evolution of shaded error bars of partial derivatives of the cost functions of agents with added Laplacian noise.Similar to the Gaussian case, the absolute difference between long-term average allocations and optimal allocations increases when the standard deviation increases.Additionally, the errors in the shaded error bars of partial derivatives of the cost functions also increasethe long-term average allocations go farther from the optimal values as noise increases.The evolution of the communication cost in bits with the Laplacian noise is illustrated in Figure 9; it is the number of capacity events over time steps.We observe that the communication cost increases (approximately) linearly as the time steps increase.It is in line with our observation that the higher the time to convergence, the higher the communication cost.

VI. RELATED WORK
This section briefly presents the related work on distributed resource allocation, distributed optimization, federated optimization, and differential privacy.

A. RESOURCE ALLOCATION
Divisible resource allocations are studied by Ghodsi et al. [27]; they propose a dominant resource fairness algorithm for multiple divisible resource allocations.Recently, Nguyen et al. [50] proposed a market-based algorithm to allocate multiple resources in fog computing.Furthermore, Fossati et al. [51] proposed a multi-resource allocation for network slicing in the context of 5G networks.The distributed re-source allocation for 5G-vehicle-to-vehicle communication was proposed by Alperen et al. [52], and an envy-free fair allocation mechanism is proposed in [53] for allocating multiple computing resources of servers.Additionally, interested readers can refer to Shang's work [54] on a distributed graph-theoretic opinion formation model for social dynamical systems.The model provides resilience to malicious and Byzantine actors in the network.A survey on the allocation of multiple resources can be found at [55].An interesting work by Zhou et al. on non-linear dynamic switching heterogeneous multiagent systems is found at [56].They propose a leaderless approach to obtain consensus on position and velocity states.Their approach is communication efficient and does not require sharing auxiliary dynamic states to obtain consensus; however, some agent interactions are required.Nonetheless, the major difference between our work and this work [56] is that [56]'s approach is leaderless.In contrast, we consider a central server that keeps track of the aggregate consumption of resources and sends notifications when the capacity constraint is reached.Furthermore, our approach deals with differential privacy guarantees.

B. DISTRIBUTED OPTIMIZATION
Tsitsiklis and co-authors proposed the distributed optimization problem in their seminal work [57], [58].To solve a distributed optimization problem, agents in a network need to share their states or multipliers with at least one neighbor, which may compromise the agents' privacy.Some of the consensus-based distributed optimization approaches are: Sub-gradient methods [59], [60], Gossip algorithm for distributed averaging [61], [62], Dual averaging [63], [64], Broadcast-based approach [65], [66], to name a few.Moreover, privacy-preserving distributed optimization has been studied in [64], [67].

C. FEDERATED OPTIMIZATION
Federated learning is a distributed machine learning technique in which several agents (clients) collaborate to train a global model without sharing their local on-device data.Each agent updates the global model with its local data-set and parameters and shares the updates with the central server.The central server aggregates the updates by agents and updates the global model [3], [4], [68].The optimization techniques are called federated optimization [1], [2].
One of the most popular and widely used federated optimization techniques is FederatedAveraging (FedAvg) by McMahan and co-authors [3].It is based on stochastic gradient descent.In the algorithm, a fixed number of agents are selected randomly from agents in the network at each iteration.Each selected agent performs the stochastic gradient descent on its local data using the global model for a certain number of epochs.Moreover, each agent performs the local updates for the same number of epochs.Finally, it communicates the averaged gradients to the central server.The central server then takes a weighted average of the gradients by agents and updates the global model.The process is repeated until the model is trained.
Another federated optimization technique is FedProx [69], which is a generalization of the FedAvg.In FedProx, the number of epochs is not fixed for each iteration, as in Fe-dAvg; however, it varies, and partial updates by agents are averaged to update the global model.It is proposed for heterogeneous devices and data-sets, data that are not independent and identically distributed (non-IID).In addition, the authors provide convergence guarantees.Sattler and co-authors [70] also proposed a federated optimization technique for non-IID data.We list a few other federated optimization techniques such as FedNova [71], SCAFFOLD [72], Overlap-FedAvg [73], federated composite optimization [74], federated adaptive optimization [1].Besides, interested readers can refer to [4], [5] and the papers cited therein for a detailed discussion on advances and future research directions in federated optimization and federated learning.

D. DIFFERENTIAL PRIVACY
Han et al. [15] developed a differentially private distributed algorithm to allocate divisible resources.They consider the convex and Lipschitz continuously differentiable cost functions.Moreover, therein noise is added to the constraints of the optimization problem.In another work on differentially private distributed algorithms, Huang et al. [75] also use convex functions.However, therein noise is added to the cost functions of agents.Olivier et al. [16] also developed distributed differentially private algorithms for the optimal allocation of resources.Differential privacy in the context of dynamical systems is studied in [76], [77], and [78].Interested readers may refer to the recent books by Le Ny [79] and Farokhi [80] for further details.Fioretto et al. [81] developed a differentially private mechanism for Stackelberg games.Duchi et al. developed the local differential privacy mechanism [82].Furthermore, Dobbe et al. [25] proposed a local differential privacy mechanism to solve convex distributed optimization problems.Also, Chen et al. [83] proposed the local and the shuffle differentially private models.

VII. CONCLUSION
We developed a local differentially private additive-increase and multiplicative-decrease multi-resource allocation algorithm for federated settings wherein inter-agent communication is not required.Nevertheless, a central server aggregates the resource demands by all agents and sends a one-bit notification in the network when a capacity constraint is violated.The algorithm incurs little communication overhead.Furthermore, agents in the network receive close to optimal values with differential privacy guarantees to agents' partial derivatives of the cost functions.We presented numerical results to show the efficacy of the developed algorithms.Also, an analysis is presented on different privacy parameters and the algorithm's accuracy.The communication complexity of the algorithm is independent of the number of agents in the network.For m shared resources in the network, in the worst-case scenario, m bits are required to communicate the capacity constraint notification.The work can be extended to other privacy models, for example, the shuffle model.It will be interesting to show the theoretical results on convergence and the rate of convergence of the LDP-AIMD algorithm.It will also be interesting to implement the developed algorithm in real-world applications. .We obtain the following for resource 1: Let k 1 capacity events for resource 1 and k 2 capacity events for resource 2 occur until time instant t k .As the noise is drawn from the Laplace distribution and f o 1,i (t k ) − f ′o 1,i (t k ) 1 denotes the l 1 -sensitivity ∆q 1i (t k ), we obtain Similarly, we obtain for resource 2: We obtained the following result for the coupled privacy mechanism: Proof of Theorem IV.3.For fixed j and i, the proof for (ϵ ji , δ ji )-differential privacy is similar to the proof of Theorem A.1 of [12].We then use joint probability for multiple resources and follow steps similar to the proof of Theorem 4.1.

FIGURE 1 :Algorithm 2 :
FIGURE 1: The local differential privacy AIMD's (LDP-AIMD) system diagram for multi-resource allocation.Here, C.E. represents the broadcast of a one-bit capacity event notification in the network.

Figures 2 and 3 FIGURE 2 :
FIGURE 2: Evolution of average allocations for selected agents (a) Gaussian noise with mean zero and standard deviation 20.5 added to the partial derivatives of the cost functions, for resource 1, (b) Gaussian noise with mean zero and standard deviation 39.31 added to the partial derivatives of the cost functions for resource 2. Dashed straight lines are plotted with the optimal values (obtained by the solver).

FIGURE 3 :FIGURE 4 :
FIGURE 3: (a)  Evolution of derivatives of cost functions with respect to resource 1 of all agents, (b) evolution of derivatives of cost functions with respect to resource 2 of all agents, for a single simulation.The noise is drawn from the Gaussian distribution.

Figure 5 FIGURE 5 :
Figure 5 illustrates the evolution of the average allocation of resources and the evolution of shaded error bars of partial derivatives of the cost functions of agents with added

FIGURE 6 :
FIGURE 6:  The ratio of the total cost obtained by the LDP-AIMD algorithm at the last capacity events and the total optimal cost, (a) evaluated with different standard deviations σ 1 and σ 2 , (b) evaluated at different values of privacy parameters ϵ 1 and ϵ 2 .Here, K 1 denotes the experiment's last capacity event for resource 1, and K 2 denotes the last capacity event for resource 2.