The Communication Value of a Quantum Channel

There are various ways to quantify the communication capabilities of a quantum channel. In this work we study the communication value (cv) of channel, which describes the optimal success probability of transmitting a randomly selected classical message over the channel. The cv also offers a dual interpretation as the classical communication cost for zero-error channel simulation using non-signaling resources. We first provide an entropic characterization of the cv as a generalized conditional min-entropy over the cone of separable operators. Additionally, the logarithm of a channel's cv is shown to be equivalent to its max-Holevo information, which can further be related to channel capacity. We evaluate the cv exactly for all qubit channels and the Werner-Holevo family of channels. While all classical channels are multiplicative under tensor product, this is no longer true for quantum channels in general. We provide a family of qutrit channels for which the cv is non-multiplicative. On the other hand, we prove that any pair of qubit channels have multiplicative cv when used in parallel. Even stronger, all entanglement-breaking channels and the partially depolarizing channel are shown to have multiplicative cv when used in parallel with any channel. We then turn to the entanglement-assisted cv and prove that it is equivalent to the conditional min-entropy of the Choi matrix of the channel. A final component of this work investigates relaxations of the channel cv to other cones such as the set of operators having a positive partial transpose (PPT). The PPT cv is analytically and numerically investigated for well-known channels such as the Werner-Holevo family and the dephrasure family of channels.


I. INTRODUCTION
A noisy communication channel prohibits perfect transmission of messages from the sender (Alice) to the receiver (Bob). While there are a number of ways to quantify the noise of a channel, perhaps the simplest is in terms of a guessing game. Suppose that with uniform probability Alice randomly chooses a channel input and sends it to Bob over the channel. Based on the channel output, Bob tries to guess Alice's input with the greatest probability of success. In this game, Bob's optimal strategy is to perform maximum likelihood estimation based (1) It is then straightforward to see that 1 n cv(P) is the largest success probability of correctly identifying the input x based on the output y, when x is drawn uniformly from [n]. The quantity cv(P) is thus a natural measure for how well a channel P transmits data on the single-copy level. The goal of this paper is to better understand the channel cv in different communication settings.
The channel cv also emerges in the problem of zeroerror channel simulation [1]- [4]. In the general task of channel simulation, Alice and Bob attempt to generate one channel P using another channel Q combined with pre-and post-processing [5]- [9]. Interesting variations to this problem arise when different types of resources are used to coordinate the pre-and post-processing of Q. For example, these resources could be shared randomness [6], [10], shared quantum entanglement [11]- [15], or nonsignaling side-channel [1], [2], [4]. The latter refers to a general bipartite channel that prohibits communication from one party to the other. When Q = id r is the identity map on [r], then the goal is to perfectly simulate P using r noiseless messages from Alice to Bob, along with any auxiliary resource. For a given class of resource, the smallest number r needed to accomplish this simulation is called the communication cost of P (also referred to as the signaling dimension of P in Refs. [16], [17]). It turns out that cv(P) is a lower bound on the communication cost when Alice and Bob have access to shared randomness [17]. In fact, this lower bound is tight when the Alice and Bob are allowed to use non-signaling resources [1]. Combining this discussion with the previous paragraph, we thus have two dual interpretations of the communication value: 1 n cv(P) as an optimal guessing probability and cv(P) as an optimal simulation cost. This is no coincidence since Eq. (1) is the dual formulation of the linear program characterizing the communication cost for perfectly simulating P using classical non-signaling resources (see Section VIII for more details).
The goal of this paper is to understand the communication value of quantum channels. Formally, a quantum channel is described by a completely positive tracepreserving (CPTP) map N mapping density operators ρ A on Hilbert space H A to density operators N (ρ) on Hilbert space H B . Every quantum channel is able to generate a family of classical channels by encoding classical data into quantum objects. Namely, for each x ∈ [n], Alice prepares a quantum state ρ x and sends it through the channel to Bob's side. Upon receiving N (ρ x ), Bob performs a quantum measurement, described by a general positive operator-valued measure (POVM) {Π y } y∈ [n ] , and regards his measurement outcome as the decoded classical data. The induced classical channel then has the form P(y|x) = Tr[Π y N (ρ x )]. (2) How noisy this channel will be depends on the state encoding {ρ x } x∈ [n] and measurement decoding {Π y } y∈ [n ] , and ideally one chooses the states and measurement to minimize the error in data transmission. We define the cv of N in terms of the classical channels it can generate.
Analogous to the classical case, 1 n cv n→n (N ) quantifies the largest success probability attainable in an n-input guessing game using the channel N . The quantity cv(N ) also has a dual interpretation as the classical communication cost for simulating any classical channel generated by N when Alice and Bob have access to non-signaling resources.
By taking multiple copies of the channel, one can consider the cv capacity, defined as CV (P) = lim k→∞ 1 k log cv(P k ), in the classical and quantum cases, respectively. It is not difficult to see that log cv(P) is an additive quantity and so CV (P) = log cv(P). On the other hand, as we show below, log cv(N ) is non-additive in general for quantum channels. A primary objective of this paper is to understand when additivity of log cv(N ) (equivalently multiplicativity of cv(N )) holds and when it does not. One of our main results is that multiplicativity always holds for qubit channels, whereas it does not for qutrits. One can also study the cv of channels that are enhanced by auxiliary resources shared between the sender and receiver. In the quantum setting, it is natural to consider entanglement-assisted channel communication, This paper is structured as follows. We begin in Section II by introducing the notation used in this manuscript and reviewing some preliminary concepts. Section III takes a deeper dive into the definition of channel communication value and relates it to the geometric measure of entanglement and other informationtheoretic quantities such as the conditional min-entropy. Section IV-A focuses on qubit channels and provides an analytic expression for cv in terms of the correlation matrix of the channel's Choi matrix. The Werner-Holveo family of channels is introduced in Section IV-B and the cv is computed. The question of cv multiplicativity is taken up in Section V with examples of both multiplicativity and non-multiplicativity being presented. Notably, the cv capacity is shown to take a single-letter form for entanglement-breaking channels, Pauli qubit channels, and the general depolarizing channel. Section VI introduces the notion of entanglement-assisted communication value and relates it to the conditional minentropy of the Choi matrix. Different relations to the communication value are considered in Section VII, with a particular focus on the PPT communication value and computable examples that it supports. In Section VII-B we describe a procedure for numerically estimating the cv of a given channel, and we provide a link to our developed software package, which performs this estimation. Finally, Section VIII provides a discussion of our results as they relate to channel capacity and zero error channel simulation.

II. NOTATION AND PRELIMINARIES
This paper considers exclusively finite-dimensional quantum systems represented by Hilbert spaces H A , H B , · · · etc. The collection of positive operators acting on Hilbert space H A will be denoted by Pos(A), which consists of all hermitian operators Herm(A) acting on A with a non-negative eigenvalue spectrum. The subset of these operators having unit trace constitute the collection of density operators for system A, and we denote this set by D(A). We write · ∞ and · 1 to indicate the spectral and trace norms of elements in Pos(A), respectively. For bipartite systems, an operator Ω ∈ Pos(AB) is called separable if it can be expressed as a positive combination of product states, and |β i β i | ∈ D(B), and we let SEP(A : B) denote the set of all separable operators on systems A and B. Classical systems can be incorporated into this framework by demanding that the density matrix of every classical state be diagonal in a fixed basis. In general, we will label a classical system by X or Y.
Quantum channels provide the basic building blocks of any dynamical system. Mathematically, they are represented by CPTP maps, and we denote the set of CPTP maps from system A to B by CPTP(A → B). The set CPTP(A → B) is isomorphic to the subset of Pos(AB) consisting of operators whose reduced density operator on system A is the identity. Specifically, for every N ∈ CPTP(A → B) its Choi matrix is the associated operator J N ∈ Pos(AB) given by where id is the identity map and φ + is proportional to the normalized d A -dimensional maximally entangled state, and we write the latter as |Φ + The fact that N is completely positive assures that J N ≥ 0, and the tracepreserving condition means that Tr B J N = I A , where I is the identity operator. On the other hand, if Tr A J N = I B then N is a unital map, meaning that N (I A ) = I B . More generally, we say a map is sub-unital if N (I A ) ≤ I B .
An important subclass of channels are known as entanglement-breaking (EB). These are characterized by the property that N A→B ⊗ id C (ρ AC ) ∈ SEP(B : C) for all ρ AC ∈ Pos(AC). It is not difficult to see that N ∈ CPTP(A → B) is EB if and only if J N ∈ SEP(A : B). For any subset S ⊂ Herm(A), we let denote the dual cone of S. As a final bit of notation, we write exp(x) and log(x) to mean 2 x and log 2 x, respectively.

III. CHARACTERIZING
which follows from the fact that we can always trivially split a POVM to increase outcomes Π → 1 2 Π + 1 2 Π, and we can always increase the size of our input set {ρ x } m x=1 by adding the same state ρ x multiple times. Finally, we note that where d B is the dimension of the output system. This follows from the fact that any POVM on a d B -dimensional systems can always be decomposed into a convex combination of extremal POVMs, each with at most d 2

B
outcomes [20], and the cv can always be attained with one of these extremal measurements.
It is also not difficult to see that for any channel N . The lower bound holds by considering a constant input ρ x = ρ (for all x) so that Similarly, the upper bound follows from the inequalities (11) where N † : B → A is the adjoint map of N , and so {N † (Π x )]} x will always be a valid POVM on Alice's system. Notice that when cv(N ) = 1, Bob can do no better than randomly guessing Alice's input. The following proposition characterizes the type of channel for which this is the case. Proposition 1. cv(N ) = 1 iff N is a replacer channel; i.e. there exists a fixed state σ such that N (ρ) = σ for all states ρ.

A. Communication Value via Conic Optimization
In general cv(N ) is difficult to compute. This can be seen more explicitly by casting cv(N ) as an optimization over the separable cone, SEP(A : B), whose membership is NP-Hard to decide [22]. Nevertheless, expressing cv(N ) as an optimization over SEP(A : B) leads to computable upper bounds since there are well-known relaxations to the set SEP(A : B) that are easier to handle analytically.
Proof. By Eq. (8) we have where ρ T x ⊗ Π x satisfies the conditions of Eq. (12). Conversely, any Ω AB ∈ SEP(A : B) can be written as x with |ψ x being a pure state. The condition Tr A [Ω AB ] = I B implies that {ω x } x constitutes a POVM.
Note that strong duality holds for the conic program here, and so Proposition 2 can be cast in dual form as In Section VIII, we will explore different relaxations to this problem by considering outer approximations of SEP(A : B).
B. An Entropic Characterization of Communication Value 1) The Conditional Separable Min-Entropy: An alternative but related manner of characterizing the communication value is in terms of the min-entropy or variations of it. Equation (14) might strike the reader as closely resembling the conditional min-entropy of J N . Recall that the conditional min-entropy of a positive bipartite operator ω AB is given by where D max (µ ν) = min{λ | µ ≤ 2 λ ν} [19]. Here ≤ denotes a generalized inequality over the convex cone of positive-semidefinite operators; i.e. X ≤ Y iff Y − X ∈ Pos(AB). Equivalently, we can combine the two minimizations in the definition of H min to write Comparing with Eq. (14), we see that cv is recovered by changing the cone from Pos(AB) to SEP * (A : B). Let us denote the cone inequality over SEP * (A : Then we can introduce a restricted conditional min-entropy.

Definition 2.
The conditional separable min-entropy of a positive bipartite operator ω AB is defined as where D sep max (µ ν) = min{λ | µ ≤ SEP * 2 λ ν}. By Eq. (14), we therefore have The separable min-entropy enjoys a data-processing inequality under one-way LOCC from Bob to Alice. The latter consists of any bipartite map Φ ∈ CPTP(AB → A B ) having the form In fact we can prove the data-processing inequality under an even larger class of operations.
Proof. Since Φ † preserves separability, we must have Φ(Q) ∈ SEP * (A : B ) for all Q ∈ SEP * (A : B). Hence, In other words, any feasible pair (σ, λ) in the minimization of H The maps of Proposition 3 include those of the form Φ = M ⊗ N , where M is sub-unital and N is CPTP. These maps are known to satisfy the data-processing inequality for the standard min-entropy [25]. However we suspect that Proposition 3 includes maps for which the standard min-entropy data-processing inequality does not hold.
We can apply Proposition 3 to the processing of Choi matrices. However, in this case not all maps Φ satisfying the conditions of Proposition 3 are physically meaningful. Specifically, we require the additional condition that Tr B Φ(P) = I A for all operators P in which Tr B P = I A . This assures that Φ maps Choi matrices to Choi matrices. One particular class of maps having this form are those in which Φ is a product of a positive unital map and a CPTP map, i.e. Φ = E † pre ⊗ E post . In this case, E pre and E post are pre-and post-processing maps for a given channel, respectively. As a consequence of Proposition 3 we therefore observe the following corollary, which can also be seen directly from the definition of communication value.

Corollary 2.
Communication value is non-increasing under pre-and post-processing of the channel.
Note that for classical systems X and Y, we have Pos(XB) = SEP(X : B) and Pos(AY) = SEP(A : Y). These correspond to classical-to-quantum and quantumto-classical channels, respectively, and in these cases Eq.
2) The max-Holevo Information: The cv can be further related to the max-Holevo information of a channel, χ max (N ). This quantity has been introduced in the study of "sandwiched" Rényi divergences [26], [27] and is defined as where the maximization is taken over all cq states ρ XA = ∑ x p(x)|x x| ⊗ ρ A x and [28]), if follows that we can restrict attention to pure ρ A x = |ψ x ψ x | A in the definition of χ max . Letting U be the unitary such that U|x = |ψ x , the maximization over ρ XA can then be replaced by a maximization over U such that We use this simplification to prove a relationship between channel χ max and conditional H sep min . Theorem 1. For any channel N A→B , Proof. Using Eq. (22) we have where and ∆ U T (τ) = ∑ x |x x|U T (τ)U * |x x| is a completely dephasing map after applying the rotation U T . The last equality in Eq. (24) follows from the fact that Pos(XB) = SEP(X : B), as noted above. Then by data-processing (Proposition 3), we have for any σ B and unitary U on system A. Hence from the definitions it follows that To prove the reverse inequality, for arbitrary σ B let and since λ 0 is a minimizer, there must exist some product state |α |β such that Let U be any unitary that rotates {|x } d A x=1 such that U|1 = |α * . Therefore Since this holds for all σ B and we are maximizing over U, we have We close this section by providing an alternative proof of Theorem 1. Instead of going through the Choi matrix, the following argument relies on a characterization of cv in terms of maximizing the min-entropy over encodings. In some sense this is intuitive as the communication value is optimizing minimal error discrimination, which min-entropy characterizes [19]. For this reason, the conceptual underpinning of this alternative derivation may be of interest in other applications.
Let {ρ A x } denote a subset of states for some alphabet X , ρ XA be a cq state defined using {ρ A x }, ρ U be the maximally mixed state on the relevant space, and ρ XB := (id X ⊗ N )(ρ XA ). Starting from (14), where the second equality is using X AB ∈ SEP * ⇔ α| β|X|α |β for all unit vectors α, β and the action of the channel in terms of the Choi, the third is by using uniform probability on X , the fourth is by definition of min-entropy, the fifth is using and the final equality is by definition.

C. Communication Value in Terms of Singlet Fraction
Another advantage of viewing cv(N ) in terms of a restricted min-entropy is that it provides an alternative operational interpretation of the communication value in terms of the singlet fraction, which it inherits from the min-entropy conic program. Recall that for a bipartite density matrix ω AB , its d A -dimensional singlet fraction is defined as where the maximization is taken over all unitaries applied to system B [29]. Proof. This follows the proof of the operational interpretation of the min-entropy [19], and we walk through the argument again here to exemplify that the only change is in restricting to the separable cone. Proposition 2 shows that cv(N ) is the maximum value Tr[Ω AB J N ], where Ω AB is the Choi matrix of a unital (entanglementbreaking) map, i.e. Ω AB = J M for some entanglementbreaking unital map M. Thus, The channel value is a measure of the maximum singlet fraction achievable using an entanglement-breaking (EB) channel to recover the singlet after the action of N . In this setting, non-multiplicativity means that the optimal EB channel Ψ changes when using N in parallel such that the achievable singlet fraction increases. This suggests that cv(N ) is a measurement of entanglement preservation.
where we have used the definitions of the Choi matrix and adjoint map, and Φ + is the normalized maximally entangled state. Noting the adjoint of an entanglementbreaking map is entanglement-breaking, and the adjoint of a unital map is trace-preserving, we have where the last line follows from the fact that the maximization over local unitaries in the definition of F + d A can be included in the maximization over entanglementbreaking channels.
Equation (32) yields the interpretation of cv(N ) as how well a maximally entangled state can be recovered after Alice sends one half of |Φ + d A to Bob over the channel N , and he is limited to performing an entanglement-breaking channel as post-processing error correction (see Fig. III-C). In Section VI, it will be shown that the entanglement-assisted communication value is characterized by exp(−H min (A|B) J N ), and it thus has a similar operational interpretation except Bob is now able to perform an arbitrary quantum channel to try and recover the maximally entangled state. Moreover, in Section VII, when we consider relaxations of cv to other cones, we will find that the PPT min-entropy H PPT min also retains this operational interpretation, but with recovery being relaxed to the use of co-positive maps.

D. The Geometric Measure of Entanglement and Maximum Output Purity
The channel cv is closely related to the geometric measure of entanglement (GME) [30], [31]. For a bipartite positive operator ω AB , its GME is defined as We can phrase this as the SDP For a channel N with Choi matrix J N , this SDP can be expressed in dual form as For any dual feasible λ in Eq. (35), we can take On the other hand, suppose that Z is dual feasible in Eq. (14). Then since Hence λI AB is dual feasible in Eq. (35). Since λ ≤ Tr[Z] for every Z, we conclude While these relationships are somewhat obvious from the formulation of the problem, it is nice to see them explicitly falling out of the two conic programs. It is known that Λ 2 (J N ) is equal to the maximum output purity of the channel N [32], [33] which is defined as To see the equivalence, first note that this supremum is attained for a pure-state input due to convexity of the operator norm. Then An alternative way to prove this equality is using [32]. In that work they establish that ν ∞ (N ) = Λ 2 (|N ) where |N is an un-normalized vector induced by the Kraus representation of N . One can in fact show that |N is the purification of the Choi matrix, though this has not been stated previously to the best of our knowledge. As the GME of a pure state is the same as the GME of the pure state with a single register traced off [34], one can being a lower bound of cv(N ) in Eq. (38), in general this bound will not be tight. Hence communication value is capturing property of a quantum channel that is distinct from maximum output purity. In fact, we have the following.
Proof. If N is a replacer channel, say, N (ρ) = |β β| ∀ρ, then clearly ν ∞ (N ) = cv(N ) = 1. On the other hand, suppose that ν ∞ (N ) = cv(N ) and let |α |β := argmax(Λ 2 (J N )). Then for an arbitrary state ρ A ∈ D(A) consider the operator Note that since Tr A Ω AB = I B and Ω AB ∈ SEP(A : B), it is a feasible solution for the optimization of cv(N ) in Proposition 2. Hence for all ρ we have But by the assumption ν ∞ (N ) = cv(N ), the second term must vanish for all ρ. This means that N is a replacer channel, outputting |β β| for all its trace-one inputs.

IV. EXAMPLES
Having characterized the channel cv in a variety of different ways, we now focus on the problem of computing it. In general, this is a challenging task. Here we provide closed-form solutions for arbitrary qubit channels and the family of Holevo-Werner channels. The latter will also provide a useful case study when we study relaxations to the communication value in Section VII.

A. Qubit Channels
Every qubit channel N induces an affine transformation on the Bloch vector of the input state. In more detail, every positive operator ρ can be written as ρ = γ(I + r · σ), where r ∈ R 3 has norm no greater than one. Then when N acts on ρ, it induces an affine transformation r → Ar + c, with A being some 3 × 3 matrix and c ∈ R 3 . Now, let σ = ∑ k α T k ⊗ β k be an arbitrary two-qubit separable operator with Tr[α k ] = 1 and ∑ k β k = I. We given them Bloch sphere representations α k = 1 2 (I + a k · σ) and where the last equality follows from the fact that It is easy to see that this maximization is attained by taking b 1 and b 2 as anti-parallel unit vectors aligned with the left singular vectors of A corresponding to its largest singular value, and likewise for a 1 and a 2 with respect to the right singular vector. Additionally, taking γ k = 1 2 for k = 1, 2 satisfies all the conditions. Hence we have the following.
where σ max (A) is the largest singular value of A.
Remark. For a unital channel N , the Bloch vector c is zero. Since we are able to obtain the largest value of γ k b T k Aa k for each value k, it follows that i.e. the upper bound in Eq. (38) is tight.
Example: Pauli Channels. As a nice example of Theorem 2, consider the family of Pauli channels, which consists of any qubit channel having the form where {X, Y, Z} are the standard Pauli matrices and ∑ 3 i=0 p i = 1. We can write the Choi matrix as where |Φ + i denotes the four Bell states. It is easy to see that the correlation matrix of J N is diagonal with entries Therefore, by Theorem 2 we can conclude that where p ↓ 3 and p ↓ 2 are the two largest probabilities of Pauli gates.
Notice that cv(N ) will equal its largest value of two if and only if there are no more than two Pauli gates applied with nonzero probability in N . In particular, when p ↓ 3 = p ↓ 2 = 1 2 the channel is entanglementbreaking; in fact it is a classical channel. Hence, this example shows that a channel's communication value captures a property distinct from its ability to transmit entanglement.

B. Werner-Holevo Channels
The Werner-Holevo family of channels [32], [35] is defined by where Φ 0 (X) = 1 n + 1 (Tr(X))I + X T This implies the Choi matrix is given by where Π + = 1 2 (I + F) and Π − = I − Π + . Proposition 6. The communication value of the Werner-Holevo channel is given by [36], we can apply the "twirling map" to the A B systems of σ while leaving the cv invariant: Furthermore, since T UU preserves the constraints on Ω AB , we can conclude without loss of generality the optimizer is given by X := T UU (Ω AB ) = xI AB + yF for some choice of x, y, i.e. it is an element of the U ⊗ Uinvariant space of operators.
As the space of PPT and SEP UU-invariant operators are the same [37], we can relax the optimization program to only requires X ∈ PPT. As is shown in (102), this means we require X satisfies X ≥ 0, Γ B (X) ≥ 0, Tr A [Ω] = I B . We will therefore convert these linear constraints into linear constraints on x, y.
Note that {Π + , Π − } define an orthogonal basis for the space spanned by {I, F}. Therefore we can write X = (x + y)Π + + (x − y)Π − , and the positivity constraints on X are given by Similarly, Γ B (X) = xI AB + yΦ + , where Φ + is the unnormalized maximally entangled state. An orthogonal basis for the space spanned by {I AB , Φ + } is given by It follows the partial transpose positivity constraints simplify to The objective function is given by Lastly, the trace condition is given by Combining these, we have the linear program Finally, if x satisfies these constraints, (58) is also satisfied. Thus we have reduced the LP to Taking the derivative of the objective function one finds that for λ ≤ 1+d 2d , the derivative is positive. Therefore, otherwise .
In Section VII, we generalize this derivation to determine the PPT relaxation of cv for the n-fold Werner-Holevo channels.

V. MULTIPLICATIVITY OF CV
We next consider how the communication value behaves when we combine two or more channels. The cv is multiplicative for two channels N and M if cv(N ⊗ M) = cv(N )cv(M).
When multiplicativity holds, it means that an optimal strategy for guessing channel inputs involves using uncorrelated inputs and measurements across the two channels. A concrete example of non-multiplicativity is given by the Holevo-Werner family of channels, as proven in Section V-D. In general, it is a hard problem to decide whether two channels have a multiplicative communication value. More progress can be made when relaxing this problem to the PPT cone, and we conduct such an analysis in Section VII. Here we resolve on the question of multiplicativity for a few special cases.

A. Entanglement-Breaking Channels
Our first result shows that non-multiplicativity arises only if the channel is capable of transmitting entanglement. Proof. Since N is EB, its Choi matrix has the form The dual optimization of cv (i.e. Eq. (14)) can then be expressed as By applying Theorem 3 iteratively across n copies of an entanglement-breaking channel, we obtain a single-letter formulation of the cv capacity.

B. Covariant Channels
We next turn to channels that have a high degree of symmetry. To study the question of multiplicativity, it will be helpful to use the relationship between cv and GME. The following is a powerful result proven in Ref. [33] regarding multiplicativity of the GME. We say an operator ρ AB is component-wise non-negative if there exists an orthonormal product basis {|i, j } i,j such that i, j|ρ|i , j ≥ 0 for all i, j, i , j .
For example, the Choi matrix of the identity channel, φ + d = J id d , is component-wise non-negative in the computational basis. Therefore by the previous lemma we have for any channel N . As we will now show, this sort of multipicativity can be readily extended to the communication value for channels with symmetry.
Let G be any group with an irreducible unitary representation on C d . Then, as we did in Eq. (49), let T UU denote the bipartite group twirling map with respect to G, g for all g ∈ G and all ρ. On the level of Choi matrices, this is equivalent to Proof. Let |α α| A ⊗ |β β| B be a product operator with trace equaling d and satisfying dΛ 2 (J N ) = α, β|J N |α, β . Suppose now that either N or Γ • N • Γ is G-covariant.
In either case we have where Ω AB = T † (|α, β α, β|). Note that Ω AB has trace equaling d, and since U g is an irrep, we have Tr A Ω AB = I. Hence cv(N ) ≥ dΛ 2 (J N ), and an analogous argument for N establishes that cv(N ) ≥ d Λ 2 (J N ). Therefore, where the last equality follows from Lemma 1. However, by the upper bound in Eq. (38) this inequality must be tight, which implies the desired multiplicativity.
Using this theorem, we can compute the cv capacity for certain channels.
Proof. It suffices to prove that cv(N ⊗n ) = ncv(N ). This follows directly from Theorem 4 by letting N = N ⊗n−1 and G = G ⊗n .
For example, in a qubit system all Pauli channels satisfy the conditions of Corollary 4. These are channels of the form N Pauli (ρ) = p 0 ρ + p 1 σ x ρσ x + p 2 σ z ρσ z + p 3 σ y ρσ y , (68) and they are covariant with respect to the Pauli group. Moreover, J N Pauli can always be converted into a matrix with non-negative entries by local unitaries. Hence using Eq. (46) we have As another example, consider the d-dimensional partially depolarizing channel D d,λ given by The channel Γ • D d,λ • Γ is G-covariant with respect to the full unitary group on C d [38]. The Choi matrix is given by J D d,λ = λφ + d + (1 − λ)I ⊗ I/d, which is clearly component-wise non-negative. Thus by Corollary 4 we have We remark that the Werner-Holevo family of channels introduced in Section IV-B fail to satisfy Corollary 4 since they are not component-wise non-negative. In fact, their cv is non-multiplicative, as we will see below. Nevertheless, Theorem 4 can be applied to a Werner-Holevo channel by using in parallel with another channel that is component-wise non-negative. For example, when trivially embedding W d,λ into a larger system we have multiplicativity: This result is perhaps surprising since mutliplicativity appears to not hold when we relax the cv optimization to the cone of PPT operators, as is shown in Section VII-C1 (See Fig. 5).

C. Qubit Channels
In Section IV-A we derived an explicit formula for the communication value of qubit channels. Namely, cv(N ) = 1 + σ max (N), where N is the correlation matrix of J N . Here we show that the cv is multiplicative when using two qubit channels in parallel.
Notice that their correlation matrices M = diag[m 1 , m 2 , m 3 ] and N = diag[n 1 , n 2 , n 3 ] are diagonal. An arbitrary channel can always be converted into this form by performing appropriate pre-and post SU(2) rotations on the channel, which do not change the communication value. Define the operator Since Tr[Z BB ] = cv(M)cv(N ), by the dual characterization of cv given in Eq. (14), we will prove that cv(M ⊗ N ) ≤ cv(M)cv(N ) if we can show that for an arbitrary two-qubit state Note that we have the action M(I) = I + c · σ, M(σ x ) = m x σ x , M(σ y ) = −m y σ y , M(σ z ) = m z σ z , and likewise for the action of N . Hence When comparing with Eq. (75), we see that Eq. (76) reduces to To prove this inequality, we note that effectively the nonunital components of M and N do not appear here. That is, let M and N be the unital CPTP maps defined by the Choi matrices Letting |ψ denote an eigenvector of largest eigenvalue for the operator on the LHS of Eq. (78), we have where we have used the fact that the GME is multiplicative for unital qubit channels (Lemma 1), along with Eq. (43). This proves Eq. (78).
A natural question is whether Theorem 5 can generalized to the case in which only one of the channels is a qubit channel. Unfortunately the proof of Theorem 5 relies heavily on the Pauli representation of qubit channels, and we therefore only conjecture that qubit channels possess an even stronger form of multiplicativity.
for any other channel N .

D. Non-Multiplicativity in Qutrits
In the previous sections we identified examples of channels for which the communication value is multiplicative. We now provide an example of channels that demonstrate non-multiplicativity. Our construction is the Werner-Holevo channels, which were previously known to exemplify non-additivity of a channel's minimum output purity [32], [33]. Specifically, the channel W d,0 has a Choi matrix proportional to the anti-symmetric subspace projector, The entanglement properties of this operator have been well-studied [33], [39], [40]. In particular, Zhu et al. have computed its one and two-copy geometric measures of entanglement to be Equation (83) is strictly larger than the square of Eq. (82) whenever d ≥ 3. Furthermore, the maximization in Eq. (83) is attained whenever |α AA and |β BB are maximally entangled states. Thus, we consider the separable operator where {|ϕ + k } d 2 k=1 is an orthonormal basis consisting of maximally entangled states for C d ⊗ C d . This satisfies the conditions of Proposition 2. Hence, we conclude

Remark.
In Section VII, we use that the spaces of PPT and SEP UUVV-invariant operators are equivalent [37], to determine the range of λ that satisfy nonmultiplicativity for cv(W d,λ ⊗ W d,λ ) numerically.

VI. ENTANGLEMENT-ASSISTED CV
We next generalize the communication scenario and allow the sender and receiver to share entanglement. Remarkably, this added resource simplifies the problem immensely. In what follows, we will allow Alice and Bob to share an entangled state ϕ A B that can be used to increase the channel cv. The most general entanglementassisted protocol is as follows (see Fig. 2). For input x ∈ [n], Alice performs a CPTP map E x ∈ CPTP(A → A) on her half of the entangled state ϕ A B . System A is then fed into the channel, and Bob finally performs a POVM {Π BB y } y∈[n ] on systems BB . The induced channel has transition probabilities given by (87) Note that this scenario corresponds to the one used in superdense coding [18]. The entanglement-assisted channel cv can now be defined.
where Pos(A : B) denotes the positive cone on AB. Moreover, cv * (N ) is attained using a (d A )-dimensional maximally entangled state. In other words, the restriction of σ AB to the separable cone (cf Eq. (12)) is removed when considering the entanglement-assisted problem.
Proof. It is clear that we need to only consider pure states in the supremum since cv(P) is convex-linear w.r.t. ϕ A B . Let |ϕ A B be arbitrary. We first show that without loss of generality we can take |ϕ to be maximally entangled. Recall Nielsen's Theorem [41] (see also [42] = p(k)|ϕ k . Using that cv is achieved by minimal error discrimination, i.e. cv * (P) = ∑ x P(x|x), where by construction Notice that {(I ⊗ M k ) † Π x (I ⊗ M k )} k,x constitutes a set of POVM elements on BÃ . This follows from the fact that {M k } k are Kraus operators for a CPTP map, and so the dual of this map, X → ∑ k M † k (X)M k , is unital. Likewise, letting U k (·) := U k (·)U † k denote a unitary channel, the collection {E x • U k } x,k forms a family of encoding maps. Therefore, we can express Eq. (90) as where theÊ z andΠ z are the concatenated encoders and decoder. This shows that we can restrict attention just to shared maximally entangled states. Furthermore, without loss of generality, we can assume that d A ≥ d A . The reason is that the transformation |Φ + is always possible for any d A ≥ d A ; so we could have just as well used the same argument with system A and arrived at Φ +A Ã in Eq. (91).
We next take Kraus-operator decompositions E z (·) = ∑ k N z,k (·)N † z,k with each N z,k : Hence Tr A Ω AB = I B is a necessary condition on the operator Ω AB such that cv * (P) = Tr[Ω AB J AB N ]. Let us now sure that it is also sufficient.
Consider any positive operator Ω AB such that Tr A Ω AB = I B . Introduce the generalized Pauli operators on system A, explicitly given by U m,n = ∑ d A −1 k=0 e imk2π/d A |m ⊕ n m|, where m, n = 0, . . . , d A − 1 and addition is taken modulo d A . It is easy to see that ∆(·) := 1 d A ∑ m,n U m,n (·)U † m,n is a completely depolarizing map; i.e. Ω(X) = Tr[X]I. Hence, This implies that the elements {U A m,n ⊗ id B (Ω AB )} m,n form a valid POVM on AB. Therefore, we can construct an entanglement-assisted protocol as follows. Let Alice and Bob share a maximally entangled state |Φ + Alice applies the unitary encoding map on system A given by U T m,n (·) := U T m,n (·)U * m,n , and sends her system through the channel N . When Bob performs the POVM just described on systemsÃB, the obtained score is ∑ m,n P(m, n|m, n) The key idea in this equation is that the unitary encoding U m,n performed on Alice's side is canceled by exactly one POVM element on Bob's side. This completes the proof of Theorem 6.
Remark. The achievability protocol in the previous proof is essentially the original superdense coding protocol applied on a d A -dimensional input channel.
Theorem 6 shows that the ea cv can be computed using semi-definite programming. Here we provide a family of channels in which it can be computed even easier. Corollary 5. Let N ∈ CPTP(A → B) be any channel such that J N has an eigenvector |ϕ AB with largest eigenvalue λ max (J N ) such that ϕ B = I/d B . Then Proof. Choose σ AB = d B |ϕ ϕ| AB in Theorem 6. Clearly this choice is optimal.
In addition, a solution can easily be deduced for all qubit channels.
Proof. Using Theorem 6, we can write Ω = 1 2 (I + r · σ) ⊗ I + ∑ i,j t ij σ i ⊗ σ j . On the other hand, up to local unitaries, the Choi matrix of a channel N can be expressed as where the last inequality follows form the fact that |t ii | ≤ 1 since Ω ≥ 0. The theorem is proven by recalling that the |a i | are the singular values of the correlation matrix A.
It is worthwhile to compare Theorems 2 and 7. Since Hence, the shared entanglement between sender and receiver cannot offer a multiplicative enhancement in the cv larger than the dimension. In general, we conjecture the following. Conjecture: For any channel N ∈ CPTP(A → B),

VII. RELAXATIONS ON THE COMMUNICATION VALUE
In previous sections we have made use of the fact the communication value can be expressed as a conic optimization problem (Proposition 2). It was noted in generality this problem would be hard to solve, but if the dimension the Choi matrix was sufficiently small, we could relax the cone, SEP (A : B), to the PPT cone, PPT(A : B), and still determine cv(N ) (Corollary 1). Moreover, in Section III-B, we used the optimization program of H min to justify characterizing cv(N ) by a restricted min-entropy, and in Section VI we saw the relationship between H min and cv * . In all of these cases, we have considered the same optimization program and simply varied the cone to which the variable was restricted. That is, we have considered the general conic program maximize: Tr[XΩ AB ] subject to: Tr A (X) = I B Ω AB ∈ K where cv(N ) corresponds to K = SEP(A : B) and cv * corresponds to K = Pos(A ⊗ B). It follows whenever we pick a cone K such that SEP(A : B) ⊂ K, we obtain an upper bound on cv(N ). Throughout the rest of this section, when considering relaxation SEP(A : B) ⊂ K, we denote the value of the optimization program by cv K (N ). In this section we primarily consider the PPT relaxation, K = PPT (A : B). We also discuss the relaxation to the k-symmetric cone, which is known to converge to the separable cone as k goes to infinity [43], making it particularly relevant.

A. Multiplicativity of Tensored PPT Operators over the PPT cone
We begin with the relaxation to the PPT cone. The primary advantage of this relaxation is that the problem becomes a semidefinite program and so pre-existing software may be used to find the optimal value. One may derive the primal and dual problems to be: where Γ B is the partial transpose map on the B space. This SDP satisfies strong duality as can be verified using Slater's condition.
With this established, we will now present a special multiplicativity property of the PPT relaxation, cv PPT .
Proof. Let R ∈ PPT(A 1 : B 1 ), Q ∈ PPT(A 2 : B 2 ). Let (Y 1 , Y 2 ), (Y 1 , Y 2 ) be the dual optimizers for R, Q respectively. From (103), we have Define R := Γ B 1 (R), Q := Γ B 2 (Q), which are both positive operators by assumption. Then we have where the first line follows from (104), the third is because of how the partial transpose over multiple systems may be decomposed, and the fourth is by linearity. Note that R , Q are positive as R, Q are PPT. Moreover Y 2 , Y 2 are positive by (103). Thus the whole argument of Γ B 1 B 2 is a positive semidefinite operator. Therefore is a feasible point of the dual problem for R ⊗ Q, and it achieves an optimal value of Tr(Y 1 )Tr(Y 2 ). If we let X 1 , X 2 be the optimizers for the primal problem for R, Q respectively, then X 1 ⊗ X 2 is clearly a feasible point for the primal problem for R ⊗ Q that achieves optimal value Tr(RX 1 )Tr(QX 2 ). Using the strong duality of the program, we have Tr(RX 1 )Tr(QX 2 ) = Tr(Y 1 )Tr(Y 2 ), so by strong duality we know our proposed optimizers are optimal and this completes the proof. It is interesting to note that we do not know that if only one of the channels is co-positive, then cv PPT is multiplicative, which would be a stronger claim. This is relevant because we conjecture that cv(N ⊗ M) is multiplicative if either of the channels is entanglement breaking, which is known to hold for maximal p-norms for p ≥ 1 [44], but even the weaker case of multiplicativity where both channels are entanglement-breaking remains open and would mirror Corollary 6, but for separable Choi matrixs and the separable cone optimization.
Relation to k-Symmetric Extendable Cone: Given the multiplicativity of tensors of PPT operators for cv PPT , one might hope this property might hope this property extends to cv Sym k with k-symmetrically extendable operators, where an operator Note that the k-symmetric extendable operators form a cone defined by semidefinite constraints. Moreover, it is known lim k→∞ Sym k = SEP(A : B) [43]. One can then attempt to extend Theorem 8 in this setting. One can do this by deriving the dual program for cv Sym k : min: Tr(W) where the indexing of π j is given by a chosen bijection between the index set [k!] and the permutations in S k . However, the proof method for Theorem 8 does not seem to naturally extend due to the permutations of the spaces.

B. Numerical Evaluation of the Communication Value
To numerically support this work, we developed the CVChannel.jl software package which is publicly available on Github [45]. This Julia [46] software package provides tools for bounding the communication value of quantum channels and certifying their nonmultiplicativity. Our software is built upon the disciplined convex programming package, Convex.jl [47], and our numerical results are produced using the splitting conic solver (SCS) [48]. For more details, the curious reader should review the software documentation and source code found on our Github repository [45].
The communication value is difficult to compute in general, but it can be bounded with relative efficiently. CVChannel.jl provides the following methods for bounding cv(N ). An upper bound on cv(N ) is computed via the dual formulation of the PPT relaxation of the communication value Eq. (103), cv(N ) ≤ cv PPT (N ).
While cv PPT (N ) is a natural upper bound of cv(N ), we consider the dual specifically so that we take a conservative approach to numerical error. That is, numerical error in minimizing the dual will result in a looser upper bound. In general, when considering upper bounds we work with the dual problem and when considering lower bounds we work with the primal problem. While the SDP satisfies strong duality, this guarantees minimizing false positives For a lower bound on cv(N ), we take a biconvex optimization approach to the problem This "see-saw" technique is applied to similar problems in [49], [50], although our implementation remains distinct. To begin, an ensemble of pure quantum states x=1 are initialized at random according to the Haar measure. Then, the following procedure is iterated: 1) With the states fixed, the POVM measurement is numerically optimized as a semidefinite program 2) With optimal measurement as {Π y }, we compute the optimal ensemble of quantum states {ρ x } where N † is the adjoint channel and || · || ∞ denotes the largest eigenvalue. Repeating this procedure results in a set of optimized To improve the see-saw optimization, the procedure is simply performed many times with randomly initialized states. Combining these techniques, we numerically bound the communication value, To numerically certify that quantum channels N and M are non-multiplicative, we need to compute a lower bound on cv( where ε > 0 is a conservative bound to the numerical error. One drawback of this procedure is its susceptibility to false negatives due to the fact that the PPT Relaxation is a loose upper bound of the communication value.

C. Examples
Having established properties of the PPT relaxation of the communication value, we investigate channels which are known in other settings to admit non-multiplicative behaviour. In particular, we look at the family of Werner-Holevo channels [32], the dephrasure channel [51], and the Siddhu channel [52]. We see that the Werner-Holevo channel is not multiplicative over a range of parameters, but the dephrasure and Siddhu channel which are known for their superactivation of coherent information are always multiplicative for the communication value. In some sense this should not be surprising as communication value captures a notion of using the quantum channel to transmit classical information whereas the coherent information measures the ability to transfer quantum information. However, it exemplifies how different the coherent information and communication values are as measures.
Werner-Holevo Channels: In Section IV-B, we showed how to determine the cv for the Werner-Holevo channels. In this section we extend the method to obtain this result to the construction of a linear program (LP) for determining the cv PPT for n Werner-Holevo channels ran in parallel. We then use this to show non-multiplicativity for cv(W d,λ ⊗ W d,λ ) as a function of λ, as well as the non-multiplicativity of cv PPT for more copies of the channel. We note our derivation assumes the dimension is the same for all channels, but a generalization is straightforward.

Proposition 7.
Considering n Werner-Holevo channels, there is a linear program max{ a, c : Ac ≥ 0, Bc ≥ 0, g, c = 1} , which obtains the value of cv PPT (⊗ n i=1 J(W d,λ i )). Moreover, there exists an algorithm to generate the constraints a, A, B, g which takes at most O(n2 2n ) steps.
Derivation of Constraints. Let Π 0 := Π + , Π 1 := Π − . This labelling will simplify notation. We are interested in cv PPT ( n i=1 J(W d,λ i )). Recalling the objective function of (102) is we can twirl Ω by moving the symmetry of the Holevo channels onto Ω. This results in Ω = ∑ s∈{0,1} n c s R s where where the constraint on the sign is because F 0 = I = Π 0 + Π 1 and F 1 = Π 0 − Π 1 , so a term is negative iff s(i) = j(i) = 1. Combining these, we can express Ω as a linear combination of orthogonal subspaces with coefficients stored in a vector c: With the state simplified into mutually orthogonal subspaces, we just need to convert the constraints of (102) to constraints on c ∈ R 2 n .
Guaranteeing positivity of Ω is equivalent to guaranteeing the weight of each orthogonal subspace in (113) is non-negative. As multiple elements of c can have weight on multiple subspaces, the constraint is that the relevant linear combination of c is non-negative for each subspace. Thus the positivity constraints may be written as Ac ≥ 0 where A ∈ R 2 n ×2 n matrix storing the sign information (−1) s(i)∧j(i) for all s,j.
The PPT constraints correspond to Ω Γ ≥ 0. Noting that F Γ = Φ + , the unnormalized maximally entangled state. We have Ω Γ = ∑ s∈{0,1} n c s i∈[n] X s(i) , where In other words, we have decomposed Ω Γ into linear combinations of a set of orthogonal subspaces. 1 Again, we only need to store the constraints on c which in this case is the order of d and if the coefficient is zero. By the definition of X s(i) , there is not weight of a subspace for c s iff the i th element in the tensor is Φ ⊥ and s(i) = 1, and otherwise the weight is given by d −(2 n −w(s)) where w(·) is the Hamming weight of the string s. Thus the PPT constraints may be written as Bc ≥ 0 where B ∈ R 2 n ×2 n .
Recalling Ω = ∑ s∈{0,1} n c s R s and Tr A (F) = I B , Tr A (I) = dI B , the partial trace condition is reduced to g, c = 1 where g ∈ R 2 n and g(s) = d n−w(s) .
Finally, we have the objective function. We write where ϕ i (j(i)) is the same as ζ i (j(i)), except without the normalization constant. Thus we may define a as the argument of the large parentheses. This completes the derivation of the LP. Finally we note to construct the constraints one needs to loop through nested loops of sizes 2 n , 2 n , n which results in the O(n2 2 n ) steps in the algorithm.
Using these numerics, we can look at the behaviour of the PPT-relaxation of the communication value of the n-fold Werner Holevo Channel (Fig. 3). We can see that the non-multiplicativity over the PPT cone grows exponentially (Fig. 3) and that all non-multiplicativity dies out at λ = 0.3 in all cases. We note it is known that for the tensor product of two Werner states, the space of PPT operators is the same as the space of separable operators. In this case, we see the non-additivity of the true communication value for the Werner-Holevo channels.
1) PPT Relaxation of Werner-Holevo with the Identity: An immediate corollary of Theorem 4 is that the Werner-Holevo channel when ran in parallel with the identity channel of any dimension is multiplicative. That is, cv(W d,λ ⊗ id d ) = d · cv(W d,λ ). However, here we find that this is not the case for cv PPT which is nonmultiplicative, exhibiting a clear separation between the cv and its relaxation. This separation is given for the W d,0 ⊗ id d in Fig. 5. It is determined using the following proposition. Here we see the non-multiplicativity of tensoring the Werner-Holevo channel with itself. We note that the blue line characterizes the multiplicativity of cv rather than just cv PPT . Proposition 8. The PPT communication value of the Werner-Holevo channel ran in parallel with an identity channel, cv PPT (W d,λ ⊗ id d ), is given by the linear program Derivation. The derivation is similar to that of the previous cv PPT LP derivations, we just also consider VV covariance for the identity channel. Let us consider the channel W d,λ ⊗ id d , where W d,λ is defined in (47). Then J W ⊗ φ + d is UUVV-covariant, and so for any feasible operator σ AB in the cv PPT SDP, we have This shows the gap between cv(N W H, Dephrasure Channel: We next consider the dephrasure channel, where p, q ∈ [0, 1]. The interesting aspect of the dephrasure channel is that in some parameter regime it admits superadditivity of coherent information [51]. We will first present it's communication value.
Proof. We are going to prove this by constructing feasible operators in the primal and dual which achieve this value. First we note the Choi matrix: where γ := (1 − q)(1 − 2p). Then for the primal problem, we may choose This clearly satisfies Tr A (X) = I B , it is PPT as it is diagonal, and X, J(N p,q ) = 2 − q. For the dual problem, let Then Y 1 is clearly Hermitian, and Y 2 0 as it's eigenvalues are κ ± γ ≥ |γ| ± γ ≥ 0 and 0 with multiplicity 4. Then, one may calculate from these expressions that Therefore we have constructed a feasible choice. Finally, Tr(Y 1 ) = 2 − q completes the proof.
Note what the above implies is the 'dephasing' property of the dephrasure is irrelevant. This is in some sense intuitive as the dephasing cannot hurt the classical information if the optimal strategy is sending data in the classical basis. Indeed, it is easy to see the above value may be achieved by using the signal states {|0 0|, |1 1|} and the projective measurement decoder {|0 0| + 1/2|e e|, |1 1| + 1/2|e e|} as then for both signal states you will guess correctly (1 − q) + q/2 conditioned on the state sent. As one might expect, in such a situation the communication value of the channel would be multiplicative with itself. As we require an upper bound, we verify this by an exhaustive numerical search using the dual problem of cv PPT . Theorem 9. cv(N ⊗2 p,q ) = cv(N p,q ) 2 , i.e. the dephrasure channel's communication value is multiplicative.

Proof.
A search over the dual problem cv PPT (N ⊗2 p,q ) for p, q ∈ [0, 0.01, ..., 1] is always within numerical error of cv(N p,q ) 2 . As the dual problem always obtains an upper bound on cv PPT , and cv PPT is an upper bound on cv, we may conclude that the dephrasure channel is multiplicative.
Siddhu Channel: Finally we consider the following family of channels: where s ∈ [0, 1/2]. This channel is known to have nonadditive coherent information over its entire parameter range when tensored with itself. However, we will now show the communication value of the channel is multiplicative with itself over the whole range. Proof. Like the dephrasure channel, we prove this by constructing upper and lower bounds that are the same.
For an upper bound, we consider the dual problem of cv PPT (103). First note that we can write the Choi matrix as: , which is positive semidefinite as it has eigenvalues 2 and 0 with multiplicity eight. It is then easy to determine where δ = α − √ s, = β − √ 1 − s. One may verify that this has eigenvalues of 1 and 0 with multiplicity eight. Thus it is feasible and Tr(Y 1 ) = 2. Noting that cv(N s ) ≤ cv PPT (N s ), we have 2 ≤ cv(N s ) ≤ 2, which completes the proof.

VIII. RELATIONSHIP TO CAPACITIES AND NO-SIGNALLING
As noted in the introduction, cv(N ) captures the classical communication cost to perfectly simulate every classical channel induced by N using non-signalling (NS) resources. This is because for a classical channel P, the one-shot classical communication cost for zeroerror simulation with classical NS, κ NS 0 , is given by ∑ y max x p(y|x) [1, Theorem 16]. Noting that cv(P) = ∑ y max x p(y|x), it follows κ NS 0 = cv(P) . Furthermore, due to the multiplicativity of cv for classical channels, the no-signalling assisted zero-error simulation capacity is also given by κ NS 0 , as was remarked in the original paper. Moreover, it is easy to show the classical capacity of a classical channel is bounded by cv(P) [1,Remark 17]: C(P) ≤ log cv(P) = χ max (P) , where we have used Theorem 1 in the last equality. Losing the single-letter property, it is easy to generalize this to arbitrary quantum channels by using the Holevo-Schumacher-Westmoreland theorem [53], [54], and whenever N satisfies weak multiplicativity for cv, such as for entanglement-breaking channels, this reduces to a single-letter upper bound.
In the entanglement-assisted regime, the relationships persist. First we recall that the SDP for min-entropy is multiplicative, and so CV * (N ) = cv * (N ) for arbitrary quantum channel N . This aligns with the fact the entanglement-assisted capacity of a quantum channel, C E (N ), is single-letter but the unassisted capacity is not. Continuing the parallels, cv * (N ) gives the classical communication cost to perfectly simulate N with a quantum no-signalling resource [2]. Given the above, a natural question is then if one can find bounds on the entanglement-assisted capacity, C E (N ), in terms of cv * (N ). Indeed, this can be done by using the definition of cv * and the fact that cv * is characterized by minimal error discrimination (as in Eq. (31)), cv * (N ) = sup ρ XAA |X | exp(−H min (X|BC) (id X ⊗N ⊗id A )(ρ) ) , where the supremum is over ρ XAA such that ρ X is uniform and the state is homogenous on register A [55], [56]. It follows by the same manipulations used in Eq. (31) that log cv * (N ) = χ E,max (N ), where χ E,max is the entanglement-assisted max-Holevo information, which is straightforward to define using [27], [55], [56]. Since the entanglement-assisted capacity equals the regularized entanglement-assisted Holevo information, we can conclude the where the regularization disappears because cv * (N ) is always multiplicative.