The platypus of the quantum channel zoo

Understanding quantum channels and the strange behavior of their capacities is a key objective of quantum information theory. Here we study a remarkably simple, low-dimensional, single-parameter family of quantum channels with exotic quantum information-theoretic features. As the simplest example from this family, we focus on a qutrit-to-qutrit channel that is intuitively obtained by hybridizing together a simple degradable channel and a completely useless qubit channel. Such hybridizing makes this channel's capacities behave in a variety of interesting ways. For instance, the private and classical capacity of this channel coincide and can be explicitly calculated, even though the channel does not belong to any class for which the underlying information quantities are known to be additive. Moreover, the quantum capacity of the channel can be computed explicitly, given a clear and compelling conjecture is true. This"spin alignment conjecture,"which may be of independent interest, is proved in certain special cases and additional numerical evidence for its validity is provided. Finally, we generalize the qutrit channel in two ways, and the resulting channels and their capacities display similarly rich behavior. In the companion paper [Phys. Rev. Lett. 130, 200801 (2023); arXiv:2202.08377], we further show that the qutrit channel demonstrates superadditivity when transmitting quantum information jointly with a variety of assisting channels, in a manner unknown before.


Introduction
Quantum channels model noisy communication links between quantum parties. The channel noise affecting signals can be mitigated by encoding the messages across many channel uses. The highest rate at which information can be sent reliably is known as a capacity of the channel. Depending on the type of information to be transmitted, we obtain different capacity quantities; for example, a quantum channel may be used to transmit classical, quantum, or private classical information. The capacities in each case are the classical capacity C, the quantum capacity Q, and the private classical capacity P, measured in bits per channel use, qubits per channel use, and private bits per channel use, respectively. The various capacities of a quantum channel quantify its usefulness in the respective communication setting.
There is a variety of synergy effects that may occur during a quantum channel's transmission of the various types of (quantum, private, classical) information. These include super-additivity of coherent information [45,13], private information [51], Holevo information [21], superactivation of quantum capacity [53], and private communication at a rate above the quantum capacity [27,25,36] (see Section 2 for details). These nonadditive effects enable exciting and novel communication protocols, but at the same time they obscure a succinct mathematical characterization of the corresponding quantum channel capacities.
Remarkably, such nonadditivities appear to be common; some are exhibited even for simple channels such as the depolarizing channel. On the other hand, for certain classes of quantum channels the information quantities listed above can be additive, thus simplifying their information-theoretical characterization. Unfortunately this is only known to be true in a few special cases. For example, the coherent information of a PPT channel is additive and indeed zero [27]. The only other channel classes with known additive coherent information are the degradable [11] and antidegradable channels, which as a result have a so-called single-letter formula for their quantum capacity. These channels also have the pleasing property of having their private capacity equal to the quantum capacity [50]; as a result, also the private capacity is given by a single-letter formula for these channels. There exists a smattering of channels whose Holevo information is additive, and therefore these channels have a single-letter formula for their classical capacity [28,30,42,31]. However, beyond special examples of proven nonadditivities and additivities, little is known about most capacities of most channels.
The best path towards a deeper understanding of nonadditivity effects in quantum information-in fact, a better understanding of quantum information itself-is to better understand and develop the menagerie of these phenomena. However, clean and clear examples of channels that isolate different aspects of nonadditivity are in short supply. As a result, over the past two decades significant effort has been dedicated to elucidating these phenomena, leading to numerous exciting findings; yet a full understanding still remains elusive. Without such understanding, we lack a theory on how to best communicate with quantum channels, and fail to answer the kinds of questions resolved in classical information theory. This is substantiated by the fact that random codes can be suboptimal; that we cannot evaluate capacities beyond special examples; that the known capacity quantities may not fully capture the communication potential of a noisy channel; and that our understanding of error correction in the quantum setting is incomplete, whether the data is classical, private, or quantum. This paper studies novel examples of additivity based on combining two well-understood but very different classes of channels. In the simplest case, we combine two qubit-channels in a qutrit channel so that their inputs overlap along one dimension. The superposed identities of the subchannels used in the transmission convey additional information to the receiver and thus becomes an integral part of the communication sent.
Our channel is constructed as follows. We start with a degradable channel, which we know has singleletter quantum and private capacities. We then explicitly break degradability by adding an extra input state that lets the sender transmit quantum information directly to the environment but allows no additional information to be sent to the output. This makes it impossible for the channel output to simulate the environment as required for degradability, and thus takes the channel outside any class known to have additive private or coherent information. Nevertheless, the coherent information appears to remain additive, which we can show up to a very reasonable conjecture. Even more surprisingly, the private information remains additive but takes a much larger value than the original channel. This difference in private and quantum capacity is a clear signature of the nondegradability of the channel. The apparent additivity of coherent information must therefore be coming from some new mechanism of additivity, which we seek to understand.

Main results
We study a remarkably simple, single-parameter family of qutrit channels N s , along with generalizations M d and O of this channel family to arbitrary dimension d. These channel families exhibit many strange behaviors for quantum communication while having uncomplicated classical and private classical capacities: The classical and private classical capacities can be calculated explicitly because the underlying information quantities (Holevo and private information, respectively) are additive, even though neither channel family belongs to any of the known additivity classes. The same holds true for the quantum capacity: the coherent informations of these families are additive, provided that a certain entropy minimization conjecture is true. We give evidence for the validity of this "spin alignment conjecture" in the main text. Despite the additivity of the capacities, both channel families have strictly larger private capacity than quantum capacity. The simplicity of our channel families enables us to further generalize the study to a 3-parameter family of channels displaying rich behavior, including separations between all three capacities, which we analyze numerically.
The results in the current paper are even more surprising in the light of additional findings in a companion paper [34]: The coherent information of the channel N s tensored with an assisting channel is super-additive, for a large swath of values of s and for some generically chosen assisting channel. The super-additivity can be lifted to quantum capacity for degradable assisting channels if the spin alignment conjecture holds. The assisting channel can have positive or vanishing quantum capacity. The mechanism behind this superadditivity is novel and in particular differs from the known explanation of super-activation [53,38]. For the d-dimensional generalization M d of N 1/2 , we focus on super-additivity with a (d−1)-dimensional erasure channel and recover results similar to those for N s . In addition, even stronger qualitative results can be obtained. First, super-additivity of quantum capacity can be proved unconditionally (without the spin alignment conjecture). Second, the effect holds for all nontrivial values of the erasure probability, for an appropriately large local dimension d.

Structure of the paper
This paper is structured as follows. In Section 2 we give some general background on quantum channels, their various capacities, and special classes of channels. The main objects of this paper, the channel N s and its d-dimensional generalizations M d and O, are defined in Section 3. The channel coherent information and the quantum capacity of these channels are discussed in Sections 4 and 5, respectively. Section 6 formulates the spin alignment conjecture relative to which the coherent information of N s is additive. We then derive bounds on the quantum capacity of N s and M d in Section 7, and determine their private and classical capacities exactly in Section 8. Section 9 summarizes and further discusses the results on the capacities of N s . Finally, we present the 3-parameter generalization of N s and a numerical analysis of its capacities in Section 10. MATLAB and Python code used to obtain the numerical results mentioned above will be made available at [19].

Quantum channels
Let H be a Hilbert space of finite dimension d. Let H † be the dual of H, andĤ ∼ = H ⊗ H † be the space of linear operators acting on H. Let H a , H b , and H c be three Hilbert spaces of dimensions d a , d b , and d c respectively. An isometry E : H a → H b ⊗ H c , i.e., a map satisfying E † E = I a (the identity on H a ), takes an input Hilbert space H a to a subspace of a pair of output spaces H b ⊗ H c . This isometry generates a quantum channel pair, (B, B c ), i.e., a pair of completely positive trace preserving (CPTP) maps, with superoperators, that take any element X ∈Ĥ a toĤ b andĤ c , respectively. Each channel in this pair (B, B c ) may be called the complement of the other. The isometry E can be written as . If the input of the isometry E is restricted to a subspace Hā of H a , then such a restricted map is still an isometry on Hā and defines a pair of channels (B,B c ), where each channelB andB c is called a sub-channel of B and B c , respectively. When focussing on some quantum channel B, it is common to denote H a , H b , and H c as the channel input, output, and environment respectively. Any CPTP map (together with its complement) may be written as (1) in terms of a suitable isometry E. Another representation of a CPTP map comes from its Choi-Jamiołkowski operator. To define this operator, consider a linear map B :Ĥ a →Ĥ b and a maximally entangled state, on H a ⊗ H a . The unnormalized Choi-Jamiołkowski operator of B is where I a denotes the identity map acting onĤ a . The linear map B is CPTP if and only if the above operator is positive semidefinite and its partial trace over H b is the identity I a on H a .

Quantum capacity
The quantum capacity Q(B) of a quantum channel B :Ĥ a →Ĥ b is defined as the largest rate at which quantum information can be sent faithfully through the channel. It can be expressed in terms of an entropic quantity as follows. Let ρ a denote a density operator (unit trace positive semi-definite operator) on H a and for any ρ a let ρ b := B(ρ a ) and ρ c := B c (ρ a ). The coherent information (or entropy bias) of a channel B at a density operator ρ a is where S(ρ) = −Tr(ρ log ρ) (we use log base 2 by default) is the von-Neumann entropy of ρ. The channel coherent information (sometimes called the single-letter coherent information), represents an achievable rate for sending quantum information across the channel B, and hence Q(B) ≥ Q (1) (B) [37,44,10]. The maximum achievable rate is equal to the quantum capacity of B, and given by a multi-letter formula (sometimes called a regularized expression) [2,1,37,44,10], where B ⊗n represent n ∈ N parallel (sometimes called joint) uses of B. The regularization in (6) is necessary because the channel coherent information is super-additive, i.e., for any two quantum channels B and B used together, the channel coherent information of the joint channel B ⊗ B satisfies an inequality, which can be strict [13,17,52,53,35,3,4,49,47,46]. The coherent information Q (1) (B) is said to be weakly additive if equality holds in (7) whenever B is a tensor power of B. If this equality holds for arbitrary B then Q (1) (B) is said to be strongly additive.

Private capacity
The private capacity P(B) of a quantum channel B :Ĥ a →Ĥ b is operationally defined as the largest rate at which classical information can be faithfully sent through the channel in such a way that the environment, H c , gains no meaningful knowledge about the information being sent. A formula for the private capacity was derived by Cai et al. [8,10] in terms of a quantity called the channel private information: Here, ∆(B, ρ) denotes the coherent information of the channel B with respect to the state ρ as defined in (4), the maximization is over all quantum state ensembles {p x , ρ x a } x , andρ a = x p x ρ x a denotes the ensemble average of the density operators {ρ x a } over the probability distribution {p x } (p x > 0 and x p x = 1), Restricting the maximization in (8) to an ensemble of pure states ρ x a makes ∆(B, ρ x a ) = 0, reducing (8) to (5), and resulting in a maximum value which simply equals the channel coherent information Q (1) (B). This value is at most P (1) (B), i.e., There are channels for which this inequality is strict, thus for such channels P (1) (B) cannot be obtained using an ensemble of pure states alone.
A channel's private information can also be written as where the mutual information at a quantum state σ, I(a; b) σ := S(σ a ) + S(σ b ) − S(σ ab ), is evaluated above at with the label x denoting a classical register, J denoting the B channel isometry (1), and the notation represents a dyad onto any ket |ψ . Cai et al. [8] and Devetak [10] proved that the private information P (1) (B) is an achievable rate for private information transmission, P(B) ≥ P (1) (B), and furthermore the private capacity is bounded from above by the regularized private information. As a result, we have the following coding theorem for the private capacity [8,10]: Except for a few special cases such as degradable channels [11,50] (for definition see Sec. 2.6) the regularization in (13) is required. This requirement arises from super-additivity of P (1) [51], i.e., P (1) satisfies an inequality of the form in (7). From (9) it follows that each term, Q (1) (B ⊗n ), in the limit (6) is at most P (1) (B ⊗n ), and thus a channel's quantum capacity Q(B) is at most its private classical capacity P(B) (13),

Classical capacity
The classical capacity C(B) of a quantum channel B :Ĥ a →Ĥ b is defined as the largest rate at which classical information can be faithfully sent through the channel. In contrast to private information transmission discussed above, there is no security criterion involving the environment, and hence The classical capacity can be expressed in terms of an information quantity called the Holevo information χ(B): whereρ a = x p x ρ x a and σ xbe is defined in terms of the quantum state ensemble {p x , ρ x a } x as in (11) above. Quantum states in the maximization (16) of χ(B) can be chosen to be pure [61]. The classical capacity C(B) of a quantum channel B can be expressed as [22,41], Again, apart from a few special classes of channels the regularization of the Holevo information in (17) is necessary due to super-additivity of χ(B) [21,43]. Comparing (16) with (10) reveals P (1) (B) ≤ χ (1) (P), as a result, each term, P (1) (B ⊗n ), in the limit (13) is at most χ (1) (P ⊗n ), and thus a channel's private channel capacity P(B) is at most its classical capacity C(B) (15).

Entanglement-assisted classical capacity
Finally, we discuss the entanglement-assisted classical capacity C E (B) of a quantum channel B, which is defined as the optimal rate of faithful classical information transmission when the sender and receiver have access to unlimited entanglement assisting the encoding and decoding process. It can be expressed in terms of the channel mutual information I(B), defined as with σ a b = (I a ⊗ B)(ψ a a ) and |ψ a a an (arbitrary) purification of the input state ρ a . Note that I(a ; b) is concave in the input state ρ a and can therefore be computed efficiently [61,16]. The entanglement-assisted classical capacity of a quantum channel is equal to its channel mutual information [6]: and is an upper bound of C(B) by definition.

Special channel classes
If a channel maps the identity element at its input to the identity at its output, then the channel is called a unital channel. A channel B is called degradable, and its complement B c anti-degradable, if there is another channel D such that D • B = B c [11,9]. Sometimes this channel D is called the degrading map of the degradable channel B. For any two channels B and B, each either degradable or anti-degradable, the joint channel B ⊗B has additive coherent information, i.e., equality holds in (7) [11,33]. For a degradable channel B, the coherent information ∆(B, ρ a ) is concave in ρ a [65], and thus Q (1) (B) can be computed with relative ease [16,40]. As a result the quantum capacity of a degradable channel, which simply equals Q (1) (B), can also be computed efficiently. An anti-degradable channel has no quantum capacity due to the no-cloning theorem. An instance of an anti-degradable channel is a measure-and-prepare or entanglement-breaking (EB) channel [26]. An EB channel is one whose Choi-Jamiołkowski operator (3) is separable [26]. The complement of an EB channel is called a Hadamard channel [61]. Besides anti-degradable channels, the only other known class of zero-quantum-capacity channels are entanglement binding or positive under partial-transpose (PPT) channels [27]. A channel is PPT if its Choi-Jamiołkowski operator (3) is positive under partial transpose. If a channel B c has zero quantum capacity then its complement B is called a more capable channel. A more capable channel has equal quantum and private capacity. If a more capable channel B has a complement B c with zero private capacity, then B has additive coherent information in the sense that equality holds in eq.
where 0 ≤ s ≤ 1/2. This isometry was introduced previously by one of us in [46] with |1 and |2 in H a exchanged. Its Kraus operators are unitarily equivalent to those of a channel L α introduced even earlier in [56] and studied further in [58]. The isometry can be written as F s = i K i ⊗ |i where Kraus operators K i : H a → H b match those in Sec. IV.C of [58] if one permutes the computational basis of H a as |0 a → |2 a → |1 a → |0 a , exchanges |1 b and |2 b , and rewrites s as sin 2 α, 0 ≤ α ≤ π/4. Through an equation of the form (1) the isometry (20) gives rise to a complementary pair of channels N s :Ĥ a →Ĥ b and N c s :Ĥ a →Ĥ c . This channel pair has two simple properties. The first property, proved in Sec. 3.4, is the existence of degradable sub-channels of N c s , obtained by restricting the channel input to operators on a qubit subspace, where i is either 1 or 2. This restriction also results in an anti-degradable sub-channel of N c s . Quantum states lying solely in this qubit input sub-space can be used to send an equal amount of quantum and private information to the N s output H b but such states cannot be used to send any quantum or private information to the N c s output H c . The second channel property, also proved in Sec. 3.4, is the presence of a perfect sub-channel of N c s obtained by restricting the channel input to operators on a qubit subspace H a spanned by {|1 , |2 }. Quantum states lying solely in this qubit input subspace H a can be used to perfectly send information to the N c s output H c while sending no information to the N s output H b . This perfect transmission of a qubit to H c implies that the quantum, private, and classical capacities of N c s are at least 1. Since the dimension of the channel output H c is two, all these capacities of N c s are at most 1, thus Together, the intuitive picture above along with the two simple properties of N s and N c s help classify each of these channels. For instance, one can easily infer that each channel in the (N s , N c s ) pair is neither degradable nor anti-degradable. If N s was degradable, then all its sub-channels would also be degradable. However the sub-channel of N s obtained by restricting its input to H a is anti-degradable. This antidegradable sub-channel is the complement of a perfect (and hence degradable) sub-channel to H c obtained by restricting the channel input to H a . In similar vein, N c s is not degradable. Consider a sub-channel of N c s obtained by restricting its input to H ai . This sub-channel is anti-degradable since its complement, a sub-channel of N s with input H ai , is degradable.
Since each channel in the (N s , N c s ) pair is neither degradable nor anti-degradable, both channels are not EB since EB channels are anti-degradable. A Hadamard channel is the complement of an EB channel, and since each channel in the complementary pair (N s , N c s ) is not EB, both the channels are not Hadamard channels. Another class of interesting channel are more capable channels. As mentioned in Sec. 2.6, a more capable channel has a complement with no quantum capacity. Both N s and N c s don't belong to this more capable channel class because these complementary channels both have non-zero quantum capacity. An argument showing that the quantum capacity of N s is non-zero was presented previously. An argument showing that the quantum capacity of N s is non-zero is given in Sec. 4 and also in [46]. Since both N s and N c s have non-zero quantum capacity, these channels are not PPT which have zero quantum capacity. Finally, it is easy to verify that each channel in the (N s , N c s ) pair is not unital. The N s channel can also be viewed as a hybrid of two simple qubit input channels. The first channel, R 1 , perfectly maps its input, H 1 , to the environment H c . The second channel, R s 2 for s ∈ [0, 1/2], is a degradable channel. To define the first channel, R 1 :Ĥ 1 →Ĥ b , we use {|1 , |2 } to label an orthonormal basis of H 1 . Then an isometry R 1 : defines R 1 (X) = Tr c (R 1 XR † 1 ) = Tr(X)|2 2|. Clearly R 1 traces out its input to a fixed pure state while sending everything to the environment. Thus R 1 cannot send any information to its output, Q(R 1 ) = P(R 1 ) = C(R 1 ) = 0. To define the second channel R s 2 , we use {|0 , |2 } to label an orthonormal basis of H 2 . Then R s 2 :Ĥ 2 →Ĥ b is generated by an isometry, R s 2 : where 0 ≤ s ≤ 1/2. One can show that R s 2 is degradable and thus Q (1) (R s 2 ) = Q(R s 2 ) = P(R s 2 ). The channel's classical capacity is one bit. This can be shown by noticing that a bit encoded into inputs |0 and |2 leads to orthogonal outputs achieving a rate of 1, which saturates the maximal entropy determined by the dimensional bound of the input of R s 2 . Notice H a is the union of H 1 and H 2 and the isometry F s : H a → H b ⊗ H c , which gives rise to N s , can be written as a hybrid, of isometries R 1 and R s 2 that give rise to R 1 and R s 2 , respectively.

The M d channel
We now consider a higher dimensional generalization of the isometry in (20) with s = 1/2. This generalization, G : , and U c = U * on H a , H b , and H c respectively, here * denotes complex conjugation in the standard basis of H c . The isometry G (26) is symmetrical in the sense that Due to this symmetry, both channels, M d and M c d , have the property that where ρ is any density operator on H a . The channels M d and M c d have properties similar to those of N s and N c s , respectively. Like N s , the channel M d has several degradable sub-channels. Later in Sec. 3.4 we show that a degradable sub-channel of M d is obtained by restricting its input to operators on a qubit subspace H ai (21) or by restricting the input to a qubit subspace obtained by applying U a , defined above in (27), to each state in H ai , where i is some fixed number between 1 and d − 1. Quantum states lying solely at the input of such a degradable qubit sub-channel can be used to send an equal amount of quantum and private information to the M d channel output H b but such states cannot be used to send any quantum or private information to the M A variety of channel classes were discussed in Sec. 2.6. These classes include degradable channels, anti-degradable channels, EB channels, Hadamard channels, and less noisy channels. If a sub-channel of a channel does not belong to any of these classes, then the channel itself does not belong to that same class. Notice N 1/2 is a sub-channel of M d arising from a restriction of the M d channel input to operators on a qutrit subspace spanned by {|0 , |1 , |2 }. In Sec. 3.1, we mentioned that N 1/2 is not degradable, anti-degradable, EB, Hadamard, or less noisy. As a result M d is not (anti)-degradable, EB, Hadamard, or less noisy. One can easily verify that M d , like N 1/2 , is not a unital channel or a PPT channel. This verification can be done by a direct computation. In the case of ruling out PPT behaviour one can also use the fact that PPT channels have zero quantum capacity, however N 1/2 , a sub-channel of M d has non-zero quantum capacity and thus M d also has non-zero quantum capacity. An argument similar to the one above can be used to show that, similar to N c s , the channel M c d is not (anti)-degradable, EB, Hadamard, less noisy, unital or PPT.

The general O channel
Both isometries in (20) and (26) can be viewed as special cases of an isometry, H : Note that complex phases in µ i can be absorbed in the definition of the standard basis of H b , so we may assume that µ i are real and non-negative. An exchange of µ i and is equivalent to an exchange of |i + 1 and |j + 1 in H a and an exchange of |i and |j in both H b and H c . All these exchanges can be achieved by performing local unitaries on H a , H b , and H c respectively. As a result, we restrict our attention to {µ i } arranged in ascending order, i.e., As mentioned earlier in Sec. 3.1 and Sec. 3.2, properties of the type mentioned above imply that quantum states lying in the qubit subspace H ai at the channel input can be used to send quantum and private information to the O channel output but not the O c output. Additionally, quantum states lying solely in the (d − 1)-dimensional H a input subspace perfectly send all information to the O c output H c . Arguments similar to those above (22) can be used to show that From (31), it follows that O c has non-zero quantum capacity and thus O c does not belong to the class of less noisy channels. Using a log-singularity based argument (see Sec. 3.4), one can show that As a result, O is also outside the class of less noisy channels. Using arguments similar to those used for

Proofs of channel properties
In Sec. 3.1, below eq. (20), we claimed that N s has degradable sub-channels obtained by restricting the channel input to H ai in (21). We made a similar claim about M d , a d-dimensional generalization of N 1/2 defined in Sec  (30)). Using this single proof, along with the definition of degradability and the symmetry in eq. (28), one can easily show an additional claim in Sec. 3.2: restricting the input of M d to a qubit sub-space obtained by applying U a , defined above eq. (27), to each state in H ai , results in a degradable sub-channel of M d .
Let O i be a sub-channel of O obtained by restricting the channel input to a two-dimensional sub-space H ai (see eq. (21)) of H a , where i is a fixed integer between 1 and d − 1. This sub-channel, O i , is degradable. Prior to constructing a degrading map for O i , consider an isometry H i : where H is defined in (30). This isometry H i generates the O i sub-channel and its complement O c i . Let H e be a d-dimensional Hilbert space. For any fixed integer i between 1 and d − 1, consider another isometry The equality above implies that F i is a degrading map for O i and thus O i is degradable and O c i antidegradable.
In Sec. 3.1 we claimed that N c s has perfect sub-channels obtained by restricting the channel input to H a (defined below (21) (30)). To prove that O c has a (d − 1)-dimensional sub-channel which perfectly maps its input to H c , consider where V : H a → H c is a bĳection of the form satisfying Since V is a bĳection, the sub-channel O in (36) perfectly maps its input space H a to its (d − 1)-dimensional output H c .

Channel coherent information
The coherent information of N s , M d , and O can be obtained from an optimization of the form (5). In general, this optimization is non-trivial to carry out because the entropy bias (4) is not generally concave in ρ. As a result the coherent information Q (1) for most channels remains unknown. However, one can show that optimizations for Q (1) of all three channels N s , M d , and O can be reduced to a one-parameter concave maximization over a bounded interval. For any 0 ≤ s ≤ 1/2, where ρ a (u) is a one-parameter density operator of the form ρ a (u) = (1 − u)[0] + u [2] and 0 ≤ u ≤ 1. For any d ≥ 3, where , 0 ≤ u ≤ 1 and i is any fixed integer between 1 and d − 1. For any where and 0 ≤ u ≤ 1. In (39), (40), and (41) the maximization is over a density operator ρ a (u) which is supported over a subspace of the H ai form (21). In Secs. 3.1, 3.2, and 3.3, we showed that restricting the channel input to these H ai subspaces results in a degradable channel. Since the entropy bias of a degradable channel is concave in the channel input, each optimization in (39), (40), and (41) is a concave maximization. In addition, one can also show that all three coherent informations (39), (40), and (41)

Evaluating
In this subsection, we will show that where ρ a (u) is defined in (42). The above equations can be proved in three steps. The first step exploits the structure of (O, O c ) to show that restricting the input ρ a to a certain block-diagonal form preserves the optimal value in the second expression in (43). The second step reduces this new optimization further to a one parameter problem (50), using a majorization-based approach. Finally, this one-parameter problem is shown to be equivalent to finding the coherent information of O d−1 . These three steps are detailed as follows.
Recall from (35) that H a = span{|1 , |2 , . . . , |d − 1 }. Let so that H a = H 0 ⊕ H 1 . Let L ij be the space of linear operators from H j to H i , i.e., Any density operator on H a can be written as where i, j are binary and N ij are linear operators from H j to H i .
For step 1, starting from any ρ a , we obtainρ a by resetting Using the above remark,ρ b can be obtained from ρ b by resetting all the off-diagonal elements to 0, andρ b is majorized by ρ b (see discussion below eq. (54) and Prb.II.5.5 in [7]). Applying Schur-concavity of the von-Neumann entropy (see Sec. 4.2), we have S(ρ b ) ≥ S(ρ b ). The above remark also implies thatρ c = ρ c , so S(ρ c ) = S(ρ c ). Together, this proves the claim ∆(O, ρ a ) ≤ ∆(O,ρ a ). Thus, to maximize the entropy bias (4) we can focus on ρ a of the form in (46) where N 01 = N 10 = 0, i.e., where 0 ≤ u ≤ 1 and σ is a density operator on H a . For step 2, note that the input ρ a in (47) gives the outputs and where the channel parameters µ 0 ≤ µ 1 ≤ · · · ≤ µ d−2 are fixed. Note that S(ρ b ) only depends on u while S(ρ c ) depends on both u and σ. Thus for any fixed u, the entropy bias ∆(O, ρ a ) is maximum when S(ρ c ) is minimum. We will prove in the next subsection that this minimum can always be attained when σ = [d − 1], for all relevant values of u and the channel parameters µ 0 ≤ µ 1 ≤ · · · ≤ µ d−2 . As a result, where ρ a (u) is as given in (42), obtained from setting σ = [d − 1] in (47).
To finish the proof, note that ρ a (u) is supported on H a(d−1) , so, expanding the maximization to a general density operator Λ supported on H a(d−1) is nondecreasing: and the RHS of the above is (50), (51), and (52) thus gives The opposite inequality holds since O d−1 is a sub-channel of O, which establishes (43).

Majorization and entropy minimization
In this subsection, we prove the claim in the previous subsection that ρ c in (49) has minimum entropy when σ = [d − 1], for any fixed u and channel parameters µ 0 ≤ µ 1 ≤ . . . µ d−2 . We first summarize our main tools, majorization and Schur-concavity of the Shannon entropy. Let us first consider majorization of real vectors. Given a vector x in R t , let x ↓ denote the vector obtained by rearranging the entries of x in descending order. For any two vectors x, y in R t , we say that x is majorized by y, in symbols x ≺ y, if the inequality, holds for all k ≤ t and becomes an equality at k = t. The concept of majorization can be generalized to Hermitian matrices as follows.
The Shannon entropy is Schur-concave, i.e., when a probability vector, p, is majorized by another, q, then h(p) ≥ h(q). The von-Neumann entropy of a density operator τ is equal to the Shannon entropy of v(τ ): Thus, like the Shannon entropy, the von-Neumann entropy is Schur-concave, i.e., if τ ≺ κ then S(τ ) ≥ S(κ).
We now return to the problem of minimizing the von-Nuemann entropy S(ρ c ). Recalled from (49) that Since S(ρ c ) is a concave function, the minimum can be attained when σ is a pure state. Applying (56) to (58), we obtain where we use v(cN ) = cv(N ) for any real c > 0. Since σ is a pure state, the RHS of (59) is a vector , and from Schur-concavity of the Shannon entropy, From the expression of w, equality is attained when

Log-singularity and positivity
We shall be using an log-singularity based method to show Q (1) (O) > 0 for any µ 0 ≤ µ 1 ≤ · · · ≤ µ d−2 and d ≥ 3. For details of the method see [46]. Let ρ( ) be a density operator that depends on a real parameter . The von-Neumann entropy, S( ) = −Tr ρ( ) log ρ( ) is said to have an log-singularity if one or several eigenvalues of ρ( ) increase linearly from 0 to leading order in . As a result of this singularity, S( ) x| log | where x > 0 is called the rate of the singularity. The entropy bias, or coherent information of O at ρ a ( ), may be concisely denoted by, where S b ( ) := S ρ b ( ) and S c ( ) = S ρ c ( ) . If an log-singularity in S b ( ) has larger rate than the one in S c ( ), and ∆(0 In the present case, let ρ a ( ) be the density operator in (42) where u is replaced with . At = 0, ρ a ( ) is a pure state, hence ∆(0) = 0. In addition, ρ c (0) has rank d c = d − 1, and thus S c ( ) cannot have an log-singularity. On the other hand S b ( ) has an log-singularity of rate 1 and thus

Sub-channels
Let O i be the sub-channel of O obtained by restricting the channel input to operators on H ai (21), where i is some fixed integer between 1 and d − 1. Using arguments similar to those presented in subsection 4.1, one can show where ρ ai (u) is a one-parameter density operator of the form and 0 ≤ u ≤ 1. Let S(ρ ci ) be the entropy of ρ ci = O c i (ρ ai ). Using majorization and Schur concavity arguments (similar to those leading to (60)) and the fact that µ 0 ≤ µ 1 ≤ · · · ≤ µ d−2 one can show that for all i ≤ j. For any fixed u in (63) the entropy S(ρ bi ) of ρ bi = O i (ρ ai ), is independent of i. Using (64) and the definition of the entropy bias (4) one obtains, for any i ≤ j. The relation above implies, for all i ≤ j. Using the above equation along with (43) we get

Rényi entropy
Before closing this section, we remark that to obtain (43) we used (a) monotonicity of the von-Neumann entropy S(ρ b ) under block-diagonalization of ρ a , and (b) concavity of the von-Neumann entropy to argue that pure σ minimizes S(ρ c ), and finally, we utilized majorization (see Sec. 4.2) to argue that the maximize entropy bias occurs when the input density operator has the form (42). In the definition of (4) if one replaces the von-Neumann entropy with any Rényi entropy, where 0 ≤ α ≤ ∞, then the corresponding equation (43) would still hold. We outline the reasoning here. First, monotonicity of the Schur-concave Rényi entropy is unaffected when ρ a is block-diagonalized. However, unlike the von-Neumann entropy, S α is not concave for α > 1. Thus to prove that the minimum Rényi entropy S α (ρ c ) (59) also occurs over pure states σ we employ a different stratgey. Write where the quantum channel N acts as N (σ ) = (1 − u)ΥTr(σ ) + uσ , with 0 ≤ u ≤ 1. Next we use the fact that the minimum output Rényi entropy for N (σ ) occurs at a pure state input (see Sec.II in [20]). With σ being restricted to a pure state, one can now use the majorization-based argument to show that the Rényi entropy S α (ρ c ) is minimum when the input density operator has the form (42). This majorization argument is unaffected when the von-Neumann entropy is replaced by the Rényi entropy which is also Schur-concave.

Quantum capacity of O
Subject to the spin-aligment conjecture, introduced in Sec. 6, we show that To show these equalities above, we prove the third equality in (70) and infer the other two because N s and M d are special cases of O (see Sec. 3.3). Our next step is to compute the coherent information of the channel O ⊗n :Ĥ ⊗n where n ≥ 1 and ρ is a density operator on H ⊗n a = H 1 a ⊗ H 2 a ⊗ · · · ⊗ H n a , with the superscript i indicating the i th space. Using (44), express for some N (s, t) ∈ L(s, t). We will obtain the form of O ⊗n (ρ) and (O c ) ⊗n (ρ) using the expression of ρ above and the following two remarks (which generalize Remark 1). To prove remark 2, we first express any operator N (s, t) as a linear combination of operators of the form,
From remark 2, the first term on the right side of the equality in (77) has zero diagonal entries, the second term has zero off-diagonal entries, while the first term on the right side of the equality in (78) is zero. Therefore, setting N (s, t) = 0 for all s = t has no effect on (O c ) ⊗n (ρ), nor on the diagonal entries of O ⊗n (ρ), while all off-diagonal entries of O ⊗n (ρ) become zero's. This may increase S (O ⊗n (ρ)) but not S ((O c ) ⊗n (ρ)) so, when maximizing the entropy difference in the second equality in (71), we may restrict to density operators of the form To proceed with the analysis, we re-express (79) as a specific convex combination of states. Let M denote a subset of {1, . . . , n} and M c the complement of M in {1, . . . , n}. Let |M | and |M c | denote the sizes of M and M c . We may use the subset M to label some of the channel uses, or some of the input systems, or some of the output systems. For any such subset M , let ω M denote a density operator acting on the corresponding subset of input spaces ⊗ i∈M H i a , where H i a denotes the input space of the i-th channel use, and let |0 0| ⊗M c denote the pure state (|0 0|) ⊗|M c | on the complement set of input spaces ⊗ j∈M c H j a . Using this notation, we now show that the density operator in (79) can be written as for some density operators ω M and for some x M between zero and one such that In the above and throughout, the summation over M is over all subsets of {1, . . . , n}. For an arbitrary s, let the k-th entry of s be i k . Recall that N (s, s) is a linear combination of operators of the form F i1i1 ⊗F i2i2 ⊗· · ·⊗F inin . Recall also that H 0 = span{|0 } so F 00 ∝ |0 0| and for each k and where (30) and (37) where ω M = (|d − 1 d − 1|) ⊗|M | . The above density operator is supported on a subspace H ⊗n Using arguments similar to those in below eq. (50) the maximum entropy difference (71), In Sec. 3.4, we showed that this sub- −1 ), and thus the equality (86) simplifies The equality along with (43) gives the desired result

Spin alignment conjecture
Since this conjecture may be of independent interest, it is presented in a more self-contained manner, with occasional repetition of information from the previous section. Let |0 and |1 denote the spin up and spin down states of a spin-1 2 particle (we just call this a spin). Let We conjecture that the entropy minimization problem has an optimal solution when all spins align with one another as much as possible and aligned with the eigenstate corresponding to the maximum eigenvalue of Q; in other words, for all 2 n − 1 possible non-empty subsets M , Note that the minimum of (91) can be attained on pure states ω M due to concavity of the von Neumann entropy.
We have not come upon a general proof of this conjecture but we have found proofs for various special cases and numerical evidence in other cases. In what follows we briefly mention this evidence for our conjecture.

Special case n = 1
For 1 spin, the entropy minimization problem (91) takes the form where φ denotes the null set, the subscript 1 is a shorthand for M = {1}, and x 1 and x φ are arbitrary but fixed non-negative numbers that sum to 1. The above optimization problem (93), generalized to any dimension and any valid density matrix Q, can be solved using Schur-concavity of the von Neumann entropy along with a majorization inequality, see eq. (II.16) on pg. 35 in [7]. The optimal solution ω 1 can always be chosen to be the projector onto any 1-dimensional eigenspace of Q corresponding to its maximum eigenvalue. This proves our conjecture for n = 1.

Special case n = 2 and s = 1/2
When there are two spins and s = 1 2 , Q = I 2 , the optimization problem in (91) takes the form where x 1 , x 2 , x 12 and x φ are arbitrary but fixed non-negative numbers that sum to 1. There is enough symmetry to assume without loss of generality that ω 1 = |1 1|, ω 2 = |1 1| and using the optimal solution for the high-dimensional generalization of the n = 1 case (93), ω 12 = |1 1| ⊗ |1 1| can be shown to be an optimal solution.
We mention in passing that when n = 2 but s is arbitrary, if we assume ω 1 , ω 2 , ω 12 to be diagonal, we can also show the optimality of the conjectured solution, by a detailed case analysis involving majorization and Schur-concavity of the von Neumann entropy.

Numerical evidence
Our conjecture is backed by numerical evidence that we gathered for the optimization problem (91) with n = 2, . . . , 6. We randomly sampled probability distributions {x M } M (see (90)), restricted the optimization to pure states ω M (see comment after (92)) in order to reduce the number of free parameters, and used both gradient descent-based and global optimization techniques (particle swarm optimization). In all instances, the optimization converged to the conjectured solution in (92), i.e., with all states ω M being tensor products of (a rotated version of) the eigenvector of Q corresponding to its largest eigenvalue. Of course, it is possible that such convergence is to a local minimum rather than a global minimum, however we have not found minima with values smaller than the conjectured value.

Generalization to higher dimensions
The entropy minimization problem (91) can be generalized to higher spins (a spin-1 2 particle is a qubit and higher-spin particles correspond to qudits): where Υ is a fixed density matrix on a qudit, and {x M } a fixed distribution on the subsets of {1, . . . , n}. As before, the minimum can be attained on pure states ω M due to concavity of the von Neumann entropy. Let |γ be an eigenvector of Υ corresponding to a maximum eigenvalue. We conjecture that the entropy is minimized when all the spins are aligned with one another as much as possible; that is, for all 2 n − 1 possible non-empty subsets M .

Rényi-2 entropy variation of the conjecture holds
We discuss a side result that illustrates the intuition behind the spin alignment conjecture. Unless otherwise stated, symbols defined here are used only for this subsection. Consider a variation of the minimization problem (95) wherein the von Neumann entropy is replaced by the Rényi-2 entropy, which according to (68), is given by In this case, the minimum is attained by states given by (96), and we will outline the proof in this subsection. Using the expression for Rényi-2 entropy above, the goal to minimize S 2 (κ) in (95) is equivalent to maximizing where we have used the expression for κ in (95). Since x M x M ≥ 0 for all M, M , it suffices to show that the states given by (96) maximize each where the subscripts 1, 2, 3 indicate which systems are acted on by the operators, Θ, Z are positive semidefinite operators, and |µ , |ν are unit vectors. The problem above attains its maximum at where |θ is any eigenvector of Θ corresponding to its maximum eigenvalue, |ζ is any eigenvector of Z corresponding to its maximum eigenvalue, and |ξ is any vector in system 2.
Proof. We can take the computational bases of systems 1 and 3 to be the eigenbases of Θ and Z respectively, so they are diagonal without loss of generality. Let Θ = j θ j |j j|, Z = i ζ i |i i| be their eigendecompositions. We can always express |µ = i a i |i 1 ⊗ |α i 2 and |ν = j b j |β j 2 ⊗ |j 3 for some nonnegative amplitudes a i , b j and unit vectors |α i and |β j on system 2, such that i a 2 i = j b 2 j = 1. Then, In the last line above, we use a convexity argument noting that {a 2 i b 2 j } i,j is a probability distribution. The proposed solution |µ 12 = |δ 1 ⊗ |ξ 2 , |ν 23 = |ξ 2 ⊗ |γ 3 attains the upper bound max i,j ζ i θ j thus must be an optimal solution for the maximization problem, proving the lemma.

Quantum capacity bounds 7.1 Upper bound on Q(N s )
We now derive an analytic upper bound and a tighter numerical upper bound on the quantum capacity of N s . The analytic bound matches the SDP bound in Prop. 16 of [58] where it appears without proof. We provide a proof showing the SDP bound matches the well-known "transposition bound", which states that Q(B) ≤ log T • B for any channel B [23]. Here, T : X → X T denotes the transposition map, taken with respect to the same basis used to define the maximally entangled state |φ in (2). For a superoperator Ψ :Ĥ →Ĥ the diamond norm Ψ is defined as The diamond norm of any linear superoperator can be computed by a semidefinite program [60], which, for T • B, is given by where J B ab denotes the unnormalized Choi-Jamiołkowski operator of B (3), and T b = I a ⊗ T denotes the partial transpose with respect to system b, i.e., the transpose map T acts onĤ b .
We compare the transposition bound to another bound on Q(B) by Wang et al. [57] defined in terms of a quantity Γ(B), which is the solution of the following semidefinite program: The two bounds on the quantum capacity of B are related as follows: Proposition 2 (Holevo, Werner [23], Wang et al. [57]). For any quantum channel B, where Γ(B) is defined in (105).
The two upper bounds mentioned above yield an analytical upper bound on the quantum capacity of the channel N s . Due to unitary equivalence of the N s channel to the one in [58] (see discussion below eq. (20)), the SDP bound log Γ(N s ) matches the one stated without proof in Prop. 16 of [58].
Proof. The theorem is proved by asserting that from which the claim follows via Proposition 2.
To prove the first inequality in (108), we pick the following operators (R ab , ρ a ) in the SDP (105) for Γ(N s ): It is easy to check that R ab , ρ a ≥ 0 and tr ρ a = 1, and that ρ a ⊗ 1 b ± T b (R ab ) ≥ 0, which ensures that the pair (R ab , ρ a ) is indeed feasible in (105). To compute the objective value tr R ab J s ab with J s ab ≡ J Ns ab , observe that the J s ab has the form We have ψ 1 |R ab |ψ 1 ab = s/2 and ψ 2 |R ab |ψ 2 ab = 1 + √ 1 − s − s/2, and hence To prove the third inequality in (108), T • N s ≤ 1 + √ 1 − s, we again pick feasible operators in the SDP (104) for T • N s : is unitarily equivalent to the following operator in block-diagonal form: with the matrix M = 1 −1 −1 1 and vectors As the operator on the right-hand side of (117) is manifestly positive semidefinite, the same holds for showing that Y ab and Z ab = Y ab are feasible in (104). The marginal Y a = tr b Y ab is diagonal with eigenvalues 1 + √ 1 − s (of multiplicity 2) and 1 + s/ Therefore, the SDP (104) for T • N s has value at most Y a ∞ = 1 + √ 1 − s, which concludes the proof of the theorem.
While the bound log Γ(B) can be strictly tighter than the transposition bound log T • B for certain channels B [57], Theorem 2 shows that the two bounds in fact coincide for N s . The SDP upper bound log Γ(B) was recently improved by Fawzi and Fawzi [14], and evaluating the latter bound (which can again be computed by semidefinite programming) yields an even tighter bound on Q(N s ). The two bounds are compared in Fig. 1.

Upper bound on Q(M d )
We now derive an analytical upper bound on the quantum capacity of M d , which we introduced in Section 3.2 as a generalization of N 1/2 to arbitrary dimension d of the input and output Hilbert spaces. As a reminder, a channel isometry for M d is given by G : (120) The Choi-Jamiołkowski operator J M d ab of M d is given as follows: where we used the notation [ψ] ≡ |ψ ψ|. Using Proposition 2 and the SDP (104), we now derive an upper bound on the quantum capacity of the channel M d .

Theorem 4. For any d ≥ 2,
In particular, Proof. We will prove this theorem by constructing feasible operators Y ab = Z ab in the SDP (104) for T •M d satisfying tr b Y ab ∞ = 1 + 1 √ d−1 , from which the claim follows by invoking Proposition 2. Consider the following ansatz for Y ab : with |Ξ = 1 We clearly have Y ab ≥ 0. Setting Z ab = Y ab , consider the second feasibility constraint in (104), Taking the Schur complement, this is equivalent to the constraint with the inverse taken on the support of Y ab . Using the form of which yields Y a ∞ = 1 + 1 √ d−1 and concludes the proof.

Private and classical capacities
The private and classical capacities of the channels N s and M d can be determined exactly. In the following, we will prove in Theorems 6 and 9 that P(N s ) = C(N s ) = 1 = P(M d ) = C(M d ). This is remarkable because neither N s nor M d belong to any of the special classes of channels for which the private or classical capacity is known to have a single-letter expression (that is, equal to the channel private information or the Holevo information; see Section 2 for the definitions of these quantities).
The upper bound on the private and classical capacities of the channels employ the SDP technique used in prior work of Wang, Xie, and Duan [58]. That same work applied the technique to a qutrit-to-qutrit channel unitarily equivalent to N s (see discussion below eq. (20)). Hence, these prior upper and lower bounds on C (see Prop. 15 in [58]) and C E (see Prop. 1 in [56]) imply our bounds on C(N s ) in Th. 6 and C E (N s ) in Th. 8. On the other hand, our lower bound on P, first inequality in (129) below, correctly proves the bound previously stated in [58,Prop. 16], whose proof contained a typo. The corresponding results for the M d channel do not follow from these prior works.

Capacities of N s
We will obtain a lower bound for the channel private information and the Holevo information using the following equiprobable ensemble of two quantum states: For this ensemble, the quantity ∆(B,ρ a ) − x p x ∆(B, ρ x a ) evaluates to 1 which gives a lower bound to the channel private information P (1) (N s ) ≥ 1 for any s ∈ [0, 1/2] (see (8)). Likewise, the Holevo information is 1 for this ensemble, giving a lower bound 1 ≤ χ(N s ) by (16). Using [8,10] and (15), we obtain chains of inequalities We will now show that C(N s ) ≤ 1 so we have equalities throughout the above. To this end, we employ a semidefinite programming upper bound on the classical capacity derived by Wang et al. [58]: 58]). For any quantum channel B, where β(B) is the solution of the following SDP: , log β(B) is a strong converse bound: if the classical information transmission rate exceeds log β(B), the transmission error converges to 1 exponentially fast.
We now state and prove the main result of this section: where |ψ ab = √ s|10 + √ 1 − s|21 . We now check that (R ab , S b ) is feasible for the SDP (131). Both R ab and S b are Hermitian by construction, and one readily checks that (111), it also follows that R ab ± T b (J s ab ) ≥ 0. These observations establish feasibility of (R ab , S b ) in (131). Furthermore tr S b = 2, hence β(N s ) ≤ 2. (As a side remark, we note that (R ab , S b ) is in fact optimal for (131) since 1 ≤ C(N s ) ≤ log β(N s ) ≤ 1 by (129) and Proposition 5.) Since log β(N s ) is a strong converse bound and C(N s ) = log β(N s ) = 1, the classical capacity of N s satisfies the strong converse property. This also holds for the private capacity P(N s ) = C(N s ) = 1, which can be seen as follows (we refer to [61] for precise definitions). Let ε C denote the error for a classical information transmission code. The error ε P for a private information transmission code is defined as ε P = max(ε C , ε env ), where ε env is an additional error parameter controlling how much information the environment gains about Alice's input. Assume now that we have a private information transmission code with rate r P > 1 and error ε P . Then this code can also be regarded as a classical information transmission code with rate r C = r P > 1 and error ε C ≤ ε P . The strong converse property of C(N s ) implies that ε C → 1 as the code blocklength n increases, from which ε P → 1 follows. Hence, P(N s ) also satisfies the strong converse property, which concludes the proof. Theorems 3 and 6 imply that the quantum and private capacities of N s are strictly separated: It was recently shown in [12] that the quantity log β(·) from Theorem 5 also serves as an upper bound on the classical capacity of a quantum channel assisted by a classical feedback channel, denoted C ← (·). Hence, Theorem 6 immediately implies that the feedback-assisted classical capacity of N s is equal to its classical capacity, C(N s ) = 1 = C ← (N s ) for all s ∈ [0, 1/2]. Moreover, the same argument applies to the private capacity of a quantum channel assisted by a public feedback channel.
Finally, we discuss the entanglement-assisted capacity of the channel N s . Similar to the classical capacity, it is independent of the parameter s: Proof. It is easy to check that the state σ ab = (I a ⊗ N s )([ψ aa ]) with achieves I(A; B) σ = 2. The optimality of [ψ aa ] can be verified using the methods of [16].

Capacities of M d
We proved in Theorem 6 in Section 8.1 that one use of the channel N s can faithfully transmit 1 private bit (and thus also 1 classical bit) regardless of the value of s. We now prove that the d-dimensional generalization M d of N 1/2 defined in (120) retains unit private and classical capacity for any local dimension d: Moreover, both the classical and private capacity of M d satisfy the strong converse property.
Proof. The proof strategy is similar to the one used in Theorem 6. First, consider an equiprobable ensemble with the following two quantum states, and form the cqq state where G : H a → H b ⊗H c is the channel isometry for M d defined in (120), and ρ xa = 1 It is straightforward to check that I(X; B) σ = 1 and I(X; E) σ = 0, from which we obtain The claim of the theorem now follows by showing that C(M d ) ≤ 1 for all d. To this end, we once again employ the upper bound log β(M d ) from Proposition 5. Consider the following Hermitian operators R ab and S b (analoguous to the operators in (133)): where |ψ ab = 1 j=1 |j a ⊗ |j − 1 b . One readily checks that R ab and S b are feasible in the SDP (131), that is, R ab ±T b (J M d ab ) ≥ 0 and 1 a ⊗S b ±T b (R ab ) ≥ 0. Furthermore, tr S b = 2 for any d, and hence β(M d ) ≤ 2. Using Proposition 5, we conclude C(M d ) ≤ log β(M d ) ≤ 1, which together with (139) gives The strong converse property for C(M d ) and P(M d ) follows in the same way as in the proof of Theorem 6.
By the same argument as in the remark after Theorem 7, Theorem 9 implies that the feedback-assisted private and classical capacities of M d are equal to their unassisted counterparts.

Discussion of capacities of the platypus channels
We now summarize the findings of Sections 4, 7 and 8 on the quantum capacity Q, private capacity P and classical capacity C of the platypus channels N s (with s ∈ (0, 1/2]) and M d (for d ≥ 3): In the above eq. (142), the left-most equality labeled by "?" is the conjectured weak additivity of the single-letter coherent information, Q (1) (N s ), which would be implied by the validity of the "spin alignment conjecture" described in Section 6. The next inequality is Theorem 3. Finally, the four equalities on the RHS of (142) come from Theorem 6 and (129). Likewise, in eq. (143) the conjectured equality labeled by "?" would be implied by the validity of the (higher-dimensional version of the) spin-alignment conjecture in Section 6.4, and the following inequality and equalities are obtained via Theorems 4 and 9, respectively. For both N s and M d , the private and classical capacity have the strong converse property, as proved in Theorems 6 and 9, respectively. These findings are remarkable for various reasons: • The private information P (1) (·) is additive for both N s and M d . The only known classes of quantum channels with additive private information are (a) "less noisy channels" B whose complementary channels have vanishing private capacity, P(B c ) = 0 [59] and of which degradable channels [11,50] are special cases; (b) anti-degradable channels; and (c) direct sums of partial traces (DSPT), a special case of the ternary ring of operators (TRO) channels [18]. We know from Section 3 that P(N c s ) = 1 for all s ∈ [0, 1/2] and P(M c d ) = log(d − 1) for d ≥ 3. Hence, neither N s nor M d are less noisy. Moreover, clearly neither of these channels is a direct sum of partial traces, so that both N s and M d fall outside all known classes of channels with additive private information.
• The Holevo information χ(·) is additive for both N s and M d . Again, both channels fall outside of all the known classes of channels with additive Holevo information: (a) entanglement-breaking channels [42]; (b) unital qubit channels [28]; (c) depolarizing channels [30]; (d) Hadamard channels [29,31]; (e) DSPT channels [18]; and (f) erasure channels [5]. Since entanglement-breaking channels have vanishing quantum capacity and both N s and M d have positive quantum capacity, the platypus channels are not entanglement-breaking. They clearly do not belong to classes (b) and (c) either. A quantum channel is Hadamard if its complementary channel is entanglement-breaking [61]. Since Q(N c s ) = 1 and Q(M c d ) = log(d−1), the complements of N s and M d cannot be entanglement-breaking, so that neither channel is Hadamard. Finally, neither N s nor M d are a DSPT nor an erasure channel.
• The quantum capacity of both N s and M d is strictly smaller than their respective private capacities, for all s ∈ (0, 1/2] and d ≥ 3. There are not too many examples of this phenomenon. The first known class is the Horodecki channels, for which the quantum capacity vanishes and the private capacity is strictly positive [25,24,39]. The smallest such example has input and output dimensions d a = d b = 3, d c = 4, and the separation is typically small. The second class is the so-called "half-rocket channels," with quantum capacity between 0.6 and 1 but private capacity log d where the input and output dimenions are This class exhibits an extensive separation of the two capacities [36]. In comparison, N s is the smallest known channel with d a = d b = 3, d c = 2 exhibiting the separation, and the separation is quite large (at least ≈ s/2 for most s of interest). A separation of the information quantities Q (1) (·) and P (1) (·) was observed for certain channels for which these quantities are also superadditive, such as the depolarizing channel [13,51] or the dephrasure channel [35]. However, due to super-additivity in these channels we do not know their exact capacities and the true separations between them.
• Reference [25] shows that a quantum state can yield a bit of classical information that is private from the environment E if and only if it is of the form where K A K B are called the key systems, S A S B are called the shield systems, |φ is the maximally entangled state on K A K B , σ is an arbitrary state on S A S B and U is a controlled unitary of the form with each U ij a unitary that depends on i and j. The key is shared between two users Alice and Bob. Alice is in possession of K A S A , Bob is in possession of K B S B , and they can generate a key by measuring along the computational basis of K A K B independently. Furthermore, if the one-way distillable key of the state is strictly greater than the one-way distillable entanglement, each of the shield systems must be nontrivial.
Since N s has 3 dimensional input and output and can send one bit privately with a single use, it can be used to make a 3 × 3 dimensional state shared by Alice and Bob that encodes one private bit. Furthermore, this state is not of the form Eq. (144). To see this, suppose the contrary. By the quantum capacity bound of N s , this state has one-way distillable entanglement strictly less than 1, so each of the shield systems must be nontrivial with at least 2 dimensions. Meanwhile, the key systems have 4 dimensions jointly, so, the total dimension exceeds 9 which is a contradiction.
The resolution is that this state is locally equivalent to a standard p-bit with each of K A , K B , S A , S B being a qubit, but for which the local ranks of both K A S A and K B S B are 3. So, our state is a p-bit, but one which has been embedded into smaller dimensional spaces than would be possible generically.
In more detail, here is the protocol to create the 3 × 3 state which distributes a private key between Alice and Bob. Alice prepares the state and applies F S (the isometry giving rise to N s ) toÃ resulting in the state The systems A, B are neither key systems nor shield systems. Consider the local isometries: Applying the above local isometries to |ν ABE results in If we trace out E from the above, we get a state of the form Eq. (144) where U 00 = U 01 = U 10 = I, U 11 is the swap operator, and σ = |00 00| + |01 01|.
• Both the private and the classical capacity of N s and M d satisfy the strong converse property. For the private capacity, the strong converse property is only known for (a) a subclass of degradable channels called generalized dephasing channels [55] (and for these channels, the quantum and private capacities coincide, Q = P); (b) DSPT channels [18]. Both N s and M d are provably non-degradable and not of DSPT form, and hence fall outside both classes. For the classical capacity, the strong converse is known for a number of channel classes: (a) erasure channels [62]; (b) depolarizing channels, unital qubit channels, and the Holevo-Werner channel [32]; (c) entanglement-breaking and Hadamard channels [63]; (d) DSPT channels [18]. By the arguments made above, neither N s nor M d belong to any of these classes.
• Finally, the coherent information Q (1) (·) is additive for both N s and M d relative to the corresponding version of the spin alignment conjecture. The known classes of channels with additive coherent information are (a) less noisy channels [59] (which includes degradable channels [11]); (b) antidegradable channels; (c) PPT channels [27]; (d) DSPT channels [18]. The channel N s is neither degradable nor anti-degradable. Since PPT channels have vanishing quantum capacity, N s and M d cannot be PPT either.
• The N s channel is unitarily equivalent to a qutrit-qutrit channel L α that was introduced in [56] to study zero-error capacities. In the follow-up work [58], the authors showed that the private and classical capacity of L α coincide (see Sec. 8 for a more detailed discussion) and satisfy the strong converse property, also noting that L α (and hence also N s ) does not belong to any of the known classes of channels with that property for the private or classical capacity listed above. They further announced (without proof) an analytical upper bound on the quantum capacity of L α separating it from the private capacity. This bound coincides with our upper bound, for which we give a full proof in Thm. 3. In our independent study we construct the related channel N s as a hybrid of two simple channels (see Sec. 3.1) and analyze in detail the additivity properties of the various information quantities of N s . Furthermore, we extend the channel construction to a larger family of channels of arbitrary dimension with similar information-theoretic properties (see Sec. 3.2 and 3.3). In the process, we also give full proofs of some of the statements announced in [58] about the capacities of the unitarily equivalent channel L α .

A 3-parameter generalization of F s
We now discuss generalizations of the channel N s and their capacities, to further our understanding of the phenomena exhibited by N s . The one-parameter isometry F s : (20) for 0 ≤ s ≤ 1 has input dimension d a = 3, output dimension d b = 3 and environment dimension d c = 2. The isometry F s can be generalized by adding two additional parameters, µ, ν ∈ [0, 1] without changing the input, output, and environment spaces, leading to an isometry V s,µ,ν : H a → H b ⊗ H c that acts as We use W s,µ,ν to denote the resulting channel fromĤ a toĤ b . The isometry V s,µ,ν becomes F s (20) when ν = µ = 1, so V s,µ,ν and W s,µ,ν indeed generalize F s and N s respectively. We study the degradability of the channel W s,µ,ν using the framework of [48]. We call a channel pcubed if it is generated by a pcubed isometry. We call an isometry pcubed if there exists a basis of the input space that is mapped by the isometry to product states of the output space and the environment space. This special basis of the input space is not required to be orthogonal. It is straightforward to verify that, when 0 < s, µ, ν < 1, the isometry V s,µ,ν is a pcubed isometry: where a 0 , b 0 , and c 0 are constants that respectively normalize |α i , |β i , |γ i , ω is a cube root of unity, and k 1 , k 2 , l 1 , l 2 and r are non-negative numbers related to s, µ, ν as follows: The inner products between the states witnessing the pcubed channel also characterize its degradability. To this end, let A, B, C be the Gram matrices for the sets {|α i }, {|β i }, and {|γ i }, respectively: Each of A, B, C has the form where m * is the complex conjugate of m, and M = A, B, C respectively when m is set to be respectively. As a side remark, since V s,µ,ν is an isometry, it follows that A = B * C where * denotes the elementwise or Hadamard Product of two matrices. The channel W s,µ,ν is degradable if and only if there is a Gram matrix D satisfying To see this, when such a Gram matrix D exists, there are normalized kets {|δ i } in some auxiliary Hilbert space H d such that D jk = δ j |δ k . A possible degrading map can be generated by the pcubed isometry from The converse follows from (151) since the degradable map must take |β i to |γ i , and must be generated by an isometry. (See Sec. III.C in [48] for a detailed discussion.)

The isometry V s,µ,1−µ
We now consider a two parameter subclass of isometries, W s,µ , obtained from setting ν = 1 − µ in V s,µ,ν , with s ∈ [0, 1/2] and µ ∈ [0, 1]. Following (150), The resulting channel W s,µ = tr c (W s,µ · W † s,µ ) has two Kraus operators As a side remark, when µ = 1, the channel W s,1 is unitarily equivalent to N s : we have N s = W s,1 (U · U † ), where the unitary U swaps |1 a and |2 a at the input. For the rest of the discussion we focus on s = 1/2. We will first evaluate the capacities of W 1/2,1/2 . Then, we study the degradability of W 1/2,µ , followed by a detailed numerical analysis of its capacities.
Altogether, for the quantum and private capacities we have while for the classical capacities we have Keeping s = 1/2, we now return to general µ ∈ [0, 1] in studying W 1/2,µ . We first use the pcubed framework to study degradability of this subclass of channels. The values of a, b, and c in (156) are given by Setting m in (155) to a, b, and c gives the Gram matrices A, B, and C respectively. The channel W 1/2,µ is degradable iff B = C * D for some Gram matrix D (see (157)). If such a matrix exists, it has the form given by M in (155) with m = d = 2ω(1 − 2µ)/(2 − µ), and thus is a valid Gram matrix (being positive semi-definite) when 1 + 2d 3 − 3|d| 2 ≥ 0 (see eq. (57) in [48]). Setting d = 2ω We now evaluate lower and upper bounds on the quantum, private, and classical capacities of the channel W 1/2,µ for µ ∈ [0, 1], collected in Fig. 2. The bounds on the quantum capacity Q(W 1/2,µ ) are obtained by numerically optimizing the single-letter coherent information Q (1) (W 1/2,µ ) (solid blue line in Fig. 2) and the SDP upper bound (105) (dashed blue line in Fig. 2). Interestingly, the (possibly tighter) SDP upper bound from [14] coincides with the SDP bound (105) from [57] for all µ ∈ [0, 1]. To bound the classical capacity C(W 1/2,µ ), we numerically optimize the single-letter Holevo information χ(W 1/2,µ ) (solid green line in Fig. 2) and evaluate the SDP upper bound (131) (dashed green line in Fig. 2). The entanglement-assisted capacity C E (W 1/2,µ ) is computed using the technique developed in [16] (solid magenta line in Fig. 2). Finally, for the private capacity P(W 1/2,µ ) we numerically optimize the single-letter private information P (1) (W 1/2,µ ) (solid orange line in Fig. 2). To obtain an upper bound on the private capacity we employ the following recent result by Fawzi and Fawzi [14] providing a bound on the private capacity of a quantum channel in terms of a conic program: Proposition 10 ( [14]). Let B :Ĥ a →Ĥ b be a quantum channel with (unnormalized) Choi-Jamiołkowski operator J B ab . Let furthermore l ∈ N and set α = 1 + 2 −l . Then we have whereÊ α (B) = l2 l − (2 l + 1) log(2 l + 1) + (2 l + 1) log T α (B), and T α (B) is the solution of the following conic * The channel W 1/2,1/2 can be generated by the isometry which attaches |+ c to system a, and conditioned on system c being in the state |1 , the unitary K 1 is applied to system a which is then relabeled as system b. This isometry applied to system b generates a valid degrading map. program: In the above, X H = X + X † , and an operator X ab is block-positive (with respect to the bipartition a : b) if ( ψ a | ⊗ φ b |)X ab (|ψ a ⊗ |φ b ) ≥ 0 for all |ψ a ∈ H a and |φ b ∈ H b . Hence, block-positive bipartite states are the states Choi-Jamiołkowski operators of positive maps (see, e.g., [64]).
The conic program in Proposition 10 only reduces to an SDP if d a d b ≤ 6, whereas our channel W 1/2,µ has qutrit input and output, d a = d b = 3. However, the following strategy suggested to us by Hamza Fawzi [15] may be employed to obtain an (SDP-computable) upper bound on the quantity T α (B) (and henceÊ α (B)), in turn giving an upper bound on the private capacity of a channel B via Proposition 10: Lemma 11 ([14, 15]). With the same notation as in Proposition 10, we have the following bound on the private capacity of a quantum channel B :Ĥ a →Ĥ b : where N ∈ N is some fixed natural number, the minimization is over sets of pure states φ i a ∈ H a , i = 1, . . . , N , and the quantity In the above, is the solution of the following semidefinite program: where σ ab = ρ a ⊗ 1 b − Z H 0 . Proof. The block positivity constraint on σ ab = ρ a ⊗1 b −Z H 0 translates via the Choi isomorphism to positivity of the map Ψ :Ĥ a →Ĥ b whose Choi-Jamiołkowski operator is σ ab , i.e., Ψ(χ a ) ≥ 0 for all pure state |χ a ∈ H a . Relaxing this positivity constraint to only requiring Ψ(φ i a ) ≥ 0 for some fixed pure states φ i a , i = 1, . . . , N now yields a maximization over a larger set of Choi-Jamiołkowski operators resp. maps Ψ, and hence we obtain  Fig. 2 (dotted orange line). We also performed a similar numerical analysis for the capacities of the complementary channel W c s,µ in Figure 3. Figure 2 reveals a number of interesting properties of the 1-parameter channel family W 1/2,µ : • For µ 0.8, the coherent information (solid blue line in Fig. 2) and private information (solid orange line in Fig. 2) coincide. The channel is degradable for µ 0.65 and anti-degradable for µ = 0, see Fig. 4.
• For µ 0.8, the private information (solid orange line in Fig. 2) is strictly larger than the coherent information (solid blue line in Fig. 2). For µ 0.9 the private information exceeds the SDP upper bound on the quantum capacity (dashed blue line in Fig. 2), hence giving a provable separation between quantum and private capacity.
• The upper bound on P(W 1/2,µ ) derived via Lemma 11 (dotted orange line in Fig. 2) clearly separates the private capacity from the classical capacity for all µ < 1.
• At µ ≈ 0.8 the Holevo information (solid green line in Fig. 2) has an inflection point, changing from concave to convex. For µ 0.8 the optimal Holevo information is achieved by an ensemble of three pure states, whereas for µ 0.8 four pure states are needed. This may be a signature of super-additivity of Holevo information, and will be further investigated in future work.

Concluding remarks
We have studied families of channels that are very simple, yet still nontrivial in terms of their capacities. Our goal is to better understand the boundary between trivially solvable and incomprehensibly complex behavior. Our main results demonstrate how intricately narrow this boundary can be-with complex quantum effects arising in seemingly innocent and generic settings. We conclude this paper by highlighting some of these results.
The primary example of quantum channel in this study is obtained by combining two very simple channels. As mentioned at the end of Sec. 3.1, we construct the channel of interest N s by "hybridizing" a degradable channel N 2 and a completely useless channel N 1 . The quantum, private, and classical capacities of N 1 are all zero. Meanwhile, the coherent information and various capacities of the degradable channel N 2 can be evaluated: Q (1) (N 2 ) = Q(N 2 ) = P(N 2 ) < 1 = C(N 2 ). Both channels have 2-dimensional inputs, and nothing extraordinary on their own. We "stitch" these channels together, via a common joint output state to the receiver and the environment. The resulting hybrid channel N s has capacities bearing interesting relationships with those of N 2 : their coherent informations are identical, Q (1) (N s ) = Q (1) (N 2 ), and conditioned on the spin alignment conjecture, Q(N s ) = Q(N 2 ) (see Fig. 1). Meanwhile, both the private and classical capacity of N s are equal to the classical capacity of N 2 . In other words, starting from N 2 and "stitching" onto it a completely useless channel N 1 boosts the private capacity of N 2 from its quantum capacity to its strictly larger classical capacity, while all other quantities remain the same.
Both the classical and private capacities of N s are, quite intuitively, equal to 1. This value as an upper bound comes from the uselessness of the input state |2 for sending classical information in addition to what can already be sent using the states |0 and |1 , a property inherited from the uselessness of N 1 . The remaining input space is 2-dimensional so the classical capacity cannot exceed 1. This value as a lower bound comes from a simple, single-letter, perfect, private code using two signalling states, |0 , along with a mixture of |1 and |2 . These give rise to respective orthogonal states to the output but identical states to the environment. Note that this private code is made possible by the stitching of N 1 to N 2 ; in fact, this quantum "stitch" contributes to a minimal shield in the p-bit framework discussed in Sec. 9. Since the private classical rate cannot exceed the classical capacity, both private and classical capacities must be 1. It is highly non-trivial to evaluate these capacities rigorously; N s does not belong to any known class of channels with additive Holevo and private informations. These capacity calculations also prove that that µ P (1) (W 1/2,µ ) Q (1) (W 1/2,µ ) χ(W 1/2,µ ) C E (W 1/2,µ ) UB on P(W 1/2,µ ) UB on Q(W 1/2,µ ) UB on C(W 1/2,µ ) Figure 2: Lower and upper bounds (UB) on the capacities of the quantum channel W 1/2,µ defined via the isometry (158). The quantum capacity Q(W 1/2,µ ) is bounded from below by the single-letter coherent information Q (1) (W 1/2,µ ) (solid blue) and from above by the SDP bound (105) derived in [57] (dashed blue). The private capacity P(W 1/2,µ ) is bounded from below by the private information P (1) (W 1/2,µ ) (solid orange) and from above by the bound given by Lemma 11 (dotted orange). The classical capacity C(W 1/2,µ ) is bounded from below by the Holevo information χ(W 1/2,µ ) (solid green) and from above by the SDP bound (131) derived in [58] (dash-dotted green). We also plot the entanglement-assisted classical capacity C E (W 1/2,µ ) (solid magenta), computed using the technique in [16]. For µ = 1 2 the channel W 1/2,1/2 is a dephasing channel, for which the special values of the capacities from (161) are marked on the right-hand side. defined via the isometry (158). The quantum capacity Q(W c 1/2,µ ) is bounded from below by the single-letter coherent information Q (1) (W c 1/2,µ ) (solid blue) and from above by the SDP bound (105) derived in [57] (dashed blue). The private capacity P(W c 1/2,µ ) is bounded from below by the private information P (1) (W c 1/2,µ ) (solid orange) and from above by the bound given by Lemma 11 (dotted orange). The Holevo information χ(W c 1/2,µ ) coincides with the SDP bound (131) derived in [58], and is hence (numerically) equal to the classical capacity C(W c 1/2,µ ) (solid green). We also plot the entanglement-assisted classical capacity C E (W 1/2,µ ) (solid magenta), computed using the technique in [16]. : Degradability parameter dg(W 1/2,µ ) (blue) and antidegradability parameter adg(W 1/2,µ ) (magenta), which are computed using the SDPs from [54]. A quantum channel B is degradable iff dg(B) = 0, and anti-degradable iff adg(B) = 0.
channel's Holevo and private information are additive in the sense χ(N s ) = C(N s ) and P (1) (N s ) = P(N s ). This additivity however is shown using arguments that are very different from those used in prior works. Furthermore, in Sec. 8, we not only find the classical and private capacity of N s and its higher dimensional analogue M d , we also provide a strong converse bound for each of these capacities. The bound shows that classical or private transmission rates exceeding the capacity attacts an error that converges to 1 exponentially. Such strong converse bounds are unavailable for most quantum channels even when one can compute their capacities.
The quantum capacity of N s is somewhat more complicated, but apparently can also be understood. In particular, by restricting to the input space of N 2 , the quantum capacities are related as Q(N 2 ) ≤ Q(N s ). Relative to the spin alignment conjecture, we find that Q(N s ) = Q(N 2 ) where Q(N 2 ) = Q (1) (N 2 ) because of its degradability. The capacity of N s can be understood this way even though it is neither degradable nor antidegradable. A rigorous proof that the capacity of N s is additive would follow from the spin-alignment conjecture. Such a proof would be qualitatively different from prior additivity proofs. Our hope is that this spin-alignment conjecture, which is at heart about the geometric structure of states minimizing entropy, will lead to further progress on additivity questions in information theory.
We finally have a family of channels that are not of the usual tractable types and yet for which we know the classical, private, and quantum capacities. Underlying this superficial simplicity, the private capacity is much higher than the quantum capacity, a signature for novel quantum effects at play. These channels, and their higher dimensional generalizations, continue to surprise. One may have thought that the weak additivities observed here would portend strong additivity. But nothing could be further from the truth. In a companion paper [34], we find that the coherent information of the channel N s tensored with an assisting channel is super-additive, for a large swath of values of s and for some generically chosen assisting channel. The super-additivity can be lifted to quantum capacity for degradable assisting channels and if the spin alignment conjecture holds. The assisting channel can have positive or vanishing quantum capacity. The mechanism behind this superadditivity is novel and in particular differs from the known explanation of super-activation [53,38]. Additional super-additivity of quantum capacity that is unconditional on the spin alignment conjecture can be proved for the d-dimensional generalization M d of N 1/2 , when it is used with a (d−1)-dimensional erasure channel for all nontrivial values of the erasure probability! In contrast, the pan-additivity of our channels is a fascinating and amazing progress.

A.1 N s channel
Let H a ∼ = H b ∼ = C 3 and H c ∼ = C 2 , and s ∈ [0, 1/2]. The platypus channel N s :Ĥ a →Ĥ b can be defined as follows: