How Deep the Theory of Quantum Communications Goes: Superadditivity, Superactivation and Causal Activation

In the theory of quantum communications, a deeper structure has been recently unveiled, showing that the capacity does not completely characterize the channel ability to transmit information due to phenomena – namely, superadditivity, superactivation and causal activation – with no counterpart in the classical world. Although how deep goes this structure is yet to be fully uncovered, it is crucial for the communication engineering community to own the implications of these phenomena for understanding and deriving the fundamental limits of communications. Hence, the aim of this treatise is to shed light on these phenomena by providing the reader with an easy access and guide towards the relevant literature and the prominent results from a communication engineering perspective.

is where quantum theory comes into play in the study of communication channels [2].As a matter of fact, any two parties wishing to exchange information should encode it in the state of some system acting as information carrier.Whenever the system exhibits a quantum nature -such as a photonic pulse propagating through an optical fiber -the propagation of the information carrier as well as the overall processing must follow the principles and the laws of quantum mechanics.Accordingly, as a generalization of channels in Shannon theory, quantum channels are introduced, linking the initial states of quantum information carriers controlled by Alice with their output states manipulated by Bob.
One surprising quantum effect, which can be resourceful for this paradigmatic shift from classical to quantum communications, is quantum entanglement.This new type of correlations, with no classical counterpart, can boost the communication capabilities drastically.In fact, despite that an entangled state shared between Alice and Bob -alone -does not provide any communication possibilities [3], when used to assist a quantum channel, it can enhance the performance by doubling the classical capacity as in quantum superdense coding [4].Or, even more surprising, it can enable the transfer of quantum information with the transmission of two classical bits as in quantum teleportation [3,4,5].
However, quantum Shannon theory has more to offer, as summarized in Table I and pictorially represented in Figure 1.Indeed, a proper channel encoder allowed to encode the information -either classical or quantum -into entangled states enhances the performance achievable thorough a quantum channel.This potential gain is referred to as superadditivity of the quantum channel capacity, and such a topic constituted a long and hot debate in the quantum communications community [6,7,8,9].
Even more astonishing, there exists pairs of channels that, although they do not have individually the ability to transmit any amount of quantum information, are able to transmit information when used together on entangled inputs.This is known as the superactivation phenomenon [10,11,12], which shows that the quantum capacity is a strongly non-additive quantity.
Both the superadditivity and the superactivation phenomena, which have no counterpart in the classical Shannon theory, induce an yet to be solved question on how different noisy channels interact and enhance each other's capabilities, as we will highlight and discuss in the following.

Classical Communications
Quantum Communications non-zero-capacity channels n uses of a communication channel do not transmit more than n times the amount of information that can be transmitted with a single channel use (additivity) n uses of a communication channel can transmit more than n times the amount of information that can be transmitted with a single channel use (superadditivity) -channels combined in a quantum trajectory can transmit more information with respect to a classical placement of the same channels (causal activation) zero-capacity channels can not transmit information, regardless of the number of uses and/or the placement of these channels can transmit information either with a classical placement of different channels (superactivation) or by combining the channels in a quantum trajectory (causal activation) Table I: Classical vs quantum communications.Superadditivity, superactivation, and causal activation can enable an unparalleled boost of the capacity of a quantum channel, which is not achievable in classical communications.
But the marvels of the quantum realm are not by any means limited to the unconventional phenomena of superadditivity and superactivation.Indeed, quantum Shannon theory deals with information encoded in quantum carriers, but still considers the propagation of information through classical trajectories, so that the path taken by messages in space is always well-defined, i.e., where channels are in definite causal order.
Counter-intuitively, quantum mechanics allows quantum particles to propagate simultaneously among multiple spacetime trajectories.This ability enables a quantum information carrier to propagate through a quantum trajectory [13,14,15,16,17].An important setup is given by a quantum trajectory where the constituting communications channels are combined in a quantum superposition of different orders, so that the causal order of the channels become indefinite.This unconventional placement of the channels is theoretically and experimentally implemented through the quantum switch, which is a supermap resulted from an extension of quantum mechanics under the name of process matrix formalism [18,19] or before this, quantum combs [20,21].
The superposition of trajectories and the quantum switch supermap have proved to be able to describe powerful setups for the transmission of classical/quantum information [22].As instance, whenever Alice and Bob are restricted to use quantum channels with zero-classical-capacity, no classical information can be sent throughout any classical configuration of the channels, neither parallel or sequential.Conversely, a causal activation 1 of the classical capacity 2 occurs when the channels are placed in a quantum configuration through the quantum switch, and non-vanishing information can be transmitted from Alice to Bob.
The unconventional phenomenon of causal activation led researchers to work toward the extension of quantum Shannon theory for modelling coherent superposition of quantum channels [15] as well as superposition of their causal orders [16] as a communication resource.This extension should not come as a surprise.Indeed, also within the "classical" quantum Shannon theory, phenomena such as superadditivity 1 The term causal activation was coined in [14] to distinguish the phenomenon of activating vanishing capacities of quantum channels with indefinite causal order of channels from the known phenomenon of superactivation [16]. 2 Indeed, causal activation occurs also for quantum capacities, as discussed in Section VI. and superactivation prove that the communication potential of a channel strictly depends on the context in which it is used.Hence, this shows that genuine quantum phenomena play a paramount role for future communications, and they should be fully understood and harnessed to achieve unprecedented information transfer capacities.

A. Outline and Contribution
As mentioned above, superadditivity, superactivation, and causal activation are all phenomena affecting the fundamental notion of channel capacity -as introduced by Claude Shannon with his seminal work [1] -in ways with no counterpart in the classical Shannon theory.Unfortunately, the existing literature is prepared by and prepared for the physics community.This still leads to a fundamental gap between the literature and the communications engineering community.
The aim of this paper is precisely to bridge this gap, by introducing the most novel, astonishing and intriguing properties of quantum communications, which can: • provide a capacity gain for both classical and quantum information through the superadditivity phenomenon, • provide a non-null capacity for quantum information through the superactivation phenomenon, • provide both a capacity gain (when the individual channels exhibit non-null capacity) or a non-null capacity (when the individual channels are zero-capacity channels) for both classical and quantum information through the causal activation phenomenon, by exploiting the genuine quantum placement of quantum channels provided by quantum trajectories.Stemming from the discussion above, in the following we shed the light on the notions of superadditivity and superactivation of quantum channel capacities, as well as the more recently discovered phenomenon of causal activation of different capacities, that accompanies the propagation of information along quantum trajectories, with the objective of allowing the reader: i) to own the implications of these phenomena for understanding and deriving the fundamental limits of communications; ii) to grasp the challenges as well as to appreciate the marvels arising with the paradigmatic shift from designing classical communications to design quantum communications.Through the manuscript, the nature of these phenomena and, in particular, the differences among the resources responsible for these advantages are elaborated.In fact, the understanding of these phenomena is a key to grasp how different resources can be distributed through quantum networks [23] more efficiently, and how they can be used optimally in the engineering of a near-term Quantum Internet [24,25,26,27,28,29,30,31,32].Indeed, due to the fast grow of both fields, such an understanding will serve the quantum engineering and the communications engineering communities alike to have an easy access and guide towards the relevant literature and to the prominent results, which will be of paramount importance for designing efficient communication protocols.
To the best of authors' knowledge, a tutorial of this type is the first of its own.
The paper is structured as depicted in Figure 2. Specifically, in Section II, we provide the reader -by assuming a basic background of classical Shannon theory -with a concise description of the preliminaries needed to understand and to formally characterize these phenomena.Then in Section III, we conduct an informal description of the three unconventional phenomena -superactivation, superadditivity and causal activation -from a communication engineering perspective.In Section IV, we first discuss the superadditivity phenomenon for one-shot capacities -i.e., Holevo information and coherent information -and then we generalize our discussions to regularized capacities.Continuing further our discussions, in Section V we detail the superactivation phenomenon for quantum capacities, and we point out the rationale behind being it restricted to quantum information.In Section VI, we discuss the causal activation phenomenon for different capacities, ranging from Holevo information through coherent information to classical and quantum regularized capacities.Finally, we conclude our tutorial in Section VII.Specifically, we first summarizing the differences and similarities between the communication advantages of these three phenomena, in terms of resources enabling these advantages.Then, we discuss the challenges and open problems arising with the engineering of these phenomena from a communication engineering perspective.Supplementary material is included in Appendices A-E with the aim of providing the reader outside the specialty of the article with an easy-to-consult summary of some definitions and results.

II. PRELIMINARIES
Ever since its almost 100-year history, quantum mechanics has not only strikingly challenged our view of Nature.Its novel counter-intuitive concepts without classical counterparts [33] have found their applications in a plethora of branches of science and engineering, and they revolutionized them.This has turned quantum mechanics from a formalism built to describe certain unexplained physical phenomena (e.g., blackbody radiation and photoelectric effect) and fit experimental data to a machinery that can be used in developing technologies that rely upon quantum effects.
Here, we provide a concise introduction to concepts and formalism needed to present and to discuss the phenomena of superadditivity, superactivation and causal activation.The basic notions and the notation adopted throughout the paper are summarized in Table II and Table III, respectively, along A.1) The quantum bit: What makes quantum mechanics attractive from a communications engineering perspective?First of all, its very principles offer a novel way to treat information when encoded in a quantum system.Classically, two mutually exclusive states -i.e., 0 and 1 -can be encoded in a bit, which is in only one of these states at any time.Conversely, suppose now that two states |0 and |1 of a quantum twolevel system (for example, the polarization of a photon) are used to encode them 3 .In this case, the superposition principle -the corner-stone of quantum mechanics -allows to go beyond bit's classical behavior, since the system can be in both states simultaneously.Hence, we can introduce the quantum bit (qubit) whose state |ψ encodes more than simply the states |0 and |1 , since it can be in a superposition of them as follows: with α, β ∈ C, known as amplitudes, satisfying |α| 2 + |β| 2 = 1.Hence, a qubit can encode not only classical information 3 Above we utilized the bra-ket notation usually adopted for quantum state.For a proper introduction to this notation, we refer the reader to Appendix A.
(the states |0 and |1 ) but also quantum information manifested in the coherence (carried by the complex amplitudes α and β) it can possess.This type of information has no classical counterpart.An important consequence of the superposition principle is a new way of processing and encoding information 4 , which can be exploited to significantly increase the security of communications and even to exchange information without actual transmission of the information carrier between the parties [36,37].A rigorous definition of the qubit is given in Appendix A.
A.2) Quantum measurement: In order to retrieve data from a qubit, one has to perform a measurement of the corresponding degree of freedom encoding the information (for example, polarization of the photon).For a superposed state of a qubit, the result of the measurement is probabilistic due to the Born's rule of quantum mechanics.For instance, for the qubit given 4 An illustration of this feature is provided by the Elitzur-Vaidman bomb testing problem: we are supposed to have a bunch of bombs that are activated by a sensor absorbing a photon.Since some sensors have a defect and do not absorb photons, we have to select the working bombs from the bunch.Classically, there is no way to find out whether a bomb works properly without making it actually explode by shining light on the sensor.However, if a photon before reaching the sensor hits a half-silvered mirror, the superposition principle allows to distinguish -probabilistically, with a success ratio as high as 33% -between the working and faulty bombs and selects some of the working ones without explosion [34,35].
Table III: Adopted notation and section of the manuscript where the notation is defined or introduced. in (1), one obtains state |0 with the probability |α| 2 and state |1 with the probability |β| 2 , hence retrieving at most one bit of information.Crucially, the measurement causes the state of the qubit to collapse to the measured state.Indeed, if the measurement of the qubit given in (1) has revealed the state |0 , any further measurement will reveal the same outcome regardless of the initial superposition.This means that the measurement irreversibly alters the state of the qubit, which loses thus the coherence previously existing between the two states |0 and |1 .A formal definition of quantum measurements is given in Appendix A.
A.3) No-cloning: Classical communication protocols rely on the ability to copy the information and to transmit it to many different users.This fundamental assumption is widely exploited through the whole protocol stack [38].Conversely, quantum information cannot be copied or cloned, as stated by the no-cloning theorem [39].In simpler terms, quantum information cannot be multicasted or broadcasted, in contrary to classical information.Consequently, the no-cloning theorem poses drastic unconventional challenges for the design of quantum networks, as most of the known classical protocols fail to be extended to the quantum paradigm [38].Fortunately, a non-trivial caveat to some of the restrictions would rely on the notion of entanglement and its astounding advantages.

A.4) Entanglement:
The superposition principle leads to a number of intriguing genuinely quantum phenomena, including the celebrated entanglement [40].Entanglement is a sort of correlations between parties of some (joint) system, which have no classical counterpart.In his seminal paper [41], John Bell has established constraints on correlations between two systems that cannot be broken by classical correlations.These constraints can be formalized in a form of inequalities for the statistical properties of outcomes of measurements performed on the joint system (the most famous form of Bell inequalities is also known as the CHSH inequalities).It has been shown that quantum entanglement can violate such inequalities.This makes entanglement a invaluable resource that might beat classical resources in different communications contexts.Although it remained until the end of the last century the question of what entanglement is useful for, eventually entanglement has been harnessed to outperform classical communication protocols and to provide security for quantum key distribution [42,43] 5 .Specifically, quantum superdense coding [5] came against what was previously known in information theory to be a coding bound for classical information.Classically, if a sender -say Alice -wants to communicate a two-bit message to a receiver -say Bob -she has to use twice a single-bit classical channel.The same still holds even if Alice and Bob  Conversely, if Alice and Bob share apriori entanglement, a two-bit classical message can be sent through a single use of a quantum channel.Furthermore, this protocol has proved not to just outperform the performance of classical communication protocols, but also to be extremely secure [5,42,43].But there is more to it.A qubit can never be transmitted using only classical channels, as these latter can not preserve the genuine quantum coherence [44].Luckily, the quantum teleportation protocol -the dual of superdense coding -allows for the transmission of an unknown qubit state using a two-bit classical channel [4], by exploiting again entanglement as a fundamental resource.The design of these two protocols challenged the classical notions of information theory and classical communications, and it opened the door towards a new era of quantum communications.
A.5) Quantum channels: In communications engineering, information is usually encoded according to the physical medium that carries it.This physical medium is usually modeled as a classical channel, which does not take into account the quantum mechanical properties of the physical system carrying the information.Conversely, in quantum communications, quantum channels model the physical medium by considering the quantum mechanical properties of the physical carrier as well as its quantum interactions with the physical environment.The rationale for this is to keep track of the coherence present in the physical carrier, and to harness its advantages in encoding classical and quantum information alike.Indeed, quantum channels -with particular instances given by optical fibers and free-space carrying quantum light -might be seen as transformations of a given quantum mechanical system state, inducing its evolution from an initial state (input of the channel) to the final state (output of the channel).Accordingly, classical channels might be seen as a particular class of quantum channels where quantum coherences are completely absent.This paradigm shift from classical to quantum channels affects the very same concept of capacity, the quantity characterizing communication channels performance, as introduced in the following paragraph.
A.6) Channel Capacity: Capacity is an intrinsic property of communication channels -be them classical or quantumwhich measures the maximum rate at which information can be reliably transferred between Alice and Bob.The capacity establishes the ultimate boundary between communication rates that are achievable in principle and those which are not.Indeed, when quantum effects are involved, there does not exist a single notion of capacity to evaluate the performance of a quantum channel.Rather, there exist multiple, nonequivalent definitions of capacities [45,46], as introduced in Section II-B and described in details with Sections II-D and II-E.

B. From Classical Capacity to Quantum Capacities
When a communication channel is used to communicate messages between two parties, Alice and Bob, it is fundamental to assess the channel capacity -namely, the maximal amount of information Alice and Bob could reliably transfer by choosing appropriate encoding and decoding operations 6 .
What is meant by reliable is that there is an infinitesimally vanishing probability that the message, sent by Alice, arrives with any alteration to Bob [47].The condition of vanishing  The capacity of a channel measures the maximum rate at which information can be reliably transferred between communication parties through such a channel.A classical channel can be used to send classical information only and, therefore, it is fully characterized by its classical capacity.A quantum channel can transmit either classical or quantum information, and the corresponding rates are bounded by its classical and quantum capacities, respectively.error probability is generally imposed in the asymptotic limit where infinitely long codes are allowed.In this setup, an explicit closed-form expression for classical channel capacities exists, which depends on the noise model given through the conditional probability p(y|x) characterizing the channel, where x and y denote the input and the output messages, respectively.Accordingly, the classical capacity is expressed as [47]: where the maximization is over all probability distributions on x, and where I(X : Y ), defined in (111) reported in Appendix D, denotes the mutual information between the input and output random variables X and Y .
Surprisingly and contrary to the classical case, extending this framework to quantum channels leads to the introduction of different capacities, depending on which context -i.e., depending on whether Alice and Bob are exchanging classical, private or quantum information -the quantum channel is used for [45,46], as shown in Figure 3.
In the following, we will restrict our attention on: i) the classical capacity C(•) over quantum channels, and ii) the quantum capacity Q(•) over quantum channels.A general scheme for classical/quantum capacity is shown in Figure 4. Specifically, the former capacity C(•) deals with the transmission of clas- sical information through a quantum information carrier, by assuming the presence of proper classical-to-quantum encoder E and decoder D, whereas the latter capacity Q(•) requires the availability of quantum-to-quantum encoder/decoder for allowing the transmission of quantum information.Furthermore, for each of the mentioned capacities, we are going to distinguish between one-shot capacities χ(•) and I c (•) and (regularized) capacities C(•) and Q(•).Specifically, the one-shot capacity restricts the encoder E to generate states that are separable7 over multiple uses of the channel, whereas the (regularized) capacity is achieved by relaxing this constraint and hence allowing the encoder to generate entangled states.Clearly, it results χ(N ) ≤ C(N ) and I c (N ) ≤ Q(N ) for any quantum channel N [46].
In what follows, we are going to give in Section II-C the operational definitions of the quantum capacities used in Section IV, without making reference to the explicit structure of the channels.Afterwords, in Section II-D and II-E we are going to review the important quantum coding theorems for memoryless channels, which express the capacities in terms of explicit entropic quantities.

C. Operational Capacity Definition for a Quantum Channel
The classical/quantum capacity of a quantum channel 8 N is the maximum achievable rate at which information encoded in quantum carriers can be transferred reliably from Alice to Bob.As in classical Shannon theory, the ratio k n is what measures the rate, where k is the number of exchanged bits/qubits of information between the sender and the receiver, and n is the number of uses of the communication channel.
Similarly to classical Shannon theory, the reliability condition requires that, at the asymptotic use of the channel (i.e., when n → ∞), the fidelity 9 operator.F between the channel input/output -or, correspondingly when it comes to classical communications, the error probability 10 -can be made arbitrarily close to one -or, correspondingly, arbitrary close to zero.
Henceforth, the classical/quantum capacity of a quantum channel can be given in an operational way, depicted in Figure 4, as: with the fidelity measuring the distinguishability between the input symbol |m and the output symbol D n→k • N ⊗n • E k→n (|m ).E k→n and D n→k denote the encoder and the decoder, mapping the11 k-qubits/bits message that Alice wants to share with Bob into a n-qubits code-word sent through the quantum channel as described in Appendix E. Importantly, classical information could be encoded in the orthogonal basis of the Hilbert space, whereas quantum information must be encoded in the span of the orthogonal basis of the Hilbert space due to the genuine quantum coherence.Intuitively, when decoding information encoded in the Hilbert space, we can retrieve more classical information than quantum.By oversimplifying, the rationale for this can be understood in terms of no-cloning theorem, which allows classical information to be copied whereas quantum information cannot.Accordingly, for any given channel N , the quantum capacity Q(N ) is upper bounded by the classical capacity C(N ) [51].
Expressions for capacities of quantum channels, in terms of entropic functions, have been provided by sophisticated coding theorems.While we refer the reader to [45,46,49,52,53,54,55,56,57] for detailed review of different notions of quantum channel capacities, for both channels with and without memory, and entanglement-assisted capacities, we focus here on memoryless channels and their unassisted capacities.

D. Classical Capacity of Quantum Channels
The expression of the classical capacity of an arbitrary quantum channel N has been formalized by the Holevo-Schumacher-Westermoreland (HSW) coding theorem with reference to the one-shot capacity12 χ(N ) [58,59].The expression resembles Shannon's formula given in (2) for the classical capacity of classical channels, as it can be expressed in terms of a maximization of the Holevo information 13χ N , {p x , ρ x }, N over the set of input ensembles {p x , ρ x } encoding the classical messages.Formally, it is given by: and the maximization can be taken always over pure input states, restricting so the search space.
The operational meaning of the HSW theorem is that, given an ensemble {p x , ρ x } and an integer satisfying N ≤ 2 nχ {px,ρx},N , one can choose N n-qubits codewords ρ 1 , ρ 2 , . . ., ρ N in separable product form ρ i = ρ i1 ⊗ • • • ⊗ ρ in and an associated decoding measurement setup, allowing Bob to discriminate between the N output states N ⊗n (ρ i ) = N (ρ i1 ) ⊗ • • • ⊗ N (ρ in ) arbitrarily well in the asymptotic limit of n.The positive operator-valued measure (POVM) 14assigned for the measurement setup is allowed to be an entangling measurement that operates collectively on the nqubits output of each codeword.
As mentioned in Section II-B, if we unrestrict the encoder E from mapping messages only to product states as in Figure 5a, and we rather allow it to produce entangled codewords as in Figure 5b, we obtain the classical capacity C(N ) of the quantum channel N .
In the HSW coding theorem, this is achieved by adopting a block coding strategy, which, for any n > 1, allows Alice to use n copies of the channel as a single extended channel N ⊗n with associated Holevo information χ(N ⊗n ), where the maximum is taken over all input ensembles, including entangled states 15 , for the n elementary channels.As a result, the capacity C(N ) of N can be obtained by taking the limit n → ∞ over the associated rate χ(N ⊗n ) n .This is known as the regularization procedure of the capacity, and it allows the capacity C(N ) to be written as: As it appears, the capacity C(N ) is not easily computed in general [58], as it requires a maximization over an unbounded number of uses of the channel.Indeed, a single-letter formula of the capacity is known only for few types of quantum channels, e.g., the depolarizing channel [60].

E. Quantum Capacity of Quantum Channels
Similarly to the HSW theorem, the quantum capacity theorem -widely known as the LSD theorem [49,51] -expresses the quantum capacity Q(N ) in terms of a regularization of the one shot capacity 16 I c (N ).The latter quantity expresses the maximal achievable rate through the quantum channel when the quantum-to-quantum encoder is constrained to generate separable codewords only.
Formally, the one-shot quantum capacity I c (N ) is expressed in terms of the coherent information 17 of channel N with respect to the arbitrary state ρ as: where the maximization is taken over all possible input quantum states.Output message (a) A scheme for the one-shot capacity χ(N ⊗n ) of the quantum channel N through n uses of the channel N .A set of classical messages in an alphabet M is encoded by a classic-to-quantum encoder E C−Q constrained to separable codewords, namely, ρi = ρi1 ⊗ • • • ⊗ ρin for the i-th codeword.After transmission, a quantum-to-classic decoder D Q−C is applied to retrieve the classical message.The decoder is a measurement given by the optimal POVM, which is allowed to act collectively on the joint output state in order to obtain a set of classical messages.Output message (b) A scheme for the classical capacity C(N ) of quantum channel N through n uses of the channel N .The encoder E C−Q is not restricted to separable codewords, rather, it is allowed to encode the classical information into entangled input states ρi = ρi1 ⊗• • •⊗ρin.Similarly to the scheme of the one-shot capacity, the decoder is allowed to perform entangling measurements.As already mentioned, the one-shot capacity I c (N ) does not fully characterize the quantum capacity Q(N ), which is the maximum achievable rate, for which the fidelity of the transmitted state is arbitrarily large, i) over asymptotically many uses of the channel N , and ii) with the encoder allowed to generate entangled codewords.Likewise to the classical capacity, when a block coding strategy is used the quantum capacity can be expressed as [61,62,63]: Of course, the quantum capacity Q(N ) is a non-tractable quantity in general.The rationale for this is because (7) involves maximizing the coherent information over an unbounded number of channel uses.In fact, entanglement across channel uses can even increase the coherent information from zero to non-zero.One might think that only a finite number of channel uses might be sufficient to calculate the capacity, as imposing a cut-off in the number of uses of the channel.It turns out this is completely wrong, as it has been shown that whatever value of n we fix, we can always find a channel with vanishing coherent information I c (N ⊗n ), nonetheless, the quantum capacity Q(N ) is non-vanishing [64].

F. Bibliographic Notes
One of the earliest uses of quantum information is classical communications over quantum channels.This research was initiated by the early work of Holevo [65], in which the Holevo bound on classical capacity was established.Later on, a lower bound on the Holevo information of a channel was provided independently by Schumacher and Westmoreland [58], and Holevo [59].Classical communications in one shot setting has been studied by a number of authors, including Hayashi [66,67], Renes and Renner [68], Wang and Renner [69], Datta et al. [70], Mathews and Wehner [71], Wilde [72].
The quest for determining a quantum capacity in the Shannon's sense was raised by Shor [73].Different notions of quantum communications were established since then.The one adapted in this tutorial is based on entanglement transmission which was defined by Schumacher [74].The notion of subspace transmission was proposed by Barnum et al. [75].Devetak [62] gave the definition of entanglement generation.Kretschman and Werner [76] showed that the capacities derived from these variations are all equal.The coherent information of asymptotic uses of a quantum channel was derived by Schumacher [74] as an upper bound on quantum capacity, Barnum et al. [77], Schumacher and Nielsen [78].The coherent information as a lower bound on the quantum capacity was established by Loyd [61], Shor [63], Devetak [62].Another proof for the achievability of the coherent capacity was provided by Hayden et al., using the decoupling lemma [79], which was initiated by Schumacher and Westmoreland [80].The one-shot setting of quantum capacity was treated in many papers, including Buscemi and Datta [81], Datta and Hsieh [82], Wang et al. [83], Kiavansh et al. [84].

III. QUANTUM MARVELS
In this section we present the three dazzling phenomena of superadditivity, superactivation and causal activation.An easyaccess guide towards the literature related to these phenomena and the prominent results as timeline of the milestones is provided with Figure 6.

A. Superadditivity
As mentioned above, entanglement has no longer been considered only as a foundational concept that breaks the operational causal explanations of correlations formulated in terms of Bell inequalities [41].It started rather to be considered as a tool with wider applications in different areas of communication engineering.And researchers are continuing to dig for other surprises of quantum phenomena within the field.
Astoundingly, it was found that -contrary to classical communications 18 , when a quantum channel is used independently multiple times, its performance in terms of coherent information 19 [45,46,101] can be non additive on the number Figure 6: Timeline for milestones on superadditivity, superactivation and causal activation of quantum channels.
In other words, in classical communication scenarios, if a channel is able to transmit a bit of information, when it is used n times, the amount of information that can be transmitted is no more than n bits.Formally, the mutual information between the output Y n and the input X n random variables on n uses of a classical channel {p i (y|x)} n i=1 is always bounded by n times the single letter capacity of the channel: In other words, the use of correlated codewords, jointly sampled, in transmitting information does not provide any communication benefit with respect to the use of uncorrelated codewords, sampled from a product distribution.
In contrast, in quantum communication scenarios, a quantum channel that can transmit a certain amount of information (classical or quantum), when used n-times it can send more than n times that amount of information.This is extremely against classical additive logic of 2 = 1 + 1.Indeed, in the (a) quantum domain, superadditivity can happen and it results 20 : This unconventional phenomenon requires the use of entanglement to encode messages, which in turn can be either classical or quantum.This is known in the literature as the superadditivity of quantum channel capacities, and it is depicted in Figure 7.The figure illustrates that when Alice and Bob use multiple instances of a quantum channel N21 to communicate messages encoded in separable input states, the coherent information of the two channels together I c (N ⊗2 ) is equal to the sum of the two individual coherent information I c (N ) + I c (N ).This is trivial in classical communications 22 .On the contrary, when Alice and Bob use the channel the same way as before -but encoding messages in entangled statesthe overall coherent information I c (N ⊗2 ) exceeds the sum of the coherent information of individual channels, in the form Indeed, a similar behaviour has been observed for the Holevo information This phenomenon shows how entanglement can be considered as a key factor for unravelling the unconventional potential of quantum theory when it comes to quantum communication.Equally, it highlights that this potential is not limited to (a) quantum messages, given that quantum communications can boost the classical information transmission rates as well, as shown in Section IV-A.

B. Superactivation
More surprisingly, our rather simplistic understanding of nature is broken by quantum logic, when it comes to the phenomenon of superactivation [10].This is when two different quantum channels that cannot transmit any amount of information separately -i.e., zero capacity channels [103] when properly used together, they can transmit information.In classical information logic the relation 2 • 0 = 0 + 0 holds, whereas this is not the case when it comes to quantum information, where the relation 23 : The superactivation phenomenon, as we discuss in more details in Section V, relies on entanglement [10] 24 .This is depicted in Figure 8.This scheme shows that when the two zero capacity channels N and M -with no ability of transferring quantum information -are used on separable inputs encoding a quantum message, the coherent information of the two channels together is the sum of the two individual coherent information.Hence, the overall channel N ⊗M does not allow transmission of any quantum information.On the other hand, if the quantum message is wisely encoded in an entangled state given as joint input to the channels, the overall channel N ⊗ M gains potential for the transmission of quantum information.Accordingly, the overall coherent information I c (N ⊗ M) satisfies the following inequality: We note that -as for superadditivity -entanglement plays a fundamental role in enabling unparalleled phenomena in quantum communications.We further note that -conversely to superadditivity -no superactivation phenomenon is known to exist for quantum channels conveying classical information [106], as discussed in Section V.This shows that quantum communications represent an heterogeneous communication paradigm, where the communication potential of a channel depends on the information nature of the message.

C. Causal activation
Although the inception of the quantum formalism has been initiated more than a century ago [101], its surprises are still coming out to this day, and there is much more out there to be discovered.
Recently, quantum information theorists, investigating causality in the quantum realm, have discovered that quantum mechanics allows for causal order to be indefinite [18,95].In simpler terms, causality between events -channels from a communications engineering perspective -might be not fixed, as shown in Figure 9.If so, two communication channels, say N and M, instead of affecting the information carrier in a definite causal order -i.e., either M • N or N • Mthey act on the carrier in a genuinely quantum superposition of causal ordering.Hence, the information carrier evolves through a quantum trajectory [107].One example of quantum trajectories is the quantum switch [96,108], which is a supermap acting on a set of channels and places them in a coherent superposition of different orders, which is a genuinely quantum placement setup.
It has been both theoretically [14,22,109] and experimentally [100] verified that the quantum switch can be used for communications, in an outperforming way, even when it is compared to known quantum protocols. 25Indeed, it has been shown that there are zero capacity channels that cannot transfer any information in the usual setups, i.e., parallel or sequential setups where the order of channels is well definite.But, when used in a quantum superposition of causal orders, these channels transmit non vanishing information (either classical or quantum, depending on the setup).This phenomenon, also termed as causal activation in literature [111], as astounding as it is, harnesses its advantage from a genuinely quantum coherence between causal orders.Indeed, causal activation should be regarded as a new way of placing communication channels [15], with no similarity with classical placement, such as parallel or sequential ones.In fact, as we discuss in more details in Section VI, whereas superadditivity and superactivation exploits quantum channels combined in a classical way, causal activation exploits a new degree of freedom, namely, the quantum placement of quantum channels.

IV. SUPERADDITIVITY OF QUANTUM CHANNEL CAPACITIES
Here we detail one of the quantum marvel phenomena introduced in Section III, namely, the superadditivity.
The additivity notion is very important as many questions in quantum information theory reduces to the additivity properties of some key functions [93].In this section, we are going to discuss the additivity properties of the Holevo information and the coherent information, which are the essential elements for characterizing the capacities of quantum channels.

A. Superadditivity of Holevo information
Originally, the Holevo information was believed to be additive for all quantum channels [93], that is This would imply that the Holevo information would be a good characterization of the classical capacity in the general case, i.e., χ(N ) = C(N ).This conjecture, known as the additivity conjecture, was proved to hold for some classes of quantum channels, e.g., entanglement breaking channels [89] or depolarizing channel [91].Surprisingly, Hastings found the existence of a counterexample to the additivity conjecture [6], demonstrating that it does not hold in the general case.He showed that, when entangled input states are used, the Holevo information is not only weakly superadditive, instead, it exhibits a strong superadditivity property.The counterexample relies on the use of two random channels N and N : which are complex conjugate to each other.Specifically, the channels have unitary Kraus operators {U i } i∈{1,...,D} and their complex conjugates { Ūi } i∈{1,...,D} .Moreover, each unitary U i is randomly sampled from a certain given random distribution.Finally, the coefficients p i in (12) are chosen randomly from another particular distribution, in such a way the minimum output entropy of the tensor product N ⊗ N of the two channels is strictly smaller than twice the minimum entropy of one of the channels alone.Formally, this is given by the following inequality:  under the use of entangled input states to the channel N ⊗ N .This inequality proved 26 the superadditivity phenomenon of the Holevo information, demonstrating that one of the most basic questions in quantum Shannon theory still remains wide open, i.e., there exists no general closed formula for classical capacity.This in turn shows our lack of deep understanding about classical information transmission over quantum channel.
Furthermore, it also implies that if Alice encodes the classical message she wants to communicate to Bob in an entangled state, this can help in increasing the classical capacity over the quantum channel linking Alice and Bob.This phenomenon has no counterpart in classical communications, where the capacity -quantified by the mutual information between input and output of the channel -cannot be increased even if classical correlations between subsequent input bits are exploited.

B. Superadditivity of Coherent Information
It was shown that the quantum capacity of a quantum channel is well-behaved and completely understood for the class 26 Indeed, the minimum entropy of a quantum channel N is defined as H min (N ) = minρ S(N (ρ)) [49], where S(ρ) denotes the von Neumann entropy of the state ρ as detailed in Appendix D. The minimum output entropy is related to the Holevo information for irreducibly covariant quantum channels by χ(N ) = S(N ( I d )) − H min (N ) where I d is the maximally mixed input state, with d being the dimension of the input of the channel and I being the identity operator.Hence, for irreducibly covariant quantum channels, the subadditivity of the minimum entropy implies the superadditivity of the Holevo information. of degradable channels, over which the coherent information is additive [90], that is: Hence the regularization could be removed and the quantum capacity could be computed by a single optimization, similarly to classical channels.However this is not true in general, as it was proven that for some channels -e.g., the depolarizing channel [87,112] the coherent information for multiple uses of the channel for some given value of n is greater than n times the coherent information provided by a single use of the channel.Hence, coherent information is superadditive [7,87,64].To illustrate this concept, let us consider the depolarizing channel N D , which transmits faithfully its input with probability 1 − p and replaces it with probability p by a maximally mixed state π = I 2 , where I is the 2 × 2 identity matrix.Formally, this channel is given by: with q = 3p 4 as in [46].To check whether the coherent information is superadditive for this channel, it suffices to calculate the coherent information for a single use of the channel, and then to find a code for multiple uses of the channel whose coherent information out-passes the single use case.
To this end, we note that the state maximizing the coherent information I c (ρ, N ) in (6) for the depolarizing channel is the  87,112].Equivalently, this means that the coherent information of the depolarizing channel N D can be obtained -by following the scheme depicted in Figure 10 -over the output state ΦBA , where: where denote the density matrices of the maximally entangled states.Accordingly, from ( 114) and ( 108) reported in Appendix D, the coherent capacity I c (N D ) of the depolarizing channel for a single use is given by: with q = (1 − q, q 3 , q 3 , q 3 ) denoting the vector of probabilities and H( q) = −(1−q) log 2 (1−q)−q log 2 q +q log 2 3 denoting the Shannon entropy -defined in (102) -of the distribution q.
A plot for the single-shot coherent capacity I c (N D ) of the depolarizing channel is given in Figure 11, where we see that it vanishes from a critical value of q ≈ 0.1893.It is known that for antidegradability reasons, the quantum capacity C(N D ) of the depolarizing channel vanishes when the channel parameter q satisfies q ≥ 1 4 = 0.25 [113] and, hence, the coherent information fully characterizes the quantum capacity of the channel.Conversely, whenever q < 1 4 , the coherent information does not fully characterize the quantum capacity of the channel.Consequently, the coherent capacity of multiple uses of the channel must be computed and, in the following, we will focus on a specific scenario where three uses -instead of five as in [46,87,112,114] -of the channel are sufficient to prove the superadditivity of the coherent information.
Specifically, we focus on a (3, 1) repetition code where each qubit is transmitted with three uses of the channel N D , and we will show that there exist a state ρ and some parametric region of the depolarizing channel N D so that: Depolarizing channel parameter q Coherent information Figure 11: Coherent information for a depolarizing channel N D vs. channel parameter q, where: i) the straight-blue line denotes the coherent information I c (N D ) achievable with a single use of depolarizing channel, and ii) the dotted-green line denotes the coherent information achievable with three uses of the channel for the encoder output given in (19) with proper choice of the encoder and the decoder.The plot is an illustration of the results derived in [46,87] where E and D are the encoder and decoder, respectively.Let us consider as output state of the encoder E(ρ) and, hence, as input to the equivalent channel, the following state: where A i is the input to the i-th use of the channel and A is the reference system as in Figure 10.Furthermore, let us assume we post-process the resulting state at the level of receiver with the decoder D shown in Figure 12.Clearly, we have that: as a result of the quantum data processing inequality [46], where Due to the convexity property of the coherent information on the receiver over classical variables [46,49], the coherent information resulting from the post-processing in Figure 12 is given by the weighted average over the output of the measurements s 1 and s 2 over B 1 and B 2 : where D s1s2 embeds the dependence of the post-processing on s 1 and s 2 , i.e., whether there will be applied a X gate on the third qubit.For each syndrome s 1 s 2 , there are 16 Kraus operators that can give rise to it.As an example, with probability q 3  27 each of the three channels will act as a X channel, and the decoder, by measuring the first and second qubits as 00, will keep the third qubit as unchanged.By grouping all the possibilities that give rise to a specific syndrome -say 00 -we can model the overall evolution of the third qubit as going through a Pauli channel such as: characterized by the vector of probabilities q s1s2 with coherent information given by: Remarkably, it has been shown that we can pick a noise parameter q from the region where the coherent information of the single use of the depolarizing channel is vanishing from Figure 11, while the coherent information in ( 21) is non-vanishing.This proves (18), demonstrating a superadditive effect of the coherent information for the depolarizing channel.
Furthermore, it has been also demonstrated (not constructively, i.e, using random codes) that there exist channels that have vanishing coherent information for arbitrary n-codes, but they have a non-vanishing capacity [64].Which is even a stronger argument for the necessity of regularization for the quantum capacity.Indeed, on one hand, this means that the coherent information must be regularized over unbounded uses of the channel, hence, it cannot be used to compute the capacity in general.On the other hand, since the coherent information is additive for separable input states, additivity violation also implies that entanglement can protect information from noise in a way that is not possible classically [9,115].

C. Superadditivity of Classical and Quantum Capacities
Having discussed the superadditivity of the one-shot capacities -i.e., of the Holevo information and the coherent information -we discuss now the superadditivity of the regularized capacities C(N ) and Q(N ).
Someone could think that some form of superadditivity for the regularized capacities might be obtained by using multiple instances of the same channel, as schematized in Figure 5b.However, regularized capacities -regardless of the classical/quantum nature of the message -over asymptotic uses of the same channel are themselves always additive.In the case of the classical capacity, this translates formally into: regardless on whether the n uses of the same channel happens simultaneously in parallel or sequentially with independent uses over time.Similarly, the quantum capacity Q(N ) is additive over multiple uses of the same channel This additivity property can be easily seen from the regularization of the Holevo capacity given in (5) and from the regularization of the coherent capacity given in (7).Since the additivity is established for the use of the same channel in parallel or independently over time, it is important to understand if this holds also when different channels are considered.The answer to this question allows one to understand how different noisy channels interact and enhance each others capabilities.
Whether it is true that: is still an open problem for classical capacity of quantum channels.For instance, it can easily be noted by simple coding arguments, that the rate C(N )+C(M) is always achievable by feeding the optimal code for each channel independently.The question of the superadditivity of the classical capacity relies on whether there could be a code with entangled states of the codewords, that satisfies C(N ⊗ M) > C(N ) + C(M).We should note that the superadditivity of the Holevo information of two channels 27 N and M does not guarantee the superadditivity of the overall capacity of the two channels N ⊗ M when used together.Contrary, the situation for the quantum capacity is much more understood.
The quantum capacity can be superadditive over the use of two quantum channels N and M together [16].This could be described formally by: Furthermore, as discussed in Section V the quantum capacity can exhibit a superactivation phenomena, which constitutes a form of superadditivity over different zero-capacity channels in the sense that the quantum capacity can satisfy the following inequality: for . The superactivation of the quantum capacity is not possible for the classical capacity for reasons that we clarify in Section V-D.

V. SUPERACTIVATION OF QUANTUM CHANNEL CAPACITIES
Here we detail the second marvel phenomenon introduced in Section III, namely, superactivation.Superactivation, as Output message Output message The same two channels being used for the same task, but the sender's encoder E now has simultaneous access to the inputs of all channels being used, allowing for quantum information to get through the two channels, and the receiver's decoding D is also performed jointly preserving coherence.
Figure 13: Superactivation of the quantum capacity from the encoder perspective.
mentioned at the end of Sec IV, is an unexpected genuinely quantum phenomenon that occurs when two zero-capacity quantum channels are used to transmit quantum information.Unexpectedly, superactivation can only occur when the two cooperating quantum channels are from different families, none of which can simulate 28 the other.In the next subsections, we discuss the different nonequivalent families of quantum channels known in literature.Subsequently, we provide examples of the phenomenon of superactivation for quantum channels from these families.

A. Classes of Zero-Capacity Channels
At least two classes of quantum channels are known to have zero capacities (whether additional classes of zero-capacity channels exist is still an open problem).The first class is known to be the family of antidegradable channels.Channels of this family, cannot transmit quantum messages due to the no-cloning theorem, which prohibits quantum information to be duplicated [94].As is discussed in Appendix C, antidegradable channels are self-complementary, in the sense that the environment of the channel can process its outcome to get an exact copy of the receiver.Thus, if this channel has a positive quantum capacity, it would violate the no-cloning theorem.An example of channels from this family is the 50% twoqubit erasure channel, which faithfully transmits a two-qubit input state half of the time and outputs an erasure flag in the rest of the cases.This channel is given by [49,46]: where |e stands for the erasure flag 29 .Another family which is known to have vanishing quantum capacity is the class of PPT channels.These are channels with 28 I.e., arbitrary combinations of channels of one family cannot result in a channel from the other family [10,49,46]. 29Mathematically, this means that N E 50% : where H 1,2 are the Hilbert spaces of the first and second qubit, respectively.Hence, this channel has a four-level input and a five-level output, where the extra output corresponds to the erasure flag.
Choi state that has a positive partial transpose 30 , hence a PPT state.It is known that PPT states are states from which no entanglement can be distilled even asymptotically.The reason why PPT channels have zero capacity, is that no entanglement can be recovered between the sender and the receiver even at an unbounded use of the channel [3,116].A particular example of this family is the 4-dimensional Horodecki channel N H given by its Kraus operators as: with and I, X, Y, Z denoting the 2 × 2 generating matrices of the Pauli group.

B. Superactivation of Quantum Capacity
Superactivation is a strong superadditivity phenomenon that occurs when two channels, having vanishing individual quantum capacities Q(N ) = Q(M) = 0 belonging to different classes of zero capacity channels, are used together.These channels might gain potentially a quantum capacity enabling them to communicate quantum information, in such a way that: As a result, we say that quantum capacity has been activated [10].The phenomenon of superactivation is schematized in Figure 13.
In this context, it has been shown that, when a quantum channel is used together with a classical channel to transmit quantum information, this configuration does not increase the quantum capacity [117].This research area has been extended to symmetric side quantum channels [118], whose use together with an arbitrary channel N exhibits the following single-letter expression: where N SS is the channel of unbounded dimension satisfying the optimization over the convex set S of symmetric side channels.In particular, it satisfies the following relation [10]: with P (N ) denoting the private capacity 31 of channel N .
Combined with the fact that the known Horodecki channels have a non-vanishing private capacity -i.e., P (N ) > 0this key result demonstrates that the capacity of Horodecki channels together with symmetric channels is non-vanishing.Namely, there exists a zero-quantum-capacity symmetric channel that, when used with a zero-quantum-capacity Horodecki channel, leads to a positive capacity.However, this result involves symmetric channels N SS with infinite dimensional input, given the sup in (32).Hence, further bounds for symmetric side channels with finite dimensional inputs are needed.
Accordingly, it has been shown that when Alice and Bob use a 4-dimensional Horodecki channel N H given in (30) to communicate quantum messages with a symmetric channel given by 50% two-qubit erasure channel N E 50% given in ( 29), the startling effect of superactivation occurs [10].When these two channels are combined, in fact, they satisfy [10]: where I c (ρ, N H ⊗ N E 50% ) is the coherent information of the channel N H ⊗ N E 50% over a particular input state ρ whose expression can be found in [10].
The two channels N H ⊗ N E 50% are neither antidegradable nor PPT, having quantum capacity greater than zero.Therefore, we might interpret the gained capacity Q(N H ⊗ N E 50% ) > 0.1 as the symmetry of the erasure channel being somehow broken as an effect of the private information leaked through the Horodecki channel [12].

C. Non-Convexity of Quantum Capacity
Astoundingly, another form of superactivation for the previous channels has been revealed, in terms of the non-convexity property of the quantum capacity.A channel, that is a flagged convex combination of the two zero capacity channels, can be constructed, and is given by ) 31 In a nutshell, the private capacity defines the rate at which the channel can be used to send classical data that is secure against eavesdropper with access to the environment of the channel.
It is a flagged 32 , convex combination which can be switched between acting as N H and N E 50% with the aid of an ancillary qubit degree of freedom.
To better understand the capabilities of this channel for transmitting quantum information, one would calculate coherent capacity over multiple uses, as its one shot coherent capacity clearly vanishes because N H ⊗ |0 0| and N E 50% ⊗ |1 1| are both zero-capacity channels.Subsequently, its two-shot coherent capacity is given by [10]: Under symmetry restrictions of the input state ρ, the two-shot coherent capacity is not vanishing over a given region of the convexity parameter p [10,12].
This new channel, contrary to its constituent channels, has a non-vanishing capacity, exhibiting an extreme form of superactivation.This confirms that the communication potential of a channel depends on the context in which it is used or on what other channels are available with it.This claim, will be further supported by the phenomenon of causal activation.

D. Classical Capacity
As discussed in Section V-A, quantum channels can have zero capacity due of different reasons.This allows to categorize zero-capacity channels into different classes [88,121,122].Hence, if we use independently two quantum channels of different classes, the entanglement and coherence that might be available in the input state allow the channels to interfere with each other.Consequently, each one can leak some amount of information that the other channel does not allow.This interference between the two channels gives an equivalent channel that is of neither class, resulting in a noise reduction that beats the vanishing capacity of the individual quantum channels.
This cannot happen when quantum channels are used to transmit classical information, because only a channel, whose output is the same regardless of the input message, can have zero classical capacity [49].Hence, there exists only a single class of channels with zero classical capacity, and it is not possible to exploit channels of different classes to superactivate their classical capacities [45,106].

VI. CAUSAL ACTIVATION OF QUANTUM CHANNEL CAPACITIES
In ordinary quantum Shannon theory, although the information carriers obey the laws of quantum mechanics, the treatment of their propagation remains classical.Indeed, the informational carriers are transmitted through a well-defined 32 The flagged extension of quantum channels plays an essential role for finding tight bounds for quantum channel capacities that cannot be expressed as single-letter formulae.Particular examples are the depolarizing channel and the generalized amplitude damping channel, whose capacity bounds are still an open problem for particular ranges of their noise parameters.Interested readers might be referred to the following recent results [84,119,120].The quantum switch supermap, where two quantum channels N and M are placed in a genuinely quantum configuration given by a coherent superposition of causal orders between the two channels [16].Within the figure, ρ c denotes the control system, part of the switch supermap, controlling the causal order between the two channels.Whenever the control qubit is initialized in a superposed state, the two channels are placed in a coherent superposition of the two different causal orders M • N and N • M.
trajectory which is assumed a-priori or can be chosen randomly, for example, by tossing a coin.Recent works proposed to generalise the framework of quantum Shannon theory [15,16,17] such as, not only the information or the channels, but also the placement of the channels -i.e., the trajectories along with the carriers propagate -can be treated as quantum and being subjected to the superposition principle.
In this section, we will review the possible advantages following the extension of quantum Shannon theory to include quantum trajectories, which is considered as the second quantization of classical Shannon theory [15,16].

A. Quantum Switch
A key example of quantum trajectories, which has been proven to be useful for communication, is given by the quantum switch 33  [14,22,23,95,98,99,109,123,124], illustrated in Figure 14.Such a supermap, given two quantum channels N and M, generates a new configuration in which the two channels are in a coherent superposition of two different causal orders, namely, M • N and N • M.
Formally, the quantum switch maps the two original channels N and M into a new quantum channel S ρc (N , M)(•), whose output is given by: where ρ is the input state, ρ c is the state controlling the causal order between the two channels in hand, and {S ij } denotes the Kraus operators of the switch, given by: with {N i } and {M j } denoting the Kraus operators of the respective channels.We should note that the structure of 33 When the complete positivity of quantum combs or process tensors is restricted to non-signaling channels only, a wider class of supermaps emerges, which includes the quantum switch as a particular instance.The quantum switch supermap cannot be described by any form of a temporal process tensor or quantum comb, unless postselection on some degree of freedom of the environment is allowed [95,110].This new resource has proven to provide advantages over the classical placement of quantum channels, violating the bottleneck inequality (115) [14,22,109,124].The rationale for this astonishing violation is that the coherent control within the quantum switch allows for the order -in which the channels act on the information carrier -to be entangled with a control degree of freedom.As a consequence, a constructive interference results from the coherent superposition of the causal order between the channels, allowing for a reduction of the overall noise affecting the information carrier.
It is worth noting that the control system, whose state is fixed a-priory, is crucial in the switch.Indeed, it seems to be locking a considerable amount of information present in the coherent superposition of the orders.Clearly, with no access to the measurement outcome of the control qubit at the receiverhence, by "tracing" it out -we cannot recover that amount of information.Furthermore, given that the control qubit embeds a fixed and a-priori determined quantum state, it can not be exploited by the sender to encode information 34 , i.e., it does not constitute a side channel [16].

B. Causal Activation of Holevo Information
The use of the quantum switch for the transfer of classical information over quantum channels has been shown to outperform the usual communication setups -namely, sequential or parallel placement of channels in a causal order -of quantum Shannon theory.
Specifically, when two completely depolarizing channels -each with vanishing Holevo information, prohibiting them from transmitting classical messages whatever classical configuration they are used in -are combined together in the quantum switch, they can deliver a non-vanishing amount of classical information [14].The completely depolarizing channel N CD for a d-dimensional input ρ is described by a mixture of d 2 mutually orthogonal unitaries 35 {U i } d 2 i=1 so that: with the Kraus operators in (39) describing the quantum switch supermap given by: When the controller is initialized in the state output (38) of the quantum switch is given explicitly by: where I is the d × d identity matrix.By accounting for (42) with p = 1 2 , the Holevo information achievable through the quantum switch S ρc (N , M)(•) is given by: where S(ρ c ) is the von Neumann entropy of the reduced state of the control system ρc = , and H min (S ρc (N CD , N CD )) is the minimum output entropy of the effective channel S ρc (N CD , N CD ), given by: A plot of the Holevo information χ(S ρc (N CD , N CD )) in (43), characterizing the capability of the quantum switch to transfer classical information, is given in Figure 15.It is clear from the plot that the completely depolarizing channel, which has vanishing classical capacity over arbitrary many uses, gains a non vanishing Holevo information 36 whenever two instances of the channels are used within the quantum switch.It is worthwhile to mention that the Holevo information represents just a lower bound on the regularized classical capacity achievable with the quantum switch, which is nonvanishing as well.This result, although moderate in terms of capacity improvement as shown in the figure, is of crucial importance from a communication engineering perspective, since it violates one of the fundamental bounds for classical trajectories, namely the bottleneck inequality given in (115).

C. Causal Activation of Quantum Capacity
As for classical capacity, there exists -as well -quantum channels with vanishing quantum capacity that, when combined within the quantum switch, gain a non-vanishing quantum capacity [22].
An illustrative example is the entanglement breaking channel N EB characterized by the Kraus operators {X, Y }, and whouse ouput state is given by: with X and Y denoting 2 × 2 Pauli matrices.This channel has vanishing quantum capacity Q(N EB ) = 0, regardless of the adopted classical (serial or parallel) configuration, since it is anti-degradable, i.e., the output on the receiver can be obtained by post-processing the output of the environment, resulting in a violation of no-cloning theorem as mentioned in Section V-A.
However, the quantum switch activates its capacity to its maximum 37 whenever the control qubit places the channels in an equal superposition of orders, that is [22]: This astonishing result can be easily understood by considering the output of the quantum switch, given by: We can see that the outcome in ( 47) is equivalent to a convex combination of two flagged channels I and Z and, hence, the coherent information of the equivalent channel is simply the convex sum of the coherent information of the two flagged channels: This result is astonishing, since it non only violates the bottleneck inequality given in (115) as discussed in the previous subsection, but it activates the capacity to its maximum value, starting from zero-capacity channel.
Although our previous discussion explicitly shows the advantages of the quantum trajectories for communications, closed-form expressions of the ultimate capacities achieved through the quantum switch are yet to be solved for generic quantum channels.In this direction, many efforts are made to obtain tight upper and lower bounds on the quantum switch capacity.In particular, it has been shown [123] that the use of the three copies of the completely depolarizing channel outperforms the bound given in (43).This has been extended to show that the asymptotic use of many copies of the completely depolarizing channel in a superposition of cyclic orders achieves perfect transmission of classical information [98,99].Furthermore, upper and lower bounds of the quantum switch capacity have been obtained for different types of channels [14,22,98,109,124].

Superactivation
Causal activation Entanglement Yes: within the encoding Yes: between the causal order of the channels and the control system

Type of the channels Two different channels belonging to different zerocapacity classes
Two different or identical channels, as long as their Kraus operators do not pairwise commute/anti-commute with each others

Channel placement
Classical Quantum: superposition of relative orders

Not activated Activated
Channels with zero quantum capacity

Activated Activated
Noise Reduction Always Not always Table IV: Superactivation vs causal activation.Although both superactivation and causal activation arise from the phenomenon of entanglement, and they both enable information transmission even through channels with zero capacity, they exhibit fundamental differences as summarized within the table.

VII. CONCLUSIONS AND FUTURE PERSPECTIVE
In classical communications, which are based on classical information flowing through classical channels, it is widely known that the channel capacity is additive.Namely, whenever a channel cannot transmit classical information over a single use, it can never gain potential to transmit information over multiple uses or when assisted by other zero-capacity classical channels.
Conversely, the weird unconventional phenomena of superadditivity, superactivation and causal activation of quantum channel capacities violate known bounds and assumptions of classical Shannon theory, boosting -sometimes with astonishing gains such as in Section V and Section VI-B -both the classical and the quantum capacities.
Hence, it is of paramount importance to i) discuss the rationale for these phenomena to appear in the quantum realm, and ii) highlight open problems and research directions, both from a communication engineering perspective.

A. Discussion
A.1) The role of quantum signatures: As thoroughly discussed in the previous sections, the advantage that the phenomena of superadditivity, superactivation and causal activation provide for communications is based on the presence of entanglement, though in different disguises.
In superadditivity and superactivation, entanglement is exploited in the used codewords, enabling information carriers to be correlated while each traverses one channel.If the sender use separable codewords, as shown in Figure 13a with reference to the superactivation phenomenon, these phenomena do not occur.Conversely, for causal activation the entanglement is manifested in the correlation between: i) the order in which the channels acts on the information carrier, and ii) the degree of freedom of the control system, which necessarily does not carry any information.
Similarly to this key difference in the exploitation of the key-resource represented by entanglement there exists another 37 We further note that the qubit channels that might witness such perfect activation of the quantum capacity are the only ones unitary equivalent to the entanglement breaking channel given in (45) [22].distinction in terms of channel placement between superadditivity/superactivation and causal activation, as summarized with Table IV for super vs causal activation.Specifically, the former two phenomena occur with a classical placement of channels -either through i) multiple uses of the same channel, or ii) use of different zero-capacity channels from different classes -whereas the latter occurs when a quantum trajectory is exploited -with the only restrictions to have the channel Kraus operators not commuting or anti-commuting pairwise.
It is worthwhile to underline that -regardless of the differences between the three phenomena -quantum channels are a fundamental constraint for this marvels to occur.Meaning that these phenomena do not have any classical counterpart when classical channels are used for communication.
A.2) Difference between causal activation and superadditivity/superactivation: Furthermore, it is tempting to believe that quantum channels placed in quantum trajectories provide stronger advantages with respect to classical configurations such as those exploited by superadditivity and superactivation.However, this is not the case.Indeed, in the case of the causal activation, the information carrier undergoes a superposition of two sequences of channels with different causal orders, which might result in an overall noise addition instead of reduction.And the rationale for this is due to the fact that a destructive interference -rather than a constructive onecan take place.Differently, in superadditivity/superactivation, the information carriers are split between the different uses of the same channel or the different channels such as each carrier undergoes a single operation, which can only induce a noise reduction, and never a noise amplification.
Finally, an interesting intersection between the two kinds of channels placement might be found by considering the family of flagged channels.In fact, a similarity between the phenomenon of superactivation in flagged convex combinations of zero-capacity channels, discussed in Section V-C and the phenomenon of causal activation in the quantum switch arises.This similarity becomes clear by noticing that the resulting channel from the quantum switch of two channels or more -such as the one given in (47) -is nothing else than a quantum-flagged convex combination of two channels, which might have zero capacity in particular cases.

B. Open Problems
Besides the marvelous communication advantages that the discussed phenomena enable, there are relevant issues -from the engineering perspective -that we should point out and properly discuss.
B.1) Superactivation: Primarily, superactivation is not yet fully understood, and many questions in this direction are yet to be answered.Basically, it is still important to understand whether there exist other families of zero-capacity quantum channels, besides the antidegradable and PPT families.Indeed, whether the superactivation holds for other Horodecki channels without positive private capacity, or whether there are other pairs of channels that witness such effect -besides the 50% erasure and the 4-dimensional Horodecki discussed in the text -is still not answered.
Furthermore, besides the mentioned issues arising in discrete quantum channels, there is much more to discover and to investigate in the continuous domain [126,127].Recently, it has been showed that superactivation can be revealed in a broad range of thermal attenuator channels, even when the transmissivity is quite low, or the thermal noise is high [128].This urges further investigations of whether superactivation might occur in physically relevant circumstances of quantum Gaussian channels [129,130].This would be a triumph for future quantum communications based on quantum properties of light.

B.2) Superadditivity:
With reference to the superadditivity phenomenon, it has been proved for channels which might be relevant in realistic scenarios.Indeed, superadditivity has been shown for a given range of the depolarizing channel.Furthermore, a recent superadditivity phenomenon of the coherent information has been shown for the dephasure channel, which is a concatenation of an erasure and a dephasing channel.This erasure channel can be seen as a pure-loss bosonic channel on a dual-rail qubit system, which is a good model for optical fibers.
A strong superadditivity phenomenon has been revealed in quasi-zero-capacity channels.Specifically, quantum rocket channel -namely, a channel with a 2 log d input qubits with private capacity less than 2 -combined with the ddimensional 50% erasure channel -which has zero private capacity -can achieve high capacity in the order of 1  2 log d [9], hence, significantly larger than the capacity of the former channel.Consequently, intensive efforts are devoted to further investigations on the superadditivity of useful channels, both i) from a theoretical point of view, to serve as a laboratory for understanding quantum capacities, and ii) from a practical point of view, to harness the effect of superadditivity in near term quantum communication technologies.
However, and differently from quantum capacities, practical and concrete examples of superadditivity of the Holevo capacity are still missing, leaving an open door for future research to reveal the usefulness of superadditivity for the transmission of classical information over quantum channels.Moreover, a full understanding of the gap between capacities of quantum channels under different constraints -namely, classical encoding-quantum decoding and quantum encodingclassical decoding -is still missing.This urges further investigation of finite blocklength coding and decoding strategies [131], and the comparison between collective measurements and LOCC (local operations and classical communication) strategies on the discrimination of product states.The later has been thoroughly investigated recently in [132].We should highlight that we have omitted in this manuscript the discussion of superadditivity in trade-off capacities of quantum channels.This is the capacity given by a trade-off region considering a limited assistance of quantum communication by classical communication and entanglement.It has been shown that this kind of quantum capacity exhibits a superadditivity phenomenon.Interested reader is referred to [133,134,135].
Finally, a key issue is constituted by the fact that capacities of realistic channels, which models practical quantum communication scenarios on different platforms, are still not known.In particular, the capacities of the generalized amplitude damping channel is still not fully understood [127,136].This channel can be seen as the qubit analogue of the bosonic thermal noise channel, and it models some of the sources of noise in superconducting circuit-based quantum computing.To this aim, many techniques for obtaining upper bounds of quantum channel capacities have been chased.For upper bounds on the classical capacity of quantum channels, the reader can be referred to [115,137,138,139,140].In equal footing, for upper bounds of quantum capacities of quantum channels the reader is referred to [141,142,143,144].

B.3) Causal activation:
Not very different from the previous two phenomena, there is a lot to be understood in causal activation.This phenomenon has been shown to be advantageous for some practical channels, like the entanglement breaking channel in the case of quantum information transmission discussed in Section VI-C and the completely depolarizing channel when it comes to the classical information transmission discussed in Section VI-B.Nevertheless, causal activation for continuous variable channels is still missing, which would be of paramount importance for photonic-based future quantum communications.
Another issue that might face the engineering of causal activation is represented by the coherent control of realistic channels.Basically, to be able to perform coherent control, all we need to know is the properties of the quantum channels themselves, which are -not easily -obtained by quantum process tomography [145,146].Even more challenging, it has been shown that there are processes revealed to break one of the key properties of quantum channels, which is complete positivity [147,148,149].These processes cannot be described by Kraus operators and, hence, the quantum switch paradigm fails in this regard.
A possible link between superactivation and causal activation of quantum channels might be tackled through the environment-assisted communication paradigm [150,151,152].On one hand, it has been shown that the quantum switch can be viewed as a one-way LOCC environment-assisted strategy [125], where the environment is controlled by a helper.In this context, the control qubit of the switch arises as a residual degree of freedom of the environment.This particular strategy (the quantum switch) perfectly corrects the noisy channels when it is optimal, otherwise, the quantum switch fails to perfectly mitigate the noise.It is worth mentioning that optimality is with regard to the one-way LOCC strategy maximizing the environment-assisted capacity of the corrected channel.On the other hand, it has been shown that, when the helper is allowed to use entangled states of the environment, two useless channels with zero capacity under environment assistance might activate their joint capacity [153,154].This opens a future direction for the investigation of the link between correlated control degrees of freedom among multiple quantum switches, and the possible superactivation therein.This will help better characterizing and understanding the capacity of the quantum channels used in the quantum switch.
Besides the advantages that the quantum switch can bring to point-to-point communications -namely, mitigating the noise of quantum channels by placing them in coherent superposition of relative orders -it would be quite valuable to find practical applications for quantum networks.A first contribution toward this issue has been proposed in [155], where the indefinite causal order framework has been used to generate multipartite entanglement.Importantly, it has been shown that the application of the quantum switch can be advantageous for the achievement of distributed multipartite entanglement generation between remote nodes of a quantum network.Consequently, the quantum switch may play the missing part in achieving reliable photonic multi-qubit gates or, at least, a quantum interface between different qubit technologies, mapping entangled states engineered in a particular platformi.e., superconducting entanglement -to photonic flying qubits used for long distance point-to-point communication [156].
These different advantages of the quantum switch suggest a new way of looking at quantum networks.Namely, new quantum internet protocol stacks are yet to be proposed [38], taking into account the coherent control in general, and the superposition of causal order paradigm in particular, laying the ground for a complete understanding of the full potential of future communication networks.

APPENDIX A QUANTUM INFORMATION BASICS: CRASH COURSE
1) Quantum bit and superposition principle: Information, either classical or quantum, can be encoded in the state of the simplest quantum mechanical system, namely, the quantum bit (qubit).Mathematically 38 , the state of a qubit is defined as a vector |ψ in a two-dimensional Hilbert space H. Therein it is possible to choose a basis, as instance the computational basis {|0 , |1 } which draws an analogy with the states 0 and 1 of a classical bit.Then, according to the superposition principle, an arbitrary state of a qubit can be expressed as a linear combination of the chosen basis states: where α, β ∈ C, and |α| 2 + |β| 2 = 1.The state |ψ in ( 49) is said to be in a superposition of the states |0 and |1 .
2) Unitary transformations: If a quantum system (such as a qubit) is closed, it can evolve in time only under deterministic and reversible unitary transformations U , i.e., transformations satisfying: where I H is an identity in the Hilbert space H.This means that, given the state of the system at some initial time point t 1 , its state at a certain time t 2 is fully determined by the corresponding unitary operator: which depends only on times t 1 and t 2 .Unitary transformations play a crucial role in quantum information and quantum communications since they can seen as gates acting on a qubit.In this picture, a quantum gate has input and output ports for a qubit, and the time evolution is hidden in the relationship between them, Typical examples of quantum gates widely used in quantum information are the Pauli gates which flip the bit (|0 → |1 ), the phase (α ), or both, respectively.An important consequence of the constraint on the transformations of a closed quantum system to be unitary is the celebrated no-cloning theorem (see Section II-A), which states the impossibility of creating an independent copy of an unknown quantum state.Indeed, there exists no unitary operator U acting on two quantum systems able to transform state |ψ 1 of one system into state |ψ 2 of another one, regardless of |ψ 2 .
3) Projective measurements: If the state of the system is unknown, certain information on it can be acquired by measuring some (observable) property of it.Mathematically39 , any observable is described by an operator A that is selfadjoint (i.e., A † = A) and can be expanded as: where {a i } are its eigenvalues describing the possible outcomes of the measurement, and M i are the orthogonal projectors onto the eigenvectors associated with the corresponding eigenvalues: By measuring the observable A, a certain outcome a i is obtained.However, accordingly to the quantum measurement postulate, after this measurement the system is left in the eigenstate associated with the projector M i .With more details, when a measurement is performed on a system in the state |ψ , the outcome a i is obtained with probability calculated according to the Born's rule: After the measurement, the system collapses into the state We note that any following measurement of the same observable reveals again the same outcome a i and state in (58).
Example 1.Let us consider a simple example to better present the above concept related to the quantum measurement.Specifically, let us suppose to be interested in measuring the qubit state (49) in the computational basis {|0 , |1 }.In this case, M 0 = |0 0| and, M 1 = |1 1|.By measuring the considered state and according to the Born's rule, we obtain the outcome "0" with probability given by: and the outcome "1" with the probability: and the system, after the measurement, is left in the state |0 or |1 , respectively.We could measure the qubit in any other basis, for example, {|± = |0 ±||1 √ 2 }.The corresponding observable can be constructed as where a + = 1, a − = −1, and In this case, the measurement reveals both outcomes "+" or "−" with the same probability 4) Mixed states and density matrix: In the situations when the knowledge on the actual state is lacking (for example, if the system undergoes the action of noise), it cannot be described by a well-defined vector in Hilbert space.This means that the system is in a certain state with some probability, i.e., it has to be described by a statistical mixture of vectors in Hilbert space.Such a statistical mixture is called mixed state (in contrast to a well-defined vector which represents a pure state) and it can be defined formally by adapting the formalism of density matrix.Indeed, if the system, with dimension d, is in one of the states {|ψ i } d i with corresponding probability p i , the density matrix that describes its overall state is defined as For a pure state |φ , the density matrix reduces to ρ = |φ φ|.Generally speaking, any operator ρ can be a density operator and describe a state of the system, as long as it fulfills the following conditions, 1) ρ † = ρ, i.e., p i ∈ R for all i, 2) ρ ≥ 0, i.e., p i ≥ 0 for all i, 3) Tr(ρ) = 1, i.e., d i p i = 1.These conditions ensure that the eigenvalues of ρ can be interpreted as probabilities, namely, they are real, positive, and sum up to the unity.It is necessary to stress out the crucial difference between these "classical" probabilities p i and the "quantum" ones P(i).The probabilities P(i) appear when one performs a measurement on the (well-defined) system's state due to the Born's rule, whereas the probabilities p i describe our a priori knowledge of the actual system's state independently on measurement.Indeed, when a measurement of an observable A is performed on a qubit being in the state ρ, an outcome a i is revealed with the probability leaving the system in the state Example 2. Being the state in (49) a pure state, its density matrix can be evaluated as On the other hand, the classical mixture of the states 0 and 1 with the probabilities |α| 2 and |β| 2 is described by the mixed state When measured in computational basis, in both cases, the qubit can be found in the state 0 and 1 with probabilities |α| 2 and |β| 2 , respectively.However, if the measurement is performed in a basis which includes |ψ , i.e., it answer the question "Is the qubit in the state |ψ or not?", then, in the first case, the answer is always "Yes", and the measurement does not change the state of the system.In the second case, the outcome "Yes" is obtained indeed with the probability As discussed, the crucial difference is that, in the first case, the qubit stays in a well-defined state |ψ which is revealed, as we have seen, when a suitable measurement is performed.
In the second case, however, the qubit a priori stays in one of the states |0 or |1 with the corresponding probabilities.

5) POVM:
Before ending the appendix it is important to highlight that the projective measurements introduced above and described by a set of orthogonal projectors {M i }, which satisfy conditions (55) and (56), represent a special case of the general quantum measurement postulate.However, there are important problems in quantum computation and quantum information, such as the optimal way to distinguish a set of quantum states, which require a more general tool, as the positive operator-valued measure (POVM) formalism [101], where the measurement operators M i are not necessarily orthogonal.
Example 3.An important example of using POVM in quantum communications is given by the problem of distinguishing between non-orthogonal states.Given a set of N linearly independent states {|ψ i }, no projective measurement can tell with a certainty that a qubit has been in one of them before measurement if they are not orthogonal.However, a wisely chosen POVM allows to perfectly distinguish between them by paying the price that sometimes no information about the state can be revealed at all.Indeed, it can be achieved by considering a set of states {| ψi } such that a state | ψi is orthogonal to all the states under interest but |ψ i [157], i.e., where δ ij is unity for i = j and zero otherwise.Then the POVM consisting of N projectors and the operator M N +1 = I − i M i allows to distinguish perfectly between {|ψ i }.Indeed, finding an outcome i ∈ {1, ..., N } suggests that the system has been in the state |ψ i before measurement.However, finding the outcome N + 1 associated with the operator M N +1 does not give any information about the state of the system at all.For example, let us assume that we have a qubit and want to distinguish between two states, |ψ 1 = |0 and |ψ 2 = |+ .In this case, a POVM consisting of operators does the job.
6) Composite systems and entanglement: A generic pure uncorrelated state of a composite system of n qubits {|ψ i n i=1 } is described by a joint quantum state belonging to a 2 n -dimensional complex Hilbert space.To simply illustrate this, we consider a two qubit system A and B. The two systems are described individually in the basis {|0 , |1 }.Accordingly, their joint state would be described by the tensor product basis given as {|00 , |01 , |10 , |11 }.Any state |ψ AB of the joint system would be given explicitly by with α i ∈ C and i |α i | 2 = 1.Any joint state of this composite system that cannot be written in a product form as in (74) should present some form of correlations between systems A and B. This form of correlations is called entanglement, and the corresponding state is deemed entangled state.A famous example of an entangled state in two qubit systems is the set of Bell pairs given by: More generally, any bipartite quantum system, no matter its state is pure or mixed, is said to be entangled if it cannot be written as a convex combination (hence, probabilistic mixture) of product states in the form: A joint state that can be written in this form is called separable.It is worth noting that separable states can have classical correlations between the systems A and B.

APPENDIX B QUANTUM CHANNELS
A quantum communication channel N is described mathematically by a completely positive trace-preserving (CPTP) map C : ρ A → ρ B from states ρ = ρ A ∈ L(H A ) belonging to the set of density operators L(H A ) over the input Hilbert space H A to states ρ B ∈ L(H B ) on an output Hilbert space H B .The condition of CPTP assures that the output of the map C is a valid density operator.In fact, it assures that Example 4. The completely depolarizing channel is a widelyused quantum channel model, and it is described by the following input-output relationship: where ρ A is the input state belonging to a d-dimensional Hilbert space H A , and I d is the d-dimensional identity matrix.The output state N CD (ρ A ) has a unique d-degenerate positive eigenvalue 1 d and Tr(N CD (ρ A )) = Tr(ρ A ).As a consequence, N CD is a positive and trace-preserving map.Moreover, it is a completely positive map.In fact, by adding an ancilla system E, we can consider the action of the map I n ⊗ N CD on the entire state ρ EA ∈ L(H E ⊗H A ) with ρ A = Tr E (ρ EA ).Accordingly, we obtain that: has positive eigenvalues since Tr A (ρ EA ) is a state.Hence, completely depolarizing map N CD is a quantum channel.
Example 5. Let us consider a map T that transposes a state ρ A = ij p ij |i j| of the system A, where we fix {|i } as a computational basis in H A .Its output, given by: with T denoting the matrix transpose, exhibits obviously the same eigenvalues and trace as ρ A .Hence, T is positive and trace-preserving.However, let us add another system B in order to check whether T is completely positive.The state of the entire system reads: and, if T acts on A, it becomes: Now let us assume A and B to be maximally entangled qubits (hence, n = 2), and consider again the action of the map T on A. The entire output is given by whose eigenvalues are ± 1 2 .Since one of its eigenvalues is negative, (I 2 ⊗ T )(ρ BA ) is not positive and, therefore, not a state.This means that the transpose map T is not completely positive and, hence, it cannot represent any quantum channel.Nevertheless, the partial transpose map (I n ⊗ T ) plays an important role in quantum communications lying in the core of the PPT-or Peres-Horodecki criterion for determining entanglement.
There are several ways of representing quantum channels formally, some of which will be useful in our discussion.

A. Kraus Representation
A quantum channel N is described by an operator sum decomposition in Kraus operators as follows [158,159,46]: with ρ ∈ L(H A ) and with {A i } k i=1 being linear operators from L(H A ) to L(H B ) satisfying the normalization condition: Example 6.The completely depolarizing channel N CD has the following Kraus representation, with { Ûi } being a set of unitary operators that are mutually orthogonal, i.e., Tr( Û † i Ûj ) = dδ ij .For a qubit (d = 2), a set of Pauli operators with identity can be chosen, { Ûi } = {I 2 , X, Y, Z}, leading to the Kraus representation In this representation, the channel can be interpreted as a noisy channel that causes a bit error (X), a phase error (Z), both errors (Y ), or no error (I 2 ) with the same probability p = 1 4 .Example 7. The qubit completely depolarizing channel (88) can be naturally generalized to the Pauli channel that causes the mentioned above errors with the corresponding probabilities p X , p Y , p Z .This is the quantum channel usually adopted in quantum communications to model a noisy qubit channel, and it has the following Kraus representation, 4 the Pauli channel reduces to the completely depolarizing channel (88).On the other hand, the choice p X = p Y = p Z = q 3 leads to the depolarizing channel

B. Isometric extension (Stinespring dilation)
A quantum channel N can be described -as shown in Figure 16 -by a reduced dynamics Tr E (•) on the isometry (i.e., a map that preserves the inner product) U N simulating the joint evolution of the system A and environment E together as [46,160]: where U N is a linear operator that maps The two descriptions ( 85) and (91) are equivalent in the sense that if we know one Kraus decomposition {A i } k i=1 of the channel, given an orthogonal basis of H E as {|i } E , the isometric extension U N is given by: Example 8.For the Pauli channel P introduced in the previous example, the set of Kraus operators is Ω Ω Figure 16: A scheme depicting channel N : L(H A ) → L(H B ) as the reduced dynamics of an isometry describing the joint evolution of the system source-receiver and the environment E. Clearly, The figure depicts also the relations holding for degradable/antidegradable channels.
Therefore, its isometric extension reads In particular, for the completely depolarizing channel N CD , the isometric extension reduces to It is important to underline that the Kraus decomposition of a quantum channel is not unique, thus the construction of the isometric extension of the channel is not unique as well.Another important concept associated to the Stinespring dilation is the complementary channel N c of a quantum channel N .The complementary channel describes the channel transmitting information to the environment rather than transmitting information to the output Hilbert space H B , and it is given by:

C. Choi state of a quantum channel
A fundamental relation between quantum channels and states is the Choi-Jamiołkowski isomorphism.This isomorphism enables a one-to-one map between an arbitrary quantum channel N and a density operator -referred to as ΦBA N in the following -in L(H B ⊗ H A ) on the Hilbert space H B ⊗ H A of the joint system BA , with A denoting the auxiliary system showed in Figure 17.

CHANNELS
The definition of the complementary channel given in (95) allows us to introduce the notion of degradability of a quantum 40 Where H A is isomorphic to the input Hilbert space H A with dimension d that, generally speaking, might be different from the dimension of H B .channel [46,161,162].
A channel N is said to be degradable if the final state obtained by the environment can be obtained by postprocessing the state at the receiver by applying a third channel (CPTP) map, as shown in Figure 16.Formally, the channel N is degradable if there exists a CPTP map Ω : L(H B ) → L(H E ) satisfying the relation: Similarly, a channel is said to be anti-degradable [163] if there exists a CPTP map Ω : L(H E ) → L(H B ) satisfying: Many channels are neither degradable nor anti-degradable.However, it was shown that qubit channels with one qubit environment are always either degradable or anti-degradable or both (symmetric) [164].A particular example of antidegradable channels is the set of entanglement breaking channels [103] mentioned in Section V-A.These are the channels whose Choi state given in ( 96) is separable [103].It is known that the set of anti-degradable channels is convex, that is, any convex combination on anti-degradable channels is an antidegradable channel, but surprisingly, the set of degradable channels is not convex [121].

APPENDIX D ENTROPIC QUANTITIES
Entropic quantities play an essential role in the study of quantum communications, as they characterizes the performance of quantum channels.The von Neumann (quantum) entropy S(ρ) of a quantum state ρ is given by [46,163]: where {λ i } is the set of eigenvalues of ρ, i.e., "classical" probabilities p i in its expansion (63).This generalizes the classical Shannon entropy of a random variable X defined as [47]: H(X) = − Let N : H A → H B be a quantum channel and let A be an auxiliary system evolving through I as shown in Figure 17, with the additional property of being a purifying system for ρ A .Specifically, the auxiliary system A is chosen so that the joint state ρ AA , satisfying is a pure state, regardless of ρ A being a pure or a mixed state: Also, let us denote the entropy of the input state ρ A as: with a slight abuse of notation, given the dependence of S(A) on the input state ρ A , but being consistent with the literature [46,49,51].Similarly, the entropy of the output state ρ B = N (ρ A ) of the channel as: Accordingly, the entropy of the output of the complementary channel N c can be written as: This quantity is known as the entropy of exchange [77], which refers to the amount of information leaking to the environment instead of being reliably transferred to the receiver.The relation between the different states in ( 108) is better understood through the isometric representation of the channel N as: with (U N ⊗ I)(ρ AA ) = (U N ⊗ I)ρ AA (U † N ⊗ I) and U N is the isometric extension given by (92).
Moreover, the equality between the first and the last line in (108) results from the fact that the state of the global system, given by the environment E, the receiver B and the purifying system A , is a pure state.This purity of the joint system EBA can be easily observed from the fact that the joint evolution on the input system ρ AA -which is pure by definition -is in fact an isometry given by U N ⊗ I. Indeed, this isometry preserves purity by definition.As a consequence, the entropies of any complementary bi-partitions on the joint output pure system should be equal.Hence, the equality S(E) = S(BA ) holds.
The previous entropies -input, output and exchanged -are the essential building blocks for many information measures in quantum communications 41 .In the following, we focus on the measures used within the paper.
A key measure needed for our discussions in Section IV-A is the Holevo information [53].This is a functional χ(•, •) of an input ensemble of states {p x , ρ x } that the sender Alice inputs to the channel N for transmitting classical information through a quantum channel.Formally, the Holevo information of channel N with respect to the arbitrary input ρ = x p x ρ x is given by: χ({p x , ρ x }, N ) = S(N (ρ)) − x p x S(N (ρ x )) (110) where ρ is the quantum ensemble encoding the classical message given by the alphabet X over which the random variable X takes values.It has been shown that the Holevo information provides an upper bound on the mutual information I(X : Y ), given by: where X is the random variable describing the message x to be transferred by Alice, and Y is the random variable referring to the output, after a POVM is applied by Bob to estimate the value x.This is known as the Holevo bound [65], and is given by: I(X : Y ) ≤ χ({p x , ρ x }, N ) It is worth mentioning, that the Holevo information is useful for many tasks in quantum estimation and quantum discrimination, for which it has been derived.
Another key measure is the quantum mutual information of channel N with respect to the arbitrary state ρ = ρ A as: which is the quantum version of Shannon's mutual information given in (111).Similarly, a measure needed for our discussions in Section IV-B is the coherent information of channel N with respect to the arbitrary state ρ, given by [78,165,46]: It can be easily seen from the second line of (114) that the coherent information is the difference between the amount of information arriving to the receiver given by the output entropy, and the amount of information leaked to the environment given by the entropy of exchange.Furthermore, from the third line of the same equation, we see that the coherent information is the negative of the conditional quantum entropy.This latter quantity can be negative, in contrast to its classical counterpart, namely, the conditional entropy H(X|Y ).An interpretation of the negativity of this quantity has been given in the context of quantum state merging [166], where it has been shown that the negativity of the quantum conditional entropy relates to the fact that the sender and the receiver gain a potential for future quantum communications.For extensive details on the properties of the quantum mutual information and the coherent information the reader is referred to [49,46,52,87,112,167].
We further note that both the Holevo information and the coherent information satisfy a data processing inequality.Specifically, whenever two arbitrary channels N and M are placed sequentially, they satisfy the following bottleneck inequalities:

APPENDIX E QUANTUM CODES AND RATES
An important notion both practically and theoretically is the notion of a code.Generally, if Alice and Bob want to communicate a message, they choose appropriate encoding and decoding strategies, allowing them to reach their ultimate rate of communication, by counteracting the effect of noise of the communication line.Formally, this consists of an encoding map E: from the alphabet of classical messages M with k = log 2 |M| to a large state space of n quantum carriers of information, and a decoding map D: from the joint state of the n-carriers to the alphabet M.This is summarized in Figure 4.In the case of communicating quantum messages, the alphabet M above is replaced by the set of quantum states L(H) over a Hilbert space H of dimension d, and k = log 2 d.Each element of the image set L(H ⊗n ) is called a codeword and the rate of the code is given by the non-negative number R = k n .Clearly, a rate is achievable if there exists codei.e., an encoder E and a decoder D -so that the probability of decoding the message erroneously vanishes as n goes to infinity.

Figure 1 :
Figure 1: Pictorial representation of non-zero vs zero-capacity channels highlighting the different phenomena -namely, superadditivity, superactivation, and causal activation -affecting the fundamental notion of channel capacity in ways with no counterpart in the classical Shannon theory.
background III-B.From Classical Capacity to Quantum Capacities II-C.Operational Definition of Quantum Channel Capacities II-D.Classical Capacity of Quantum Channels II-E.Quantum Capacity of Quantum Channels II-F.Bibliographic Notes III.Quantum Marvels III-A.Superadditivity III-B.Superactivation III-C.Causal Activation IV.Superadditivity of Quantum Channel Capacities IV-A.Superadditivity of Holevo Information IV-B.Superadditivity of Coherent Information IV-C.Superadditivity of Classical and Quantum Capacities V. Superactivation of Quantum Channel Capacities V-A.Classes of Zero Capacity Channels V-B.Superactivation of Quantum Capacity V-C.Non-convexity of Quantum Capacity V-D.Classical Capacity VI.Causal activation of Quantum Channel Capacities VI-A.Quantum Switch VI-B.Causal Activation of Holevo Information VI-C.Causal Activation of Quantum Capacity VII. Conclusions and Future Perspectives VII-A.Summary VII-B.Open Problems VIII.Appendices

Figure 3 :
Figure3: Classical vs Quantum Capacity.The capacity of a channel measures the maximum rate at which information can be reliably transferred between communication parties through such a channel.A classical channel can be used to send classical information only and, therefore, it is fully characterized by its classical capacity.A quantum channel can transmit either classical or quantum information, and the corresponding rates are bounded by its classical and quantum capacities, respectively.

Figure 5 :
Figure 5: One-shot vs. regularized classical capacity from the encoder perspective.

Figure 7 :
Figure 7: A scheme showing superadditivity of the one-shot quantum capacity of channel N .(a) When two instances of the channel (this is formally given by the tensor product N ⊗2 = N ⊗ N ) are used on separable inputs such as |0 ⊗|0 , the coherent information of the two channels together I c (N ⊗2 ) is the sum of the two individual coherent information I c (N ) + I c (N ).(b) Conversely, when the two instances of the channel are used on an entangled state |Ψ , superadditivity of the coherent information occurs and the joint coherent information I c (N ⊗2 ) exceeds the sum of individual coherent information I c (N ) + I c (N ).

Figure 8 :
Figure 8: A scheme showing superactivation of the one-shot quantum capacity of two zero capacity quantum channels N and M. (a) When the two channels are used on separable inputs such as |0 ⊗ |0 encoding the quantum message, the coherent information of the two channels together I c (N ⊗M) is the sum of the two individual capacities I c (N ) and I c (M), and hence it is identically zero.(b) When the two channels are used on an entangled state |Ψ properly encoding the quantum message, superactivation of the capacity occurs and the joint coherent information I c (N ⊗ M) can be greater than zero, allowing the two channels to transmit a non-vanishing amount of quantum information.
a) A classical sequential trajectory where the information carrier prepared in a certain state |ψ undergoes the transformation M • N , in which channel N is acting on the carrier before channel M. Both the quantum and classical capacity of this scheme are upper bounded by the bottleneck inequality given in(115), i.e., by the minimum of the capacities of each of the two concatenated channels.I c (S ρ c (N , M)) > min{I c (N ), I c (M)} N |ψ M (b) A quantumtrajectory, which is a coherent superposition of the two classical sequential trajectories N •M and M•N .This placement of channels is neither equivalent to a sequential trajectory in which the channels are timelike separated, nor equivalent to a parallel placement where the channels are spacelike separated.The overall coherent information Ic Sρ c (N , M) of the equivalent channel Sρ c (N , M) -with ρc denoting the quantum system controlling the causal order between the two channels -can violate the bottleneck inequality.

Figure 9 :
Figure 9: A scheme showing causal activation of the coherent information for two quantum channels N and M.

Figure 10 :
Figure 10: The figure shows the scenario used to derive the coherent information I c (N D ) for the depolarizing channel N D .The quantum information is encoded into the maximallyentangled input state Φ AA , whose part A is sent through the noisy channel N D and the other part A is kept as a reference, by sending it through the ideal channel I.

Figure 12 :
Figure 12: Scheme showing the decoder D used to prove the superadditivity of the coherent information for the depolarizing channel N D .

Figure 14 :
Figure14: The quantum switch supermap, where two quantum channels N and M are placed in a genuinely quantum configuration given by a coherent superposition of causal orders between the two channels[16].Within the figure, ρ c denotes the control system, part of the switch supermap, controlling the causal order between the two channels.Whenever the control qubit is initialized in a superposed state, the two channels are placed in a coherent superposition of the two different causal orders M • N and N • M.

Figure 15 : 1 √ 2
Figure15: The Holevo information of the effective channel S ρc (N CD , N CD )(•) implemented through the quantum switch when ρ c is the density matrix for the state 1√ 2 |0 + |1 ).The plot is an illustration of the results derived in[14].

•
C outputs a positive operator (positivity), • for any n, I n ⊗ C -with I n denoting an identity map on n-dimensional operators -outputs a positive operator (complete positivity), • Tr(C[ρ A ]) = Tr(ρ A ) (trace preservation).In the following we provide two simple examples to better understand the above concepts.

Figure 17 : 96 ) 9 .d 2 d 2 ( 97 )
Figure 17: A scheme depicting the state ΦBA N of the arbitrary channel N , obtained by sending i) one part of the maximally entangled state Φ AA through channel N , and ii) the other part through the identity channel I.
Alice and Bob attempt to separately use two zero-capacity channels N and M to transfer quantum states.Alice uses separate encoders E1 and E2 for each group of channels and Bob uses separate decoders D1 and D2.For any set of chosen encoding and decoding operations the transmission of information will fail due to the vanishing capacity of individual channels.