What Physical Layer Security Can Do for 6G Security

While existing security protocols were designed with a focus on the core network, the enhancement of the security of the B5G access network becomes of critical importance. Despite the strengthening of 5G security protocols with respect to LTE, there are still open issues that have not been fully addressed. This work is articulated around the premise that rethinking the security design bottom up, starting at the physical layer, is not only viable in 6G but importantly, arises as an efficient way to overcome security hurdles in novel use cases, notably massive machine type communications (mMTC), ultra reliable low latency communications (URLLC) and autonomous cyberphysical systems. Unlike existing review papers that treat physical layer security orthogonally to cryptography, we will try to provide a few insights of underlying connections. Discussing many practical issues, we will present a comprehensive review of the state-of the-art in i) secret key generation from shared randomness, ii) the wiretap channels and fundamental limits, iii) authentication of devices using physical unclonable functions (PUFs), localization and multi-factor authentication, and, iv) jamming attacks at the physical layer. We finally conclude with the proposers' aspirations for the 6G security landscape, in the hyper-connectivity and semantic communications era.


I. INTRODUCTION
The rollout of fifth-generation (5G) mobile networks and the forthcoming sixth-generation (6G) will bring about fundamental changes in the way we communicate, access services and entertainment.In the context of security, inarguably, 5G security enhancements present a big improvement with respect to LTE.However, as the complexity of the application scenarios increases with the introduction of novel use cases, notably ultra-reliable low latency (URLLC), massive machine type communications (mMTC) and autonomous cyberphysical systems (drones, autonomous cars, robots, etc.), novel security challenges arise that might be difficult to address using the standard paradigm of complexity based classical cryptographic solutions.
Specific use cases with open security issues are described in detail in a number of 3GPP technical reports, e.g., on the false base station attack scenario [1] and on the security issues in URLLC [2].Indeed, for beyond 5G (B5G) systems, there exist security aspects that can be further enhanced by exploiting different approaches, as classical mechanisms either fall short in guaranteeing all the security and privacy relevant aspects, or, can be strengthened with mechanisms that could provide a second layer of protection.
In the past years, physical layer security (PLS) [3] has been studied and indicated as a possible way to emancipate networks from classical, complexity based, security approaches.Multiple white papers on the vision for 6G incorporate physical layer security, e.g., [4]- [6], as well as in the IEEE International Network Generations Roadmap (INGR) 1st and 2nd Editions [7].Motivated by the above, a key point of this paper is to showcase how PLS and in general security controls at the PHY level can be exploited towards securing future networks.
One of the most promising and mature PLS technologies concern the distillation of symmetric keys from shared randomness, typically in the form of wireless fading coefficients.Within the channel's coherence time, small scale fading is reciprocal, time-varying and random in nature and therefore, offers a valid, inherently secure source for key agreement (KA) protocols between two communicating parties.This is pertinent to many forthcoming B5G applications that will require strong, but nevertheless, lightweight KA mechanisms, notably in the realm of Internet of things (IoT).
With respect to authentication, there are multiple PLS possibilities, including physical unclonable functions (PUFs), wireless fingerprinting and high precision localization.Combined with more classical approaches, these techniques could enhance authentication in demanding scenarios, including (but not limited to) device to device (D2D) and Industry 4.0.Note that according to the 6G vision, as a network of (sub)networks, authentication might be required independently for access to the local (sub)network and to the core network, making the adoption of RF and device fingerprints a viable alternative for fast authentication of local wireless connections.
In parallel, mmWave and subTHz bands require the use of a huge number of antennas and pencil sharp beamforming.Consequently, a viable scenario for the wiretap channel can be substantiated, without any assumptions regarding the hardware (number of antennas, noise figure, etc.) or the position of a potential eavesdropper.Similarly, visible light communications (VLC) systems offer respective use cases.It is therefore pertinent to discuss advancements in wiretap secrecy encoders.The interplay between secrecy and privacy in finite blocklengths is arXiv:2212.00427v1[cs.CR] 1 Dec 2022 another aspect that emerged from recent fundamental results in finite blocklength secrecy coding and should be highlighted.
Furthermore, new types of attacks have to be accounted for.In particular, there is mounting concern for potential jamming attacks and pilot contamination attacks during beam allocation and entry phases of nodes into the network [8].Clearly, such attacks cannot be addressed with standard cryptographic tools and the required solutions can only emerge at the PHY, potentially in the form of jamming-resilient waveform and code design.
Finally, a less considered aspect relates to anomaly / intrusion detection by monitoring hardware metrics.This can be either used for distributed anomaly detection in low-end IoT networks, i.e., by monitoring memory usage, Tx and Rx time, debug interface of devices, or, for more generalized anomaly detection of devices of untrusted manufacturers, etc.Such approaches could help lessen the monitoring overhead of centralized approaches and could provide new approaches towards the identification of the source of the anomaly [9].
Looking at the bigger picture, future security controls will be adaptive and context-aware [10].In this framework, rethinking the security design bottom up can provide low-cost alternatives.In particular, 1 PLS can provide information-theoretic security guarantees with lightweight mechanisms (e.g., using LDPC, Polar codes, etc.); 2 Hybrid crypto-PLS protocols can provide fast, lowfootprint and low-complexity solutions for issues such as in [1] and [2]; 3 PLS can act as an extra security layer, complementing other approaches, enhancing the trustworthiness of the radio access network (RAN); 4 PLS is inherently adaptive and can leverage the context and the semantics of the data exchanged.In the following we will provide a comprehensive review of fundamental, cutting edge results in PLS and showcase how PLS can be employed to achieve many of the standard security goals, notably confidentiality, authentication, integrity.To this end, and, in order to provide a platform for a fair comparison to standard crypto schemes and a discussion on the potential advantages of hybrid PLS-crypto systems, we will first review fundamental cryptographic concepts and goals in Section II.Next, Section III gives a brief motivation on why PLS should be considered for the 6G.In Section IV the wiretap channel theory will be presented (focusing on information theoretic characterizations for the finite blocklength) along with some recent results for privacy in sensing systems.Subsequently Section V discusses the topic of secret key generation (SKG) from shared randomness and highlights two subtle points concerning the pre-processing of the observation channel coefficients and coding methods in the short blocklength, furthermore, jamming attacks and countermeasures are discussed [11], [12].In Section VI hardware based and statistical methods used in authentication will be visited, focusing on localization based authentication [13], [14] and physical unclonable functions.Finally, future directions and the authors' aspirations for security controls at all layers in 6G will be presented in Section VII.

II. BACKGROUND CONCEPTS IN CRYPTOGRAPHY AND NETWORK SECURITY
Starting with some fundamental concepts in cryptography, we will address questions that arise in the systematic study of any system.In particular, we will provide answers to the following questions: "what do we want to achieve?"; "what is the system model?";"what are the underlying assumptions, and what are the desirable properties?" With respect to what we aim to achieve, typically any security system aims at reaching one or multiple of four fundamental goals.The first goal is to be able to provide data confidentiality, i.e., security against eavesdropping (passive attackers).The corresponding threat model involves two legitimate parties communicating in the presence of an eavesdropper.Typically, with the aid of encryption, confidentiality is ensured against passive attackers.The second major goal is that of data integrity, i.e., providing guarantees that as the data traverses through the network, any modification or alteration of a message will be perceptible at the destination.The corresponding threat model involves an active attacker that in addition to intercepting messages also performs modifications.The third major security goal is authentication (user or device), while access control is a closely related topic.The threat model involves again an active attacker that potentially attempts to gain unauthorized access.Finally, the fourth goal is that of availability, i.e., users should not be denied services.The network should be resilient to active attacks that fall in the general category of "denial of service".
With respect to the system model, as noted above, the basic system setting includes three nodes.Two legitimate parties, that are referred here Alice and Bob and an adversarial node that is typically referred to as Eve (passive eavesdropper) or Mallory (active attacker, i.e., man-in-the-middle).To securely transmit a message (plaintext) to Bob, Alice uses a secret key to first encrypt it to a ciphertext.The ciphertext is then propagated through the transmission medium and received at Bob. Bob can decrypt the ciphertext by using the same or a different type of key, depending on the underlying algorithm.

A. Confidentiality
To perform the operations above, i.e., encryption / decryption, Alice and Bob rely on the use of ciphers.A key feature of modern block ciphers is to exploit highly non-linear operations to induce confusion, i.e., to render statistical inference attacks impossible.A textbook example of a linear cipher that is badly broken is the substitution cipher in which each letter of the alphabet is moved k positions to the right (or to the left), with k changing per letter.Considering the English alphabet, this results in 25! possible key combinations, making a brute force attack impractical.However, due to the linearity of the operations (permutations), a frequency analysis of a (long enough) ciphertext suffices to guess the plaintext.
A revolutionizing result in security was presented by Shannon in 1949 [15], when he demonstrated that perfect secrecy can be achieved if and only if (iff) the entropy of the secret key is greater or equal to the entropy of the plaintext.The corresponding scheme, known as one-time-pad, is implemented by xor-ing the plaintext with the key.Unfortunately, to perform the above, the key size must be at least equal to that of the data which raises the problem of key distribution.
While one-time pad is impractical, it provided insight into how secrecy can be achieved.In particular, it inspired the family of stream ciphers that rely on the idea of inflating short key sequences to psedorandom sequences of the same size as the plaintext and xor-ing them.This is achieved through the use of pseudorandom number generators (PRNGs).Although they cannot provide perfect secrecy (entropy cannot increase by data processing as a consequence of the data processing inequality), their usage led to the introduction of a more practical concept, i.e., semantic security.
The definition of semantic security for PRGNs relies on the indistinguishability between their output and the output of a truly random source.More generally, semantic security ensures that a non-negligible statistical advantage cannot be accumulated by an adversary in polynomial time.For all practical purposes, if a statistical advantage happens with probability higher that 2 −30 , e.g., one bit is leaked in one gigabyte of data, the system is considered broken (not semantically secure).
A canonical example of modern block ciphers is the advanced encryption standard (AES).AES is a semantically secure symmetric block cipher which takes a n-bit plaintext (n = 128) and a k-bit key (k chosen from 128, 192, or 256 bits, with AES-256 considered to be quantum resistant) as input and outputs a n-bit ciphertext.AES relies on a set of substitution and permutation operations including the use of substitution (S) boxes.A well structured S-box removes the relation and dependency between bits, making a (linear or differential) cryptanalysis attack impossible.To allow the re-use of a single key for multiple blocks, nonces can be used.Nonces are deterministic (e.g., a counter) or random (initialization vectors), chosen such that a pair (key, nonce) never repeats.The important message here is that, today's cryptographic mechanisms allow the use of a short key sequence (e.g., 96 Bytes of key material in TLS v1.3) for the encryption of very long data sequences (in the order of GBs), allowing to overcome the key issue with one-time pad.

B. Data integrity
Data integrity is achieved with message authentication codes (MACs).The principle of MACs is to append a small label (tag) to each message, which validates its integrity.A MAC consists of two algorithms: signing and verification.Similarly to confidentiality schemes, there are historical examples of broken integrity algorithms in which linear functions (e.g., cyclic redundancy checks) have been used to generate MACs.Modern signing algorithms (tag generation) leverage the use of secret keys and symmetric block ciphers to generate a t-bit tag for a n-bit message, with t << n.Upon reception, the verification algorithm uses the key, the received message and the tag and outputs a binary decision, i.e., the integrity check is either successful or not.
Building on the above, a naturally arising concept is the one of authenticated encryption (AE) which combines both confidentiality and integrity.Various options exist on how to perform the two operations.One approach, that is always correct and provably secure, is the so called encrypt-then-sign, i.e., after a plaintext is encrypted a tag is generated over the ciphertext.The receiver would first check the integrity and iff successful would continue with decryption.

C. Authentication
The process of authentication relies on digital signatures, which in turn, are used to produce digital certificates.Digital certificate is data signed by a trusted third party (certificate authority (CA)) that ensures the authenticity of the its owner.A certificate contains information about the CA, the owner of the certificate, the validity of the certificate, etc.As an example, when a user accesses a public server, the server proves its authenticity by presenting a certificate signed from a CA.To achieve mutual authentication the user must enter a password information, provide biometric data, etc.

III. MOTIVATION FOR CONSIDERING PHYSICAL LAYER SECURITY
Given the fact that all schemes discussed in the previous section are widely deployed and trusted, one question remains: What is the motivation in considering PLS?
PLS technologies can offer multiple security techniques: i) secrecy encoders for wiretap channels, ii) privacy preserving transmission, iii) secret key generation from shared randomness iv) physical unclonable functions for device authentication, and v) localization or RF fingerprinting based authentication.While crypto solutions can provide these functionalities for current standards, they face number of challenges when considering new and emerging technologies.First, latency requirements are getting more stringent than ever, bringing the need for faster authentication and integrity checks.Second, large scale IoT deployment requires flexible and easily scalable security solutions that could simultaneously satisfy different security levels.A third element comes from the rise of quantum computing which opens the need for quantum secure algorithms.Finally, a fourth motivation comes from the new PHY infrastructures where the number of operations performed at the edge are expected to rise dramatically.Therefore, it is of utmost importance to separate the security of the core network from the one at the edge and introduce new faster and lightweight security algorithms.The statements above are complemented with the following list: 1) Regarding latency, 3GPP has recently noted that delays should be minimized in two directions, delays incurred by the communication and delays incurred due to computational overhead.A particular case where computational overhead of current standards do not comply with the requirements is security.As an example, it has been shown that the verification of a digital signature, in a vehicular networking scenario using a 400 MHz processor, exceeds the tolerated delays and requires approximately 20 ms [16].Such results hint that a revolutionizing actions are needed in that direction.
2) Next, deploying billions of IoT devices is not inconceivable anymore.In 2016, it has been demonstrated that a Mirai sized attack (e.g., 6 × 10 5 bots) is plausible.The attack has been demonstrated over simple machines, e.g.water heater, however, controlling 6 × 10 5 can instantly change the demand in the smart grid by 3 GW, which is comparable to having an access to a nuclear plant.
Examples like this raise a lot of questions on the security of the IoT. 3) In 2017, the NIST started the investigation on the topic of quantum resistance and post-quantum cryptography.However, as it stands now, the state of the art is based on using longer keys and increased complexity.This makes the mechanisms heavier which contradicts with the need for low latency and low footprint.Hence, post-quantum innovations at the moment are not well aligned to the expectations towards 6G networks.4) Finally, new PHY and networking structures are being developed for the next generation of communication technologies.The central idea is to enhance the role of AI edge intelligence.This is a key component, that can enable the use of PLS in 6G.More details regarding this point will be discussed in Sec.VII.In the following sections it will be discussed how PLS technologies can be employed and some fundamental results in the area will be showed.

A. Confidential transmission
In this section two aspects of physical layer security will be discussed, i.e., data confidentiality and data privacy.In detail, the information theoretic formulations of these problems will be investigated.
As noted in Section I secure data transmission tends to be a higher layer issue, e.g., enabled by encryption.However, confidential data transmission becomes difficult when considering massive numbers of low cost and low complexity devices.This is where physical layer security can play an important role.The idea is, instead of having reliability encoding, i.e., error control coding separated from the encryption, we can use joint encoding schemes that provide both reliability and security.
This approach, known as wiretap coding, was proposed approximately half a century ago by A. Wyner [17].Wyner looked at a three terminal wireless channel, i.e., two legitimate users Alice and Bob, and an eavesdropper, Eve.He recognized that the channels between the terminals are not perfect, i.e., their transmission will be impacted by noise.Therefore, when Alice transmits, Bob and Eve will not see exactly what has been transmitted.Moreover, Bob and Eve will have different received signals as they have different noisy channels.Wyner was interested in whether Alice could send a message reliably to Bob, while keeping it secret from Eve.To answer, he looked at the reliable rate to Bob, versus the equivocation at Eve (conditional entropy of the message at Eve's receiver).Note that, perfect secrecy can be achieved if the reliable rate at which data is being transmitted to Bob equals to the equivocation of Eve.To measure these quantities Wyner introduced a new metric, named secrecy capacity, which is the maximum reliable rate that equals the equivocation.He further showed that, achieving positive secrecy capacity is possible, hence, confidential transmission can be performed without the use of secret keys.However, achieving positive secrecy capacity is possible iff, the measurements at Eve are degraded with respect to those at Bob.A plausible example is when the signal to noise ratio (SNR) at Bob is higher than the SNR at Eve.Now, thinking about the physical layer, it is clear that the properties of radio propagation, i.e., diffusion and superposition, provide opportunities to achieve positive secrecy capacity.For example, by using the natural degradeness over time (e.g., fading), by introducing an artificial degradeness to the eavesdropper (e.g., interference and jamming), or, by leveraging spatial diversity (e.g., multiple antenna systems and relays can create secrecy degrees of freedom).
Based on the above, over the last fifteen years the idea of wiretap coding has been further examined considering several fundamental channel models: broadcast channel (one transmitter, multiple receivers), multiple access channel (multiple transmitters, one receiver), interference channels (multiple transmitters, multiple receivers); see e.g.[18].To illustrate the main results in the area, this work focuses on the broadcast channel [19].First, consider a Gaussian broadcast channel with Alice being a transmitter and Bob and Eve receivers.Assume two messages are transmitted: M 1 intended for both receivers and M 2 a secret message that is intended only for Bob.To define the capacity region we consider a degraded channel at Eve.In particular, it is assumed that the SNR level at Bob equals 10 dB, and the SNR at Eve is 5 dB.This is illustrated in Figure 1 where the horizontal axis gives the range of possible rates for the common message M 1 , and the vertical axis gives the range of possible rates for the secret message M 2 .The capacity region without secrecy constraints is shown with red solid curve and the secrecy capacity is indicated by the dashed blue curve.It can be observed that, if secrecy is required, part of the available capacity must be sacrificed in order to confuse the eavesdropper for that message.It is important to note that the amount to be sacrificed depends upon choosing a codeword that randomizes the message w.r.t.Eve, but allows Bob to successfully verify it.
Next, Figure 2 shows the impact when the SNR at Eve varies.Similarly, the horizontal axis gives the common rate and the vertical axis gives the secrecy rate.The arrow shows that, if the SNR at Eve decreases, the range for the common rate shrinks and the range of secrecy rates increases.On the other hand, if the SNR at Eve reaches 10 dB, the same level as Bob's SNR, the secrecy region collapses.That is, if the second receiver is not degraded, secrecy rate becomes zero.Interestingly, things change when looking at a fading Gaussian broadcast channel.To illustrate this scenario we consider the same model, i.e., one transmitter, two receivers, one common message, and one secret message, but we assume that both the receivers have the same level of Gaussian noise, i.e., Bob and Eve have 5 dB SNR.This is given in Figure 3.The difference between Bob and Eve is the fading parameter, i.e., Bob's experiences Rayleigh fading with a unit parameter, and  Eve has Rayleigh fading with parameter σ 2 .Note, a smaller σ 2 , results in more intense fading.As before, when Eve's channel gets worse, i.e., σ 2 decreases, it can be seen that the range of common rates on the horizontal axis shrinks and the range of secret rates on the vertical axis increases.However, a distinction here is that if the two receivers observe the statistically identical channels (this is the case when σ 2 = 1), the secrecy capacity does not collapse as in the case of the Gaussian channel.This result holds under the assumption of perfect channel knowledge and follows from the fact that fading provides additional degrees of freedom leading to advantage during the time when other receivers experience deeper fade.
A major issue concerning the results above comes from an information theoretic perspective.In particular, they are based on the assumption of infinite coding blocklength.Hence, it concerns the following scenario.Assume that a message W , that is encoded into a length-n codeword, is transmitted into the channel.After passing through the wireless medium noisy instances of the codeword are obtained by Bob and Eve.These codewords are then fed into Bob's and Eve's decoders.The desired property for this scenario is that for Bob to be able to reconstruct the codeword perfectly while at the same time, the leakage of the codeword to Eve is bounded by the quantity δ.In the original formulation by Wyner, the considered blocklength is infinity, i.e, n, the number of channel uses, is infinity.When n → ∞, the probability of error at Bob, i.e., probability that he decodes to a Ŵ which is different compared to W goes to zero.Additionally, the information leakage δ also goes to zero.The secrecy capacity for this case has been formulated as the difference between the mutual information between Alice, X A , and Bob, X B , and the mutual information between Alice and Eve, X E , when considering the maximum from the channel input distribution P X , i.e.,: This is an intuitive result, i.e., achieving positive secrecy capacity relies on the degradation of Eve's channel.The limitation of this theory is that it gives only asymptotic results that are not suitable for low latency applications, such as in an IoT scenario.This opens the question: What is achievable in the non-asymptotic case?, and the answer depends on the finite blocklength information theory.Assume we have a source W , which can take 1, 2, . . ., M possible values, i.e., it has log 2 M bits.The source is mapped using an encoder to a sequence, X n , which is then passed through a channel.Due to noise, the receiver will observe a corrupted version of the transmission, i.e., Y n , which is then decoded to Ŵ .If the errors between Ŵ and W are less than a particular value, , the decoder could reconstruct the original source.In systems like this, the design of nM codes is of particular interest: M the number of source symbols, n the number of channel uses, and the upper bound on the reconstruction fidelity of the source at the output of the decoder.The fundamental limit for such a system is defined by the maximum M , i.e., the largest possible number of source symbols that can be transmitted through the channel in n channel uses and be reconstructed at the decoder with error probability ≤ .Note that, lim n→∞ 1 n log 2 (M ) gives the Shannon's capacity where → 0. However, in an actual system n and are finite values.Considering this, an approximation for M * was derived in [20], and it is given as where C gives the Shannon's capacity, Q −1 ( ) defines the tail of a standard Gaussian distribution evaluated at , and V is the channel dispersion, which is the variance of the information density (note that Shannon's capacity is the mean of the information density).
The result from Equation ( 2) is illustrated in Figure 4, where an AWGN channel is assumed with SNR equal to 0 dB, = 10 −3 and C = 1/2.The figure shows the upper bound and lower bound for the capacity for finite block lengths, denoted here by "Converse", and "Best achievability", respectively.Hence, the actual capacity, which remains to be found, lies between those two curves.While the gap between the curves is small for high values of n, it can be observed that for small values of n the gap remains large, hence, further work in the area is required to obtain a more precise solution.
Following the result for channel capacity, it has been just recently shown that the secrecy capacity in the finite blocklength scenario can also be approximated [21].Fixing the error probability at Bob, , the leakage at Eve, δ, and the block length n, an approximation for the secrecy capacity is given as where V is defined similarly to the channel dispersion of (2).The result from Equation ( 3) is illustrated in Figure 5.
The figure considers a binary symmetric wiretap channel with crossover probability p = 0.11, δ = = 10 −3 and C S = 1/2.A similar trend is observed as in the previous figure, the gap between upper bound (Converse) and lower bound (Best achievability) shrinks and widens as n gets larger or smaller, respectively.This has also been evaluated for a Gaussian wiretap channel and the result is illustrated in Figure 6.The SNR at Bob here equals 3 dB, and the SNR at Eve equals −3 dB.It can be observed that the gap between achievability and converse is even larger for this scenario.However, what is important to mention here is that the upper bound, when considering finite block lengths, is far from the asymptotic secrecy capacity, C S .This shows that research on emerging IoT technologies should not rely on asymptotic results and should focus on the investigation of short block length communications.

B. Privacy in sensing systems
Differently from secrecy, where the concern is about restricting a malicious party from getting access to the transmission, in the case of privacy, the goal is to keep part of the information secret from other parties, including the legitimate receiver (Bob).A simple way to ensure there is no privacy leakage is to deny access to Bob, however, without having a recipient the data source becomes useless.Therefore, it is important to study, which part of the data can be shared, such that the message is successfully and securely transmitted, while the privacy leakage is minimized.
This section focuses on the problem of privacy leakage with particular focus on sensing systems.Such systems include smart meters, cameras, motion sensors, i.e., devices that generate useful data for companies who provide users with particular service (alarm, power supply, etc.).While companies can use the data to improve their services, the full access to it endangers the privacy of users.
The above hints towards that, there is a fundamental tradeoff between privacy and usefulness of data (distortion).This is illustrated in Figure 7.If the data is completely private, i.e., its equivocation at Bob is high, the data becomes useless and it is fully distorted.Contrarily, if the data is fully accessible, i.e., it has low distortion at Bob, then its equivocation goes to zero and the data is not private.Now, when considering a specific application, i.e., smart meters, the trade-off can be specified as follows: a smart meter measures the electricity usage in almost real time, hence, having the utility of providing users with information on their usage, but in the same time it leaks this information to the power supply company who can use it to trace in-home activities [22].One way to model this problem is through a hidden Gauss-Markov model.This is given in Figure 8 where the hidden state is the intermittent state, e.g., turning your toaster on, your kettle on, etc.The figure captures a smart meter trace, and shows that the privacy-utility trade-off for this model can be characterized by a reverse water-filling [23].The trade-off here is defined by the water level φ, such that all signals with power lower than φ are being suppressed by the meter, while all signals above are being be transmitted (and leaked) by the meter.Therefore, the value of φ defines the amount of privacy that the user is willing to sacrifice to increase his utility.
Another way to approach the same problem is through using control, i.e., actively controlling what the meter sees based on storage and energy harvesting [24].This is illustrated in  Figure 9, where the utility-privacy trade-off for this model is captured by measuring wasted energy versus information leakage.Presenting this control approach as a Markov model allows to numerically determine the efficient frontier.This is given in Figure 10, where the red curve gives the optimal trade-off of wasted power versus information leakage.
Another example is when considering the case of competitive privacy.In competitive privacy, there are multiple agents (Bobs) each having own privacy utility trade-offs.On one hand, there are multiple interacting agents who are competing with one another, but, on the other hand, the agents have coupled measurements.In detail, each agent wants to estimate its own parameters and can help other agents by sharing data but does not want to compromise his own privacy.
This competitive scenario can be represented as a linear measurement model [25].Utility can be measured in terms of mean squared error on the state estimation and privacy can be measured in terms of information leakage.In fact, it has been shown that this reduces to a classical problem, known as the Wyner-Ziv problem or the distributed source coding problem.Particularly, it has not been discussed what is the optimal amount of information that must be exchanged, but it has been shown that the optimal way to exchange information Fig. 10.Wasted power versus information leakage when considering a control approach.
is by using Wyner-Ziv coding.Next, depending on the scenario a simple way to find the optimal amount of information is through the use of game theory.
Finally, an important conclusion for this section is that information theory can help us understand the fundamental limits of security and privacy.While mainly theoretical constructs have been discussed, it is clear that there is a need to connect the theoretical analyses to real networks.Building on the above, some emerging research directions include finite blocklength analysis (short packet low latency communication), scaling laws for large networks (channel models that consider massive networks) and practical coding schemes.

V. SECRET KEY GENERATION USING PLS
This section focuses on several aspects concerning SKG.First, it provides an overview on how to extract symmetric keys from shared randomness, then it shows how SKG can be incorporated in actual crypto systems, and finally, it discusses how the SKG process can be made resilient to active attacks.

A. Secret key generation
Generally, the SKG protocol consists of three steps: advantage distillation, information reconciliation, and, privacy amplification.Assuming two legitimate parties, e.g., Alice and Bob, the steps can be summarized as follows.In the first step, Alice and Bob exchange pilot signals during the coherence time of the channel, and obtain correlated observations Z A and Z B , respectively.In the second step, their observations are first quantized and then passed through a distributed source code type of decoder.During this step Alice (or Bob) shares side information, which is used by Bob to correct errors at the output of his decoder.Hence, at the end of this step both parties obtain a common binary sequence.Finally, to produce a maximum entropy key and suppress the leaked information, privacy amplification is performed.In this last step, Alice and Bob apply an irreversible compression function (e.g., hash Fig. 11.FER performance of reconciliation codes compared to the lower bound from [27] for n = 128.(From [28].)function) over the reconciled bit sequence.This produces a uniform key that is unobservable by adversaries.
There are few important points that need to be taken into account for the success of the SKG process.First, channel measurements represent a mixture of large scale and small scale fading components.In multiple studies, it has been demonstrated that the large scale component is strongly dependent on the location and the distance between users, which makes it predictable for eavesdroppers.Therefore, to distill a secret key, Alice and Bob should either remove this part from their measurements and generate the key using the unpredictable small scale components or should compress more at the privacy amplification.This point is further discussed in Section VII.Second, the SKG protocol should follow all the steps described above, and no steps should be skipped.As an example, skipping the privacy amplification would give Alice and Bob longer key sequence, however, the key sequence is vulnerable to different attacks [26].Third, it is important that, Alice and Bob do not transmit information related to their observations, as this could be exposed to eavesdroppers in the vicinity.Forth, Alice and Bob should respect the coherence time and coherence bandwidth of the channel, such that their subsequent measurements are decorrelated in time and frequency.This allows them to generate random and unpredictable bit sequences.Finally, as mentioned in the previous section, further testing of short blocklength encoders is necessary in order to identify the optimal solution for SKG.
Regarding the last point, Figures 11 and 12 show a comparison between an upper bound, evaluated in [27], versus information reconciliation rates achieved using of LDPC, polar codes and BCH codes [28].Both figures n = 128 and n = 512 show that polar codes with CRC and BCH codes with list decoding outperform the other approaches, making them good candidates for reconciliation decoding.Note that such type of encoders are already used in 5G for different purposes.

B. Secret key generation in hybrid crypto systems
Building on the above, we continue with a particular example on how SKG can be incorporated in hybrid security cryptographic schemes.In detail, it will be discussed how to build a SKG-based authenticated encryption.Three ingredients are needed to formulate this problem: where h represents the channel measurements, k the generated key after privacy amplification and s A is Alice's side information that has to be transmitted to Bob to finalize the process.2) Before transmitting s A to Bob, Alice breaks her key into two parts k = {k e , k i }, generates a ciphertext as c = Es(k e , m) and signs it as t = Sign(k i , c).Afterwards she transmits to Bob the concatenation of [s A ||c||t], i.e., in a single message she can transmit the side information and her message.3) Upon receiving the above, Bob uses the side information s A , to finish the SKG process, i.e., to obtains the key k.
Then, he checks the integrity of the received ciphertext as Ver(k i , c, t) and if successful he decrypts and obtain the message m.Differently from the standard SKG scheme, where SKG is performed in parallel at both nodes and data exchange happens only after the key generation is finalized, in the scheme above Alice completes the SKG locally and then transmits in a single go the ciphertext, the tag, and, the side information (e.g., syndrome).Then Bob uses the syndrome to complete the SKG and performs the authenticated decryption.This small change in the standard procedure shows how PLS can be easily combined with standard crypto schemes.
Such approaches bring new opportunities.For example, the scheme above opens the problem of transmission optimization.Consider a scenario with multiple subcarriers used for transmission.The subcarriers can then be split into two subsets, a subset D used for transmitting encrypted data and a subset D used for transmitting side information (syndromes).This transmission scheme can be optimized considering several constraints.The first constraint comes from the world of cryptography, i.e., based on the choice of cryptographic cipher we can define the amount of data to be encrypted with a single key.This can be captured by the following constraint: where C SKG defines the key generation rate, C D defines the data rate and β is a quantity that relates the key size to the data size that will be encrypted, e.g., β = 1 corresponds to a one-time pad cipher.The second constraint comes from the world of information theory.It relates the necessary (side information) syndrome rate C R and the SKG rate as follows: where κ defines minimum number of reconciliation bits with respect to the key bits.It is a parameter defined by the type of the encoder/decoder used for SKG, e.g., for a k n block encoder κ = n−k k .Further constraints that can be incorporated are power constraint: and a channel capacity constraint, i.e., where N gives the number of subcarriers, P is the power limit per subcarrier and C is the total capacity of the channel.The objective of the problem can then be defined as: 5), ( 6), (7), and ( The problem can be turned into a combinatorial optimization problem which can be solved optimally using dynamic programming techniques or sub-optimally using heuristic approaches.Overall, this problem shows how physical layer aspects can be related to cryptographic schemes, in the form of a hybrid security scheme, and provide new opportunities for cross layer optimization. The problem was solved in [29] and the main result is depicted in Figure 13.The figure shows the long term efficiency (expected sum data rate normalized to the capacity of the channel) of the proposed parallel approach, i.e., the transmission of side information and encrypted data are done simultaneously on D and D, respectively, versus a standard sequential transmission approach.It can be seen that, for most values of β, the parallel approach outperforms the sequential one.Another observations is that as β increases, the efficiency decreases.This is expected result as higher β will required more frequent key generation, hence, less data transmission.Finally, an important result that can be observed on the graph is that the authors proposed a simple heuristic approach for the parallel scheme that gives an equivalent efficiency to the optimal solution solved using dynamic programming approach (i.e., as a Knapsack problem).Further interesting aspects that This problem has been further investigated in [30], where a general quality of service (QoS) delay constraint was introduced.The work is based on leveraging the theory of the effective capacity and identifies the maximum supported transmission rate when considering a delay constraints, i.e., instead of maximizing the data rate C D the problem focuses on maximizing the effective data rate E C (α), given as where α = θT f B ln(2) with θ being a MAC sub-layer parameter that captures the packet arrival rate and introduces a delay requirement into the problem, T f is the frame duration and B denotes the bandwidth.Considering that, [30] identified the optimal power allocation policy that maximizes E C (α) as where g 0 is a cut-off value that can be found from the power constraint and ĝi i = 1, . . ., N denote the imperfectly estimated channel gains.If the system can tolerate looser delay requirements, i.e., θ → 0 the result above converges to the well-known water-filling algorithm and if stringent delay constraints are implied, i.e., θ → ∞ the optimal power allocation converges to total channel inversion.Similarly to the previous case, it has been demonstrated that the parallel approach outperforms the sequential approach, in terms of efficiency, regardless of the values of θ and β [30].

C. Secret key generation under active attacks
The previous section discussed how SKG can be used to build authenticated encryption protocols.However, the above scheme could only be secure under the assumption that the advantage distillation phase is robust against active attacks.Therefore, this section focuses on active attacks during SKG, in particular the injection attack is investigated.The idea of this attack is illustrated in Figure 14.
Differently from previous sections, instead of an eavesdropper, an active man-in-the-middle (MiM) attacker is considered, referred to as Mallory.The system model assumes two legitimate users, Alice and Bob, each having a single antenna and Mallory, who has two antennas.The goal of the attacker is to inject an equivalent signal W at both, Alice and Bob, such that their channel observations Z A and Z B , respectively, will also include the injected signal: where the channel realization between Alice-Bob is denoted by H ∼ CN (0, σ 2 ), the exchanged signal over this channel is given as X, E[|X| 2 ] ≤ P , the noise observations at Alice and Bob are given as N A , N B ∼ CN (0, 1) and the injected signals over the link Eve-Alice (given as H A ) and Eve-Bob (given as H B ) are given as W = H A T PX J = H B T PX J .The received signals are equal, thanks to the precoding matrix P. A simple mathematical operation can reveal that, as long as Mallory has one extra antenna, as compared to Alice and Bob, the design of the pre-coding matrix is straight forward, i.e., Overall, this is a simple attack to mount its consequences are crucial.As it can be seen in Equations ( 12) and ( 13), by injecting the signals, Mallory adds additional term to the shared randomness between Alice and Bob, turning it into XH + W . Hence, this allows Mallory to obtain partial information with respect to the generated key.
Fortunately, a simplistic countermeasure has been proposed in [11].The idea is instead of using deterministic pilot signals X, as described above, Alice and Bob can transmit independent and randomized probe signals X and Y , respectively.This turns their observations into which allows them to simply post-multiply by their own transmission resulting into the following: where, as it can be seen, W is not anymore part of the shared randomness.Therefore, as long as X and Y are uncorrelated this simple approach can successfully reduce an injection attack to a less harmful uncorrelated jamming attack.In detail, the jamming attack has impact on the achievable key rate but does not reveal anything about the key to Mallory.Now, when Mallory's attack is reduced to jamming, a smart thing she can do, is to act as a reactive jammer.A reactive jammer would first sense the spectrum and jam only subcarriers where she detects a transmission.Considering a multicarrier system, Mallory can choose a sensing threshold and jam only subcarriers where she detects signals with power greater than the chosen threshold.A thorough analysis considering this scenario has be performed in [11], where this problem has been investigated using game theory.In fact, the scenario can be formulated as a non-cooperative zero-sum game with two players, i.e., player L, (legitimate users act as a single player), and player J, (the jammer).Based on the fact that player J jams only after observing the action from player L, this is formed as a hierarchical game with L being the leader of the game and J being the follower.Note that in hierarchical games, the optimal action is the Stackelberg equilibrium (SE).What was shown in this study is that the SE is based on two things: i) the sensitivity of the receiver at player J, and more specifically how well the sensing threshold is chosen, and ii) the available power at the legitimate users.The SE is defined as: • If the jammer has badly chosen threshold, depending on the available power at the legitimate users they would optimally: 1) equally distribute their power below the sensing threshold and do not comprise their communication.2) transmit with full power on all subcarriers, hence being sensed and jammed.
• If the jammer has chosen a low threshold that allows to detect all ongoing transmissions, Alice and Bob have no choice but to transmit at full power.Overall, SKG is a promising PLS technology and could help solving the key distribution issue for emerging 6G applications, e.g., addressing scalability for massive IoT [31].

VI. AUTHENTICATION USING PLS
One of the main motivations to look at PLS authentication schemes is the increasing complexity of standard crypto schemes.In fact, it has been shown in multiple studies that there exists a trade-off between delay and key sizes used in the cryptographic schemes.
A particular example that focuses on addressing such issues is the zero-round-trip-time (0-RTT) protocol introduced in the TLS version 1.3 for session resumption.The idea is based on using resumption keys to quickly resume a session, in a 0-RTT, as opposed to re-authenticating users every subsequent session.Unfortunately, it has been shown that this scheme is vulnerable a set of attacks (e.g., replay attack), however, the community answer was "But too big a win not to do" [32].
This section gives a hint on what PLS can do in terms of authentication for 6G systems.In particular, it first gives a brief background on physical unclonable functions (PUFs), then discusses how localization can be used as an authentication factor, and finally, it introduces a secure 0-RTT authentication protocol that leverages multiple PLS technologies.

A. Physical unclonable functions
PUFs can be referred to as device fingerprints.The idea is that, the manufacturing of a circuit is a process with unique characteristics (e.g., due to change in the temperature, vibrations), which makes each device unique on its own.While devices operate in a similar manner, they always have small variations in terms of delays, power-on-state, jitter, etc.This gives an opportunity to leverage these uniqueness, and use it for authentication.
Given that, a standard PUF based authentication protocol follows two phases.An enrolment phase which takes place offline, and an authentication phase which is performed online.During the enrolment phase, a set of challenges are run on a device's PUF.A set of challenge could refer to measuring propagation delays over different propagation paths.Due to the presence of noise, these measurements are passed through a suitable encoder to generate helper data.Following that, a verifier (e.g., a server) creates a database where challengeresponse pairs (CRPs) are stored along with the corresponding helper data.Next, during the online authentication phase, the verifier sends a random challenge to the device, and the device replies with a new PUF measurement.The authentication is successful if the verifier can regenerate the response saved during enrolment by using the new response and the helper data in its database.Note that, to avoid replay type of attacks a CRP should not be re-used.A major advantage of the scheme above is that the device does not need to store any key information and relies only on PUF measurements.Hence, if the device is compromised (e.g., "captured by an enemy"), no useful information can be extracted.

B. Location-based authentication
Localization precision is continuously increasing and the goal of 6G technologies is to achieve centimeter level accuracy.Popular approaches for fingerprinting rely on measuring received signal strength (RSS), carrier frequency offsets, I-Q imbalances, CSI measurements and more.This section presents a lightweight example for location based authentication, through a low-complexity proximity estimation.
Consider a mobile low-end device with a single antenna and low computational power.Assume that the device has a map of a premise and knows the location of the access points within this premise.A simple strategy to perform reverse authentication (i.e., the device authenticates an access point) is to move in an unpredictable manner and measure the RSS from multiple positions.As the RSS is strongly related to the distance between devices, this simple approach allows to confirm the location of the access point.Typically, localization would require either the deployment of multiple nodes that measure the RSS simultaneously or advanced hardware/computational capabilities when considering a single device.The approach above does not have such requirements and can still be used as an authentication factor.In fact, the proximity detection described above can provide resilience to impersonation type of attacks, e.g., in the presence of a malicious access point.Now, we summarize some open research issues in the direction of using fingerprint based authentication.A concern that naturally arises is about the resilience of such schemes to jamming and man-in-the-middle type of attacks.In particular, how to cope with interference transmissions, or pilot contamination type of attacks, both of which can alter the precision of the localization information.Another issue concerns the trustworthiness of the localization information, i.e., depending whether we operate at short or long distance, the variability of measurements can change, hence, bringing uncertainty into the system.Finally, another aspect concerns the type of application where such approach could be useful.A good example comes from the idea presented above, e.g., reverse authentication.Reverse authentication can help in mitigating attacks that fall into the general category of false base station attacks (which are open issues in 5G).However, we note that before deploying location-based authentication technologies all concerns must be addressed.

C. Multi-factor PLS authentication
A recent publication [14], has shown how three PLS credentials (PUFs, SKG and location fingerprints) can be combined into a multi-factor PLS based authentication protocol.The proposed scheme uses PUFs as a mutual authentication factor between a mobile node (Alice) and a static server (Bob).The protocol is realized following a typical PUF approach, i.e., following two steps, enrolment and authentication.The use of PUFs provides several security guaranties, including protection against physical and cloning attacks.Next, Alice uses proximity estimation as a second authentication factor.This simple technique re-assures her for the legitimacy of Bob and provides resistance to impersonation attacks (e.g., false base station attacks).To provide anonymity for Alice, the scheme introduces one-time alias IDs.After a successful authentication, both parties exchange resumption secrets, following a standard TLS 1.3 procedure.The resumption secrets are used for a fast 0-RTT re-authentication between Alice and Bob, i.e., session resumption (as opposed to performing a full authentication procedure).While the standard approach for session resumption is not forward secure and is vulnerable to replay attacks, the scheme in [14] uses SKG keys to randomize the resumption secrets.It is shown that adding SKG ensures both perfect forward security and resistance against replay attacks.
In general, using the physical layer for authentication is a well investigated topic.Schemes like the one above, show that there are already multiple PHY schemes which can contribute for the system security.Some of the research problems in the area include design of high-entropy PUFs and accurate and privacy-preserving location-based authentication.

VII. CONCLUSIONS AND FUTURE DIRECTIONS
This paper highlights the role that PLS could play in 6G, in view of the evolution in terms of security, with the concepts of trust, context awareness, and quality of security.
6G is expected to introduce new features to communication standards including sensing, subTHz communication, massive MIMO, extreme beamforming, learning and actuating, ultra reliable low latency computing and more.While it is still not clear how the transition from 5G to 6G will look like, there is growing interest on the use of semantics, semantic communications, semantic compression, and context awareness in 6G.
Another perspective was introduced with quality of security (QoSec), i.e., different slices of the network have different security and privacy requirements.This brings the need of adaptive security levels.A series of questions arise based on the above: How to define other security levels?How to perform adaptive identity management?How to make an intelligent risk assessment?
PLS emerges as a contestant for the next generation of security systems in 6G.One key advantage of PLS is that it is inherently adaptive.This is due to the fact that in physical technologies, the secrecy outage probability can be directly tuned through adjusting the transmission rate.
In particular, wireless channels can be treated as a source of two things, a source of uniqueness, and a source of entropy.For example, in a slow flat fading scenario (e.g.LoS) then the channel could be treated as a good source of uniqueness.As discussed in Section VI, uniqueness can be easily used for authentication purposes.On the other hand, if the channel changes very fast, due to small scale fading, it could be treated as a good source of entropy.The variability of the channel can then be directly used to either distill keys, or perform keyless transmission.An important observation is that if one is not available, e.g., uniqueness, then the other will be, e.g., entropy.
Following the above, an open research question is, how to characterize the channel properties and particularly, which part of the channel should be considered as predictable and which as unpredictable.It is not an easy question to answer as it would require the characterization of the channel correlations in time, frequency and space domains; but it is an important one as it would allow the alignment of PLS metrics to semantic security metrics.
Finally, we believe it is now time to start defining the security levels based on the usage of multiple elements.Here, we list several elements: 1) Criticality of information -how important the information is from user or the network perspective; 2) Value of information for the attacker -this captures, who is the attacker and how much effort is expected to put into compromising the system; 3) System resilience -this includes the stability and repair time after an attack; 4) Threat level -the usage of context to recognize "abnormal" events (could include location, behavior and communication information); 5) QoS constraint -systems are expected to comply with particular QoS index.Today, the deployment of PLS in systems is still lacking traction.However, there is a growing interest by industry and academia.This paper shows the potential of PLS for upcoming wireless system designs.It gives concrete examples of use cases for PLS, reaching far beyond addressing encryption.By doing so, greatly improving the security of 6G networks.For PLS it is instrumental to characterize and exploit the wireless channel from a security point of view.A key advantage is seen for developing light-weight security solutions for low-latency and massive IoT use cases.

Fig. 2 .
Fig. 2. Achievable rates for the Gaussian broadcast channel considering variable SNR at Eve.

Fig. 4 .
Fig. 4. Upper and lower bounds on the capacity regions for short block length communication.SNR is equal to 0 dB and = 10 −3 .(From [20])

Fig. 7 .
Fig. 7. Trade-off between privacy and usefulness of data.

Fig. 9 .
Fig. 9. Privacy-utility trade-off characterized by a measuring wasted energy versus information leakage.

Fig. 14 .
Fig. 14.Alice and Bob have single transmit and receive antennas and exchange pilot signals X over a Rayleigh fading channel H.A MiM, Mallory, with multiple transmit antennas injects a pre-coded signal PX J , such that the received signals at Alice and Bob are equal W = H A T P = H B T P.