By Topic

IEEE Quick Preview
  • Abstract

SECTION I

INTRODUCTION

A LINEAR scrambler is usually used in a communication system to convert a data bit sequence into a pseudorandom sequence that is free from long strings of 1 s and 0s. It is easy to implement with a wide variety of scrambler polynomials to choose from and the choice of which one to use has relatively little impact on the performance of the communication system. However, basing on the scrambler reconstruction technique detailed in [1], it is found in [2] that not all scrambler polynomials offer equal protection against reconstruction. In this work, we examined further the reconstruction of the feedback polynomial of a linear scrambler assuming the source bits are being encoded with forward error correction coding before being scrambled. The findings of this work are envisaged to aid the design of secured digital communication systems implemented in a flexible platform such as software defined radio (SDR). Our results point out what can be done to prevent reconstruction of a communication system; for example, various scrambler reconstruction techniques were proposed in [1], [2], [3], [4], [5]. The proposed approach will also add to the plethora of techniques for designing an intelligent receiver which can adapt itself to the different building blocks of the transmitter such as those proposed in [6], [7], [8]. It is also an extension of the results and findings on recovery of error-correcting codes which include linear block codes [9], [10], [11] and convolutional codes [12], [13], [14], [15], [16].

There are generally two types of linear scrambler, namely synchronous scrambler and self-synchronized scrambler. Both types of scrambler usually consist of a LFSR whose output sequence Formula${(s_{t})}_{t\geq 0}$ is combined with the input sequence Formula${(x_{t})}_{t\geq 0}$ and the result is the scrambled sequence Formula${(y_{t})}_{t\geq 0}$, i.e., Formula TeX Source $$y_{t}=x_{t}\oplus s_{t}\quad t\geq 0\eqno{\hbox{(1)}}$$ where Formula$\oplus$ denotes modulo 2 summation. In this paper, for simplicity, only synchronous scramblers are considered. Reconstruction of a synchronous scrambler consists of reconstructing the feedback polynomial of the LFSR as well as its initial state. When some input and scrambled bits are known, the Berlekamp-Massey algorithm [3] can be used to reconstruct the feedback polynomial of the LFSR. In [4], a method is proposed to estimate the initial state of the LFSR from the scrambled sequence only, and by assuming that the feedback polynomial of the LFSR is also known. Recently, in [1], an algorithm is proposed by Cluzeau for reconstructing the feedback polynomial of the LFSR by only using the scrambled sequence. In the following, this algorithm will be referred to as Cluzeau's algorithm.

Although Cluzeau's algorithm is much more efficient than the brute force search algorithm in the recovery of the feedback polynomials of the LFSR, it is based on the critical assumption that the source bits, which xor directly with the outputs of the LFSR, are distributed with a biased probability Formula$\hbox{Pr}(x_{t}=1)=(1/2)-\varepsilon$, where Formula$\varepsilon\neq 0$. Although this assumption usually holds for natural sources, when the source bits pass through a channel encoder before they are scrambled, the bias existing in the bit sequence might become very small. Consequently, the number of bits required to do the reconstruction becomes exorbitantly large. To deal with this problem, in this paper, a scheme is proposed to use the property of “dual words”, which are orthogonal to the codewords generated by the channel encoder, instead of the bias existing in the encoded bit sequence, to achieve reconstruction of the scrambler. It can be observed that by using the proposed scheme, the number of bits required for reconstruction is reduced drastically.

The paper is organized as follows. In Section II, Cluzeau's algorithm is reviewed. In Section III, the bias existing in the encoded bit sequence after a channel encoder is analyzed. In Section IV, the scheme to recover the feedback polynomial as well as the initial state of the LFSR in a linear scrambler placed after a channel encoder is proposed. In Section V, the problem of reconstruction of the scrambler in the presence of channel noise is investigated. Some security propositions are given in the concluding section in Section VI.

SECTION II

CLUZEAU'S ALGORITHM FOR RECONSTRUCTING A SYNCHRONOUS SCRAMBLER

In a synchronous scrambler, Formula$s_{t}$ is generated independently of Formula$x_{t}$ and Formula$y_{t}$, as shown in Fig. 1.

Figure 1
Fig. 1. Structure of synchronous scrambler.

Instead of brute force searching for the feedback polynomial Formula$P(X)$ directly, Cluzeau's algorithm searches for sparse multiples of Formula$P(X)$ with the degree of the sparse multiples varying from low to high. After two multiples of Formula$P(X)$ are detected, it returns the nontrivial greatest common divisor (gcd) of the two detected multiples as the detected feedback polynomial. The determination of whether a sparse polynomial is a multiple of Formula$P(X)$ or not is based on a statistical test on the absolute value of a variable Formula$Z$, which is given by Formula TeX Source $$Z=\sum_{t=i_{d-1}}^{N-1}(-1)^{z_{t}},\eqno{\hbox{(2)}}$$ where Formula$z_{t}$ is a modulo 2 summation of Formula$d$ scrambled bits, i.e., Formula$z_{t}=y_{t}\oplus\bigoplus_{j=1}^{d-1}y_{t-i_{j}}$, Formula$(0<i_{1}<i_{2}<\cdots<i_{d-1})$, and Formula$N$ is the number of bits required for the reconstruction. Let Formula$Q(X)=1+\sum_{j=1}^{d-1}X^{i_{j}}$. When Formula$Q(X)$ is a multiple of Formula$P(X)$, we have Formula TeX Source $$z_{t}=y_{t}\oplus\bigoplus_{j=1}^{d-1}y_{t-i_{j}}=x_{t}\oplus\bigoplus_{j=1}^{d-1}x_{t-i_{j}}\eqno{\hbox{(3)}}$$ since Formula$s_{t}\oplus\bigoplus_{j=1}^{d-1}s_{t-i_{j}}=0$ and Formula$y_{t}=x_{t}\oplus s_{t}$. According to the statistical analysis results given in [1], Formula$z_{t}$ is biasedly distributed with Formula$\hbox{Pr}(z_{t}=1)=(1/2)[1-(2\varepsilon)^{d}]$, if the input bits are biasedly distributed with Formula$\hbox{Pr}(x_{t}=1)=(1/2)-\varepsilon$, where Formula$\varepsilon\neq 0$. Consequently, the value of Formula$Z$, i.e., Formula$\sum_{t=i_{d-1}}^{N-1}(-1)^{z_{t}}=(N-i_{d-1})-2\sum_{t=i_{d-1}}^{N-1}z_{t}$, is Gaussian distributed with the mean value Formula$\mu$ given by Formula TeX Source $$\mu=(N-i_{d-1})(2\varepsilon)^{d}\eqno{\hbox{(4)}}$$ and the variance Formula$\sigma^{2}$ [5] given by Formula TeX Source $$\sigma^{2}\leq(N-i_{d-1})\left[1+d\left((2\varepsilon)^{2}-(2\varepsilon)^{2d}\right)\right].\eqno{\hbox{(5)}}$$

It can also be shown that when Formula$Q(X)$ is not a multiple of Formula$P(X)$, Formula$\hbox{Pr}(z_{t}=0)=1/2$, implying that Formula$Z$ has a Gaussian distribution with the mean value 0 and the variance Formula$N-i_{d-1}$. The two distributions are depicted in Fig. 2.

From Fig. 2, it can be observed that when the two distributions of Formula$Z$ have a small enough intersection, a threshold Formula$T$ can be used to determine whether Formula$Q(X)$ is a multiple of Formula$P(X)$, i.e., when Formula$\vert Z\vert<T$, Formula$Q(X)$ is not a multiple of Formula$P(X)$; otherwise, Formula$Q(X)$ is a multiple of Formula$P(X)$. The threshold Formula$T$ and the number of bits required for the reconstruction Formula$N$ depend on two factors, i.e., the false-alarm probability Formula$P_{f}$ and the nondetection probability Formula$P_{n}$.

Figure 2
Fig. 2. Distributions of Z.

Let Formula TeX Source $$a=\Phi^{-1}\left(1-{P_{f}\over 2}\right)={T\over\sqrt{N-i_{d-1}}}\eqno{\hbox{(6)}}$$ and Formula TeX Source $$b=-\Phi^{-1}(P_{n})={T-\vert\mu\vert\over\sigma},\eqno{\hbox{(7)}}$$ where Formula$\Phi$ denotes the normal distribution function. From (6) and (7), it can be derived that the threshold Formula$T$ is Formula TeX Source $$T={a(a+b\bar{\sigma}_{l})\over\left(2\vert\varepsilon\vert\right)^{d}},\eqno{\hbox{(8)}}$$ and the number of bits required for the reconstruction is Formula TeX Source $$N=i_{d-1}+{(a+b\bar{\sigma}_{l})^{2}\over(2\varepsilon)^{2d}},\eqno{\hbox{(9)}}$$ where Formula$\bar{\sigma}_{l}$ is the normalized upper bound of Formula$\sigma$, which is given by Formula TeX Source $$\bar{\sigma}_{l}=\sqrt{\left[1+d\left((2\varepsilon)^{2}-(2\varepsilon)^{2d}\right)\right]}.\eqno{\hbox{(10)}}$$ More detailed description of Cluzeau's algorithm can be found in [1] and [5].

SECTION III

BIAS AFTER CHANNEL ENCODER

In many communication systems, error correcting codes are used to combat errors introduced by the communication channel. In this work, we considered the case when the channel encoder is placed between the source and the scrambler as shown in Fig. 3.

In the following, the bias existing in the encoded bit sequence after a channel encoder will be analyzed. Two commonly used error correcting codes are considered, i.e., linear block code and convolutional code.

A. Bias of a Bit Sequence After a Linear Block Encoder

Figure 3
Fig. 3. Chain of scrambler and channel encoder.

Generally, for a Formula$(n,k)$ binary linear block code Formula${\cal C}$, where Formula$k$ is the number of information bits and Formula$n$ is the number of coded bits, a Formula$k\times n$ generator matrix can be defined by the following Formula$k\times n$ array: Formula TeX Source $${\bf G}_{\bf b}=\left[\matrix{{\bf g}_{\bf 0}\cr{\bf g}_{\bf 1}\cr\vdots\cr{\bf g}_{{\bf k}-{\bf 1}}}\right]=\left[\matrix{g_{0,0}&g_{0,1}&\cdots&g_{0,n-1}\cr g_{1,0}&g_{1,1}&\cdots&g_{1,n-1}\cr\vdots&&&\cr g_{k-1,0}&g_{k-1,1}&\cdots&g_{k-1,n-1}}\right],\eqno{\hbox{(11)}}$$ where Formula$g_{i,j}=(0\ \hbox{or}\ 1)$ and Formula${\bf g}_{\bf 0},{\bf g}_{\bf 1}\ldots{\bf g}_{{\bf k}-{\bf 1}}$ are linearly independent Formula$n$-tuples that form a basis for Formula${\cal C}$. Considering a Formula$k$-tuple message, i.e., Formula TeX Source $${\bf x}=(x_{0},x_{1},\ldots,x_{k-1}),$$ the encoder transforms the message Formula${\bf x}$ independently into an Formula$n$-tuple codeword Formula${\bf c}=(c_{0},c_{1},\ldots,c_{n-1})$ by Formula TeX Source $${\bf c}={\bf x}\cdot{\bf G}_{\bf b}=(x_{0},x_{1},\ldots,x_{k-1})\left[\matrix{{\bf g}_{\bf 0}\cr{\bf g}_{\bf 1}\cr\vdots\cr{\bf g}_{{\bf k}-{\bf 1}}}\right].\eqno{\hbox{(12)}}$$ Any encoded bit Formula$c_{i}$ Formula$(i=0,1,\ldots,n-1)$ can be written as a linear binary summation of the message bits, i.e., Formula TeX Source $$c_{i}=g_{0,i}x_{0}\oplus g_{1,i}x_{1}\oplus\cdots\oplus g_{k-1,i}x_{k-1}.\eqno{\hbox{(13)}}$$

Suppose the source bit sequence is produced by a biased and memoryless source with bias Formula$\varepsilon$, and the number of nonzero terms (the weight) in the Formula$i$th column of Formula${\bf G}_{\bf b}$ is Formula$L_{i}$ Formula$(i=0,1,\ldots,n-1)$, then the probability that Formula$c_{i}=1$ is given by Formula TeX Source $$\eqalignno{{\rm Pr}(c_{i}=1)=&\,\sum_{l=1,3,\ldots}^{L_{i}}\left({L_{i}\over l}\right)\left({1\over 2}-\varepsilon\right)^{l}\left({1\over 2}+\varepsilon\right)^{L_{i}-l}\cr=&\,{1\over 2}\left[1-(2\varepsilon)^{L_{i}}\right].&\hbox{(14)}}$$ According to (14), the bias existing in the Formula$i$th encoded bit Formula$c_{i}$ is Formula$\varepsilon_{c_{i}}=1/2(2\varepsilon)^{L_{i}}$. As Formula$L_{i}\geq 1$ and Formula$\varepsilon\leq 0.5$, we have Formula$\varepsilon_{c_{i}}\leq\varepsilon$. The bias existing in the whole encoded bit sequence, Formula$\varepsilon_{bc}$, can be expressed by Formula TeX Source $$\varepsilon_{bc}={1\over n}\sum_{i=0}^{n-1}\varepsilon_{c_{i}}\leq{1\over n}\sum_{i=0}^{n-1}\varepsilon=\varepsilon.\eqno{\hbox{(15)}}$$ From the above equation, it can be observed that the bias existing in the encoded bit sequence is less than or equal to the bias existing in the bit sequence before the encoder. Consider the systematic encoder, for which Formula$L_{0}=L_{1}=\cdots=L_{k-1}=1$ and Formula$L_{k},L_{k+1},\cdots,L_{n-1}>1$. The bias existing in the encoded bit sequence can be roughly estimated by Formula TeX Source $$\varepsilon_{bc}={1\over n}\sum_{i=0}^{n-1}\varepsilon_{c_{i}}={k\over n}\varepsilon+{1\over 2n}\sum_{i=k}^{n-1}(2\varepsilon)^{L_{i}}\approx{k\over n}\varepsilon.\eqno{\hbox{(16)}}$$

To verify (16), the bias existing in the bit sequences of the output of the BCH encoders are obtained by computer simulations and results are shown in Table I. In each simulation, a bit sequence which contains Formula$10000\times k$ information bits is input into a BCH encoder (systematic encoder) and the simulation is repeated 100 times. The bias existing in the bit sequence before the encoder is set to 0.1. From Table I, it can be observed that the bias after the BCH encoder determined by the simulation results matches very well with that computed by (16).

Table 1
TABLE I BIAS AFTER SOME BCH ENCODERS

B. Bias of a Bit Sequence After a Convolutional Encoder

An Formula$(n,k,m)$ convolutional code, where Formula$k$ is the number of information bits, Formula$n$ is the number of coded bits and Formula$m$ is the constraint length, can be defined by a Formula$k\times n$ generator matrix Formula${\bf G}_{\bf c}$ which consists of Formula$k\times n$ binary “impulse responses” Formula${\bf g}_{\bf j}^{({\bf i})}$, where Formula$i$ denotes the Formula$i$th input Formula$(0\leq i<k)$ and Formula$j$ denotes the Formula$j$th output Formula$(j\leq j<n)$, i.e., Formula TeX Source $${\bf G}_{\bf c}\!=\!({\bf G}_{\bf 0},\ldots,{\bf G}_{{\bf n}-{\bf 1}})\!=\!\left[\matrix{{\bf g}_{\bf 0}^{({\bf 0})}&{\bf g}_{\bf 1}^{({\bf 0})}&\cdots&{\bf g}_{{\bf n}-{\bf 1}}^{({\bf 0})}\cr{\bf g}_{\bf 0}^{({\bf 1})}&{\bf g}_{\bf 1}^{({\bf 1})}&\cdots&{\bf g}_{{\bf n}-{\bf 1}}^{({\bf 1})}\cr\vdots&\vdots&\ddots&\vdots\cr{\bf g}_{\bf 0}^{({\bf k}-{\bf 1})}&{\bf g}_{\bf 1}^{({\bf k}-{\bf 1})}&\cdots&{\bf g}_{{\bf n}-{\bf 1}}^{({\bf k}-{\bf 1})}}\right],\eqno{\hbox{(17)}}$$ where Formula TeX Source $${\bf g}_{\bf j}^{({\bf i})}=\left(g_{j}^{(i)}(0),g_{j}^{(i)}(1),\ldots,g_{j}^{(i)}(m-1)\right).\eqno{\hbox{(18)}}$$ Supposing the bit sequence at the Formula$i$th input of the convolutional encoder is Formula${\bf x}_{\bf i}=(x_{i,0},x_{i,1},\ldots)$, the bit sequence at the Formula$j$th output is given by Formula TeX Source $${\bf c}_{\bf j}={\bf x}_{\bf 0}\ast{\bf g}_{\bf j}^{({\bf 0})}\oplus{\bf x}_{\bf 1}\ast{\bf g}_{\bf j}^{({\bf 1})}\oplus\cdots\oplus{\bf x}_{{\bf k}-{\bf 1}}\ast{\bf g}_{\bf j}^{({\bf k}-{\bf 1})}=\sum_{i=0}^{k-1}{\bf x}_{\bf i}\ast{\bf g}_{\bf j}^{({\bf i})},\eqno{\hbox{(19)}}$$

Figure 4
Fig. 4. Dot product of a dual word of a linear block code with the received bit sequence.

where ∗ is the convolution operation. Suppose the number of nonzero terms in Formula${\bf g}_{\bf j}^{({\bf i})}$ is Formula$\mathtilde{L}_{i,j}$, then the bias of the whole encoded bit sequence, Formula$\varepsilon_{cc}$, can be expressed as Formula TeX Source $$\varepsilon_{cc}={1\over kn}\sum_{i=0}^{k-1}\sum_{j=0}^{n-1}{1\over 2}(2\varepsilon)^{\mathtilde{L}_{i,j}}.\eqno{\hbox{(20)}}$$

Table 2
TABLE II BIAS AFTER SOME RATE 1/2 CONVOLUTIONAL ENCODERS

To verify (20), the bias existing in the bit sequences after some optimum rate 1/2 convolutional code encoders [17] are obtained by computer simulations and results are shown in Table II. In each simulation, a bit sequence which contains 1,000,000 information bits is input into a convolutional encoder and the simulation is repeated 1000 times. The bias existing in the bit sequence before the encoder is assumed to be 0.1.

From Table II, it can again be observed that in general, the bias existing in the bit sequence after the sequence has passed through a convolutional encoder is very low as Formula$\mathtilde{L}_{i,j}$ is normally Formula$>2$.

SECTION IV

RECONSTRUCTION OF THE SCRAMBLER AFTER A CHANNEL CODE

In the last section, our analysis shows that after passing through a channel encoder, the bias existing in the bit sequence drops, especially when convolutional codes are used. In this section, a novel scheme for reconstruction of the feedback polynomial and initial state of the LFSR in a scrambler which is placed after a channel encoder is proposed. This scheme exploits the property of dual words instead of the bias existing in the encoded bit sequence. In the following, the reconstruction of the scrambler placed after a linear block code will be considered first and after that, the proposed scheme will be extended to the case of convolutional code.

A. Reconstruction of the Scrambler After Linear Block Code

1) Reconstruction of the Feedback Polynomial of the LFSR

Consider a Formula$(n,k)$ binary linear block code Formula${\cal C}$ with Formula$k\times n$ generator matrix Formula${\bf G}_{\bf b}$. Rows in Formula${\bf G}_{\bf b}$ form a basis for Formula${\cal C}$. The parity-check matrix for Formula${\cal C}$ is a Formula$(n-k)\times n$ matrix Formula${\bf H}_{\bf b}$ whose rows span the dual code Formula${\cal C}^{\perp}$, i.e., Formula TeX Source $${\bf H}_{\bf b}=\left[\matrix{h_{0,0}&h_{0,1}&\cdots&h_{0,n-1}\cr h_{1,0}&h_{1,1}&\cdots&h_{1,n-1}\cr\vdots&&&\cr h_{n-k-1,0}&h_{n-k-1,1}&\cdots&h_{n-k-1,n-1}}\right]\eqno{\hbox{(21)}}$$ and Formula${\bf G}_{\bf b}\cdot{\bf H}_{\bf b}^{T}=0$. Formula${\bf h}_{\bf 0}$, Formula${\bf h}_{\bf 1},\ldots,{\bf h}_{{\bf n}-{\bf k}-{\bf 1}}$ denote rows Formula$0,1,\ldots,n-k-1$ in Formula${\bf H}_{\bf b}$ and they are called dual words of Formula${\cal C}$.

To use the property of dual words to reconstruct the feedback polynomial of the LFSR, firstly, the received bit sequence Formula${\bf y}=(y_{0},y_{1},\ldots)$ is divided into blocks Formula${\bf y}_{\bf 0},{\bf y}_{\bf 1},\ldots$, with each block containing Formula$n$ bits, i.e., Formula${\bf y}_{\bf t}=(y_{nt},y_{nt+1},\ldots,y_{(n+1)t-1})$. Then, a new sequence Formula${\bf r}=(r_{0},r_{1},\ldots)$ can be generated, in which each bit Formula$r_{t}$ is the dot product of Formula${\bf y}_{\bf t}$ with a dual word, say Formula${\bf h}_{\bf 0}$, as shown in Fig. 4.

From Fig. 4, it can be seen that Formula TeX Source $$\eqalignno{r_{0}=&\,{\bf y}_{\bf 0}\cdot{\bf h}_{\bf 0}=\sum_{i=0}^{n-1}y_{i}\cdot h_{0,i}\cr=&\,y_{0}\cdot h_{0,0}\oplus y_{1}\cdot h_{0,1}\oplus\cdots\oplus y_{n-1}\cdot h_{0,n-1}\cr r_{1}=&\,{\bf y}_{\bf 1}\cdot{\bf h}_{\bf 0}=\sum_{i=0}^{n-1}y_{n+i}\cdot h_{0,i}\cr=&\,y_{n}\cdot h_{0,0}\oplus y_{n+1}\cdot h_{0,1}\oplus\cdots\oplus y_{2n-1}\cdot h_{0,n-1}\cr&{\hskip-15pt}\vdots.&\hbox{(22)}}$$

As Formula${\bf y}_{\bf t}={\bf c}_{\bf t}\oplus{\bf s}_{\bf t}$, Formula$(t=0,1,2\ldots)$, where Formula${\bf c}_{\bf t}$ is the Formula$n$-tuple codeword at time index Formula$t$ and Formula${\bf s}_{\bf t}=(s_{nt},s_{nt+1},\ldots,s_{n(t+1)-1})$ are the outputs of the scrambler, we have Formula TeX Source $$r_{t}={\bf y}_{\bf t}\cdot{\bf h}_{\bf 0}={\bf c}_{\bf t}\cdot{\bf h}_{\bf 0}\oplus{\bf s}_{\bf t}\cdot{\bf h}_{\bf 0}.\eqno{\hbox{(23)}}$$ According to the property of dual words, Formula${\bf c}_{\bf t}\cdot{\bf h}_{\bf 0}=0$; therefore, Formula$r_{t}$ can be written as Formula TeX Source $$r_{t}={\bf y}_{\bf t}\cdot{\bf h}_{\bf 0}={\bf s}_{\bf t}\cdot{\bf h}_{\bf 0},\eqno{\hbox{(24)}}$$ i.e., Formula TeX Source $$\eqalignno{r_{0}=&\,s_{0}\cdot h_{0,0}\oplus s_{1}\cdot h_{0,1}\oplus\cdots\oplus s_{n-1}\cdot h_{0,n-1}\cr r_{1}=&\,s_{n}\cdot h_{0,0}\oplus s_{n+1}\cdot h_{0,1}\oplus\cdots\oplus s_{2n-1}\cdot h_{0,n-1}\cr&{\hskip-15pt}\vdots.&\hbox{(25)}}$$

Proposition 1

For a set of Formula$d-1$ integers Formula$(0<i_{1}<i_{2}<\cdots<i_{d-1})$, if Formula$r_{t}\oplus r_{t-i_{1}}\oplus r_{t-i_{2}}\oplus\cdots\oplus r_{t-i_{d-1}}\equiv 0$ for any Formula$t\geq i_{d-1}$, then Formula$1+X^{ni_{1}}+X^{ni_{2}}+\cdots+X^{ni_{d-1}}$ is a multiple of the feedback polynomial Formula$P(X)$.

Proof

According to (23), Formula$r_{t}$ can be written as Formula TeX Source $$r_{t}=s_{nt}\cdot h_{0,0}\oplus s_{nt+1}\cdot h_{0,1}\oplus\cdots\oplus s_{n(t+1)-1}\cdot h_{0,n-1}.\eqno{\hbox{(26)}}$$ Similarly, Formula TeX Source $$\eqalignno{r_{t-i_{1}}=&\,s_{n(t-i_{1})}\cdot h_{0,0}\oplus s_{n(t-i_{1})+1}\cdot h_{0,1}\oplus\cr&\cdots\oplus s_{n(t-i_{1}+1)-1}\cdot h_{0,n-1}\cr&{\hskip-15pt}\vdots\cr r_{t-i_{d-1}}=&\,s_{n(t-i_{d-1})}\cdot h_{0,0}\oplus s_{n(t-i_{d-1})+1}\cdot h_{0,1}\oplus\cr&\cdots\oplus s_{n(t-i_{d-1}+1)-1}\cdot h_{0,n-1}.&\hbox{(27)}}$$ Therefore, Formula TeX Source $$\eqalignno{&r_{t}\!\oplus\!r_{t-i_{1}}\!\oplus\!r_{t-i_{2}}\!\oplus\!\cdots\!\oplus\!r_{t-i_{d-1}}\cr&\quad\!=\!\left(s_{nt}\!\oplus\!s_{nt-n{i_{1}}}\!\oplus\!\cdots\!\oplus\!s_{nt-ni_{d-1}}\right)\cdot h_{0,0}\cr&\qquad\oplus\left(s_{nt+1}\!\oplus\!s_{nt+1-n{i_{1}}}\!\oplus\!\cdots\!\oplus\!s_{nt+1-ni_{d-1}}\right)\cdot h_{0,1}\!\oplus\!\cdots\cr&\qquad\oplus\left(s_{n(t+1)-1}\!\oplus\!s_{n(t+1)-1-ni_{1}}\!\oplus\!\cdots\!\oplus\!s_{n(t+1)-1-ni_{d-1}}\right)\cr&\qquad\cdot h_{0,n-1}.&\hbox{(28)}}$$

As Formula${\bf h}_{\bf 0}$ is a dual word, Formula$h_{0,0},h_{0,1},\ldots,h_{0,n-1}$ cannot be all 0. Therefore, Formula$r_{t}\oplus r_{t-i_{1}}\oplus r_{t-i_{2}}\oplus\cdots\oplus r_{t-i_{d-1}}\equiv 0$ only holds when Formula$s_{k}\oplus s_{k-n{i_{1}}}\oplus\cdots\oplus s_{k-ni_{d-1}}\equiv 0$, i.e., Formula$s_{k}\equiv s_{k-n{i_{1}}}\oplus\cdots\oplus s_{k-ni_{d-1}}$. It means Formula$1+X^{ni_{1}}+X^{ni_{2}}+\cdots+X^{ni_{d-1}}$ is a multiple of the feedback polynomial Formula$P(X)$.Formula$\hfill\square$

It is interesting to note that since the encoded bits are removed according to (24), the sequence Formula${\bf r}$ can be taken as a combination of some Formula$n$th decimated sequences of the original sequence produced by the LFSR. Some properties of such a decimated sequence have been found in [19]. Actually, proposition 1 can also be proved by using properties of the decimated sequence proposed in [19].

From Proposition 1, it can be observed that when the sequence Formula${\bf r}$ is obtained, Cluzeau's algorithm, with only minor changes, can be applied to Formula${\bf r}$ to find the feedback polynomial of the LFSR. In the following, the scheme to determine the feedback polynomial of the LFSR in a scrambler placed after a channel encoder is described:

  1. Divide the received bit sequence Formula${\bf y}=(y_{0},y_{1},\ldots)$ into blocks Formula${\bf y}_{\bf 0},{\bf y}_{\bf 1},\ldots$, with each block containing Formula$n$ bits.
  2. Generate a new bit sequence Formula${\bf r}$, in which each bit Formula$r_{t}$ is the dot product of the received block with a dual word.
  3. For Formula$(i_{1},\ldots,i_{d-1})$, Formula$0<i_{1}<\ldots<i_{d-1}\leq D$, compute the number of bits in Formula${\bf r}$, Formula$N_{r}$, required for the summation of Formula$\mathtilde{Z}$. How to compute Formula$N_{r}$ will be described later. Let Formula$N_{c}=i_{d-1}+N_{r}$.
  4. Initialize Formula$\mathtilde{Z}$ with Formula$\mathtilde{Z}=0$.
  5. For Formula$t$ varying from Formula$i_{d-1}+1$ to Formula$N_{c}$, compute Formula TeX Source $$\mathtilde{z}_{t}=r_{t}\oplus\bigoplus_{j=1}^{d-1}r_{t-i_{j}}\eqno{\hbox{(29)}}$$ and Formula TeX Source $$\mathtilde{Z}=\mathtilde{Z}+(-1)^{\mathtilde{z}_{t}}\eqno{\hbox{(30)}}$$
  6. If Formula$\mathtilde{Z}=N_{r}$, store Formula$Q(X)=1+\sum_{j=1}^{d-1}X^{i_{j}\cdot n}$ in a table.
  7. For Formula$Q^{\prime}(X)\neq Q(X)$ in the table, compute the nontrivial greatest common divisor (gcd) of Formula$(Q(X),Q^{\prime}(X))$.
Figure 5
Fig. 5. Distributions of Formula$\mathtilde{Z}$.

Steps 1 to 4 are repeated until a Formula${\rm gcd}(Q(X),Q^{\prime}(X))=P(X)$ Formula$(P(X)\neq 1)$ is found or all combinations of Formula$(i_{1},\ldots,i_{d-1})$ are tested.

The scheme proposed above is based on the fact that if Formula$Q(X)=1+\sum_{j=1}^{d-1}X^{i_{j}\cdot n}$ is a multiple of the feedback polynomial, Formula$\mathtilde{z}_{t}$ will always be 0 for Formula$t$ varying from Formula$i_{d-1}+1$ to Formula$N_{c}$, and therefore, the value of Formula$\mathtilde{Z}$ should be Formula$N_{c}-i_{d-1}=N_{r}$. If Formula$Q(X)=1+\sum_{j=1}^{d-1}X^{i_{j}\cdot n}$ is not a multiple of the feedback polynomial, Pr Formula$(\mathtilde{z}_{t}=1)=0.5$ and Formula$\mathtilde{Z}$ will be Gaussian distributed with the mean value 0 and the variance Formula$N_{r}$. The distribution of Formula$\mathtilde{Z}$ is shown in Fig. 5.

Similar to Cluzeau's algorithm, the number of bits in Formula${\bf r}$ used in the summation of Formula$\mathtilde{Z},N_{r}$, will affect the false-alarm probability Formula$P_{f}$ and nondetection probability Formula$P_{n}$. As shown in Fig. 5, the value of Formula$\mathtilde{Z}$ is always equal to Formula$N_{r}$ when Formula$Q(X)$ is a multiple of Formula$P(X)$. That means Formula$P_{n}=0$ when the proposed scheme is used. The false-alarm can happen only when Formula$\mathtilde{Z}=N_{r}$ but Formula$Q(X)$ is not a multiple of Formula$P(X)$, and the probability is given by Formula TeX Source $$\eqalignno{P_{f}\!=\!&\,{\rm Pr}\left(\mathtilde{Z}\!=\!N_{r}\vert{\rm when}\ Q(X)\ {\rm is}\ {\rm not}\ {\rm a}\ {\rm multiple}\ {\rm of}\ P(X)\right)\cr\!=\!&\,{1\over\sqrt{2\pi\sigma^{2}}}e^{-{(N_{r}-\mu)^{2}\over 2\sigma^{2}}}\vert_{\mu\!=\!0,\sigma^{2}\!=\!N_{r}}\cr\!=\!&\,{1\over\sqrt{2\pi N_{r}}}e^{-{N_{r}\over 2}}.&\hbox{(31)}}$$

Table 3
TABLE III SIMULATION RESULTS FOR RECONSTRUCTION OF SCRAMBLERS PLACED AFTER LINEAR BLOCK CODES

It can be observed that a small value of Formula$N_{r}$, say 50, can already make Formula$P_{f}<10^{-10}$. The total number of bits in Formula${\bf r}$ used in the reconstruction is Formula$i_{d-1}+N_{r}$. According to (22) and Fig. 4, each bit in Formula${\bf r}$ is a dot product of a dual word with a received block consisting of Formula$n$ bits. Therefore, the total number of bits required by the proposed scheme is Formula TeX Source $$N_{c}=(i_{d-1}+N_{r})n\approx(i_{d-1}+50)n.\eqno{\hbox{(32)}}$$

Comparing (32) with (9), it can be observed that the number of bits required to do the reconstruction by the proposed algorithm does not depend on the bias Formula$\varepsilon$ anymore. Obviously, when Formula$\varepsilon$ is small, it is most probably that Formula$N_{c}<N$. To show this fact clearer, the proposed algorithm is applied to reconstruct some feedback polynomials of LFSR in synchronous scramblers placed after different linear block codes. The number of bits required by the proposed algorithm Formula$(N_{c})$ are shown in Table III. The number of bits required by Cluzeau's algorithm Formula$(N)$ are also shown in Table III for comparison. In the simulation, it is assumed that the bias existing in the bit sequence before the block encoder is 0.1 and Formula$d=3$. For Cluzeau's algorithm, it is assumed that Formula$P_{f}=10^{-7}$ and Formula$P_{n}=10^{-5}$. For the proposed algorithm, it is assumed that Formula$N_{r}=50$, which will lead to Formula$P_{n}=0$ and Formula$P_{f}<10^{-10}$.

From Table III, it can be observed that the number of bits required by the proposed algorithm to do the reconstruction is much lower than that required by Cluzeau's algorithm, especially when Hamming (7,4) code is used. This is because the property of the dual word is exploited by the proposed algorithm instead of the bias in the encoded bit sequence. Since the code rate of Hamming (7,4) code is the lowest among the 3 types of codes shown in Table III, the bias existing in the encoded bit sequence is also the lowest, and the number of bits required to do the reconstruction is the longest when Cluzeau's algorithm is used.

It should be noted that in Table III, the gcd of the two detected multiples is normally not the feedback polynomial but a multiple of the feedback polynomial. Suppose the gcd of the two detected multiples is Formula$F(X)$. To find the correct feedback polynomial, Formula$F(X)$ is firstly factorized. The correct feedback polynomial can then be found by descrambling the bit sequence by using each polynomial factor of Formula$F(X)$ respectively, and see which one would lead to a descrambled bit sequence that satisfies the condition that the dot product of each codeword in the sequence with the dual words Formula${\bf h}_{\bf i}$, Formula$i=0,1\ldots,n-k-1$ equals to 0. For example, the first two detected multiples in Table III are Formula$x^{112}+x^{7}+1$ and Formula$x^{266}+x^{245}+1$. Their gcd is Formula$x^{56}+x^{42}+x^{35}+x^{21}+1$, which is the product of 3 polynomial factors Formula$x^{24}+x^{20}+\cdots+1$, Formula$x^{24}+x^{19}+\cdots+1$ and Formula$x^{8}+x^{4}+x^{3}+x^{2}+1$. After descrambling the bit sequence by each polynomial factor, it is found that only Formula$x^{8}+x^{4}+x^{3}+x^{2}+1$ leads to a sensible descrambled sequence. Hence, it is the correct feedback polynomial.

2) Reconstruction of the Initial State of the LFSR

After the feedback polynomial of the LFSR is determined, to descramble the received bit sequence, the initial state of the LFSR needs also to be recovered. In the following, a scheme to determine the initial state of the LFSR is described. This scheme is similar to the scheme proposed in [4], which also uses the encoder redundancy to determine the initial state of the LFSR.

Suppose the feedback polynomial of the LFSR is denoted by Formula$P(X)=1+a_{1}X+a_{2}X^{2}+\cdots+a_{L}X^{L}$, where Formula$L$ is the degree of the feedback polynomial and Formula$a_{i}\in\{0,1\}$, then the output of the LFSR at time index Formula$t$ is Formula TeX Source $$s_{t}=\sum_{i=1}^{L}a_{i}s_{t-i}.\eqno{\hbox{(33)}}$$ Suppose the state of the LFSR at time index Formula$t$ is Formula TeX Source $${\bf S}_{t}=(s_{t}\quad s_{t+1}\quad s_{t+2}\quad\ldots\quad s_{t+L-1})^{T}\eqno{\hbox{(34)}}$$ and a transition matrix Formula$F$ is defined as Formula TeX Source $$F=\left(\matrix{0&1&0&\ldots&0&0\cr 0&0&1&\ldots&0&0\cr\vdots&\vdots&\vdots&\ddots&\vdots&\vdots\cr 0&0&0&\ldots&0&1\cr 1&a_{L-1}&a_{L-2}&\ldots&a_{2}&a_{1}}\right).\eqno{\hbox{(35)}}$$ According to (33) and the property of the LFSR, the LFSR state at time index Formula$t+i$, Formula$(i=0,1,2,\ldots)$ can be written as Formula TeX Source $${\bf S}_{t+i}=F^{i}\cdot{\bf S}_{t}.\eqno{\hbox{(36)}}$$ Let the Formula$1\times L$ array Formula$U$ be defined as Formula TeX Source $$U=(1\quad 0\quad 0\quad\cdots\quad 0),\eqno{\hbox{(37)}}$$Formula$s_{t}$ can then be calculated by Formula TeX Source $$s_{t}=U\cdot{\bf S}_{t}=U\cdot F^{t}\cdot{\bf S}_{0}.\eqno{\hbox{(38)}}$$ According to (26) and (38), Formula$r_{0}$ can be rewritten as Formula TeX Source $$\eqalignno{{\hskip-20pt}r_{0}\!=\!&\,U\!\cdot\!{\bf S}_{0}\!\cdot\!h_{0,0}\oplus U\!\cdot\!{\bf S}_{1}\!\cdot\!h_{0,1}\oplus\cdots\oplus U\!\cdot\!{\bf S}_{n-1}\!\cdot\!h_{0,n-1}\cr{\hskip-20pt}=&\,U\!\cdot\!(I_{L}\!\cdot\!h_{0,0}\oplus F\!\cdot\!h_{0,1}\oplus\cdots\oplus F^{n-1}\!\cdot\!h_{0,n-1})\!\cdot\!{\bf S}_{0}&\hbox{(39)}}$$ where Formula$I_{L}$ is a Formula$L\times L$ identity matrix. Similarly, Formula$r_{1},r_{2},\ldots,r_{t}$ can be rewritten as Formula TeX Source $$\eqalignno{r_{1}\!=\!&\,U\cdot(I_{L}\cdot h_{0,0}\!\oplus\!F\cdot h_{0,1}\!\oplus\!\cdots\!\oplus\!F^{n-1}\cdot h_{0,n-1})\cdot F^{n}\cdot{\bf S}_{0}\cr r_{2}\!=\!&\,U\cdot(I_{L}\cdot h_{0,0}\!\oplus\!F\cdot h_{0,1}\!\oplus\!\cdots\!\oplus\!F^{n-1}\cdot h_{0,n-1})\cdot F^{2n}\cdot{\bf S}_{0}\cr&{\hskip-15pt}\vdots\cr r_{t}\!=\!&\,U\cdot(I_{L}\cdot h_{0,0}\!\oplus\!F\cdot h_{0,1}\!\oplus\!\cdots\!\oplus\!F^{n-1}\cdot h_{0,n-1})\cr&\cdot F^{tn}\cdot{\bf S}_{0}.&\hbox{(40)}}$$ Suppose Formula$G$ is a Formula$L\times L$ matrix that is given by Formula TeX Source $$G\!=\!\!\left(\matrix{U\cdot(I_{L}\cdot h_{0,0}\oplus F\cdot h_{0,1}\cdots\oplus F^{n-1}\cdot h_{0,n-1})\cdot F^{n}\cr U\cdot(I_{L}\cdot h_{0,0}\oplus F\cdot h_{0,1}\cdots\oplus F^{n-1}\cdot h_{0,n-1})\cdot F^{2n}\cr\vdots\cr U\cdot(I_{L}\cdot h_{0,0}\oplus F\cdot h_{0,1}\cdots\oplus F^{n-1}\cdot h_{0,n-1})\cdot F^{Ln}}\right)\!.\eqno{\hbox{(41)}}$$ Then the initial state Formula${\bf S}_{0}$ can be calculated by Formula TeX Source $${\bf S}_{0}=G^{-1}\cdot(r_{0}\quad r_{1}\quad\cdots\quad r_{L-1})^{T}.\eqno{\hbox{(42)}}$$

In many cases, there are more than one dual word for an error correcting code. According to (41), for the same feedback polynomial and different dual words, the matrices Formula$G$ are different. For each Formula$G$ and vector Formula$(r_{0}\ r_{1}\ \cdots\ r_{L-1})$, an initial state Formula${\bf S}_{0}$ can be obtained by using (42). Obviously, if the feedback polynomial is the true feedback polynomial of the LFSR, Formula${\bf S}_{0}$ obtained from (42) are the same no matter which dual word is used. Otherwise, Formula${\bf S}_{0}$ obtained from different dual words are most likely to be different. This property can be used to determine the correct feedback polynomial of the LFSR without descrambling the bit sequence.

B. Reconstruction of the Scrambler After a Convolutional Code

Similar to linear block code, the generator matrix Formula${\bf G}_{\bf c}$ of a Formula$(n,k,m)$ convolutional code generates a vector space of dimension Formula$k$ over the finite field Formula$GF(2)$. This vector space has an orthogonal space of dimension Formula$n-k$ and any element Formula$({\bf h}_{{\bf c},{\bf 0}},{\bf h}_{{\bf c},{\bf 1}},\ldots,{\bf h}_{{\bf c},{\bf n}-{\bf 1}})$ in this space satisfies the property: Formula$\sum_{j=0}^{n-1}{\bf g}_{j}^{(i)}\ast{\bf h}_{{\bf c},{\bf j}}=0$ Formula$\forall i\in[0,k-1]$. Formula$({\bf h}_{{\bf c},{\bf 0}},{\bf h}_{{\bf c},{\bf 1}},\ldots,{\bf h}_{{\bf c},{\bf n}-{\bf 1}})$ can therefore be “translated” into a “dual word”. Suppose Formula${\bf h}_{{\bf c},{\bf j}}=(h_{c,j}^{0},h_{c,j}^{1},\ldots,h_{c,j}^{\mathtilde{m}-1})$ where Formula$h_{c,j}^{i}=(0\ \hbox{or}\ 1)$. The binary vector Formula TeX Source $${\bf h}_{\bf c}=\left(h_{c,0}^{\mathtilde{m}-1},\ldots,h_{c,n-1}^{\mathtilde{m}-1},\ldots,h_{c,0}^{0},\ldots,h_{c,n-1}^{0}\right)$$ of length Formula$n\times\mathtilde{m}$ will be the corresponding dual word.

After the dual word is obtained, the rest of the steps for reconstruction of the feedback polynomial and initial state of the LFSR are the same as those used for the linear block code. The only difference is that the received bit sequence is not divided into blocks. In fact, the dual word will be orthogonal to any segment of Formula$n\times\mathtilde{m}$ bits in the coded sequence, when the starting offset of the Formula$n\times\mathtilde{m}$ bits is Formula$n$ or a multiple of Formula$n$. An example of the dot product of the dual word of a convolutional code with the received bit sequence is shown in Fig. 6.

Figure 6
Fig. 6. Dot product of a dual word of a convolutional code with the received bit sequence.

In Fig. 6, the convolutional code is a (2,1,5) convolutional code with generator matrix [11011 11001]. It is found that the dual word of the convolutional code is 1101001111. As shown in Fig. 6, Formula$r_{t}$ is generated by making a dot product of the dual word with 10 bits in the coded sequence at time index Formula$t$. For every increase of the time index Formula$t$, the starting offset of the 10bits will be increased by Formula$n=2\ \hbox{bits}$. To see the effect of the proposed algorithm clearer, it is used to reconstruct some feedback polynomials of LFSR in synchronous scramblers placed after different convolutional codes with optimum distance spectrum [18]. The multiples detected and the number of bits required by the proposed algorithm are shown in Table IV. The number of bits required by Cluzeau's algorithm are also shown in Table IV for comparison. The setting of parameters for the simulation are the same as before.

From Table IV, it can be observed that the reduction of the number of bits required to do the reconstruction is very significant. This is because firstly, as described previously, the bias existing in the bit sequence after the sequence has passed through a convolutional encoder is very low, and consequently Formula$N$ is very big according to (9). Secondly, for convolutional code, the value of Formula$n$ is usually very small (<10), and consequently Formula$N_{c}$ is small according to (32). Therefore, the proposed scheme is the most suitable for convolutional code as the number of bits required by it to do the reconstruction is very small.

SECTION V

RECONSTRUCTION OF SCRAMBLER WHEN CHANNEL NOISE IS PRESENT

In the previous sections, it is assumed that the channel is noiseless, i.e., there is no error in the received bit sequence. In practical situations, there is usually noise in the channel and some of the received bits will be wrong, as shown in Fig. 7. When channel errors are present, the dual words are no longer completely orthogonal to the received encoded bit sequence and the scheme proposed in Section IV cannot be applied directly.

Table 4
TABLE IV SIMULATION RESULTS FOR RECONSTRUCTION OF SCRAMBLER PLACED AFTER CONVOLUTIONAL CODES
Figure 7
Fig. 7. Chain of scrambler, channel encoder, and channel.

Suppose the channel is modelled as a binary symmetric channel (BSC). The probabilities that the channel error Formula$e$ is equal to 1 and 0 are Formula$\hbox{Pr}(e=1)=p=0.5-\delta$ and Formula$\hbox{Pr}(e=0)=1-p=0.5+\delta$ respectively. Let the Formula$n$-tuple channel errors at time index Formula$t$ be denoted by Formula${\bf e}_{\bf t}=(e_{nt},e_{nt+1},\ldots,e_{(n+1)t-1})$; the Formula$n$-tuple received codeword with errors, Formula${\bf y}_{\bf t}^{\bf e}$, is given by Formula TeX Source $${\bf y}_{\bf t}^{\bf e}={\bf y}_{\bf t}\oplus{\bf e}_{\bf t}.\eqno{\hbox{(43)}}$$ Since Formula${\bf y}_{\bf t}={\bf c}_{\bf t}\oplus{\bf s}_{\bf t}$, the dot product of the dual word Formula${\bf h}_{\bf 0}$ with the received bit sequence is given by Formula TeX Source $$r_{t}^{e}={\bf y}_{\bf t}^{\bf e}\cdot{\bf h}_{\bf 0}^{T}={\bf c}_{\bf t}\cdot{\bf h}_{\bf 0}^{T}\oplus{\bf s}_{\bf t}\cdot{\bf h}_{\bf 0}^{T}\oplus{\bf e}_{\bf t}\cdot{\bf h}_{\bf 0}^{T}.\eqno{\hbox{(44)}}$$ According to the property of the dual word, we have Formula${\bf c}_{\bf t}\cdot{\bf h}_{\bf 0}^{T}=0$; therefore, Formula TeX Source $$r_{t}^{e}=({\bf s}_{\bf t}\oplus{\bf e}_{\bf t})\cdot{\bf h}_{\bf 0}^{T},\eqno{\hbox{(45)}}$$ i.e., Formula TeX Source $$\eqalignno{r_{0}^{e}=&\,(s_{0}\oplus e_{0})\cdot h_{0,0}\oplus(s_{1}\oplus e_{1})\cdot h_{0,1}\oplus\cr&\cdots\oplus(s_{n-1}\oplus e_{n-1})\cdot h_{0,n-1}\cr r_{1}^{e}=&\,(s_{n}\oplus e_{n})\cdot h_{0,0}\oplus(s_{n+1}\oplus e_{n+1})\cdot h_{0,1}\oplus\cr&\cdots\oplus(s_{2n-1}\oplus e_{2n-1})\cdot h_{0,n-1}\cr&{\hskip-15pt}\vdots&\hbox{(46)}}$$

Proposition 2

Suppose Formula$\mathtilde{z}_{t}^{e}=r_{t}^{e}\oplus r_{t-i_{1}}^{e}\oplus r_{t-i_{2}}^{e}\oplus\cdots\oplus r_{t-i_{d-1}}^{e}$ Formula$(t\geq i_{d-1})$. When Formula$1+X^{ni_{1}}+X^{ni_{2}}+\cdots+X^{ni_{d-1}}$ is not a multiple of the feedback polynomial Formula$P(X)$, Formula$\hbox{Pr}(\mathtilde{z}_{t}^{e}=1)=1/2$. When Formula$1+X^{ni_{1}}+X^{ni_{2}}+\cdots+X^{ni_{d-1}}$ is a multiple of Formula$P(X)$, Formula$\hbox{Pr}(\mathtilde{z}_{t}^{e}=1)\leq 1/2[1-(2\delta)^{wd}]$, where Formula$w$ is the weight of the dual word and Formula$\delta=0.5-p$ (Formula$p$ is the channel crossover probability).

Proof

For linear block codes, Formula$r_{t}^{e}$ can be written as Formula TeX Source $$\displaylines{r_{t}^{e}=(s_{nt}\oplus e_{nt})\cdot h_{0,0}\oplus(s_{nt+1}\oplus e_{nt+1})\cdot h_{0,1}\oplus\hfill\cr\hfill\cdots\oplus\left(s_{n(t+1)-1}\oplus e_{n(t+1)-1}\right)\cdot h_{0,n-1}.\quad\hbox{(47)}}$$ Similarly, Formula TeX Source $$\eqalignno{r_{t-i_{1}}^{e}=&\,\left(s_{n(t-i_{1})}\oplus e_{n(t-i_{1})}\right)\cr&\cdot h_{0,0}\oplus\left(s_{n(t-i_{1})+1}\oplus e_{n(t-i_{1})+1}\right)\cr&\cdot h_{0,1}\oplus\cdots\oplus\left(s_{n(t-i_{1}+1)-1}\oplus e_{n(t-i_{1}+1)-1}\right)\cr&\cdot h_{0,n-1},\cr&{\hskip-15pt}\vdots\cr r_{t-i_{d-1}}^{e}=&\,\left(s_{n(t-i_{d-1})}\oplus e_{n(t-i_{d-1})}\right)\cr&\cdot h_{0,0}\oplus\left(s_{n(t-i_{d-1})+1}\oplus e_{n(t-i_{d-1})+1}\right)\cr&\cdot h_{0,1}\oplus\cdots\oplus\left(s_{n(t-i_{d-1}+1)-1}\oplus e_{n(t-i_{d-1}+1)-1}\right)\cr&\cdot h_{0,n-1}.&\hbox{(48)}}$$ Therefore, Formula TeX Source $$\eqalignno{\mathtilde{z}_{t}^{e}=&\,r_{t}^{e}\oplus r_{t-i_{1}}^{e}\oplus r_{t-i_{2}}^{e}\oplus\cdots\oplus r_{t-i_{d-1}}^{e}\cr=&\,\left(s_{nt}\oplus s_{nt-ni_{1}}\oplus\cdots\oplus s_{nt-ni_{d-1}}\right)\cdot h_{0,0}\cr&\oplus\left(e_{nt}\oplus e_{nt-ni_{1}}\oplus\cdots\oplus e_{nt-ni_{d-1}}\right)\cdot h_{0,0}\cr&\oplus\left(s_{nt+1}\oplus s_{nt+1-ni_{1}}\oplus\cdots\oplus s_{nt+1-ni_{d-1}}\right)\cdot h_{0,1}\cr&\oplus\left(e_{nt+1}\oplus e_{nt+1-ni_{1}}\oplus\cdots\oplus e_{nt+1-ni_{d-1}}\right)\cdot h_{0,1}\cr&{\hskip-10pt}\vdots\cr&\oplus\left(s_{n(t+1)-1}\oplus\cdots\oplus s_{n(t+1)-1-ni_{d-1}}\right)\cdot h_{0,n-1}\cr&\oplus\left(e_{n(t+1)-1}\oplus\cdots\oplus e_{n(t+1)-1-ni_{d-1}}\right)\cdot h_{0,n-1}.&\hbox{(49)}}$$

According to the property of the LFSR, when Formula$1+X^{ni_{1}}+X^{ni_{2}}+\cdots+X^{ni_{d-1}}$ is not a multiple of Formula$P(X)$, and as Formula$\hbox{Pr}(s_{t}=1)=1/2$, it is apparent that Formula$\hbox{Pr}(\mathtilde{z}_{t}^{e}=1)=1/2$. When Formula$1+X^{ni_{1}}+X^{ni_{2}}+\cdots+X^{ni_{d-1}}$ is a multiple of Formula$P(X)$, Formula$s_{k}\oplus s_{k-n{i_{1}}}\oplus\cdots\oplus s_{k-ni_{d-1}}=0$ for any Formula$k\geq ni_{d-1}$ and we have Formula TeX Source $$\eqalignno{{\hskip-20pt}\mathtilde{z}_{t}^{e}=&\,\left(e_{nt}\oplus e_{nt-ni_{1}}\oplus\cdots\oplus e_{nt-ni_{d-1}}\right)\cdot h_{0,0}\cr{\hskip-20pt}&\oplus\left(e_{nt+1}\oplus\cdots\oplus e_{nt+1-ni_{d-1}}\right)\cdot h_{0,1}\oplus\cdots\cr{\hskip-20pt}&\oplus\left(e_{n(t+1)-1}\oplus\cdots\oplus e_{n(t+1)-1-ni_{d-1}}\right)\cdot h_{0,n-1}.&\hbox{(50)}}$$

In (50), Formula$\mathtilde{z}_{t}^{e}$ is a modulo 2 summation of Formula$wd$ channel errors Formula$e$, where Formula$w$ is the weight of the dual word. Similar to (14), it can be derived that Formula TeX Source $$\eqalignno{{\rm Pr}\left(\mathtilde{z}_{t}^{e}=1\right)=&\,\sum_{l=1,3,\ldots}^{wd}{wd\choose l}\left({1\over 2}-\delta\right)^{l}\left({1\over 2}+\delta\right)^{wd-l}\cr=&\,{1\over 2}\left[1-(2\delta)^{wd}\right].&\hbox{(51)}}$$

For convolutional codes, similarly, Formula$\mathtilde{z}_{t}^{e}$ is a modulo 2 summation of Formula$wd$ channel errors Formula$e$. However, according to Fig. 6, some of the channel errors might be overlapped; therefore, we have Formula TeX Source $${\rm Pr}\left(\mathtilde{z}_{t}^{e}=1\right)\leq{1\over 2}\left[1-(2\delta)^{wd}\right].\eqno{\hbox{(52)}}$$Formula$\hfill\square$

Suppose Formula$\mathtilde{Z}_{e}=\sum_{t=i_{d-1}+1}^{i_{d-1}+N_{r}^{e}}\mathtilde{z}^{e}_{t}$, where Formula$N_{r}^{e}$ is the number of bits in Formula${\bf r}$ required for the reconstruction when noise is present. According to Proposition 2 and the scheme described in Section IV, when Formula$Q(X)=1+X^{ni_{1}}+X^{ni_{2}}+\cdots+X^{ni_{d-1}}$ is not a multiple of Formula$P(X)$, Formula$\mathtilde{Z}_{e}$ is Gaussian distributed with the mean value 0 and variance Formula$N_{r}^{e}$. Similar to the derivation of the distribution of Formula$Z$ [5], when Formula$Q(X)$ is a multiple of Formula$P(X)$, it can be derived that Formula$\mathtilde{Z}_{e}$ is Gaussian distributed with the mean value Formula$\mu_{e}=N_{r}^{e}(2\delta)^{wd}$ and variance Formula$\sigma_{e}^{2}\leq N_{r}^{e}[1+d((2\delta)^{2w}-(2\delta)^{2wd})]$. Therefore, the algorithm proposed in Section IV can still be used with a minor change in Step 4, i.e., a threshold Formula$T_{e}$ can be used to determine whether Formula$Q(X)$ is a multiple of the feedback polynomial. Similar to Cluzeau's algorithm described in Section II, when the false-alarm probability Formula$P_{f}$ and the nondetection probability Formula$P_{n}$ are given, the threshold Formula$T_{e}$ can be determined by Formula TeX Source $$T_{e}={a_{e}^{2}+a_{e}b_{e}\sqrt{1+d\left((2\delta)^{2w}-(2\delta)^{2wd}\right)}\over(2\delta)^{wd}}\eqno{\hbox{(53)}}$$ where Formula TeX Source $$a_{e}=\Phi^{-1}(1-P_{f})={T_{e}\over\sqrt{N_{r}^{e}}}\eqno{\hbox{(54)}}$$ and Formula TeX Source $$-b_{e}=\Phi^{-1}(P_{n})={T_{e}-\mu_{e}\over\sigma_{e}}.\eqno{\hbox{(55)}}$$ From (54) and (55), it can be derived that the total number of bits Formula$N_{c}^{e}$ used in the reconstruction is given by Formula TeX Source $$\eqalignno{{\hskip-20pt}N_{c}^{e}\!\!=\!&\,n\left(i_{d-1}\!+\!N_{r}^{e}\right)\cr{\hskip-20pt}\!\!=\!&\,n\left(i_{d-1}\!+\!{\left(a_{e}\!+\!b_{e}\sqrt{1\!+\!d\left((2\delta)^{2w}\!-\!(2\delta)^{2wd}\right)}\right)^{2}\over(2\delta)^{2wd}}\right)\!.&\hbox{(56)}}$$

In Figs. 8 and 9, the numbers of bits required for reconstruction when channel noise is present are shown for different error correcting codes and channel error probabilities. It is assumed that Formula$d=3$, Formula$P_{f}=10^{-7}$ and Formula$P_{n}=10^{-5}$. The feedback polynomial is assumed to be Formula$x^{8}+x^{4}+x^{3}+x^{2}+1$.

Figure 8
Fig. 8. Number of bits required for reconstruction when linear block codes are used and channel noise is present.
Figure 9
Fig. 9. Number of bits required for reconstruction when convolutional codes are used and channel noise is present.

From Figs. 8 and 9, it can be observed that the number of bits required to do the reconstruction when channel noise is present is larger, as compared with that required in a noiseless condition. The larger the channel error probability, the larger the number of bits required to do the reconstruction. Another factor which affects the number of bits for the reconstruction is the dual word weight Formula$w$. Obviously, with the increase of Formula$w$, the number of bits required will increase accordingly, especially when the channel error probability is large. Therefore, for the same error correcting code, the dual word of minimum weight Formula$w$ is the best choice for the reconstruction.

In practical situations, the number of bits available for reconstruction is usually limited. In that case, the false-alarm probability or the nondetection probability will be affected. Suppose the number of bits in Formula${\bf r}$ available for reconstruction is Formula$\bar{N}_{r}^{e}$ and the false-alarm probability is determined in advance, i.e., Formula$a_{e}$ is determined in advance. The threshold Formula$\bar{T}_{e}$ is then given by Formula TeX Source $$\bar{T}_{e}=a_{e}\cdot\sqrt{\bar{N}_{r}^{e}}\eqno{\hbox{(57)}}$$ and the nondetection probability Formula$\bar{P}_{n}$ can then be calculated by Formula TeX Source $$\eqalignno{\bar{P}_{n}=&\,\Phi\left({\bar{T}_{e}-\mu_{e}\over\sigma_{e}}\right)\cr\approx&\,\Phi\left({a_{e}\cdot\sqrt{\bar{N}_{r}^{e}}-\bar{N}_{r}^{e}(2\delta)^{wd}\over\sqrt{\bar{N}_{r}^{e}\left[1+d\left((2\delta)^{2w}-(2\delta)^{2wd}\right)\right]}}\right).&\hbox{(58)}}$$

In Fig. 10, the nondetection probabilities versus different number of bits available for reconstruction are plotted. It is assumed that Formula$d=3$, Formula$P_{f}=10^{-7}$ and the feedback polynomial is Formula$x^{8}+x^{4}+x^{3}+x^{2}+1$.

For recovering the initial state of the LFSR when noise is present, some known techniques, such as those proposed in [20], [21], can be used.

SECTION VI

CONCLUSION

In this paper, the problem of reconstruction of the LFSR in a linear scrambler placed after a channel encoder is studied. The existing algorithm, i.e., Cluzeau's algorithm, is very promising in reconstructing the feedback polynomial based on the assumption that the source bits are biasedly distributed.

Figure 10
Fig. 10. Nondetection probabilities versus the number of bits available for reconstruction.

However, after passing through a channel encoder, the bias (relative numbers of 1 s and 0 s) in the bit sequence drops, especially when a convolutional code is used, and the number of bits required by Cluzeau's algorithm will become exorbitantly large. In this paper, a new scheme which, instead of relying on the bias in the bit sequence, uses the orthogonality between the dual words and codewords generated by the channel encoder is studied. Our analysis shows that by using this proposed scheme, the feedback polynomial can be reconstructed much faster, as the number of bits required to do the reconstruction is reduced greatly, especially when convolutional codes are used as the error correcting codes. When channel noise is added, the above scheme can still be used to perform reconstruction, as long as the number of bits used to do the reconstruction is increased accordingly. It is noted that the larger the channel error probability, the larger the number of bits required to do the reconstruction.

Based on the above results, it is clear that scrambling the source bits before applying the FEC offers better protection against scrambler reconstruction when all else being equal.

Secondly, it has been shown that for a linear block code, the bias of the binary bits stream before scrambling can be approximated by the product of the bias of the source bits and the code rate (16). For convolutional encoder, the resultant bias is much lower (20). However, using dual words of the encoder, our results show that a convolutional code-linear scrambler pair is a much weaker pair compared with a linear block code-linear scrambler pair. This is because any shift of a multiple of Formula$n$ bits of a dual word is orthogonal to the coded sequence, and for most practical convolutional code, Formula$n$ is typically a small number.

The work presented in this paper is focused on determining the scrambler polynomial assuming dual word is known and word synchronization has been achieved a priori. A more challenging reconstruction problem would be to reconstruct both the code and the scrambler at the same time. One possible solution to this problem is to incorporate a scheme which recovers the code's length and achieves synchronization without considering the scrambler, such as schemes proposed in [10], [11] into the scheme proposed in this paper. For example, for a short linear block code or a convolutional code, an exhaustive search can be used to test all possible dual words and generate all possible Formula${\bf r}$. Obviously, after applying the scheme proposed in Section IV-A to Formula${\bf r}$, in noiseless case, only the Formula${\bf r}$ generated by the correct dual word will lead to two different distributions of Formula$Z$ as shown in Fig. 5. In a noisy condition, the situation is similar. For longer block codes, more sophisticated schemes need to be used for recovering both the code and the scrambler at the same time. Finally, the weight of the dual word plays a key part in the reconstruction, as low weight dual words are easier to be found and in noisy condition, low weight dual words lead to fewer bits required for the reconstruction. Therefore, one might consider using error correcting codes which do not have low weight dual words. How to find such codes is also an interesting topic for future work.

Footnotes

The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Y.-W. Peter Hong.

X.-B. Liu and S. N. Koh are with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798.

C.-C. Chui is with the Temasek Laboratories at Nanyang Technological University, Singapore 639798.

X.-W. Wu is with the School of Information and Communication Technology, Griffith University, Gold Coast, QLD 4222, Australia.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

References

No Data Available

Authors

Xiao-Bei Liu

Xiao-Bei Liu

Xiao-Bei Liu received the B.S. degree in electrical and communication engineering from Fudan University, Shanghai, China, in 1998, and the Ph.D. degree from Nanyang Technological University (NTU), Singapore, in 2004.

From 1998 to 2000, she was an engineer with Datang Mobile Communications Equipment Co., Ltd., and from 2007 to 2010, she was a senior digital signal processing engineer in Wireless Sound Solutions Pte. Ltd. She is currently a research fellow in the Positioning and Wireless Technology Centre of NTU and her research interests include digital signal processing in wireless communications, modulation/coding techniques, and secured communications.

Soo Ngee Koh

Soo Ngee Koh

Soo Ngee Koh received the B.Eng. degree from the University of Singapore and the B.Sc. degree from the University of London, both in 1979. He received the M.Sc. and Ph.D. degrees from Loughborough University, U.K., in 1981 and 1984, respectively.

Prior to his return to Singapore, he worked as a consultant at the British Telecom Research Laboratories in England. He joined Nanyang Technological University (NTU) of Singapore in 1985. He was the founding Head of the Communication Engineering Division of the School of Electrical and Electronic Engineering (EEE) of NTU from 1995 to 2005, founding Cochair of the International Conference on Information, Communications and Signal Processing, and Associate Chair (Academic) from 2005 to 2011. He is currently a Professor of the School. He has published more than 140 papers in international journals and conference proceedings, and holds two international patents on speech coder design. His research interests include speech processing, coding, enhancement and recognition, computer-aided language learning, blind source separation, and secured communication.

Chee-Cheon Chui

Chee-Cheon Chui

Chee-Cheon Chui received the B.Eng. degree from the National University of Singapore, Singapore, in 1994, and the M.Sc. and Ph.D. degrees from the University of Southern California, USA, in 2001 and 2005, respectively, all in electrical engineering.

He is currently with TL@NTU, Singapore as a research scientist, engaging in research and development and management of numerous projects in the field of wireless communications. He has also held various positions in the executive committee of the IEEE Singapore local Communications Chapter. His current research interests include receiver synchronization, time-synchronization of wireless systems, physical-layer security, wireless communication signal processing, and forward error correction coding.

Xin-Wen Wu

Xin-Wen Wu

Xin-Wen Wu (M'00) received the B.S. and M.S. degrees in 1989 and 1992, respectively, from East China Normal University, Shanghai, and the Ph.D. degree in 1995 from the Institute of Systems Science, Chinese Academy of Sciences, Beijing.

From 1995 through 2003, he was affiliated with the Institute of Mathematics, Chinese Academy of Sciences. From January to October 1996, and from October 1997 to December 1998, he was a visiting research associate at the Center for Advanced Computer Studies at the University of Louisiana, Lafayette, LA, USA. From Jun. 1999 to May 2000, he was a postdoctoral researcher at the Department of Electrical and Computer Engineering, University of California at San Diego. During February 2003–October 2005, he worked at the Department of Electrical and Electronic Engineering, University of Melbourne, holding a research fellowship. From November 2005 through April 2010, he was a faculty member at the Graduate School of Mathematics and Information Technology, University of Ballarat. Since April 2010, he has been with the School of Information and Communication Technology, Griffith University, Gold Coast, Australia. His research interests are in the areas of coding theory, cryptology, information theory with applications to bioinformatics, and other areas. He has authored or coauthored over 40 research papers and one book in the above-mentioned areas.

Cited By

No Data Available

Keywords

Corrections

None

Multimedia

No Data Available
This paper appears in:
No Data Available
Issue Date:
No Data Available
On page(s):
No Data Available
ISSN:
None
INSPEC Accession Number:
None
Digital Object Identifier:
None
Date of Current Version:
No Data Available
Date of Original Publication:
No Data Available

Text Size