By Topic

IEEE Quick Preview
  • Abstract

SECTION I

INTRODUCTION

IN A digital communication system, the constituent elements used by the transmitter and the specifications of each element are normally known by the receiver. In this paper, we consider a scenario wherein the specifications of the elements used by the transmitter are not completely known to the receiver. The capability of reconstructing transmitter elements when their specifications are not perfectly known is envisaged to be an enabling technology in digital communication systems with a flexible platform such as software defined radio (SDR), as it will reduce the overheads needed and make the design of the system more flexible. Similar application is also envisaged in [1] for “multistandard adaptive receivers.”

The challenge of reconstructing transmitter elements when their specifications are unknown has attracted considerable research interests in the last few years. For example, results and findings on recovery of error-correcting codes have been published in [2], [3], [4], [5], [6], [7]. In this paper, we focus on the reconstruction of another element which is commonly used in digital communication systems, i.e., the linear scrambler. A linear scrambler is usually used in a communication system to convert a data bit sequence into a pseudorandom sequence that is free from long strings of 1s or 0s. There are generally two types of linear scrambler, namely the synchronous scrambler and the self-synchronized scrambler. Both types of scrambler usually consist of a linear feedback shift register (LFSR) whose output sequence Formula$(s_t)_{t\geq 0}$ is combined with the input sequence Formula$(x_t)_{t\geq 0}$ and the result is the scrambled sequence Formula$(y_t)_{t\geq 0}$, i.e., Formula TeX Source $$y_t=x_t\oplus s_t, \quad t\geq 0 \eqno{\hbox{(1)}}$$ where Formula$\oplus$ denotes module 2 summation. In this paper, for simplicity, only the synchronous scrambler is considered. However, our ideas on the reconstruction of synchronous scramblers with the presence of channel noise can also be extended to self-synchronized scramblers.

In most communication systems, to achieve the maximum period for the sequences produced by the LFSRs, binary primitive polynomials are used as the feedback polynomials. Reconstructing a linear scrambler consists of reconstructing the feedback polynomial of the LFSR as well as its initial state in the case of a synchronous scrambler. In this paper, we will focus on reconstructing the feedback polynomial of the LFSR, as reconstructing the initial state of the LFRS is a well-known problem for stream cipher and it has been extensively studied in the literature [8], [9], [10], [11]. When some input and scrambled bits are known, the Berlekamp–Massey algorithm [12] can be used to reconstruct the feedback polynomial of the LFSR. Recently, an algorithm is proposed by Cluzeau for reconstructing the feedback polynomial of the LFSR by using only the scrambled bits [13]. In the following, this algorithm will be referred to as “Cluzeau's algorithm.”

Although Cluzeau's algorithm can be used to reconstruct most of the feedback polynomials of the LFSR very efficiently, it can be observed from the simulation results shown in [13] that in some cases, the algorithm cannot make a correct detection of the feedback polynomial even when the false-alarm probability Formula$P_f$ is set to a very small value. Furthermore, the algorithm proposed in [13] assumed that all the scrambled bits are correctly received. In practical situations, the communication channels always have some types of noise, which will lead to errors in the received bits. In this paper, the above-mentioned two problems are investigated. In the first part of this paper, a scheme to improve the detection capability of Cluzeau's algorithm, with only marginal increase in complexity is proposed. Following that, an approach to reduce the number of operations required by Cluzeau's algorithm to do the recovery without affecting the detection capability is described. In the second part of this paper, the problem of reconstruction of scramblers in the presence of noise is studied. Two kinds of channel errors are considered; one is bit flipping due to channel noise and the second is insertion of bits in the scrambled bit sequence.

The paper is organized as follows. In Section II, Cluzeau's algorithm is reviewed. In Section III, a scheme is proposed to improve the detection capability of Cluzeau's algorithm. In Section IV, the approach to reduce the number of operations required to detect the feedback polynomial will be described. In Section V, reconstructions of the scrambler in the presence of channel noise is investigated. Conclusions are drawn in Section VI.

SECTION II

CLUZEAU'S ALGORITHM FOR RECONSTRUCTING A SYNCHRONOUS SCRAMBLER

Figure 1
Fig. 1. Structure of synchronous scrambler.

In this section, Cluzeau's algorithm which recovers the feedback polynomial Formula$P(X)$ of a synchronous scrambler by using only the scrambled bits will be reviewed. In a synchronous scrambler, Formula$s_t$ is generated independently of Formula$x_t$ and Formula$y_t$, as shown in Fig. 1.

Instead of searching for the feedback polynomial Formula$P(X)$ directly, Cluzeau's algorithm searched for sparse multiples of Formula$P(X)$ with the degree of the sparse multiples varying from low to high. After two multiples of Formula$P(X)$ are detected, it returns the nontrivial greatest common divisor (gcd) of the two detected multiples as the detected feedback polynomial. The determination of whether a sparse polynomial is a multiple of Formula$P(X)$ or not is based on a statistical test on the absolute value of a variable Formula$Z$, which is given by Formula TeX Source $$Z=\sum_{t=i_{d-1}}^{N-1}(-1)^{z_t} \eqno{\hbox{(2)}}$$ where Formula$z_t$ is a module 2 summation of Formula$d$ scrambled bits, i.e., Formula$z_t=y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j}, (0 < i_1 < i_2 < \cdots < i_{d-1})$. Let Formula$Q(X)=1+\sum_{j=1}^{d-1}X^{i_j}$, when Formula$Q(X)$ is a multiple of Formula$P(X)$, we have Formula TeX Source $$\eqalignno{z_t & =y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j}\cr & =x_t\oplus\bigoplus_{j=1}^{d-1}x_{t-i_j} &\hbox{(3)} }$$ since Formula$s_t\oplus\bigoplus_{j=1}^{d-1}s_{t-i_j}=0$ and Formula$y_t=x_t\oplus s_t$. According to the statistical analysis results given in [7], when Formula$Q(X)$ is a multiple of Formula$P(X)$ and if the input bits are biased distributed with Formula${\rm Pr}\, (x_t=1)= {(1/2)}-\varepsilon$, where Formula$\varepsilon\neq 0,$ Formula$z_t$ is also biased distributed with Formula${\rm Pr}\, (z_t=1)= {(1/2)} [1-(2\varepsilon)^d]$. Then according to Theorem 1 given in [7], the value of Formula$Z$, i.e., Formula$\sum_{t=i_{d-1}}^{N-1}(-1)^{z_t}=(N-i_{d-1})-2\sum_{t=i_{d-1}}^{N-1}z_t$, is Gaussian distributed with mean value Formula$\mu$ given by Formula TeX Source $$\mu=(N-i_{d-1})(2\varepsilon)^d \eqno{\hbox{(4)}}$$ and variance Formula$\sigma^2$ given by Formula TeX Source $$\displaylines{\sigma^2=(N-i_{d-1}) (1-(2\varepsilon)^{2d})\hfill\cr \hfill +\sum_{u=1}^{d-1}N_u \left((2\varepsilon)^{2(d-u)}-(2\varepsilon)^{2d}\right) \quad\hbox{(5)} }$$

Figure 2
Fig. 2. Distributions of Formula$Z$.

where Formula$N_u$ denotes the number of pairs Formula$(z_t,z_{t'}), (0 < \vert t'-t \vert\leq i_{d-1})$ which share exactly Formula$u$ terms of Formula$x_t$. For different Formula$Q(X)$, the values of Formula$N_u$ are different, and hence there is not a fixed value of Formula$\sigma^2$. However, according to [7], an upper bound of Formula$\sigma^2$ can be derived, since in the worst case, Formula$N_1=N_2= \cdots =N_{d-2}=0$ and Formula$N_{d-1}\leq (N-i_{d-1})2d(d-1)$, which leads to Formula TeX Source $$\eqalignno{\sigma^2 & \leq (N-i_{d-1}) [1+2d(d-1)(2\varepsilon)^2\cr & \qquad\qquad\qquad\quad -(2\varepsilon)^{2d}(1+2d(d-1))]\cr & \leq (N-i_{d-1}) [1+2d(d-1)](1-(2\varepsilon)^{2d}). &\hbox{(6)} }$$ Therefore, the upper bound Formula$\sigma_l$ of Formula$\sigma$ is given by Formula TeX Source $$\sigma\leq \sigma_l=\sqrt{(N-i_{d-1})[1+2d(d-1)](1-(2\varepsilon)^{2d})} \eqno{\hbox{(7)}}$$ and the normalized upper bound Formula$\bar\sigma_l$ is given by Formula TeX Source $$\bar\sigma_l = {\sigma_l\over \sqrt{N-i_{d-1}}} = \sqrt{[1+2d(d-1)](1-(2\varepsilon)^{2d})}. \eqno{\hbox{(8)}}$$

According to the above description, when Formula$Q(X)$ is a multiple of Formula$P(X)$, Formula$Z$ has a Gaussian distribution with mean value Formula$\mu$ and variance Formula$\sigma^2$; it is also obvious that when Formula$Q(X)$ is not a multiple of Formula$P(X)$, Formula${\rm Pr}\, (z_t=0)= {(1/2)}$, implying that Formula$Z$ has a Gaussian distribution with mean value 0 and variance Formula$N-i_{d-1}$. The two distributions are depicted in Fig. 2.

From Fig. 2, it can be observed that when the two distributions of Formula$Z$ have a small enough intersection, a threshold Formula$T$ can be used to determine whether Formula$Q(X)$ is a multiple of Formula$P(X)$ or not: i.e., when Formula$\vert Z\vert < T$, Formula$Q(X)$ is not a multiple of Formula$P(X)$; otherwise, Formula$Q(X)$ is a multiple of Formula$P(X)$. Formula$T$ depends on the false-alarm probability Formula$P_f={\rm Pr} (\vert Z\vert \geq T\vert$ when Formula$Q(X)$ is not a multiple of Formula$P(X))$ and on the nondetection probability Formula$P_n={\rm Pr} (\vert Z\vert < T \vert$ when Formula$Q(X)$ is a multiple of Formula$P(X))$. In the following, Cluzeau's algorithm is outlined for reader's convenience.

  1. Compute the threshold Formula$T$ as follows: Formula TeX Source $$T= {a(a+b\bar\sigma_l)\over (2\vert \varepsilon\vert)^{d}} \eqno{\hbox{(9)}}$$ where Formula TeX Source $$\eqalignno{a & =\Phi^{-1}\left(1- {P_f\over 2}\right) &\hbox{(10)}\cr \noalign{\hbox{and}} b & =-\Phi^{-1}(P_n). &\hbox{(11)} }$$ In the above equations, Formula$\Phi$ denotes the normal distribution function, i.e., Formula TeX Source $$\Phi(x)= {1\over \sqrt{2\pi}}\int_{-\infty}^{x}{\rm exp}\left(-{t^2\over 2}\right)dt \eqno{\hbox{(12)}}$$ and Formula$d$ denotes the weights of the sparse multiples of Formula$P(X)$ and typical values of Formula$d$ are 3, 4, and 5.
  2. For Formula$(i_1,\ldots,i_{d-1})$, Formula$0 < i_1 < \cdots < i_{d-1}\leq D$ (Formula$D$ is the maximum degree of the sparse multiple we want to search), compute the number of bits Formula$N$ required to recover the feedback polynomial as follows: Formula TeX Source $$N=i_{d-1}+ {(a+b\bar\sigma_l)^2\over (2\varepsilon)^{2d}}. \eqno{\hbox{(13)}}$$
  3. Initialize Formula$Z$ with Formula$Z=0$.
    For Formula$t$ from Formula$i_{d-1}$ to Formula$N$, compute Formula TeX Source $$\eqalignno{z_t& =y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j} &\hbox{(14)}\cr \noalign{\hbox{and}} Z& =Z+(-1)^{z_t}. &\hbox{(15)} }$$
  4. If Formula$\vert Z\vert > T$, store Formula$Q(X)=1+\sum_{j=1}^{d-1}X^{i_j}$ in a table.
  5. For Formula$Q'(X)\neq Q(X)$ in the table, compute the nontrivial gcd of Formula$(Q(X),Q'(X))$.

Steps 2–5 are repeated until a gcd Formula$(Q(X),Q'(X))=P(X) (P(X)\neq 1)$ is found or all combinations of Formula$(i_1,\ldots,i_{d-1})$ are tested.

According to [7], if using the algorithm described above, for a randomly chosen primitive polynomial Formula$P(X)$ of degree Formula$L$, the number of operations performed by the algorithm is Formula TeX Source $$W_p= d\cdot 2^{L} {(a+b\bar\sigma_l)^2\over (2\varepsilon)^{2d}} \eqno{\hbox{(16)}}$$

To test the accuracy and runtime of Cluzeau's algorithm, it is applied to some feedback polynomials and results are shown in Table I. We will propose improved reconstruction procedures in the following sections. To make a fair comparison with Cluzeau's algorithm [13], we run Cluzeau's algorithm with the parameters Formula$\varepsilon = 0.1$, false-alarm probability Formula$P_f=2\cdot 10^{-7}$, nondetection probability Formula$P_n=10^{-5}$, and weight of the sparse multiples of the feedback polynomial Formula$d=3$, which are the same as those in [13].

Table 1
TABLE I SIMULATION RESULTS OF CLUZEAU'S ALGORITHM

In Table I, the first, third, and last feedback polynomials are the same as those shown in Table III in [13]. Due to the difference in hardware (Intel dual core, 2.5 GHz), when testing Cluzeau's algorithm, the runtimes are a bit different from those presented by Cluzeau. In the following sections, we will test our improved reconstruction algorithm and compare the results, using the same parameters and hardware as those for Table I.

SECTION III

IMPROVE THE DETECTION CAPABILITY OF CLUZEAU'S ALGORITHM

From Table I, it can be observed that the third and fourth polynomials are not correctly recovered. To further understand why the original feedback polynomials are not detected, we look into the detected multiples of the feedback polynomials and results are shown in Table II.

Table 2
TABLE II MULTIPLES OF THE FEEDBACK POLYNOMIALS

From Table II, it can be observed that for the third and fourth feedback polynomials, their second detected trinomial multiple is a multiple (square) of the first one. Therefore, their gcd is the first detected multiple instead of the feedback polynomial. For the feedback polynomials Formula$x^{23}+x^{18}+1$ and Formula$x^{29}+x^2+1$, their second detected trinomial multiple is also a multiple of the first one. However, as shown in Table I, they are correctly detected. This is because their first detected trinomial multiples, i.e., Formula$x^{23}+x^{18}+1$ and Formula$x^{29}+x^2+1$, are exactly the feedback polynomials already.

Based on the above observation, to avoid making the wrong detection, the second detected multiple of the feedback polynomial should be ignored if it is a multiple of the first one. The only exception is that when the first detected multiple of the feedback polynomial is irreducible, then the feedback polynomial is nothing else but the detected multiple itself. Consequently, we can modify sightly Cluzeau's algorithm with an improved performance as follows:

  1. Follow the algorithm described in Section II until a sparse multiple Formula$Q(X)$ is found.
  2. If Formula$Q(X)$ is irreducible, then stop and return Formula$Q(X)$ as the detected feedback polynomial; otherwise, store Formula$Q(X)$ in a table and go to the next step.
  3. Search for another sparse multiple Formula$Q'(X)$; if Formula$Q'(X)$ is a multiple of Formula$Q(X)$, ignore it and keep on searching until a Formula$Q'(X)$ which is not a multiple of Formula$Q(X)$ is found. Store Formula$Q'(X)$ in the table.
  4. Compute gcd Formula$(Q(X),Q'(X))=P(X)$. If Formula$P(X) \neq 1$, return Formula$P(X)$ as the detected feedback polynomial. Otherwise, go to Step 3.

In the following, the algorithm described above will be called “improved algorithm,” to differentiate it from Cluzeau's algorithm. The improved algorithm includes two extra steps, as compared with Cluzeau's algorithm. The first is to determine if the detected multiple is irreducible. The second is to determine if the second multiple is a multiple of the first one. The second step is rather simple because the number of operations (xors) used by it is bounded from above by Formula$O(d\cdot D)$. For the first step, there are a lot of algorithms proposed in the literature to test the reducibility of a binary polynomial [14], [15]. We use the algorithm proposed in [15] and the number of operations is bounded from above by Formula$O(d\cdot D^2)$. Compared with Cluzeau's algorithm, for which the number of operations is bounded from above by Formula$O(({D^{d-1})/((2\varepsilon)^{2d})})$, the increase in complexity is trivial.

To evaluate the performance of the improved algorithm, the scrambler using each of all the primitive polynomials of degrees from 8 to 16 as the feedback polynomial is simulated. The weight of the sparse multiple is chosen to be Formula$d=3$ and the results obtained are plotted in Fig. 3.

Figure 3
Fig. 3. Comparison of Cluzeau's and improved algorithms.

From Fig. 3, it is apparent that the detection performance is significantly improved by using the improved algorithm. It should be noted that there are still feedback polynomials which cannot be correctly recovered by using the improved algorithm. This situation occurs if the first detected multiple is Formula$Q(X)=P(X)F(X)D(X)$, where Formula$F(X)$ and Formula$D(X)$ are any binary polynomials not equal to 1, and the second detected multiple is Formula$Q'(X)=P(X)F(X)D'(X)$. If Formula$D'(X)$ is not a multiple of Formula$D(X)$, Formula$Q'(X)$ is not a multiple of Formula$Q(X)$ either. Therefore, Formula$Q'(X)$ will not be ignored in the improved algorithm, and the gcd of Formula$Q(X)$ and Formula$Q'(X)$ is Formula$P(X)F(X)$ instead of Formula$P(X)$ itself. In any case, if the detected feedback polynomial is not primitive, say it is equal to Formula$P(X)F(X)$, we can find the correct feedback polynomial by trying to recover the source bits using the polynomials Formula$P(X)F(X),$ Formula$P(X)$ and Formula$F(X)$ as the feedback polynomials, respectively, and see which one would lead to a sensible source sequence. The procedure above includes the factorization of Formula$P(X)F(X)$ which can be easily achieved with Berlekamp's algorithm [16].

SECTION IV

IMPROVE THE SPEED OF CLUZEAU'S ALGORITHM

According to (13) and (16), the number of bits and operations required to recover the feedback polynomial depends on the normalized upper bound Formula$\bar\sigma_l$. The bigger the value of Formula$\bar\sigma_l$, the larger the number of bits and operations required. In the following, we will show that the upper bound of Formula$\sigma$ can be improved, and a new upper bound Formula$\sigma'_l$, which approaches the actual value of Formula$\sigma$ more closely, will be derived.

As described in Section II, Formula$\sigma^2$ becomes the maximum value in the worst case, i.e., Formula$N_1=N_2=\cdots=N_{d-2}=0$ and Formula$N_{d-1}\leq (N-i_{d-1})2d(d-1)$, where Formula$N_u (u=1,2,\ldots,d-1)$ denotes the number of pairs Formula$(z_t,z_{t'})$ which share exactly Formula$u$ terms of Formula$x_t$. It is obvious that Formula$N_u$ also denotes the number of pairs Formula$(X^tQ(X),X^{t'}Q(X))$ which share exactly Formula$u$ terms of Formula$X^i$. Let Formula$M_u$ denote the number of pairs Formula$(Q(X), X^nQ(X)),(0 < n\leq i_{d-1})$, which share exactly Formula$u$ terms of Formula$X^i (i=i_1, i_2,\ldots i_{d-1})$. Since at each time instant Formula$t$, Formula$(X^tQ(X),X^{t'}Q(X))$, Formula$(i_{d-1}\leq t < t'\leq N-1, 0 < t'-t\leq i_{d-1})$, can be taken as a time shift version of Formula$(Q(X), X^nQ(X)), (0 < n\leq i_{d-1})$, we have Formula TeX Source $$N_u=2(N-i_{d-1})M_u. \eqno{\hbox{(17)}}$$ The factor 2 in (17) arises because Formula$t$ and Formula$t'$ are interchangeable.

In the following, the number of terms that the pair Formula$(Q(X), X^nQ(X))$ have in common is studied. Formula$Q(X)$ and Formula$X^nQ(X)$ are denoted by Formula$Q(X)=1+X^{i_1}+X^{i_2}+ \cdots +X^{i_{d-1}}$ and Formula$X^nQ(X)=X^n+X^{n+i_1}+X^{n+i_2}+ \cdots +X^{n+i_{d-1}}$, respectively. Obviously, the constant term 1 will not be shared by the pair Formula$(Q(X), X^nQ(X))$. So, first, how many times Formula$X^{i_1}$ appears in Formula$X^nQ(X)$ is studied. It can be observed that there is only one time, i.e., when Formula$n=i_1$, Formula$X^{i_1}$ appears in Formula$X^{i_1}Q(X)$. Then, for Formula$X^{i_2}$, there are two times it appears in Formula$X^nQ(X)$, i.e., when Formula$n=i_2$ or Formula$n=i_2-i_1$, Formula$X^{i_2}$ is shared by the pair Formula$(Q(X), X^nQ(X))$. For the same reason, for Formula$X^{i_3}$, Formula$X^{i_4},\ldots,X^{i_{d-1}}$, the number of times they appear in Formula$X^nQ(X)$ are Formula$3, 4, \ldots, d-1$, respectively. In Table III, a summary of the terms that the pair Formula$(Q(X), X^nQ(X))$ have in common, with their shared times and the corresponding values of Formula$n$ are shown.

Obviously, Table III includes all the terms that the pair Formula$(Q(X), X^nQ(X))$ share for Formula$0 < n\leq i_{d-1}$. It is also noted that for the same shared term, the possible values of Formula$n$ are different since Formula$0 < i_1 < i_2 < \cdots < i_{d-1}$. It means that there is no double counting of the shared terms. Therefore, the total number of shared terms for the pair Formula$(Q(X), X^nQ(X))$ for Formula$0 < n\leq i_{d-1}$ is Formula TeX Source $$M= {[1+(d-1)](d-1)\over 2}= {d(d-1)\over 2}. \eqno{\hbox{(18)}}$$

Table 3
TABLE III SHARED TERMS OF THE PAIR Formula$(Q(X), X^nQ(X))$ AND THE CORRESPONDING VALUE OF Formula$n$

For different shared terms, the value of Formula$n$ may be the same, e.g., when Formula$i_1=i_2-i_1=i_3-i_2$, Formula$Q(X)$ and Formula$X^{i_1}Q(X)$ will share Formula$X^{i_1}$, Formula$X^{i_2}$, and Formula$X^{i_3}$, i.e., three terms. Obviously, the maximum number of terms Formula$Q(X)$ and Formula$X^nQ(X)$ can share for the same value of Formula$n$ is Formula$d-1$. Since Formula$M_u$ denotes the number of pairs Formula$(Q(X), X^nQ(X)),(0 < n\leq i_{d-1})$ which share exactly Formula$u$ terms of Formula$X^i$, we have Formula TeX Source $$\sum_{u=1}^{d-1}u\cdot M_u=M= {d(d-1)\over 2}. \eqno{\hbox{(19)}}$$ Based on (17) and (19), we have Formula TeX Source $$\eqalignno{\sum_{u=1}^{d-1}u\cdot N_u & =2(N-i_{d-1}) \sum_{u=1}^{d-1} u\cdot M_u\cr & =(N-i_{d-1})\cdot d \cdot (d-1). &\hbox{(20)} }$$

From (5) and (20), it can be observed that the maximum value of Formula$\sigma$ can be achieved only when Formula TeX Source $$N_u= \cases{0, & ${\rm when} \quad u < d-1$\cr (N-i_{d-1})d, & ${\rm when} \quad u=d-1$.} \eqno{\hbox{(21)}}$$ Put the value of Formula$u=d-1$ and Formula$N_u=(N-i_{d-1})d$ into (5), we get Formula TeX Source $$\eqalignno{\sigma^2 & \leq(N-i_{d-1})(1-(2\varepsilon)^{2d})\cr & \quad +(N-i_{d-1})d((2\varepsilon)^2-(2\varepsilon)^{2d})\cr & \leq(N-i_{d-1})[1+d((2\varepsilon)^2-(2\varepsilon)^{2d})] .&\hbox{(22)} }$$

According to (22), the new upper bound Formula$\sigma'_l$ of Formula$\sigma$ is given by Formula TeX Source $$\sigma'_l=\sqrt{(N-i_{d-1})[1+d((2\varepsilon)^2-(2\varepsilon)^{2d})]} \eqno{\hbox{(23)}}$$ and the new normalized upper bound Formula$\bar\sigma'_l$ is given by Formula TeX Source $$\bar\sigma'_l= {\sigma'_l\over \sqrt{N-i_{d-1}}}=\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}. \eqno{\hbox{(24)}}$$

To see how closely the new upper bound of Formula$\sigma$ derived above approaches the actual value, the Gaussian distributions with Formula$\mu$ calculated by using (4) and standard deviation Formula$\sigma$ equal to Formula$\sigma_l$ and Formula$\sigma'_l$, respectively, are plotted in Fig. 4. In Fig. 4, the actual distribution of Formula$Z$, which is obtained by using Formula$Q(X)=1+x^{10}+x^{21}$, Formula$N=10\, 000$, Formula$\varepsilon=0.1$, and Formula$d=3$, is also plotted. A total of 50 000 values of Formula$Z$ are collected and a low-pass filter is used to smooth the curve. Other trinomials for Formula$Q(X)$ are also tested in our simulations and it is found that there is no big difference in the distribution of Formula$Z$ for different trinomials of Formula$Q(X)$. Therefore, the dashed curve in Fig. 4 can be taken as an approximation of the actual distribution of Formula$Z$ for any Formula$Q(X)$ with Formula$d=3$.

Figure 4
Fig. 4. Distribution of Formula$Z$ with different Formula$\sigma (N=10\, 000$, Formula$i_{d-1}=21$, Formula$\epsilon=0.1$, Formula$d=3)$.

From Fig. 4, it can be observed that the Gaussian distribution with Formula$\sigma=\sigma'_l$ approaches the actual distribution of Formula$Z$ very closely. The Gaussian distribution with Formula$\sigma=\sigma_l$ deviates from the actual distribution by quite a margin, as shown in Fig. 4. It should be noted that when Formula$d$ increases, Formula$\sigma_l$ deviates even more from Formula$\sigma$, as shown in Fig. 5, but Formula$\sigma'_l$ still approaches Formula$\sigma$ very closely.

Figure 5
Fig. 5. Distribution of Formula$Z$ with different Formula$\sigma (N=10\, 000$, Formula$i_{d-1}=31$, Formula$\epsilon=0.1$, Formula$d=4)$.

When using the new upper bound, both the threshold Formula$T$, the number of bits Formula$N$, and the number of operations Formula$W_p$ required by the algorithm can be reduced and consequently, the runtime of the algorithm can be reduced. The new threshold Formula$\mathtilde{T}$ can be obtained by replacing Formula$\bar\sigma_l$ in (9) by Formula$\bar\sigma'_l$ and the result is given by Formula TeX Source $$\eqalignno{\mathtilde{T} & = {a(a+b\bar\sigma'_l)\over (2\vert \varepsilon\vert)^{d}}\cr & = {a^2+ab\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}\over (2\vert \varepsilon\vert)^{d}}. &\hbox{(25)} }$$ Similarly, the revised number of bits Formula$\mathtilde{N}$ and operations Formula$\mathtilde{W_p}$ required by the algorithm are given by Formula TeX Source $$\eqalignno{\mathtilde{N} & =i_{d-1}+ {\left(a+b\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}\right)^2\over (2\varepsilon)^{2d}} &\hbox{(26)}\cr \noalign{\hbox{and}} \mathtilde{W_p} & =d\cdot 2^{L} {\left(a+b\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}\right)^2\over (2\varepsilon)^{2d}}. &\hbox{(27)} }$$

Comparing (16) and (27), it can be observed that the reduction factor Formula$R$ for the number of operations required to recover the feedback polynomial is given by Formula TeX Source $$R= {W_p\over \mathtilde{W_p}}\approx {(\gamma+q)^2\over (1+q)^2} \eqno{\hbox{(28)}}$$ where Formula$\gamma=\sqrt{1+2d(d-1)}$ and Formula$q= {(a/b)}$.

In Fig. 6, the values of Formula$R$ against Formula$d$ are plotted. According to the simulation setup described in Section II, Formula$q\approx 1.2$. It can be observed that Formula$R$ increases with the increase in Formula$d$. This is because Formula$\gamma$ increases with the increase in Formula$d$.

In the following, the improved algorithm which uses Formula$\mathtilde{T}$ and Formula$\mathtilde{N}$ instead of Formula$T$ and Formula$N$ are tested by using the same feedback polynomials as those in Table I, and results are shown in Table IV. In the second and third column of Table IV, the detected feedback polynomials and runtime required by the improved algorithm are shown. In the fourth column of Table IV, the runtime required by the improved algorithm using Formula$\mathtilde{T}$ and Formula$\mathtilde{N}$ instead of Formula$T$ and Formula$N$ is shown.

Table 4
TABLE IV SIMULATION RESULTS OF THE IMPROVED ALGORITHM USING Formula$\mathtilde{T}$ AND Formula$\mathtilde{N}$ INSTEAD OF Formula$T$ AND Formula$N$
Figure 6
Fig. 6. Reduction factor Formula$(R)$ for the number of operations for different values of Formula$d (q\approx 1.2)$.

First, we compare the first three columns of Table I with Table IV. It can be seen that for the first and second feedback polynomials which are correctly detected by using Cluzeau's algorithm, they can still be correctly detected by using the improved algorithm and the runtime is not affected. For the third and fourth feedback polynomials which are wrongly detected by using Cluzeau's algorithm, they are correctly detected by using the improved algorithm, while the runtime is also increased. For the last two feedback polynomials, the runtime is even reduced by using the improved algorithm, because they are irreducible and the algorithm is stopped after the first step. We then compare the third and fourth columns in Table IV. It can be observed that the time required to do the detection is reduced by a factor of about 4.5 for all the primitive polynomials, when using Formula$\mathtilde{T}$ and Formula$\mathtilde{N}$ instead of Formula$T$ and Formula$N$ in the detection process. This result matches very well with the reduction factor for the number of operations required to recover the feedback polynomial shown in Fig. 6. It should be noted that when using Formula$\mathtilde{T}$ and Formula$\mathtilde{N}$, the detected polynomials are still the same as those shown in the second column of Table IV. In general, comparing Tables I and IV, it is clear that with our proposed schemes, both the detection capability and speed of Cluzeau's algorithm are improved significantly.

SECTION V

RECOVERY OF SCRAMBLER IN THE PRESENCE OF CHANNEL NOISE

A. Recover the Scrambler With Flipped Bits

In the algorithm described in Section II, the scrambled bit sequence Formula$y_t (t > 0)$ is assumed to be correctly received at the receiver. In this section, we consider the situation that channel noise is present, as depicted in Fig. 7.

Figure 7
Fig. 7. Chain of scrambler and channel.

In Fig. 7, the channel is modeled as a binary symmetric channel (BSC). The input to the BSC is denoted by Formula$y_t$ and the output of the channel is denoted by Formula$y'_t$, which is given as Formula TeX Source $$y'_t=y_t\oplus e_t \eqno{\hbox{(29)}}$$ where Formula$e_t$ is the channel error at time instant Formula$t$. Suppose the channel error probability (or crossover probability) is Formula$p$, then at each time instant Formula$t$ we have Formula TeX Source $$\eqalignno{{\rm Pr}(e_t=1)& =p &\hbox{(30)}\cr \noalign{\hbox{and}} {\rm Pr}(e_t=0)& =1-p. &\hbox{(31)} }$$ Similar to (14), we have Formula TeX Source $$\eqalignno{z'_t& =y'_t\oplus\bigoplus_{j=1}^{d-1}y'_{t-i_j}\cr & =y_t\oplus e_t\bigoplus_{j=1}^{d-1}y_{t-i_j}\oplus e_{t-i_j} &\hbox{(32)} }$$ where Formula$y_t=x_t\oplus s_t$. When Formula$Q(X)$ is a multiple of Formula$P(X)$, we have Formula$s_t\oplus\bigoplus_{j=1}^{d-1}s_{t-i_j}=0$ and thus Formula TeX Source $$z'_t=x_t\oplus e_t\oplus\bigoplus_{j=1}^{d-1}x_{t-i_j}\oplus e_{t-i_j}. \eqno{\hbox{(33)}}$$ Suppose Formula$x'_t=x_t\oplus e_t$, (33) becomes Formula TeX Source $$z'_t=x'_t\oplus\bigoplus_{j=1}^{d-1}x'_{t-i_j}. \eqno{\hbox{(34)}}$$ Comparing with the noiseless case, in which we have Formula TeX Source $$z_t=x_t\oplus\bigoplus_{j=1}^{d-1}x_{t-i_j} \eqno{\hbox{(35)}}$$ we can see that the only difference is that Formula$x_t$ is replaced by Formula$x'_t$, which can be taken as passing Formula$x_t$ through a BSC channel. As stated previously, the reconstruction of the scrambler is based on the assumption that Formula$x_t$ is biasedly distributed with Formula${\rm Pr}\,(x_t=0)= {(1/2)}+\varepsilon$. When Formula$x_t$ is replaced by Formula$x'_t$, we need to look into the distribution of Formula$x'_t$ and see if it is biasedly distributed also.

Let us introduce another variable Formula$\delta$ such that Formula$\delta= {(1/2)}-p$. We then have Formula TeX Source $$\eqalignno{{\rm Pr}(e_t=1)& = {1\over 2}-\delta &\hbox{(36)}\cr \noalign{\hbox{and}} {\rm Pr}(e_t=0)& ={1\over 2}+\delta. &\hbox{(37)} }$$

According to the distribution property of Formula$x_t$ and Formula$e_t$, we have Formula TeX Source $$\eqalignno{{\rm Pr}(x'_t=1)& = {1\over 2}-2\varepsilon\delta &\hbox{(38)}\cr \noalign{\hbox{and}} {\rm Pr}(x'_t=0)& = {1\over 2}+2\varepsilon\delta. &\hbox{(39)} }$$

From the above equations, we can see that Formula$x'_t$ is also biasedly distributed, with a new bias Formula$\varepsilon'=2\varepsilon\delta$. It means that when channel noise is present, we still can recover the synchronous scrambler by using the method proposed for noiseless condition. The only difference is that the source bias is changed from Formula$\varepsilon$ to Formula$2\varepsilon\delta$, where Formula$\delta$ is determined by the channel error probability Formula$p$.

As Formula$p\leq 0.5$ and Formula$\delta= {(1/2)}-p$, we have Formula$0\leq 2\delta\leq 1$ or Formula$2\varepsilon\delta\leq \varepsilon$. The “=” sign only holds when Formula$\delta=0.5$ or Formula$p=0$. Therefore, when Formula$p > 0$, we will have Formula$\varepsilon' < \varepsilon$. According to (26) and (27), the smaller the bias, the larger the number of bits and operations are required in the reconstruction. Therefore, when Formula$\varepsilon$ is changed to Formula$\varepsilon'$, the number of bits required in the reconstruction becomes larger and the time required to do the detection becomes longer. When Formula$\mathtilde{N}\gg i_{d-1}$, which is usually true for practical systems, the factor of increase Formula$(I)$ in the number of bits can be derived as Formula TeX Source $$I= {1\over (2\delta)^{2d}}= {1\over (1-2p)^{2d}}. \eqno{\hbox{(40)}}$$

In Fig. 8, the values of Formula$I$ are plotted for different values of Formula$p$ and Formula$d$ when channel errors are present. It can be observed that the factor of increase in the number of bits used in the reconstruction grows with increases in Formula$p$ and Formula$d$. In practical situations, the channel error probability can vary from a very small value to about 0.2 when Formula$E_b/N_o = 0$ dB [17], and, therefore, Formula$I$ may vary from 1 to 100 depending on the values of Formula$p$ and Formula$d$. To make an appropriate choice of Formula$I$, an accurate estimation of the statistical properties of the channel is, therefore, needed.

Next, the impact on Formula$P_f$ and Formula$P_n$ when channel errors are present but the number of bits used is not increased correspondingly is investigated. According to (10) and the new upper bound of Formula$\sigma$ derived in Section IV, we have Formula TeX Source $$a=\Phi^{-1}\left(1- {P_f\over 2}\right)= {\mathtilde{T}\over \sqrt{\mathtilde{N}-i_{d-1}}}. \eqno{\hbox{(41)}}$$ As Formula$\mathtilde{T}$ and Formula$\mathtilde{N}$ are precalculated, they will not change with the change in the source bias; therefore, Formula$P_f$ will not be affected by the channel noise.

According to (11) and the distribution of Formula$Z$, we have Formula TeX Source $$-b=\Phi^{-1}(P_n)= {\mathtilde{T}-\mu\over \sigma} \eqno{\hbox{(42)}}$$ where Formula$\mathtilde{T}$ is given by (25), Formula$\mu$ by (4), and Formula$\sigma$ by (22). When channel noise is present, the source bias will change from Formula$\varepsilon$ to Formula$2\varepsilon\delta$. Therefore, the Formula$\varepsilon$ in (4) and (22) must be replaced by Formula$2\varepsilon\delta$, and the resulting new mean value Formula$\mu_{e}$ of Formula$Z$ is Formula TeX Source $$\mu_{e}=(\mathtilde{N}-i_{d-1})(4\varepsilon\delta)^d=(2\delta)^d\mu \eqno{\hbox{(43)}}$$ and the new upper bound Formula$\sigma_{e}$ of the standard deviation of Formula$Z$ is Formula TeX Source $$\eqalignno{\sigma_{e}& =\sqrt{(\mathtilde{N}-i_{d-1})[1+d((4\varepsilon\delta)^2-(4\varepsilon\delta)^{2d})]}\cr & \approx \sqrt{\mathtilde{N}-i_{d-1}}\cdot\bar\sigma'_l. &\hbox{(44)} }$$ From (43) and (44), it can be observed that when channel noise is present, the standard deviation of Formula$Z$ will not be significantly affected, but the mean value of Formula$Z$ will become smaller. This will result in a smaller value of Formula$b$, and thus Formula$P_n$ will increase. As Formula$2\delta\leq 1$, we have Formula TeX Source $$\eqalignno{& {\mathtilde{T}-\mu_{e}\over \sigma} ={\mathtilde{T}-(2\delta)^d\mu\over \sigma}\geq {(2\delta)^d (\mathtilde{T}-\mu)\over \sigma} =(2\delta)^d(-b).\cr &&\hbox{(45)} }$$ Therefore, the new value of Formula$P_n$ when noise is present, i.e., Formula$P^e_n$, can be roughly estimated by Formula TeX Source $$P^e_n=\Phi\left({\mathtilde{T}-\mu_{e}\over \sigma}\right)\geq\Phi ((2\delta)^d(-b)). \eqno{\hbox{(46)}}$$ In Fig. 9, the values of Formula$P^e_n$ are plotted for different values of Formula$p$ and Formula$d$. The number of bits Formula$\mathtilde{N}$ used in the recovery is set to satisfy the condition that when Formula$p=0$, the nondetection probability is equal to Formula$10^{-5}$.

Figure 8
Fig. 8. Factor of increase Formula$(I)$ of the number of bits for different Formula$p$ and Formula$d$ when channel errors are present Formula$(\mathtilde{N}\gg i_{d-1})$.
Figure 9
Fig. 9. Values of Formula$P^e_n$ for different values of Formula$p$ and Formula$d (\varepsilon=0.1,$ Formula$P_f=2.10^{-7})$.

From the figure, it is clear that with the increase in the channel error probability, the nondetection probability Formula$P^e_n$ increases rapidly if the number of bits used in the recovery process is not increased correspondingly. When the channel error probability is 0.1, more than half of the sparse multiples of the feedback polynomial will not be detected. When the channel error probability becomes equal to or larger than 0.15, almost all the sparse multiples will not be detected. Therefore, as stated before, in practical situations, channel estimation should be used to make sure that the number of bits used in the recovery process is properly chosen.

B. Recovery of Scrambler When Insertion of Bits Occurs

During the data transmission, there is another kind of error, namely insertion/deletion of one or more bits into/from the scrambled bit sequence. In this section, for simplicity, only insertion of one bit is considered. However, our ideas can easily be extended to insertion of more than one bits and also deletion of one or more than one bits. Suppose Formula$c_x$ is inserted at time index Formula$t_x (t_x\geq 0)$, the received sequence becomes Formula TeX Source $$y_0,y_1, y_2,\ldots y_{t_x-1},c_x, y_{t_x}, y_{t_x+1},\ldots$$ Suppose the received sequence is denoted by Formula$y'_t$, we have Formula TeX Source $$y'_t= \cases{y_t, & $t\leq t_x-1$ \cr c_x, & $t=t_x$\cr y_{t-1}, & $t\geq t_x+1$.} \eqno{\hbox{(47)}}$$ For Formula$(i_1,\ldots,i_{d-1})$, when Formula$t\leq t_x-1$, we then have Formula TeX Source $$z'_t=y'_t\oplus\bigoplus_{j=1}^{d-1} y'_{t-i_j}=y_t\oplus\bigoplus_{j=1}^{d-1} y_{t-i_j}. \eqno{\hbox{(48)}}$$

When Formula$t\geq t_x+i_{d-1}+1$, we will have Formula TeX Source $$z'_t=y'_t\oplus\bigoplus_{j=1}^{d-1} y'_{t-i_j}=y_{t-1}\oplus\bigoplus_{j=1}^{d-1} y_{t-i_j-1}. \eqno{\hbox{(49)}}$$ Comparing with the noiseless case, in which we have Formula$z_t=y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j}$, we can see that Formula$z'_t=z_t$ when Formula$t\leq t_x-1$, and Formula$z'_t=z_{t-1}$ when Formula$t\geq t_x+i_{d-1}+1$. As the final decision is based on the value of Formula$Z$, we are more interested in the difference between the summations Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$. Assuming the number of bits of Formula$\{z_{i_{d-1}},\ldots,z_{\mathtilde{N}-1}\}$ that are different from Formula$\{z'_{i_{d-1}},\ldots,z'_{\mathtilde{N}-1}\}$ is Formula$d_z$, then the density, Formula$P_{(t_x,i_{d-1},\mathtilde{N})}= {(d_z)/(\mathtilde{N}-i_{d-1})}$, is dependent on the values of Formula$t_x$, Formula$i_{d-1}$, and Formula$\mathtilde{N}$. For different values of Formula$t_x$, Formula$i_{d-1}$, and Formula$\mathtilde{N}$, Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ has the following four different expressions:

  1. Formula$0\leq t_x < i_{d-1}+1$ and Formula$t_x+i_{d-1}<\mathtilde{N}-1$. In this case, Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ can be written as Formula TeX Source $$\eqalignno{\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}& =\sum_{t=i_{d-1}}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}+1}^{\mathtilde{N}-1}z'_t\cr & =\sum_{t=i_{d-1}}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}}^{\mathtilde{N}-2}z_t. &\hbox{(50)} }$$ Considering the worst case, i.e., Formula$z'_t$ is independent of Formula$z_t ({\rm Pr}\, (z'_t=z_t)=0.5)$ when Formula$i_{d-1}\leq t\leq {t_x+i_{d-1}}$, it can be seen that the number of bits that are different in Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately Formula$0.5(t_x+1)$ and Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by Formula TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}= {0.5(t_x+1)\over \mathtilde{N}-i_{d-1}}. \eqno{\hbox{(51)}}$$
  2. Formula$0\leq t_x < i_{d-1}+1$ and Formula$t_x+i_{d-1}\geq\mathtilde{N}-1$. In this case, the number of bits that are different in Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately Formula${(\mathtilde{N}-i_{d-1})/(2)}$ and Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by Formula TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}=0.5. \eqno{\hbox{(52)}}$$
  3. Formula$i_{d-1}+1\leq t_x < \mathtilde{N}$ and Formula$t_x+i_{d-1} < \mathtilde{N}-1$. In this case, Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ can be written as Formula TeX Source $$\eqalignno{\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}& =\sum_{t=i_{d-1}}^{t_x-1}z'_t+\sum_{t=t_x}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}+1}^{\mathtilde{N}-1}z'_t\cr & =\sum_{t=i_{d-1}}^{t_x-1}z_t+\sum_{t=t_x}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}}^{\mathtilde{N}-2}z_t. &\hbox{(53)} }$$ The number of bits that are different in Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately Formula$0.5(i_{d-1}+1)$ and Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by Formula TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}= {0.5(i_{d-1}+1)\over \mathtilde{N}-i_{d-1}}. \eqno{\hbox{(54)}}$$
  4. Formula$i_{d-1}+1\leq t_x < \mathtilde{N}$ and Formula$t_x+i_{d-1}\geq\mathtilde{N}-1$. In this case, Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ can be written as Formula TeX Source $$\eqalignno{\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}& =\sum_{t=i_{d-1}}^{t_x-1}z'_t+\sum_{t=t_x}^{\mathtilde{N}-1}z'_t\cr & =\sum_{t=i_{d-1}}^{t_x-1}z_t+\sum_{t=t_x}^{\mathtilde{N}-1}z'_t. &\hbox{(55)} }$$ The number of bits that are different in Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately Formula$0.5(\mathtilde{N}-t_x)$ and Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by Formula TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}= {0.5(\mathtilde{N}-t_x)\over \mathtilde{N}-i_{d-1}}. \eqno{\hbox{(56)}}$$

In Figs. 10 and 11, the variations of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ versus Formula$t_x$ according to our analysis above are depicted. When Formula$\mathtilde{N}\geq 2(i_{d-1}+1)$, the variation of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ versus Formula$t_x$ will follow the curve shown in Fig. 10 and when Formula$\mathtilde{N} < 2(i_{d-1}+1)$, the variation of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ versus Formula$t_x$ will follow the curve shown in Fig. 11.

From both figures, it can be observed that the insertion of one bit will have the least impact to the summation Formula$\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ when the bit is inserted at the two ends of the bit sequence. When the insertion point moves from the two ends to the middle of the bit sequence, the impact will increase and finally reach a maximum value, which depends on the values of Formula$i_{d-1}$ and Formula$\mathtilde{N}$. According to [7], for a randomly chosen primitive polynomial of degree Formula$L$, the minimum value of Formula$i_{d-1}$ is Formula TeX Source $$i_{d-1}=((d-1)!)^{\displaystyle{{1\over d-1}}}2^{\displaystyle{{L\over d-1}}}. \eqno{\hbox{(57)}}$$

Figure 10
Fig. 10. Variations of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ versus Formula$t_x (\mathtilde{N}\geq 2(i_{d-1}+1))$.
Figure 11
Fig. 11. Variations of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ versus Formula$t_x (\mathtilde{N} < 2(i_{d-1}+1))$.

For a small value of Formula$L$, normally Formula$\mathtilde{N}\geq 2(i_{d-1}+1)$ will hold and from Fig. 10, it can be observed that the maximum value of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ is Formula$(0.5(i_{d-1}+1))/(\mathtilde{N}-i_{d-1})$. When Formula$L$ increases in value, and as Formula$i_{d-1}$ increases exponentially with Formula$L$, finally Formula$\mathtilde{N} < 2(i_{d-1}+1)$ will be satisfied. In this case, when Formula$\mathtilde{N}-(i_{d-1}+1)\leq t_x < i_{d-1}+1$, Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ will be as high as 0.5.

In the following, the impact of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$ on the performance of the reconstruction will be discussed. For simplicity, we consider insertion of one bit in the scrambled bit sequence to be equivalent to passing Formula$y_t$ through a BSC channel with channel error probability Formula$p_{\rm eq}$. Let Formula$P_{(p_{\rm eq})}$ denote the density of the number of different bits in Formula$\{z_{i_{d-1}},\ldots,z_{\mathtilde{N}-1}\}$ and Formula$\{z'_{i_{d-1}},\ldots,z'_{\mathtilde{N}-1}\}$ when passing Formula$y_t$ through a BSC with channel error probability Formula$p_{\rm eq}$; from (32) and (35), we have Formula TeX Source $$\eqalignno{P_{(p_{\rm eq})}& ={\rm Pr}(z'_t\oplus z_t=1)\cr & ={\rm Pr}\left(e_t\oplus\bigoplus_{j=1}^{d-1}e_{t-i_j}=1\right)= {1\over 2}\left[1-(2\delta_{\rm eq})^d\right]\cr &&\hbox{(58)} }$$ where Formula$\delta_{\rm eq}= {1\over 2}-p_{\rm eq}$. Let Formula$P_{(p_{\rm eq})}=P_{(t_x,i_{d-1},\mathtilde{N})}$, it can be derived that Formula TeX Source $$\delta_{\rm eq}= {1\over 2} \root{d}\of{{1-2P_{(t_x,i_{d-1},\mathtilde{N})}}} \eqno{\hbox{(59)}}$$ and consequently we have Formula TeX Source $$p_{\rm eq}= {1\over 2}-\delta_{\rm eq}= {1\over 2}\left(1-\root{d}\of{1-2P_{(t_x,i_{d-1},\mathtilde{N})}}\right). \eqno{\hbox{(60)}}$$

According to (60), for each value of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$, there is a corresponding channel error probability Formula$P_{\rm eq}$. The bigger the value of Formula$P_{(t_x,i_{d-1},\mathtilde{N})}$, the bigger the value of Formula$p_{\rm eq}$. Details on the impact of the channel error probability on the performance of the reconstruction can be found in Section V-A. For most practical systems, we have Formula$\mathtilde{N}\gg i_{d-1}$ and Formula${(0.5(i_{d-1}+1))/(\mathtilde{N}-i_{d-1})}\approx 0$. According to Fig. 10, Formula$P_{(t_x,i_{d-1},\mathtilde{N})}\leq {(0.5(i_{d-1}+1))/(\mathtilde{N}-i_{d-1})}$; therefore, for any value of Formula$t_x$ we have Formula$P_{(t_x,i_{d-1},\mathtilde{N})}\approx 0$ and hence Formula$p_{\rm eq}\approx 0$. It means that when Formula$\mathtilde{N}\gg i_{d-1}$, the performance of the reconstruction will not be affected by the insertion of one bit, no matter where the bit is inserted. However, when Formula$i_{d-1}$ becomes larger, especially when Formula$i_{d-1}\geq 0.5\mathtilde{N}$, the value of Formula$p_{\rm eq}$ will vary from 0 to 0.5 depending on where the bit is inserted. In the worst case, i.e., Formula$p_{\rm eq}=0.5$, the reconstruction will never succeed no matter how big the bias Formula$\varepsilon$ is, because as described in Section V-A, when Formula$p_{\rm eq}=0.5$, we have Formula$\delta_{\rm eq}=0$ and after passing through the BSC channel, the new bias becomes Formula$2\varepsilon\delta_{\rm eq}=0$. In this case, the two distributions depicted in Fig. 2 will overlap and, therefore, no multiple of the feedback polynomial can be detected.

SECTION VI

CONCLUSION

Cluzeau's algorithm is very promising in reconstructing the feedback polynomial of the LFSR used in a linear scrambler. In this paper, a scheme to improve the detection capability of Cluzeau's algorithm is proposed. Simulation results show that the detection capability is significantly improved by using the proposed scheme. A tighter upper bound for Formula$\sigma$ which approaches the actual Formula$\sigma$ more closely has also been derived. By using the new upper bound, the number of bits and operations required by Cluzeau's algorithm to reconstruct the feedback polynomial of the LFSR can be reduced significantly without affecting the detection capability. As the number of operations is reduced, the time required for reconstruction is also reduced. According to our analysis, the higher the weight of the sparse multiples to be searched, the higher the time reduction factor.

It should be noted that even with the improved algorithm, the value of Formula$\varepsilon$ will affect the number of bits and operations and in turn the running time of the algorithm significantly. For example, when the source bias is reduced from 0.1 to 0.01, the number of operations required to do the reconstruction will increase by at least Formula$10^6$ times. In this case, even for a feedback polynomial of very small degree, the time to do the detection will become very long. For example, for the first feedback polynomial in Table IV, the detection time will increase from 3 s to about 35 days! Fortunately, for natural source in practical situations, the typical values of Formula$\varepsilon$ are 0.1 and 0.05 [7].

Another issue investigated in this paper is on how to recover the scrambler in the presence of noise. Our analysis results show that when passing the scrambled bits through a BSC channel, the feedback polynomial of the LFSR still can be recovered by using the same method as the one proposed for the recovery of the LFSR in noiseless condition. The only difference is that the effective source bias is changed which depends on the channel error probability Formula$p$. As the effective source bias is smaller than the original source bias when bit errors are present, the number of bits required in the reconstruction becomes larger in order to maintain the detection capability. The larger the value of Formula$p$, the larger the number of bits required for the reconstruction. As the factor of increase in the number of bits varies a lot for different values of Formula$p$ and Formula$d$, channel estimation is proposed to be used to get the statistical properties of the channel.

We have also investigated the problem of reconstruction of the scrambler when there is an insertion of one bit in the scrambled bit sequence. What we have found is that when the number of bits used in the reconstruction Formula$(\mathtilde{N})$ is much larger than the minimum degree of the multiple of the feedback polynomial Formula$(i_{d-1})$, the performance of the reconstruction will not be affected by the insertion of one bit no matter where the bit is inserted. However, with the increase in the degree of the feedback polynomial, the degradation in the performance of the reconstruction algorithm will vary a lot depending on where the bit is inserted. When the bit is inserted at the two ends of the bit sequence, the performance degradation is small. When the insertion point moves from the two ends to the middle of the bit sequence, the performance degradation increases until a maximum is reached, and the maximum performance degradation is dependent on Formula$\mathtilde{N}$ and Formula$i_{d-1}$. In the worst case, i.e., Formula$i_{d-1}\geq 0.5\mathtilde{N}$ and the insertion of the bit is at the mid point of the sequence, the reconstruction will never succeed even though only one bit is inserted.

Footnotes

The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Z. Jane Wang.

X.-B. Liu and S. N. Koh are with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798.

X.-W. Wu is with the School of Information and Communication Technology, Griffith University, Gold Coast, QLD 4222, Australia.

C.-C. Chui is with the Temasek Laboratories at Nanyang Technological University, Singapore 639798.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

References

No Data Available

Authors

Xiao-Bei Liu

Xiao-Bei Liu

Xiao-Bei Liu received the B.S. degree in electrical and communication engineering from Fudan University, Shanghai, China, in 1998, and the Ph.D. degree from Nanyang Technological University (NTU), Singapore, in 2004.

From 1998 to 2000, she was an engineer with Datang Mobile Communications Equipment Co., Ltd. and from 2007 to 2010 she was a senior digital signal processing engineer in Wireless Sound Solutions Pte. Ltd. She is currently a research fellow in the Positioning and Wireless Technology Centre of NTU and her research interests include digital signal processing in wireless communications, modulation/coding techniques, and secured communications.

Soo Ngee Koh

Soo Ngee Koh

Soo Ngee Koh received the B.Eng. degree from the University of Singapore and the B.Sc. degree from the University of London, both in 1979. He received the M.Sc. and Ph.D. degrees from Loughborough University, U.K., in 1981 and 1984, respectively.

Prior to his return to Singapore, he worked as a consultant at the British Telecom Research Laboratories in England. He joined Nanyang Technological University (NTU) of Singapore in 1985. He was the founding Head of the Communication Engineering Division of the School of Electrical and Electronic Engineering (EEE) of NTU from 1995 to 2005, founding Cochair of the International Conference on Information, Communications and Signal Processing, and Associate Chair (Academic) from 2005 to 2011. He is currently a Professor of the School and Director (Undergraduate Education) of the University. He has published more than 140 papers in international journals and conference proceedings, and holds two international patents on speech coder design. His research interests include speech processing, coding, enhancement and recognition, computer-aided language learning, blind source separation, and secured communication.

Xin-Wen Wu

Xin-Wen Wu

Xin-Wen Wu (M'00) received the B.S. and M.S. degrees in 1989 and 1992, respectively, from East China Normal University, Shanghai, and the Ph.D. degree in 1995 from the Institute of Systems Science, Chinese Academy of Sciences, Beijing.

From 1995 through 2003, he was affiliated with the Institute of Mathematics, Chinese Academy of Sciences. From January to October 1996, and from October 1997 to December 1998, he was a visiting research associate at the Center for Advanced Computer Studies at the University of Louisiana, Lafayette. From June 1999 to May 2000, he was a postdoctoral researcher at the Department of Electrical and Computer Engineering, University of California at San Diego. During February 2003–October 2005, he worked at the Department of Electrical and Electronic Engineering, University of Melbourne, holding a research fellowship. From November 2005 through April 2010, he was a faculty member at the Graduate School of Mathematics and Information Technology, University of Ballarat. Since April 2010, he has been with the School of Information and Communication Technology, Griffith University, Gold Coast, Australia. His research interests are in the areas of coding theory, cryptology, information theory with applications to bioinformatics, and other areas. He has authored or coauthored over 40 research papers and one book in the above-mentioned areas.

Chee-Cheon Chui

Chee-Cheon Chui

Chee-Cheon Chui received the B.Eng. degree from the National University of Singapore, Singapore, in 1994, and the M.Sc. and Ph.D. degrees from the University of Southern California, in 2001 and 2005, respectively, all in electrical engineering.

He is currently with Temasek Laboratories at Nanyang Technological University, Singapore, as a research scientist, engaging in research and development and management of numerous projects in the field of wireless communications. He has also held various positions in the executive committee of the IEEE Singapore local Communications Chapter. His current research interests include receiver synchronization, time-synchronization of wireless systems, physical-layer security, wireless communication signal processing, and forward error correction coding.

Cited By

No Data Available

Keywords

Corrections

None

Multimedia

No Data Available
This paper appears in:
No Data Available
Issue Date:
No Data Available
On page(s):
No Data Available
ISSN:
None
INSPEC Accession Number:
None
Digital Object Identifier:
None
Date of Current Version:
No Data Available
Date of Original Publication:
No Data Available

Text Size