By Topic

• Abstract

SECTION I

## INTRODUCTION

IN A digital communication system, the constituent elements used by the transmitter and the specifications of each element are normally known by the receiver. In this paper, we consider a scenario wherein the specifications of the elements used by the transmitter are not completely known to the receiver. The capability of reconstructing transmitter elements when their specifications are not perfectly known is envisaged to be an enabling technology in digital communication systems with a flexible platform such as software defined radio (SDR), as it will reduce the overheads needed and make the design of the system more flexible. Similar application is also envisaged in [1] for “multistandard adaptive receivers.”

The challenge of reconstructing transmitter elements when their specifications are unknown has attracted considerable research interests in the last few years. For example, results and findings on recovery of error-correcting codes have been published in [2], [3], [4], [5], [6], [7]. In this paper, we focus on the reconstruction of another element which is commonly used in digital communication systems, i.e., the linear scrambler. A linear scrambler is usually used in a communication system to convert a data bit sequence into a pseudorandom sequence that is free from long strings of 1s or 0s. There are generally two types of linear scrambler, namely the synchronous scrambler and the self-synchronized scrambler. Both types of scrambler usually consist of a linear feedback shift register (LFSR) whose output sequence $(s_t)_{t\geq 0}$ is combined with the input sequence $(x_t)_{t\geq 0}$ and the result is the scrambled sequence $(y_t)_{t\geq 0}$, i.e., TeX Source $$y_t=x_t\oplus s_t, \quad t\geq 0 \eqno{\hbox{(1)}}$$ where $\oplus$ denotes module 2 summation. In this paper, for simplicity, only the synchronous scrambler is considered. However, our ideas on the reconstruction of synchronous scramblers with the presence of channel noise can also be extended to self-synchronized scramblers.

In most communication systems, to achieve the maximum period for the sequences produced by the LFSRs, binary primitive polynomials are used as the feedback polynomials. Reconstructing a linear scrambler consists of reconstructing the feedback polynomial of the LFSR as well as its initial state in the case of a synchronous scrambler. In this paper, we will focus on reconstructing the feedback polynomial of the LFSR, as reconstructing the initial state of the LFRS is a well-known problem for stream cipher and it has been extensively studied in the literature [8], [9], [10], [11]. When some input and scrambled bits are known, the Berlekamp–Massey algorithm [12] can be used to reconstruct the feedback polynomial of the LFSR. Recently, an algorithm is proposed by Cluzeau for reconstructing the feedback polynomial of the LFSR by using only the scrambled bits [13]. In the following, this algorithm will be referred to as “Cluzeau's algorithm.”

Although Cluzeau's algorithm can be used to reconstruct most of the feedback polynomials of the LFSR very efficiently, it can be observed from the simulation results shown in [13] that in some cases, the algorithm cannot make a correct detection of the feedback polynomial even when the false-alarm probability $P_f$ is set to a very small value. Furthermore, the algorithm proposed in [13] assumed that all the scrambled bits are correctly received. In practical situations, the communication channels always have some types of noise, which will lead to errors in the received bits. In this paper, the above-mentioned two problems are investigated. In the first part of this paper, a scheme to improve the detection capability of Cluzeau's algorithm, with only marginal increase in complexity is proposed. Following that, an approach to reduce the number of operations required by Cluzeau's algorithm to do the recovery without affecting the detection capability is described. In the second part of this paper, the problem of reconstruction of scramblers in the presence of noise is studied. Two kinds of channel errors are considered; one is bit flipping due to channel noise and the second is insertion of bits in the scrambled bit sequence.

The paper is organized as follows. In Section II, Cluzeau's algorithm is reviewed. In Section III, a scheme is proposed to improve the detection capability of Cluzeau's algorithm. In Section IV, the approach to reduce the number of operations required to detect the feedback polynomial will be described. In Section V, reconstructions of the scrambler in the presence of channel noise is investigated. Conclusions are drawn in Section VI.

SECTION II

## CLUZEAU'S ALGORITHM FOR RECONSTRUCTING A SYNCHRONOUS SCRAMBLER

Fig. 1. Structure of synchronous scrambler.

In this section, Cluzeau's algorithm which recovers the feedback polynomial $P(X)$ of a synchronous scrambler by using only the scrambled bits will be reviewed. In a synchronous scrambler, $s_t$ is generated independently of $x_t$ and $y_t$, as shown in Fig. 1.

Instead of searching for the feedback polynomial $P(X)$ directly, Cluzeau's algorithm searched for sparse multiples of $P(X)$ with the degree of the sparse multiples varying from low to high. After two multiples of $P(X)$ are detected, it returns the nontrivial greatest common divisor (gcd) of the two detected multiples as the detected feedback polynomial. The determination of whether a sparse polynomial is a multiple of $P(X)$ or not is based on a statistical test on the absolute value of a variable $Z$, which is given by TeX Source $$Z=\sum_{t=i_{d-1}}^{N-1}(-1)^{z_t} \eqno{\hbox{(2)}}$$ where $z_t$ is a module 2 summation of $d$ scrambled bits, i.e., $z_t=y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j}, (0 < i_1 < i_2 < \cdots < i_{d-1})$. Let $Q(X)=1+\sum_{j=1}^{d-1}X^{i_j}$, when $Q(X)$ is a multiple of $P(X)$, we have TeX Source \eqalignno{z_t & =y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j}\cr & =x_t\oplus\bigoplus_{j=1}^{d-1}x_{t-i_j} &\hbox{(3)} } since $s_t\oplus\bigoplus_{j=1}^{d-1}s_{t-i_j}=0$ and $y_t=x_t\oplus s_t$. According to the statistical analysis results given in [7], when $Q(X)$ is a multiple of $P(X)$ and if the input bits are biased distributed with ${\rm Pr}\, (x_t=1)= {(1/2)}-\varepsilon$, where $\varepsilon\neq 0,$ $z_t$ is also biased distributed with ${\rm Pr}\, (z_t=1)= {(1/2)} [1-(2\varepsilon)^d]$. Then according to Theorem 1 given in [7], the value of $Z$, i.e., $\sum_{t=i_{d-1}}^{N-1}(-1)^{z_t}=(N-i_{d-1})-2\sum_{t=i_{d-1}}^{N-1}z_t$, is Gaussian distributed with mean value $\mu$ given by TeX Source $$\mu=(N-i_{d-1})(2\varepsilon)^d \eqno{\hbox{(4)}}$$ and variance $\sigma^2$ given by TeX Source $$\displaylines{\sigma^2=(N-i_{d-1}) (1-(2\varepsilon)^{2d})\hfill\cr \hfill +\sum_{u=1}^{d-1}N_u \left((2\varepsilon)^{2(d-u)}-(2\varepsilon)^{2d}\right) \quad\hbox{(5)} }$$

Fig. 2. Distributions of $Z$.

where $N_u$ denotes the number of pairs $(z_t,z_{t'}), (0 < \vert t'-t \vert\leq i_{d-1})$ which share exactly $u$ terms of $x_t$. For different $Q(X)$, the values of $N_u$ are different, and hence there is not a fixed value of $\sigma^2$. However, according to [7], an upper bound of $\sigma^2$ can be derived, since in the worst case, $N_1=N_2= \cdots =N_{d-2}=0$ and $N_{d-1}\leq (N-i_{d-1})2d(d-1)$, which leads to TeX Source \eqalignno{\sigma^2 & \leq (N-i_{d-1}) [1+2d(d-1)(2\varepsilon)^2\cr & \qquad\qquad\qquad\quad -(2\varepsilon)^{2d}(1+2d(d-1))]\cr & \leq (N-i_{d-1}) [1+2d(d-1)](1-(2\varepsilon)^{2d}). &\hbox{(6)} } Therefore, the upper bound $\sigma_l$ of $\sigma$ is given by TeX Source $$\sigma\leq \sigma_l=\sqrt{(N-i_{d-1})[1+2d(d-1)](1-(2\varepsilon)^{2d})} \eqno{\hbox{(7)}}$$ and the normalized upper bound $\bar\sigma_l$ is given by TeX Source $$\bar\sigma_l = {\sigma_l\over \sqrt{N-i_{d-1}}} = \sqrt{[1+2d(d-1)](1-(2\varepsilon)^{2d})}. \eqno{\hbox{(8)}}$$

According to the above description, when $Q(X)$ is a multiple of $P(X)$, $Z$ has a Gaussian distribution with mean value $\mu$ and variance $\sigma^2$; it is also obvious that when $Q(X)$ is not a multiple of $P(X)$, ${\rm Pr}\, (z_t=0)= {(1/2)}$, implying that $Z$ has a Gaussian distribution with mean value 0 and variance $N-i_{d-1}$. The two distributions are depicted in Fig. 2.

From Fig. 2, it can be observed that when the two distributions of $Z$ have a small enough intersection, a threshold $T$ can be used to determine whether $Q(X)$ is a multiple of $P(X)$ or not: i.e., when $\vert Z\vert < T$, $Q(X)$ is not a multiple of $P(X)$; otherwise, $Q(X)$ is a multiple of $P(X)$. $T$ depends on the false-alarm probability $P_f={\rm Pr} (\vert Z\vert \geq T\vert$ when $Q(X)$ is not a multiple of $P(X))$ and on the nondetection probability $P_n={\rm Pr} (\vert Z\vert < T \vert$ when $Q(X)$ is a multiple of $P(X))$. In the following, Cluzeau's algorithm is outlined for reader's convenience.

1. Compute the threshold $T$ as follows: TeX Source $$T= {a(a+b\bar\sigma_l)\over (2\vert \varepsilon\vert)^{d}} \eqno{\hbox{(9)}}$$ where TeX Source \eqalignno{a & =\Phi^{-1}\left(1- {P_f\over 2}\right) &\hbox{(10)}\cr \noalign{\hbox{and}} b & =-\Phi^{-1}(P_n). &\hbox{(11)} } In the above equations, $\Phi$ denotes the normal distribution function, i.e., TeX Source $$\Phi(x)= {1\over \sqrt{2\pi}}\int_{-\infty}^{x}{\rm exp}\left(-{t^2\over 2}\right)dt \eqno{\hbox{(12)}}$$ and $d$ denotes the weights of the sparse multiples of $P(X)$ and typical values of $d$ are 3, 4, and 5.
2. For $(i_1,\ldots,i_{d-1})$, $0 < i_1 < \cdots < i_{d-1}\leq D$ ($D$ is the maximum degree of the sparse multiple we want to search), compute the number of bits $N$ required to recover the feedback polynomial as follows: TeX Source $$N=i_{d-1}+ {(a+b\bar\sigma_l)^2\over (2\varepsilon)^{2d}}. \eqno{\hbox{(13)}}$$
3. Initialize $Z$ with $Z=0$.
For $t$ from $i_{d-1}$ to $N$, compute TeX Source \eqalignno{z_t& =y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j} &\hbox{(14)}\cr \noalign{\hbox{and}} Z& =Z+(-1)^{z_t}. &\hbox{(15)} }
4. If $\vert Z\vert > T$, store $Q(X)=1+\sum_{j=1}^{d-1}X^{i_j}$ in a table.
5. For $Q'(X)\neq Q(X)$ in the table, compute the nontrivial gcd of $(Q(X),Q'(X))$.

Steps 2–5 are repeated until a gcd $(Q(X),Q'(X))=P(X) (P(X)\neq 1)$ is found or all combinations of $(i_1,\ldots,i_{d-1})$ are tested.

According to [7], if using the algorithm described above, for a randomly chosen primitive polynomial $P(X)$ of degree $L$, the number of operations performed by the algorithm is TeX Source $$W_p= d\cdot 2^{L} {(a+b\bar\sigma_l)^2\over (2\varepsilon)^{2d}} \eqno{\hbox{(16)}}$$

To test the accuracy and runtime of Cluzeau's algorithm, it is applied to some feedback polynomials and results are shown in Table I. We will propose improved reconstruction procedures in the following sections. To make a fair comparison with Cluzeau's algorithm [13], we run Cluzeau's algorithm with the parameters $\varepsilon = 0.1$, false-alarm probability $P_f=2\cdot 10^{-7}$, nondetection probability $P_n=10^{-5}$, and weight of the sparse multiples of the feedback polynomial $d=3$, which are the same as those in [13].

TABLE I SIMULATION RESULTS OF CLUZEAU'S ALGORITHM

In Table I, the first, third, and last feedback polynomials are the same as those shown in Table III in [13]. Due to the difference in hardware (Intel dual core, 2.5 GHz), when testing Cluzeau's algorithm, the runtimes are a bit different from those presented by Cluzeau. In the following sections, we will test our improved reconstruction algorithm and compare the results, using the same parameters and hardware as those for Table I.

SECTION III

## IMPROVE THE DETECTION CAPABILITY OF CLUZEAU'S ALGORITHM

From Table I, it can be observed that the third and fourth polynomials are not correctly recovered. To further understand why the original feedback polynomials are not detected, we look into the detected multiples of the feedback polynomials and results are shown in Table II.

TABLE II MULTIPLES OF THE FEEDBACK POLYNOMIALS

From Table II, it can be observed that for the third and fourth feedback polynomials, their second detected trinomial multiple is a multiple (square) of the first one. Therefore, their gcd is the first detected multiple instead of the feedback polynomial. For the feedback polynomials $x^{23}+x^{18}+1$ and $x^{29}+x^2+1$, their second detected trinomial multiple is also a multiple of the first one. However, as shown in Table I, they are correctly detected. This is because their first detected trinomial multiples, i.e., $x^{23}+x^{18}+1$ and $x^{29}+x^2+1$, are exactly the feedback polynomials already.

Based on the above observation, to avoid making the wrong detection, the second detected multiple of the feedback polynomial should be ignored if it is a multiple of the first one. The only exception is that when the first detected multiple of the feedback polynomial is irreducible, then the feedback polynomial is nothing else but the detected multiple itself. Consequently, we can modify sightly Cluzeau's algorithm with an improved performance as follows:

1. Follow the algorithm described in Section II until a sparse multiple $Q(X)$ is found.
2. If $Q(X)$ is irreducible, then stop and return $Q(X)$ as the detected feedback polynomial; otherwise, store $Q(X)$ in a table and go to the next step.
3. Search for another sparse multiple $Q'(X)$; if $Q'(X)$ is a multiple of $Q(X)$, ignore it and keep on searching until a $Q'(X)$ which is not a multiple of $Q(X)$ is found. Store $Q'(X)$ in the table.
4. Compute gcd $(Q(X),Q'(X))=P(X)$. If $P(X) \neq 1$, return $P(X)$ as the detected feedback polynomial. Otherwise, go to Step 3.

In the following, the algorithm described above will be called “improved algorithm,” to differentiate it from Cluzeau's algorithm. The improved algorithm includes two extra steps, as compared with Cluzeau's algorithm. The first is to determine if the detected multiple is irreducible. The second is to determine if the second multiple is a multiple of the first one. The second step is rather simple because the number of operations (xors) used by it is bounded from above by $O(d\cdot D)$. For the first step, there are a lot of algorithms proposed in the literature to test the reducibility of a binary polynomial [14], [15]. We use the algorithm proposed in [15] and the number of operations is bounded from above by $O(d\cdot D^2)$. Compared with Cluzeau's algorithm, for which the number of operations is bounded from above by $O(({D^{d-1})/((2\varepsilon)^{2d})})$, the increase in complexity is trivial.

To evaluate the performance of the improved algorithm, the scrambler using each of all the primitive polynomials of degrees from 8 to 16 as the feedback polynomial is simulated. The weight of the sparse multiple is chosen to be $d=3$ and the results obtained are plotted in Fig. 3.

Fig. 3. Comparison of Cluzeau's and improved algorithms.

From Fig. 3, it is apparent that the detection performance is significantly improved by using the improved algorithm. It should be noted that there are still feedback polynomials which cannot be correctly recovered by using the improved algorithm. This situation occurs if the first detected multiple is $Q(X)=P(X)F(X)D(X)$, where $F(X)$ and $D(X)$ are any binary polynomials not equal to 1, and the second detected multiple is $Q'(X)=P(X)F(X)D'(X)$. If $D'(X)$ is not a multiple of $D(X)$, $Q'(X)$ is not a multiple of $Q(X)$ either. Therefore, $Q'(X)$ will not be ignored in the improved algorithm, and the gcd of $Q(X)$ and $Q'(X)$ is $P(X)F(X)$ instead of $P(X)$ itself. In any case, if the detected feedback polynomial is not primitive, say it is equal to $P(X)F(X)$, we can find the correct feedback polynomial by trying to recover the source bits using the polynomials $P(X)F(X),$ $P(X)$ and $F(X)$ as the feedback polynomials, respectively, and see which one would lead to a sensible source sequence. The procedure above includes the factorization of $P(X)F(X)$ which can be easily achieved with Berlekamp's algorithm [16].

SECTION IV

## IMPROVE THE SPEED OF CLUZEAU'S ALGORITHM

According to (13) and (16), the number of bits and operations required to recover the feedback polynomial depends on the normalized upper bound $\bar\sigma_l$. The bigger the value of $\bar\sigma_l$, the larger the number of bits and operations required. In the following, we will show that the upper bound of $\sigma$ can be improved, and a new upper bound $\sigma'_l$, which approaches the actual value of $\sigma$ more closely, will be derived.

As described in Section II, $\sigma^2$ becomes the maximum value in the worst case, i.e., $N_1=N_2=\cdots=N_{d-2}=0$ and $N_{d-1}\leq (N-i_{d-1})2d(d-1)$, where $N_u (u=1,2,\ldots,d-1)$ denotes the number of pairs $(z_t,z_{t'})$ which share exactly $u$ terms of $x_t$. It is obvious that $N_u$ also denotes the number of pairs $(X^tQ(X),X^{t'}Q(X))$ which share exactly $u$ terms of $X^i$. Let $M_u$ denote the number of pairs $(Q(X), X^nQ(X)),(0 < n\leq i_{d-1})$, which share exactly $u$ terms of $X^i (i=i_1, i_2,\ldots i_{d-1})$. Since at each time instant $t$, $(X^tQ(X),X^{t'}Q(X))$, $(i_{d-1}\leq t < t'\leq N-1, 0 < t'-t\leq i_{d-1})$, can be taken as a time shift version of $(Q(X), X^nQ(X)), (0 < n\leq i_{d-1})$, we have TeX Source $$N_u=2(N-i_{d-1})M_u. \eqno{\hbox{(17)}}$$ The factor 2 in (17) arises because $t$ and $t'$ are interchangeable.

In the following, the number of terms that the pair $(Q(X), X^nQ(X))$ have in common is studied. $Q(X)$ and $X^nQ(X)$ are denoted by $Q(X)=1+X^{i_1}+X^{i_2}+ \cdots +X^{i_{d-1}}$ and $X^nQ(X)=X^n+X^{n+i_1}+X^{n+i_2}+ \cdots +X^{n+i_{d-1}}$, respectively. Obviously, the constant term 1 will not be shared by the pair $(Q(X), X^nQ(X))$. So, first, how many times $X^{i_1}$ appears in $X^nQ(X)$ is studied. It can be observed that there is only one time, i.e., when $n=i_1$, $X^{i_1}$ appears in $X^{i_1}Q(X)$. Then, for $X^{i_2}$, there are two times it appears in $X^nQ(X)$, i.e., when $n=i_2$ or $n=i_2-i_1$, $X^{i_2}$ is shared by the pair $(Q(X), X^nQ(X))$. For the same reason, for $X^{i_3}$, $X^{i_4},\ldots,X^{i_{d-1}}$, the number of times they appear in $X^nQ(X)$ are $3, 4, \ldots, d-1$, respectively. In Table III, a summary of the terms that the pair $(Q(X), X^nQ(X))$ have in common, with their shared times and the corresponding values of $n$ are shown.

Obviously, Table III includes all the terms that the pair $(Q(X), X^nQ(X))$ share for $0 < n\leq i_{d-1}$. It is also noted that for the same shared term, the possible values of $n$ are different since $0 < i_1 < i_2 < \cdots < i_{d-1}$. It means that there is no double counting of the shared terms. Therefore, the total number of shared terms for the pair $(Q(X), X^nQ(X))$ for $0 < n\leq i_{d-1}$ is TeX Source $$M= {[1+(d-1)](d-1)\over 2}= {d(d-1)\over 2}. \eqno{\hbox{(18)}}$$

TABLE III SHARED TERMS OF THE PAIR $(Q(X), X^nQ(X))$ AND THE CORRESPONDING VALUE OF $n$

For different shared terms, the value of $n$ may be the same, e.g., when $i_1=i_2-i_1=i_3-i_2$, $Q(X)$ and $X^{i_1}Q(X)$ will share $X^{i_1}$, $X^{i_2}$, and $X^{i_3}$, i.e., three terms. Obviously, the maximum number of terms $Q(X)$ and $X^nQ(X)$ can share for the same value of $n$ is $d-1$. Since $M_u$ denotes the number of pairs $(Q(X), X^nQ(X)),(0 < n\leq i_{d-1})$ which share exactly $u$ terms of $X^i$, we have TeX Source $$\sum_{u=1}^{d-1}u\cdot M_u=M= {d(d-1)\over 2}. \eqno{\hbox{(19)}}$$ Based on (17) and (19), we have TeX Source \eqalignno{\sum_{u=1}^{d-1}u\cdot N_u & =2(N-i_{d-1}) \sum_{u=1}^{d-1} u\cdot M_u\cr & =(N-i_{d-1})\cdot d \cdot (d-1). &\hbox{(20)} }

From (5) and (20), it can be observed that the maximum value of $\sigma$ can be achieved only when TeX Source $$N_u= \cases{0, & {\rm when} \quad u < d-1\cr (N-i_{d-1})d, & {\rm when} \quad u=d-1.} \eqno{\hbox{(21)}}$$ Put the value of $u=d-1$ and $N_u=(N-i_{d-1})d$ into (5), we get TeX Source \eqalignno{\sigma^2 & \leq(N-i_{d-1})(1-(2\varepsilon)^{2d})\cr & \quad +(N-i_{d-1})d((2\varepsilon)^2-(2\varepsilon)^{2d})\cr & \leq(N-i_{d-1})[1+d((2\varepsilon)^2-(2\varepsilon)^{2d})] .&\hbox{(22)} }

According to (22), the new upper bound $\sigma'_l$ of $\sigma$ is given by TeX Source $$\sigma'_l=\sqrt{(N-i_{d-1})[1+d((2\varepsilon)^2-(2\varepsilon)^{2d})]} \eqno{\hbox{(23)}}$$ and the new normalized upper bound $\bar\sigma'_l$ is given by TeX Source $$\bar\sigma'_l= {\sigma'_l\over \sqrt{N-i_{d-1}}}=\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}. \eqno{\hbox{(24)}}$$

To see how closely the new upper bound of $\sigma$ derived above approaches the actual value, the Gaussian distributions with $\mu$ calculated by using (4) and standard deviation $\sigma$ equal to $\sigma_l$ and $\sigma'_l$, respectively, are plotted in Fig. 4. In Fig. 4, the actual distribution of $Z$, which is obtained by using $Q(X)=1+x^{10}+x^{21}$, $N=10\, 000$, $\varepsilon=0.1$, and $d=3$, is also plotted. A total of 50 000 values of $Z$ are collected and a low-pass filter is used to smooth the curve. Other trinomials for $Q(X)$ are also tested in our simulations and it is found that there is no big difference in the distribution of $Z$ for different trinomials of $Q(X)$. Therefore, the dashed curve in Fig. 4 can be taken as an approximation of the actual distribution of $Z$ for any $Q(X)$ with $d=3$.

Fig. 4. Distribution of $Z$ with different $\sigma (N=10\, 000$, $i_{d-1}=21$, $\epsilon=0.1$, $d=3)$.

From Fig. 4, it can be observed that the Gaussian distribution with $\sigma=\sigma'_l$ approaches the actual distribution of $Z$ very closely. The Gaussian distribution with $\sigma=\sigma_l$ deviates from the actual distribution by quite a margin, as shown in Fig. 4. It should be noted that when $d$ increases, $\sigma_l$ deviates even more from $\sigma$, as shown in Fig. 5, but $\sigma'_l$ still approaches $\sigma$ very closely.

Fig. 5. Distribution of $Z$ with different $\sigma (N=10\, 000$, $i_{d-1}=31$, $\epsilon=0.1$, $d=4)$.

When using the new upper bound, both the threshold $T$, the number of bits $N$, and the number of operations $W_p$ required by the algorithm can be reduced and consequently, the runtime of the algorithm can be reduced. The new threshold $\mathtilde{T}$ can be obtained by replacing $\bar\sigma_l$ in (9) by $\bar\sigma'_l$ and the result is given by TeX Source \eqalignno{\mathtilde{T} & = {a(a+b\bar\sigma'_l)\over (2\vert \varepsilon\vert)^{d}}\cr & = {a^2+ab\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}\over (2\vert \varepsilon\vert)^{d}}. &\hbox{(25)} } Similarly, the revised number of bits $\mathtilde{N}$ and operations $\mathtilde{W_p}$ required by the algorithm are given by TeX Source \eqalignno{\mathtilde{N} & =i_{d-1}+ {\left(a+b\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}\right)^2\over (2\varepsilon)^{2d}} &\hbox{(26)}\cr \noalign{\hbox{and}} \mathtilde{W_p} & =d\cdot 2^{L} {\left(a+b\sqrt{1+d((2\varepsilon)^2-(2\varepsilon)^{2d})}\right)^2\over (2\varepsilon)^{2d}}. &\hbox{(27)} }

Comparing (16) and (27), it can be observed that the reduction factor $R$ for the number of operations required to recover the feedback polynomial is given by TeX Source $$R= {W_p\over \mathtilde{W_p}}\approx {(\gamma+q)^2\over (1+q)^2} \eqno{\hbox{(28)}}$$ where $\gamma=\sqrt{1+2d(d-1)}$ and $q= {(a/b)}$.

In Fig. 6, the values of $R$ against $d$ are plotted. According to the simulation setup described in Section II, $q\approx 1.2$. It can be observed that $R$ increases with the increase in $d$. This is because $\gamma$ increases with the increase in $d$.

In the following, the improved algorithm which uses $\mathtilde{T}$ and $\mathtilde{N}$ instead of $T$ and $N$ are tested by using the same feedback polynomials as those in Table I, and results are shown in Table IV. In the second and third column of Table IV, the detected feedback polynomials and runtime required by the improved algorithm are shown. In the fourth column of Table IV, the runtime required by the improved algorithm using $\mathtilde{T}$ and $\mathtilde{N}$ instead of $T$ and $N$ is shown.

TABLE IV SIMULATION RESULTS OF THE IMPROVED ALGORITHM USING $\mathtilde{T}$ AND $\mathtilde{N}$ INSTEAD OF $T$ AND $N$
Fig. 6. Reduction factor $(R)$ for the number of operations for different values of $d (q\approx 1.2)$.

First, we compare the first three columns of Table I with Table IV. It can be seen that for the first and second feedback polynomials which are correctly detected by using Cluzeau's algorithm, they can still be correctly detected by using the improved algorithm and the runtime is not affected. For the third and fourth feedback polynomials which are wrongly detected by using Cluzeau's algorithm, they are correctly detected by using the improved algorithm, while the runtime is also increased. For the last two feedback polynomials, the runtime is even reduced by using the improved algorithm, because they are irreducible and the algorithm is stopped after the first step. We then compare the third and fourth columns in Table IV. It can be observed that the time required to do the detection is reduced by a factor of about 4.5 for all the primitive polynomials, when using $\mathtilde{T}$ and $\mathtilde{N}$ instead of $T$ and $N$ in the detection process. This result matches very well with the reduction factor for the number of operations required to recover the feedback polynomial shown in Fig. 6. It should be noted that when using $\mathtilde{T}$ and $\mathtilde{N}$, the detected polynomials are still the same as those shown in the second column of Table IV. In general, comparing Tables I and IV, it is clear that with our proposed schemes, both the detection capability and speed of Cluzeau's algorithm are improved significantly.

SECTION V

## RECOVERY OF SCRAMBLER IN THE PRESENCE OF CHANNEL NOISE

### A. Recover the Scrambler With Flipped Bits

In the algorithm described in Section II, the scrambled bit sequence $y_t (t > 0)$ is assumed to be correctly received at the receiver. In this section, we consider the situation that channel noise is present, as depicted in Fig. 7.

Fig. 7. Chain of scrambler and channel.

In Fig. 7, the channel is modeled as a binary symmetric channel (BSC). The input to the BSC is denoted by $y_t$ and the output of the channel is denoted by $y'_t$, which is given as TeX Source $$y'_t=y_t\oplus e_t \eqno{\hbox{(29)}}$$ where $e_t$ is the channel error at time instant $t$. Suppose the channel error probability (or crossover probability) is $p$, then at each time instant $t$ we have TeX Source \eqalignno{{\rm Pr}(e_t=1)& =p &\hbox{(30)}\cr \noalign{\hbox{and}} {\rm Pr}(e_t=0)& =1-p. &\hbox{(31)} } Similar to (14), we have TeX Source \eqalignno{z'_t& =y'_t\oplus\bigoplus_{j=1}^{d-1}y'_{t-i_j}\cr & =y_t\oplus e_t\bigoplus_{j=1}^{d-1}y_{t-i_j}\oplus e_{t-i_j} &\hbox{(32)} } where $y_t=x_t\oplus s_t$. When $Q(X)$ is a multiple of $P(X)$, we have $s_t\oplus\bigoplus_{j=1}^{d-1}s_{t-i_j}=0$ and thus TeX Source $$z'_t=x_t\oplus e_t\oplus\bigoplus_{j=1}^{d-1}x_{t-i_j}\oplus e_{t-i_j}. \eqno{\hbox{(33)}}$$ Suppose $x'_t=x_t\oplus e_t$, (33) becomes TeX Source $$z'_t=x'_t\oplus\bigoplus_{j=1}^{d-1}x'_{t-i_j}. \eqno{\hbox{(34)}}$$ Comparing with the noiseless case, in which we have TeX Source $$z_t=x_t\oplus\bigoplus_{j=1}^{d-1}x_{t-i_j} \eqno{\hbox{(35)}}$$ we can see that the only difference is that $x_t$ is replaced by $x'_t$, which can be taken as passing $x_t$ through a BSC channel. As stated previously, the reconstruction of the scrambler is based on the assumption that $x_t$ is biasedly distributed with ${\rm Pr}\,(x_t=0)= {(1/2)}+\varepsilon$. When $x_t$ is replaced by $x'_t$, we need to look into the distribution of $x'_t$ and see if it is biasedly distributed also.

Let us introduce another variable $\delta$ such that $\delta= {(1/2)}-p$. We then have TeX Source \eqalignno{{\rm Pr}(e_t=1)& = {1\over 2}-\delta &\hbox{(36)}\cr \noalign{\hbox{and}} {\rm Pr}(e_t=0)& ={1\over 2}+\delta. &\hbox{(37)} }

According to the distribution property of $x_t$ and $e_t$, we have TeX Source \eqalignno{{\rm Pr}(x'_t=1)& = {1\over 2}-2\varepsilon\delta &\hbox{(38)}\cr \noalign{\hbox{and}} {\rm Pr}(x'_t=0)& = {1\over 2}+2\varepsilon\delta. &\hbox{(39)} }

From the above equations, we can see that $x'_t$ is also biasedly distributed, with a new bias $\varepsilon'=2\varepsilon\delta$. It means that when channel noise is present, we still can recover the synchronous scrambler by using the method proposed for noiseless condition. The only difference is that the source bias is changed from $\varepsilon$ to $2\varepsilon\delta$, where $\delta$ is determined by the channel error probability $p$.

As $p\leq 0.5$ and $\delta= {(1/2)}-p$, we have $0\leq 2\delta\leq 1$ or $2\varepsilon\delta\leq \varepsilon$. The “=” sign only holds when $\delta=0.5$ or $p=0$. Therefore, when $p > 0$, we will have $\varepsilon' < \varepsilon$. According to (26) and (27), the smaller the bias, the larger the number of bits and operations are required in the reconstruction. Therefore, when $\varepsilon$ is changed to $\varepsilon'$, the number of bits required in the reconstruction becomes larger and the time required to do the detection becomes longer. When $\mathtilde{N}\gg i_{d-1}$, which is usually true for practical systems, the factor of increase $(I)$ in the number of bits can be derived as TeX Source $$I= {1\over (2\delta)^{2d}}= {1\over (1-2p)^{2d}}. \eqno{\hbox{(40)}}$$

In Fig. 8, the values of $I$ are plotted for different values of $p$ and $d$ when channel errors are present. It can be observed that the factor of increase in the number of bits used in the reconstruction grows with increases in $p$ and $d$. In practical situations, the channel error probability can vary from a very small value to about 0.2 when $E_b/N_o = 0$ dB [17], and, therefore, $I$ may vary from 1 to 100 depending on the values of $p$ and $d$. To make an appropriate choice of $I$, an accurate estimation of the statistical properties of the channel is, therefore, needed.

Next, the impact on $P_f$ and $P_n$ when channel errors are present but the number of bits used is not increased correspondingly is investigated. According to (10) and the new upper bound of $\sigma$ derived in Section IV, we have TeX Source $$a=\Phi^{-1}\left(1- {P_f\over 2}\right)= {\mathtilde{T}\over \sqrt{\mathtilde{N}-i_{d-1}}}. \eqno{\hbox{(41)}}$$ As $\mathtilde{T}$ and $\mathtilde{N}$ are precalculated, they will not change with the change in the source bias; therefore, $P_f$ will not be affected by the channel noise.

According to (11) and the distribution of $Z$, we have TeX Source $$-b=\Phi^{-1}(P_n)= {\mathtilde{T}-\mu\over \sigma} \eqno{\hbox{(42)}}$$ where $\mathtilde{T}$ is given by (25), $\mu$ by (4), and $\sigma$ by (22). When channel noise is present, the source bias will change from $\varepsilon$ to $2\varepsilon\delta$. Therefore, the $\varepsilon$ in (4) and (22) must be replaced by $2\varepsilon\delta$, and the resulting new mean value $\mu_{e}$ of $Z$ is TeX Source $$\mu_{e}=(\mathtilde{N}-i_{d-1})(4\varepsilon\delta)^d=(2\delta)^d\mu \eqno{\hbox{(43)}}$$ and the new upper bound $\sigma_{e}$ of the standard deviation of $Z$ is TeX Source \eqalignno{\sigma_{e}& =\sqrt{(\mathtilde{N}-i_{d-1})[1+d((4\varepsilon\delta)^2-(4\varepsilon\delta)^{2d})]}\cr & \approx \sqrt{\mathtilde{N}-i_{d-1}}\cdot\bar\sigma'_l. &\hbox{(44)} } From (43) and (44), it can be observed that when channel noise is present, the standard deviation of $Z$ will not be significantly affected, but the mean value of $Z$ will become smaller. This will result in a smaller value of $b$, and thus $P_n$ will increase. As $2\delta\leq 1$, we have TeX Source \eqalignno{& {\mathtilde{T}-\mu_{e}\over \sigma} ={\mathtilde{T}-(2\delta)^d\mu\over \sigma}\geq {(2\delta)^d (\mathtilde{T}-\mu)\over \sigma} =(2\delta)^d(-b).\cr &&\hbox{(45)} } Therefore, the new value of $P_n$ when noise is present, i.e., $P^e_n$, can be roughly estimated by TeX Source $$P^e_n=\Phi\left({\mathtilde{T}-\mu_{e}\over \sigma}\right)\geq\Phi ((2\delta)^d(-b)). \eqno{\hbox{(46)}}$$ In Fig. 9, the values of $P^e_n$ are plotted for different values of $p$ and $d$. The number of bits $\mathtilde{N}$ used in the recovery is set to satisfy the condition that when $p=0$, the nondetection probability is equal to $10^{-5}$.

Fig. 8. Factor of increase $(I)$ of the number of bits for different $p$ and $d$ when channel errors are present $(\mathtilde{N}\gg i_{d-1})$.
Fig. 9. Values of $P^e_n$ for different values of $p$ and $d (\varepsilon=0.1,$ $P_f=2.10^{-7})$.

From the figure, it is clear that with the increase in the channel error probability, the nondetection probability $P^e_n$ increases rapidly if the number of bits used in the recovery process is not increased correspondingly. When the channel error probability is 0.1, more than half of the sparse multiples of the feedback polynomial will not be detected. When the channel error probability becomes equal to or larger than 0.15, almost all the sparse multiples will not be detected. Therefore, as stated before, in practical situations, channel estimation should be used to make sure that the number of bits used in the recovery process is properly chosen.

### B. Recovery of Scrambler When Insertion of Bits Occurs

During the data transmission, there is another kind of error, namely insertion/deletion of one or more bits into/from the scrambled bit sequence. In this section, for simplicity, only insertion of one bit is considered. However, our ideas can easily be extended to insertion of more than one bits and also deletion of one or more than one bits. Suppose $c_x$ is inserted at time index $t_x (t_x\geq 0)$, the received sequence becomes TeX Source $$y_0,y_1, y_2,\ldots y_{t_x-1},c_x, y_{t_x}, y_{t_x+1},\ldots$$ Suppose the received sequence is denoted by $y'_t$, we have TeX Source $$y'_t= \cases{y_t, & t\leq t_x-1 \cr c_x, & t=t_x\cr y_{t-1}, & t\geq t_x+1.} \eqno{\hbox{(47)}}$$ For $(i_1,\ldots,i_{d-1})$, when $t\leq t_x-1$, we then have TeX Source $$z'_t=y'_t\oplus\bigoplus_{j=1}^{d-1} y'_{t-i_j}=y_t\oplus\bigoplus_{j=1}^{d-1} y_{t-i_j}. \eqno{\hbox{(48)}}$$

When $t\geq t_x+i_{d-1}+1$, we will have TeX Source $$z'_t=y'_t\oplus\bigoplus_{j=1}^{d-1} y'_{t-i_j}=y_{t-1}\oplus\bigoplus_{j=1}^{d-1} y_{t-i_j-1}. \eqno{\hbox{(49)}}$$ Comparing with the noiseless case, in which we have $z_t=y_t\oplus\bigoplus_{j=1}^{d-1}y_{t-i_j}$, we can see that $z'_t=z_t$ when $t\leq t_x-1$, and $z'_t=z_{t-1}$ when $t\geq t_x+i_{d-1}+1$. As the final decision is based on the value of $Z$, we are more interested in the difference between the summations $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$. Assuming the number of bits of $\{z_{i_{d-1}},\ldots,z_{\mathtilde{N}-1}\}$ that are different from $\{z'_{i_{d-1}},\ldots,z'_{\mathtilde{N}-1}\}$ is $d_z$, then the density, $P_{(t_x,i_{d-1},\mathtilde{N})}= {(d_z)/(\mathtilde{N}-i_{d-1})}$, is dependent on the values of $t_x$, $i_{d-1}$, and $\mathtilde{N}$. For different values of $t_x$, $i_{d-1}$, and $\mathtilde{N}$, $P_{(t_x,i_{d-1},\mathtilde{N})}$ has the following four different expressions:

1. $0\leq t_x < i_{d-1}+1$ and $t_x+i_{d-1}<\mathtilde{N}-1$. In this case, $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ can be written as TeX Source \eqalignno{\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}& =\sum_{t=i_{d-1}}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}+1}^{\mathtilde{N}-1}z'_t\cr & =\sum_{t=i_{d-1}}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}}^{\mathtilde{N}-2}z_t. &\hbox{(50)} } Considering the worst case, i.e., $z'_t$ is independent of $z_t ({\rm Pr}\, (z'_t=z_t)=0.5)$ when $i_{d-1}\leq t\leq {t_x+i_{d-1}}$, it can be seen that the number of bits that are different in $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately $0.5(t_x+1)$ and $P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}= {0.5(t_x+1)\over \mathtilde{N}-i_{d-1}}. \eqno{\hbox{(51)}}$$
2. $0\leq t_x < i_{d-1}+1$ and $t_x+i_{d-1}\geq\mathtilde{N}-1$. In this case, the number of bits that are different in $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately ${(\mathtilde{N}-i_{d-1})/(2)}$ and $P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}=0.5. \eqno{\hbox{(52)}}$$
3. $i_{d-1}+1\leq t_x < \mathtilde{N}$ and $t_x+i_{d-1} < \mathtilde{N}-1$. In this case, $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ can be written as TeX Source \eqalignno{\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}& =\sum_{t=i_{d-1}}^{t_x-1}z'_t+\sum_{t=t_x}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}+1}^{\mathtilde{N}-1}z'_t\cr & =\sum_{t=i_{d-1}}^{t_x-1}z_t+\sum_{t=t_x}^{t_x+i_{d-1}}z'_t+\sum_{t=t_x+i_{d-1}}^{\mathtilde{N}-2}z_t. &\hbox{(53)} } The number of bits that are different in $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately $0.5(i_{d-1}+1)$ and $P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}= {0.5(i_{d-1}+1)\over \mathtilde{N}-i_{d-1}}. \eqno{\hbox{(54)}}$$
4. $i_{d-1}+1\leq t_x < \mathtilde{N}$ and $t_x+i_{d-1}\geq\mathtilde{N}-1$. In this case, $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ can be written as TeX Source \eqalignno{\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}& =\sum_{t=i_{d-1}}^{t_x-1}z'_t+\sum_{t=t_x}^{\mathtilde{N}-1}z'_t\cr & =\sum_{t=i_{d-1}}^{t_x-1}z_t+\sum_{t=t_x}^{\mathtilde{N}-1}z'_t. &\hbox{(55)} } The number of bits that are different in $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ and $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z'_t}$ is approximately $0.5(\mathtilde{N}-t_x)$ and $P_{(t_x,i_{d-1},\mathtilde{N})}$ is given by TeX Source $$P_{(t_x,i_{d-1},\mathtilde{N})}= {0.5(\mathtilde{N}-t_x)\over \mathtilde{N}-i_{d-1}}. \eqno{\hbox{(56)}}$$

In Figs. 10 and 11, the variations of $P_{(t_x,i_{d-1},\mathtilde{N})}$ versus $t_x$ according to our analysis above are depicted. When $\mathtilde{N}\geq 2(i_{d-1}+1)$, the variation of $P_{(t_x,i_{d-1},\mathtilde{N})}$ versus $t_x$ will follow the curve shown in Fig. 10 and when $\mathtilde{N} < 2(i_{d-1}+1)$, the variation of $P_{(t_x,i_{d-1},\mathtilde{N})}$ versus $t_x$ will follow the curve shown in Fig. 11.

From both figures, it can be observed that the insertion of one bit will have the least impact to the summation $\sum_{t=i_{d-1}}^{\mathtilde{N}-1}{z_t}$ when the bit is inserted at the two ends of the bit sequence. When the insertion point moves from the two ends to the middle of the bit sequence, the impact will increase and finally reach a maximum value, which depends on the values of $i_{d-1}$ and $\mathtilde{N}$. According to [7], for a randomly chosen primitive polynomial of degree $L$, the minimum value of $i_{d-1}$ is TeX Source $$i_{d-1}=((d-1)!)^{\displaystyle{{1\over d-1}}}2^{\displaystyle{{L\over d-1}}}. \eqno{\hbox{(57)}}$$

Fig. 10. Variations of $P_{(t_x,i_{d-1},\mathtilde{N})}$ versus $t_x (\mathtilde{N}\geq 2(i_{d-1}+1))$.
Fig. 11. Variations of $P_{(t_x,i_{d-1},\mathtilde{N})}$ versus $t_x (\mathtilde{N} < 2(i_{d-1}+1))$.

For a small value of $L$, normally $\mathtilde{N}\geq 2(i_{d-1}+1)$ will hold and from Fig. 10, it can be observed that the maximum value of $P_{(t_x,i_{d-1},\mathtilde{N})}$ is $(0.5(i_{d-1}+1))/(\mathtilde{N}-i_{d-1})$. When $L$ increases in value, and as $i_{d-1}$ increases exponentially with $L$, finally $\mathtilde{N} < 2(i_{d-1}+1)$ will be satisfied. In this case, when $\mathtilde{N}-(i_{d-1}+1)\leq t_x < i_{d-1}+1$, $P_{(t_x,i_{d-1},\mathtilde{N})}$ will be as high as 0.5.

In the following, the impact of $P_{(t_x,i_{d-1},\mathtilde{N})}$ on the performance of the reconstruction will be discussed. For simplicity, we consider insertion of one bit in the scrambled bit sequence to be equivalent to passing $y_t$ through a BSC channel with channel error probability $p_{\rm eq}$. Let $P_{(p_{\rm eq})}$ denote the density of the number of different bits in $\{z_{i_{d-1}},\ldots,z_{\mathtilde{N}-1}\}$ and $\{z'_{i_{d-1}},\ldots,z'_{\mathtilde{N}-1}\}$ when passing $y_t$ through a BSC with channel error probability $p_{\rm eq}$; from (32) and (35), we have TeX Source \eqalignno{P_{(p_{\rm eq})}& ={\rm Pr}(z'_t\oplus z_t=1)\cr & ={\rm Pr}\left(e_t\oplus\bigoplus_{j=1}^{d-1}e_{t-i_j}=1\right)= {1\over 2}\left[1-(2\delta_{\rm eq})^d\right]\cr &&\hbox{(58)} } where $\delta_{\rm eq}= {1\over 2}-p_{\rm eq}$. Let $P_{(p_{\rm eq})}=P_{(t_x,i_{d-1},\mathtilde{N})}$, it can be derived that TeX Source $$\delta_{\rm eq}= {1\over 2} \root{d}\of{{1-2P_{(t_x,i_{d-1},\mathtilde{N})}}} \eqno{\hbox{(59)}}$$ and consequently we have TeX Source $$p_{\rm eq}= {1\over 2}-\delta_{\rm eq}= {1\over 2}\left(1-\root{d}\of{1-2P_{(t_x,i_{d-1},\mathtilde{N})}}\right). \eqno{\hbox{(60)}}$$

According to (60), for each value of $P_{(t_x,i_{d-1},\mathtilde{N})}$, there is a corresponding channel error probability $P_{\rm eq}$. The bigger the value of $P_{(t_x,i_{d-1},\mathtilde{N})}$, the bigger the value of $p_{\rm eq}$. Details on the impact of the channel error probability on the performance of the reconstruction can be found in Section V-A. For most practical systems, we have $\mathtilde{N}\gg i_{d-1}$ and ${(0.5(i_{d-1}+1))/(\mathtilde{N}-i_{d-1})}\approx 0$. According to Fig. 10, $P_{(t_x,i_{d-1},\mathtilde{N})}\leq {(0.5(i_{d-1}+1))/(\mathtilde{N}-i_{d-1})}$; therefore, for any value of $t_x$ we have $P_{(t_x,i_{d-1},\mathtilde{N})}\approx 0$ and hence $p_{\rm eq}\approx 0$. It means that when $\mathtilde{N}\gg i_{d-1}$, the performance of the reconstruction will not be affected by the insertion of one bit, no matter where the bit is inserted. However, when $i_{d-1}$ becomes larger, especially when $i_{d-1}\geq 0.5\mathtilde{N}$, the value of $p_{\rm eq}$ will vary from 0 to 0.5 depending on where the bit is inserted. In the worst case, i.e., $p_{\rm eq}=0.5$, the reconstruction will never succeed no matter how big the bias $\varepsilon$ is, because as described in Section V-A, when $p_{\rm eq}=0.5$, we have $\delta_{\rm eq}=0$ and after passing through the BSC channel, the new bias becomes $2\varepsilon\delta_{\rm eq}=0$. In this case, the two distributions depicted in Fig. 2 will overlap and, therefore, no multiple of the feedback polynomial can be detected.

SECTION VI

## CONCLUSION

Cluzeau's algorithm is very promising in reconstructing the feedback polynomial of the LFSR used in a linear scrambler. In this paper, a scheme to improve the detection capability of Cluzeau's algorithm is proposed. Simulation results show that the detection capability is significantly improved by using the proposed scheme. A tighter upper bound for $\sigma$ which approaches the actual $\sigma$ more closely has also been derived. By using the new upper bound, the number of bits and operations required by Cluzeau's algorithm to reconstruct the feedback polynomial of the LFSR can be reduced significantly without affecting the detection capability. As the number of operations is reduced, the time required for reconstruction is also reduced. According to our analysis, the higher the weight of the sparse multiples to be searched, the higher the time reduction factor.

It should be noted that even with the improved algorithm, the value of $\varepsilon$ will affect the number of bits and operations and in turn the running time of the algorithm significantly. For example, when the source bias is reduced from 0.1 to 0.01, the number of operations required to do the reconstruction will increase by at least $10^6$ times. In this case, even for a feedback polynomial of very small degree, the time to do the detection will become very long. For example, for the first feedback polynomial in Table IV, the detection time will increase from 3 s to about 35 days! Fortunately, for natural source in practical situations, the typical values of $\varepsilon$ are 0.1 and 0.05 [7].

Another issue investigated in this paper is on how to recover the scrambler in the presence of noise. Our analysis results show that when passing the scrambled bits through a BSC channel, the feedback polynomial of the LFSR still can be recovered by using the same method as the one proposed for the recovery of the LFSR in noiseless condition. The only difference is that the effective source bias is changed which depends on the channel error probability $p$. As the effective source bias is smaller than the original source bias when bit errors are present, the number of bits required in the reconstruction becomes larger in order to maintain the detection capability. The larger the value of $p$, the larger the number of bits required for the reconstruction. As the factor of increase in the number of bits varies a lot for different values of $p$ and $d$, channel estimation is proposed to be used to get the statistical properties of the channel.

We have also investigated the problem of reconstruction of the scrambler when there is an insertion of one bit in the scrambled bit sequence. What we have found is that when the number of bits used in the reconstruction $(\mathtilde{N})$ is much larger than the minimum degree of the multiple of the feedback polynomial $(i_{d-1})$, the performance of the reconstruction will not be affected by the insertion of one bit no matter where the bit is inserted. However, with the increase in the degree of the feedback polynomial, the degradation in the performance of the reconstruction algorithm will vary a lot depending on where the bit is inserted. When the bit is inserted at the two ends of the bit sequence, the performance degradation is small. When the insertion point moves from the two ends to the middle of the bit sequence, the performance degradation increases until a maximum is reached, and the maximum performance degradation is dependent on $\mathtilde{N}$ and $i_{d-1}$. In the worst case, i.e., $i_{d-1}\geq 0.5\mathtilde{N}$ and the insertion of the bit is at the mid point of the sequence, the reconstruction will never succeed even though only one bit is inserted.

## Footnotes

The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Z. Jane Wang.

X.-B. Liu and S. N. Koh are with the School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore 639798.

X.-W. Wu is with the School of Information and Communication Technology, Griffith University, Gold Coast, QLD 4222, Australia.

C.-C. Chui is with the Temasek Laboratories at Nanyang Technological University, Singapore 639798.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

## References

No Data Available

## Cited By

No Data Available

None

## Multimedia

No Data Available
This paper appears in:
No Data Available
Issue Date:
No Data Available
On page(s):
No Data Available
ISSN:
None
INSPEC Accession Number:
None
Digital Object Identifier:
None
Date of Current Version:
No Data Available
Date of Original Publication:
No Data Available