A Scalable and Accurate Chessboard-Based AMC Algorithm With Low Computing Demands

Automatic Modulation Classification (AMC) is a technique used to identify signal modulations in applications like IoT devices, cognitive radar, software-defined radio, and electronic warfare. These applications could be applied to IoT devices. With future wide applications of IoT devices, AMC algorithms need to be more compact yet suitable for embedded devices with limited resources and remain acceptable accuracy. Although current AMC algorithms deliver high accuracy, they require substantial computing power, making them unsuitable for IoT devices. This paper introduces the novel Chessboard-based Automatic Modulation Classification (CAMC) algorithm, which has dramatically high accuracy. Test results reveal that CAMC achieves 99%* accuracy under a 3dB SNR condition and 100% above 5dB SNR. Meanwhile, this algorithm is scalable and demands less computing power. It offers better accuracy results compared to state-of-the-art AMC algorithms, classifying mainstream modulations in IoT devices like BPSK, QPSK, 8PSK, and 16QAM, but requires less computing power than existing algorithms. Additionally, CAMC is hardware-friendly due to its inherent parallelism and scalability. The novelty of this paper is to classify 4 different modulations in a low-computation-loading required and hardware-friendly way and achieve a high accuracy of over 99%* above SNR of 3dB. (* Accuracy that most of the time could reach)


I. INTRODUCTION
Automaticmodulation classification (AMC) serves as an intermediary process between signal detection and demodulation, enabling the identification of signals with unknown modulations [1].AMC finds applications in radio frequency (RF) signal processing fields, including cognitive radar and electronic warfare, as well as mobile communication systems such as software-defined radio (SDR) [2], [3].By addressing high-density spectrum issues in communication environments, AMC enhances the efficiency of demodulating signals with unknown modulations, thereby alleviating spectrum congestion and improving overall system performance [4], [5].Based on the advantages of AMC, now it is a trend that The associate editor coordinating the review of this manuscript and approving it for publication was Julien Le Kernec .
AMC-related technologies are being applied to IoT devices.Due to the large quantity of IoT devices in this world, the spectrum limits problem is not ignorable.Therefore, more communication-based technologies like cognitive radar and SDR, are now applied to an IoT device [6], [7].
AMC methodologies can be broadly categorized into two main approaches: feature-based approach and decisiontheoretic approach.In the feature-based approach, features are first extracted from the signals, and then the modulation is classified based on these features.This method is straightforward and does not necessitate prior knowledge of communication or modulation.On the other hand, the decision-theoretic approach demands comprehensive communication knowledge and detailed information, such as phase jitter, erroneous channel state information, and frequency offset.This method involves maximizing the likelihood information before using it to classify the modulation [2].Recently, traditional AMC methods have been integrated with deep learning applications to achieve higher accuracy, as demonstrated in [4] and [8].In [4], constellation and spider images serve as input for a Convolutional Neural Network (CNN) to classify the signal via the constellation image.
In recent years, Software-Defined Radio (SDR) has gained popularity, with ARM(Advanced RISC Machine)-based, DSP(Digital Signal Processor)-based, and FPGA(Field Programmable Gate Arrays)-based SDRs available.However, ARM-based and DSP-based SDRs may not meet the stringent timing requirements of future communication systems due to shrinking symbol time windows.Consequently, FPGAs, with their high sampling frequency capabilities [9], [10], could become the only viable option.Thus, it is essential to consider hardware implementation aspects of AMC algorithms during the design stage.Although the algorithm in [11] achieves nearly 100% accuracy even at −2dB of SNR, its complexity poses challenges for FPGA implementation due to the logarithmic, root, and division operations involved.
Performing operations like logarithms on FPGAs is resource-intensive and power-consuming [12].Additionally, limited reconfigurable hardware capability for floating-point calculations may decrease accuracy during implementation.Utilizing lookup tables can attain high accuracy but significantly increases hardware resource usage and necessitates hardware expertise [13], [14].
Deep-learning-aided methodologies can enhance accuracy but exhibit drawbacks, such as being computationally intensive, energy-consuming, and less practical in real-life scenarios [11].Furthermore, generated features are not easily explainable, hindering optimization based on domainspecific knowledge.Existing AMC algorithms exhibit an imbalance between complexity and accuracy.
To sum up, based on the related work mentioned above, the problems that remain in the state-of-the-art designs are listed below:

1) Complicated calculations (decision-theoretic method)
or a large number of calculations (deep-learning-aided method) are required to achieve high accuracy, which leads to a large amount of computation loading.2) Algorithms mentioned are hardware-friendly and could lead to accuracy degeneration when applied to hardware.3) Existing work consumes a long processing time.
With more stringent speed and accuracy requirements in future communication systems, we propose a novel, lowcomplexity AMC algorithm capable of accurately detecting four mainstream modulations: BPSK, QPSK, 8PSK, and 16QAM.These four modulations are chosen since they are still widely used in the IoT communication standards, including WIGHTLESS-P and WEIGHTLESS-W [15].By choosing the feature-based method combined with a hardware-friendly algorithm, calculation loading could be eliminated, and it could keep a better performance when applied to a hardware platform.Our inherently parallel, scalable, and hardware-friendly algorithm achieves accuracy comparable to existing methods while significantly reducing complexity.Detailed contributions are as follows: 1. We propose a Chessboard-based AMC (CAMC) algorithm which has competitive accuracy as the existing algorithms.The accuracy could reach 99% when SNR is over 3dB and 100% when SNR is over 5dB.
2. This CAMC algorithm could reach 97% accuracy under 2000 of test symbols and perform with good noise-tolerant ability.When the number of test symbols increases, accuracy will also increase.
3. The algorithm requires less computing complexity compared with other published works.
The rest of the paper is organized as follows: In Section II, we provide a detailed description of the chessboard algorithm.In Section III, the simulation results from MAT-LAB and real-world data, comparisons with other papers are provided.In Section IV, the conclusion and future perspective are indicated.

II. ALGORITHM AND ANALYSIS
The initiation of the AMC algorithm typically begins with the constellation graph, which provides coordinates containing both real and imaginary parts of a signal, as exemplified by the Constellation Image (CI) method in [4].As illustrated in Figure 1 after phase lock, high SNR signals can be manually distinguished by observing the distinct characteristics of various modulations, such as the locations where most points are concentrated.Signals are collected assuming the receiver is in symbol phase-lock, or the frequency error is sufficiently low so that the constellation rotation over the analysis period is less than a symbol boundary.
Because of the noise, constellation points are distributed around the actual constellation points according to the noise level.To facilitate a more straightforward yet effective distinction between different constellation points, we introduce the chessboard concept.As depicted in Figure 2, we first use small grids to divide the constellation plane and count the number of points located within each square, giving the constellation plane a chessboard-like appearance.Grid_scale is the side length figure of the squares used to divide the constellation graph as shown in Figure 2. Subsequently, the CAMC methodology quantifies the number of points in each square, providing insight into the points' distribution.This information is then transformed into a matrix.Finally, by applying an inner product between the chessboard matrices of the symbols and the testing signal, we can maximize the similarity extent between the symbols and testing input modulation; larger results indicate a higher likelihood of the same modulation.
The detailed process of the CAMC algorithm is represented in Figure 3 and the equations will be analyzed below: Chessboard Matrix Generation Process: First, the initially set XY_scale (XYS) and grid_scale (GS) are used to generate a square matrix (M).The column/row size (S) of the matrix will be: Then, I/Q pairs as input data will be inserted as I and Q respectively.Before passing to the next procedure, I and Q need to meet the if statement below: After the if statement above, I and Q will be transformed to coordinate data of the square matrix C_I and C_Q.C_I will be the row and C_Q will be the column.R() refers to the round function.
At last, the square matrix M will be updated by the input: The above process is for a single pair of I/Q.The total number of I/Q pairs could be decided by changing the number of symbols in the codes.

A. MATRIX INNER PRODUCTION PROCESS
After Matrix Generation Process, 8 matrices will be generated: 4 of which are from training data for BPSK, QPSK, 8PSK and 16QAM respectively and 4 input matrices with different angles of rotation.The equation of Inner production IP will be shown below: The inner production results will be gathered as below: Finally, the largest results among the 16 will be picked up as the final result (FR).And the modulation that refers to the FR will be the classification result.FR = max(Among All R) (7) To further improve the accuracy of this CAMC algorithm, a 2-pass classification method is applied.When the first classification is incorrect, the algorithm will pick up the second largest results which is not the same modulation type as the first classification.
The macro view of the CAMC process is shown below: The chessboard process could be divided into 2 steps: ''training'' and ''inference''.It will begin with the training symbols matrix generation, and then proceed to find the Inner product between the 4 input matrices (including rotation data) and 4 different modulation matrices.The modulation with the largest result will be the prediction.

B. TRAINING PROCESS
Before introducing the training process, it is essential to define the concept of depth.Depth refers to the number of points located in a single square on the chessboard, which will later be converted into an element in the chessboard matrix.First, users set a customized grid scale, which influences the accuracy and computational load of the results.The grid is then used to divide the constellation plane into squares.Subsequently, the algorithm counts the number of points located in the squares, representing the depths.The grid scale can be defined by users according to their needs; however, recommended numbers for hardware-friendly purposes include 0.1, 1/4, 1/8, and 1/16, among others.These numbers can be transferred into hardware using standard multipliers and shifters without imposing excessive precision requirements on the number system.
After setting the grid scale, we employ MATLAB to generate modulation data and obtain 4 classification matrices of BPSK, QPSK, 8PSK, and 16QAM, respectively.For each matrix, we generate a predetermined number of constellation points (defined by the users) with a fixed SNR.The process of creating these matrices serves as the training process for the algorithm, establishing a number in the classification matrices for the subsequent stage, referred to as the ''inference'' stage.

C. INFERENCE PROCESS
After completing the training process, we can utilize the generated matrices to classify a signal with unknown modulation.
After receiving input symbols, symbols will be rotated 45 • , −45 • and 90 • respectively.Then these 4 sets of data (including non-rotation data) will proceed to generate 4 input matrices by using the same depth concept as in the training process.
Then, the inner productions between the input matrices and training matrices are calculated to maximize the difference.Each input matrix will proceed with inner production with training matrices of BPSK, QPSK, 8PSK, and 16QAM respectively and 16 results will be gathered in total.The modulation with the largest output is identified as the classification result.Furthermore, to improve the accuracy of CAMC, a 2-pass method is applied.If the first classification is wrong, then the algorithm will pick up the second-largest result whose modulation type is different from the first classification the result.
During this process, the number of input constellation points is also scalable to accommodate different demands.For higher accuracy, users can opt for a larger number of input data; for considerations of high speed and lower power-resource per classification, a smaller amount of data can be inserted.

D. PARALLELISM
This algorithm is well-suited for parallel implementation because there is no data dependency when performing all the inner products.Figure 4 and Figure 5 separately illustrate the parallel approach in chessboard generation and matrix inner product computation.Figure 4 shows the generation N chessboards with the same to accumulate the depth and add all depths in chessboards together after the process.In this way, we could shorten the processing time to 1/N based on the customized parallelism degree.
In Figure 5, an example inner product of the matrices is shown, it shows the inner product being found from whole matrices.However, as shown in Figure 5, the complete matrices can be decomposed into smaller ones to make the process parallel.After partial inner products, all results will be added together to get the classification.E.g., If a larger matrix is decomposed into N smaller partial matrices, then the total execution time can be reduced to 1/N of the original.

III. RESULTS AND COMPARISON A. ACCURACY
MATLAB was employed to generate signals with BPSK, QPSK, 8PSK, and 16QAM modulation, respectively, using 1000 symbols under 20dB SNR and producing four classification matrices for each modulation.The accuracy was evaluated under various SNRs (from 0dB to 20dB), different grid scales, and varying numbers of input data.The outcomes were compared with those from other published works.
Table 1 presents the detection accuracy under different grid scales.Under the condition of 6000 inputs, grid scales of 0.1, 1/4, 1/8, and 1/16 were tested.The results indicate that accuracy increases under low SNR conditions of QPSK classification as the grid scale becomes smaller.
Table 2 shows the classification accuracy under varying numbers of input signals.The accuracy improves as the number of input data increases.As the input number increases, the results become more stable with higher accuracy.Meanwhile, for a large input number, only QPSK input will need a 2-pass method to classify while when a small number of inputs are applied, QPSK, 8PSK and 16QAM all need 2-pass method to classify.A larger number of input data provides the algorithm with more elements in the matrix, which amplifies the differences in the results and makes the modulation more distinguishable.3 present comparisons between work and other published works [2], [4], and [9].In Figure 6, the worst cases are highlighted, while in Table 3, the accuracy rate for different approaches the papers is provided (NA denotes data not mentioned).The results of 6000 input symbols and a grid scale of 0.1 were used for comparison.
high SNR conditions, CAMC performs exceptionally well, with an accuracy of 100% under SNR conditions above 5dB.
At 0dB, only QPSK presents a challenge, while other modulation detections achieve over 100% accuracy.QPSK classification under a low SNR poses a difficult classification in other papers because of the similar characteristics between QPSK and 8PSK.For example, the accuracy of QPSK classification could not exceed 60% in [4] with the results of 54% (CI [4]), 57% (GRF [4]) respectively, and lower than 70% in [2] (69.3% using fixed threshold method).
This issue becomes worse in the CAMC algorithm as the constellation graphs of QPSK and 8PSK are similar as shown in Figure 7.In the CAMC algorithm, an inner production will be processed.In the inner product, when an element from any matrix is 0, then this pair of elements could be ignored in the calculation since it gives 0. Therefore, in the CAMC algorithm, the inner production could be simplified into the ''same position production with training matrices'' as in Figure 8.In Figure 9, red centers are from the training matrix of QPSK while red and blue centers are from the 8PSK matrix.4 centers are the same between these two matrices.Yellow points indicate 0dB   QPSK input constellation graph.Because 6000 symbols are inserted, all 8 centers will have a similar number of symbols.Besides, as the rotation input matrix is utilized when the rotation angle is 45 • of QPSK, there will not be any common center with the QPSK training matrix.However, there will be 8PSK common centers instead, as shown in Figure 10.Therefore, after inner production, the result of the 8PSK will be much larger than the QPSK result.It leads to the condition that all the QPSK signals at 0dB will be classified as 8PSK.But this could also be optimized.Since all the QPSK will be classified to 8PSK at 0dB, therefore the result will be a choice only between 8PSK and QPSK.However, when the SNR is 3dB, the signal power versus noise ratio is around 2. Under this condition, the difference between QPSK and 8PSK will be enhanced a lot, which leads to a large gap between accuracies under 0dB and 3dB.
The confusion matrix of this CAMC design (6000 input symbols, grid scale = 0.1) is shown below with SNR of 3dB, 5dB, 10dB, and 20 dB respectively.
Except for errors caused by noise, synchronization error is another factor that could influence the signals.It is caused by delays in the circuits [16].However, other publications of AMC applications did not mention and test for synchronization errors in recent years, therefore, it is fair that this error test is not shown in this paper.
Moreover, some lab-gathered (real-world) data are tested to be classified by the CAMC algorithm.The signals were generated at mmwave frequencies using a Rhode & Schwarz SMM100A.They were then transmitted over a cable to be received by a Keysight PXA N9030B signal analyzer.Below are the confusion matrices.These real-world lab-gathered data contain synchronization errors.There are 20000 symbols for each modulation and are divided into 10 groups (each group contains 2000 symbols).These groups are inserted into the CAMC algorithm.The results show that by using the CAMC algorithm, lab-gathered data could be 100% correctly classified.

B. COMPUTATION DEMANDS AND HARDWARE FRIENDLY
When implementing our algorithm into hardware with time and power constraints, it is sometimes necessary to adjust the algorithm, such modifying precisions or divisions, which may result in a sacrifice of precision.However, our algorithm is hardware-friendly and does not require any modifications.Additionally, our algorithm is computationally simple and can meet future challenges of the short time window for classifications.
The number of operations is counted below in Table 4 Table 5 and Table 6.
In (1), SN is the number of input symbols, XY_scale represents the larger absolute value of both the real and imaginary 120960 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.parts of the data).Grid_scale refers to the user-determined grid scale, while symbol_no indicates the number of input signal data chosen by the user.The second row demonstrates a choice between shift or multiplication.If users select a grid scale of 0.1, then multiplication is used; otherwise, bitshifting is employed.
Compared to other related work that utilizes deep learning or AI-assisted methods, this approach demands less computing power and holds the potential for high performance-power efficiency implementation.Furthermore, this methodology is well-suited for FPGA applications, as it requires much less and simpler operations.Additionally, users can choose grid scales of 0.1, 1/4, 1/8, and 1/16, which can be converted to bit-shifting and multiplication instead of performing divisions.

IV. CONCLUSION AND FUTURE WORK
In this paper, we propose a Chessboard-based Automatic Modulation Classification (CAMC) algorithm that employs a depth concept to automatically classify four modulations: BPSK, QPSK, 8PSK, and 16QAM.This algorithm offers scalable variables to accommodate different user requirements.The CAMC algorithm is hardware-friendly and can be easily implemented on FPGA for FPGA-based SDR applications.
From the test results, this algorithm demonstrates higher performance compared to other works, particularly in high SNR conditions.
Meanwhile, for real-world lab-gathered data, this CAMC algorithm still provides a 100% accurate result.
The algorithm is also scalable, allowing users to adjust the accuracy by varying the number of input symbols and using different grid scales for improved classification under low SNRs.
Compared to other published papers, especially those employing AI or deep-learning-assisted methods, the proposed algorithm demands significantly less computing power and holds the potential for low-power implementations.Additionally, the CAMC algorithm's simplicity substantially shortens the design period when transferring it to hardware, such as FPGAs.
However, there are still some problems that need to be tackled.The QPSK under 0dB will give 0% accuracy, which is even below the random guess of 25%.In future work, the solution of this scenario will be developed, and this algorithm will be implemented on an FPGA and tested using live data in the future.Subsequently, it will be integrated into an SDR to evaluate its functionality and potential in a real-world scenario.

FIGURE 5 .
FIGURE 5. Parallel calculations in the matrix inner-product.

Figure 6
Figure 6 and Table3present comparisons between work and other published works[2],[4], and[9].In Figure6, the worst cases are highlighted, while in Table3, the accuracy rate for different approaches the papers is provided (NA denotes data not mentioned).The results of 6000 input symbols and a grid scale of 0.1 were used for comparison.highSNR conditions, CAMC performs exceptionally well, with an accuracy of 100% under SNR conditions above 5dB.At 0dB, only QPSK presents a challenge, while other modulation detections achieve over 100% accuracy.QPSK classification under a low SNR poses a difficult classification in other papers because of the similar characteristics between QPSK and 8PSK.For example, the accuracy of QPSK classification could not exceed 60% in[4] with the results of 54% (CI[4]), 57% (GRF[4]) respectively, and lower than 70% in[2] (69.3% using fixed threshold method).This issue becomes worse in the CAMC algorithm as the constellation graphs of QPSK and 8PSK are similar as shown in Figure7.

FIGURE 6 .
FIGURE 6. Comparisons between this work and other works.

TABLE 3 .
Comparisons between this work and other works.

FIGURE 9 .
FIGURE 9. Inner production simplification in the CAMC.

TABLE 1 .
Accuracies under different grid scales.

TABLE 2 .
Accuracy under different numbers of input symbols.

TABLE 4 .
Number of operations in input data rotation.

TABLE 5 .
Number of operations in one matrix generation.