Performance Characterization of Deep-Phase-Retrieval Shack-Hartmann Wavefront Sensors

Shack-Hartmann wavefront sensor (SHWFS) is the most popular wavefront sensor in adaptive optics systems. The Deep-Phase-Retrieval Wavefront Reconstruction (DPRWR) method, which was proposed by our group previously, is a kind of deep learning-based wavefront reconstruction method. It can extract more information from the SHWFS images to accurately obtain more Zernike mode coefficients. However, the application limits, performance upper bound, and noise immunity have not been investigated in detail in previous reports. In this paper, sub-aperture spot sampling, bit depth, number of reconstructed mode coefficients, and noise intensities are analyzed by simulations and experiments to investigate the influence of changes in these parameters on the performance of DPRWR. This work aims to optimize the configuration of DPRWR for better measurement accuracy, spatial resolution, and robustness.


I. INTRODUCTION
A DAPTIVE optics (AO) is to solve the image degradation caused by atmospheric turbulence. At present, almost all ground-based telescopes in the world are equipped with AO systems [1], [2]. AO system mainly consists of two modules, first measuring the aberrated wavefront phase, then compensating the measured wavefront in reverse by employing a deformable mirror (DM) or a spatial light modulator (SLM) for aberration correction. Therefore, the accuracy of the wavefront measurement determines the upper bound of phase correction accuracy.
The major wavefront measurement methods can be divided into two classes. One group of methods is the wavefront sensorless method, such as phase diversity, which iteratively estimates wavefront from two degraded images acquired by focal cameras without additional wavefront sensors [3], [4]. This approach is capable of precisely measuring low-order aberrations through iterative solutions [5]. Due to its quite time-consuming calculation process, it is difficult to meet the real-time correction requirements of AO systems. Although subsequent work proposed a novel real-time non-iterative phase-diversity wavefront sensing by using a phase diversity convolutional neural network, it still focuses on low-order aberrations measurement [6].
The other class of methods utilizes wavefront sensor measure wavefront. In real-time AO systems, the most widely applied wavefront sensor is the Shack-Hartmann wavefront sensor (SHWFS), which has a simple structure consisting of a microlens array for wavefront segmentation and a CCD array for spot position measurement. Each sub-lens corresponds to a sub-aperture spot on the sensor. The traditional wavefront reconstruction algorithm, called the slope-based method, calculates the wavefront mode coefficients by multiplying the sub-aperture spots displacements vector with a reconstruction matrix generally obtained by singular value decomposition [7], [8], [9]. Nevertheless, when the mode number continues to increase, the condition number of the reconstructed matrix would increase quickly. Thus the number of reconstructed mode coefficients in this approach is typically about 0.7 to 0.8 times the number of sub-apertures [10], [11], [12]. In addition, due to the similarity of the slope vectors, modal coupling and modal confusion errors are likely to occur, leading to a decrease in the accuracy of wavefront measurements [13].
To further improve the wavefront measurement spatial resolution of SHWFS. Phase retrieval by SHWFS is proposed. Li et al. proposed to place the detector in the microlens' defocused plane and reconstruct the wavefront aberration by a stochastic parallel gradient descent algorithm. This method improves the measurement accuracy of SHWFS sensors for higher-order aberrations [14]. Viegers et al. proposed the SABRE-M method to reconstruct the wavefront aberration using the first-and second-order moments of the sub-aperture image. The simulation results showed that the measurement accuracy of the first-order moment method could be achieved using half of the sub-aperture sampling rate [15]. However, such conventional This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ phase retrieval by SHWFS requires iterative computation, which is time-consuming and tends to fall into local optima [16].
In recent years, the neural network has led to new solution paths, from "solving ill-posed inverse problems" to "learning pseudo-inverse mapping" [17]. Researchers have proposed to establish a nonlinear mapping relationship between slope and aberration by using simple artificial neural networks [18], [19], [20]. These methods are capable to reduce the wavefront measurement error compared to the slope-based method, yet the network is difficult to converge because of large slope measurement errors under strong turbulence. To cope with strong turbulence or strong noise, some researchers have proposed methods to compute the spot center-of-mass position or classify and identify sub-aperture spots with the aid of neural networks [21], [22], [23].
Furthermore, Zernike mode coefficients can be reconstructed by convolutional neural networks (CNN) directly from the SHWFS image. For instance, Hu et al. proposed the learningbased Shack-Hartmann wavefront sensor and achieved the measurement of the 2nd-120th Zernike mode coefficients [24]. In our previous works, the Deep-Phase-Retrieval Wavefront Reconstruction method (DPRWR) was proposed, which can achieve high speed and high spatial resolution wavefront measurement and obtain the prediction of the 2nd-299th Zernike mode coefficients by simulation [25].
The slope-based wavefront reconstruction method recovers the wavefront by calculating the tilt of the SHWFS sub-aperture spot center of mass. When the spatial frequency of the aberration is less than or equal to the SHWFS sub-aperture spatial sampling rate, the wavefront within the sub-aperture can be approximated as a plane wave. In this case, there is almost only tilted aberration in the sub-aperture. Thus, the reconstruction error of the conventional slope-based wavefront method is small. When the aberration spatial frequency is much larger than the SHWFS sub-aperture spatial sampling rate, there are defocused and higher-order aberrations in the sub-aperture wavefront. The shape of the sub-aperture spot diffusion function changes and the traditional slope-based method can neither accurately extract the wavefront tilt nor extract the higher-order aberration information within the sub-aperture. However, DPRWR can reconstruct the higher-order aberrations based on the sub-aperture spot position and morphological features. Thus the higher accuracy wavefront reconstruction can be achieved by DPRWR. Compared with conventional phase retrieval by SHWFS, although the DPRWR method takes a long time to obtain the dataset and train the network, the wavefront reconstruction for a single frame of SHWFS image is estimated to take only about 0.8 ms by the trained network [25]. DPRWR can meet the need for a real-time AO system. However, the effects of SHWFS configuration parameters on the DPRWR performance are not clear for configuration optimization. The configuration constraints, spatial resolution limits, and robustness of the DPRWR are not investigated.
In this paper, we evaluate the influence of spot sampling rate, bit depth, number of reconstructed modes, and noise on the performance of DPRWR. The SHWFS with a higher spot sampling rate means that there are more pixels of each sub-aperture spot and the spot shape would be clearer. Besides, the higher the bit depth, the more it can reflect the grayscale details of the spot. How these input features after passing through the network layers impact the output need to be analyzed and verified by specific experiments, which are done in Sections III-A and III-B of this paper. A relevant experimental analysis of the performance upper bound related to the number of reconstructed modes is done in Section III-C. In addition, low signal-to-noise SHWFS images due to strong background interference can severely limit the working capability and the range of use of AO systems [26]. How to improve the anti-interference ability of wavefront sensing methods has been continuously investigated [27]. Therefore, we compare the robustness of DPRWR under four noise processing schemes designed based on neural network characteristics in Section III-D. Finally, an AO system using SLM was built to verify the above analysis, which is described in detail in Section IV.

II. METHODS
The simulation of SHWFS images is carried out on the Matlab platform, and its principle is described in detail in Section II-A. The SHWFS configuration parameters are consistent in simulations and experiments. The DPRWR network structure is written based on the Keras framework and is described in detail in Section II-B. To reduce the network training time, we use NVIDIA GTX 1080Ti to help speed up the gradient computation process.

A. SHWFS Simulation Principle
Based on optical diffraction theory, the imaging process of SHWFS is simulated, the optical field modulated by atmospheric turbulence focusing on the focal plane of the microlens array. The SHWFS used in this simulation is composed of 91 microlenses, as Fig. 1(a). Each sub-aperture spot image can be obtained by utilizing Fraunhofer diffraction integral equation [28].
is the complex amplitude of the sub-aperture source optical field, Where A(x 1 , y 1 ) is the sub-lens pupil function, which is generally a fixed amplitude distribution conforming to the actual conditions, and it conforms to the uniform distribution in this simulation. ϕ(x 1 , y 1 ) is the phase function conforming to the Kolmogorov turbulence [29], Where M k (x 1 , y 1 ) is the k-order Zernike polynomial, corresponds to the Zernike mode coefficient, and K is the highest Zernike mode frequency.
The (x 1 , y 1 ) is the source optical field coordinates in front of the microlens, (x 2 , y 2 ) is the microlens focal plane coordinates, f is the microlens focal length, and λ is the light wave wavelength. To facilitate the numerical solution, the source light wave plane and focal plane need to be discretized and sampled, and the integral values can be solved by the Fourier transform. The transformed equation is as follows.
The (m 1 , n 1 ), (m 2 , n 2 ) are the coordinates of the sampled source optical field plane and microlens focal plane, respectively. δ 1 , δ 2 are the sampling intervals respectively corresponding to the input plane and output plane. They satisfy the following condition.
In the following simulations, we need to generate the spot images under different sensor parameters. The spot sampling rate can be adjusted by altering the N or δ 1 value according to (5). Images with different bit depths can be simulated by multiplying the normalized focal plane spot intensity F (m 2 , n 2 ) with different gray levels. The below Fig. 2 show the simulation process of the training datasets. Fig. 1(a) is the schematic diagram of the SHWFS microlens array. Fig. 1(b) shows the network structure of DPRWR. The SHWFS images are the input of the DPRWR network, followed by 3 sets of sub-networks with the same structure, in the order of convolution-ReLU-pooling layer, for extracting highly abstracted features. These features are then flattened to the sample label space through 3 full connection layers. The number of neurons in the last full connection layer is determined by the number of reconstructed mode coefficients. The reconstructed wavefront ( Fig. 1(c)) can be built by mode coefficients and Zernike polynomials, based on (3). The error of the reconstructed wavefront and true wavefront indicate the performance of the DPRWR method. Due to the orthogonality of the Zernike polynomial, the root mean square error (RMSE) of the wavefront is equal to the RMSE of the Zernike mode coefficients, which can

B. The Structure of the DPRWR
The a j (k) is the output label of the DPRWR network, and the a j (k) is the true Zernike mode coefficient.

III. SIMULATION RESULTS
To investigate the reconstruction performance of DPRWR in the following four aspects, different spot sampling rates ( Fig. 1(d)), bit depths ( Fig. 1(d)), numbers of reconstructed modes ( Fig. 1(e)), and noise, we generated different datasets in each simulation. Each dataset contains 10000 SHWFS images, which are divided into a training set, validation set, and test set according to the ratio of 8:1:1. The labels of the training sets are random Zernike mode coefficients obeying a turbulence intensity D/r 0 = 10. The network trained for 50 epochs converges well. Details will be described in the following subsections. Each RMSE in the graphs and tables is the average of 100 measurement errors rather than a single measurement error.

A. Influence of the Spot Sampling Rate
The full width at half height of the SHWFS central subaperture spot under parallel light is represented as the spot sampling rate. According to the diffraction limit theory, d = 1.22λf/D, the wavelength, the focal length and the diameter of the microlens determine the diameter of the Airy spot. The number of pixels taken up by the Airy spot on the sensor is determined by both the Airy spot diameter and the sensor sampling interval, which can be described by the FWHM of the spot. Therefore, the variations of light wave wavelength, lens focal length, lens diameter, and sensor sampling interval are all reflected in the variations of spot FWHM.
In this simulation, 8 noiseless datasets with different FWHMs are generated. The bit depth of all datasets is 8-bit and the number of reconstructed coefficients is 299. The sensor pixel sizes and the horizontal pixel counts of sub-aperture spot images corresponding to eight different FWHM are shown in Table I. A larger FWHM means a smaller pixel size, greater sub-aperture image size, and higher spot sampling rate. However, considering the memory size of the training datasets in this simulation, all the sub-aperture spot images with different FWHMs are cropped or filled with 0 to 21×21. So the size of the SHWFS image in all datasets is 241×241 pixels. Therefore, each network corresponding to the 8 datasets has the same topology. In addition, for comparison with the slope-based method, eight reconstructed matrices under different spot sampling rates have been simulated for the test.
The sub-aperture spot images at 8 different spot sampling rates are compared in Fig. 3. It can be found that when FWHM is less than 0.94 pixels, the center pixel gathers almost more than 90% spot intensity, and as the FWHM decreases, more energy is gathered in a central pixel. Conversely, when the FWHM increases, the spot energy is dispersed within multiple pixels, and the spot morphology is more detailed.
The measurement errors of DPRWR and the slope-based method for dynamic aberrations with different D/r 0 at eight different spot sampling rates are compared in Fig. 4 and Table II. Based on the table, it can be calculated that when the FWHM increases from 0.94 to 0.98 pixels, the measurement errors of the DPRWR method and the slope method decrease by 0.1524 λ and 0.5278 λ, respectively. When the spot sampling rates rise 2.2 times from 1.46 to 4.7 pixels, the error of DPRWR only decreases by 0.0096 λ, while the error of the slope-based method is almost unchanged.
These changes are consistent with Fig. 4. With the gradual increase of FWHM, the reconstruction errors of both methods firstly decrease sharply when the FWHM is less than 1 pixel, and then tend to be flatter when the FWHM is larger than 1.46 pixels, either in strong or weak turbulence aberration.
Differently, the error of the slope-based method is almost more than four times that of DPRWR. When the FWHM continues to increase from 1.46 to 4.7 pixels, the measurement error of DPRWR decreases steadily, while the RMSE of the slope-based method generally remains stable. The reason for this may be  that as the number of pixels of the spot increases further, the CNN can still extract useful information for wavefront measurement, and the slope features are more accurate. Yet linear mapping in a slope-based method no longer meets more precise reconstruction.

B. Influence of the Bit Depth
We evaluated the reconstruction performance for nine different bit depths of 1, 2, 4, 6, 8, 10, 12, 14, and 16, corresponding to nine datasets with FWHM of 1.9, 299 mode coefficients, and noise-free. The same network topology and training parameters were used in these 9 cases.
The average errors of the two methods at different bit depths are shown in Table III. Then the influences of aberration with different D/r 0 on the relationship between bit depth and reconstruction performance are analyzed, and the results are shown in Fig. 5. As seen in Fig. 5, at D/r 0 = 10, the RMSE of the slope-based method shows a steady decline with increasing bit depth. RMSE reduces by 20% from 1-bit to 16-bit, namely 0.0606 λ. However, the RMSE of DPRWR sharply drops by 80%, namely 0.3220 λ, with the bit depth increasing from 1-bit to 2-bit. Then it slowly decreases by 0.0105 λ with the bit depth increasing from 2-bit to 10-bit. In general, increasing the bit depth can improve the measurement accuracy to some extent for both methods. Especially, the impact of the distorted information of the binary image on the DPRWR performance is significant. When bit depth changes from 10-bit to 16-bit, the RMSE of DPRWR remains stable. This performance characterization is similar to what FWHM performs.
Comparing Tables II and III, when FWHM is 1.9 pixels in Table II and bit depth is 8-bit in Table III, the configurations of SHWFS are the same, and RMSEs of the two methods are 0.0596 λ, 0.2570 λ respectively. When FWHM increases from 1.9 pixels to 6.5 pixels in Table II, the RMSE decreases from 0.0596 λ to 0.0535 λ, while the RMSE is 0.0587 λ in Table III when the bit depth increase from 8-bit to 16-bit. For the DPRWR, improving spot sampling further is more effective than bit depth. For the slope-based method, conversely improving bit depth further is more effective than spot sampling. RMSE at 6.5 pixels is 0.2623 λ in Table II, while RMSE at 16-bit is 0.2396 λ in Table III. A high bit depth sensor means that the light wave energy arriving at sensors can be characterized by grayscale values with a higher sampling rate. It will help the neural network to extract more features and to improve the fitting ability of the network, while this characterization does not exist when the bit depth is  higher than 10-bit. Nevertheless, for a single feature, spot slope, it is more strongly correlated with the bit depth than the spot sampling rate.

C. Influence of the Number of Reconstructed Modes
Zernike polynomials are orthogonal functions in the circular domain and aberrations can be represented by linear combinations of different modes. High-order modes imply high spatial frequencies of aberrations. In this simulation, all datasets were generated by the 780 random Zernike mode coefficients, but the label used for train contains different numbers of mode coefficients, separately, 45, 55, 66, 91, 135, 209, 377, 464, 561, 666, and 780. By the way, the process of counting RMSE between true coefficients and reconstructed coefficients slightly differs from the previous 3 subsections. For instance, when the number of reconstructed mode coefficients is 45, the first 45 of 780 reconstructed coefficients are the network output labels, and the last 735 coefficients are set to 0. It means that the reconstruction error contains the prediction error coming from reconstructed modes and the error induced by the modes that fail to reconstruct but actually exist. Furthermore, the networks used for eleven datasets are different in the number of neurons of full connection layers, as shown in Table IV. In addition, other parameters are FWHM of 1.9, a bit depth of 8-bit, D/r 0 = 10, and a size of 241×241 pixels. Fig. 6 shows the relationship between reconstructed mode numbers and the DPRWR performance at different spot sampling rates. We find that the RMSE decreases first and then increases as the number of reconstructed modes rises. This characteristic is the same at different spot sampling rates. It performs more clearly when FWHM is 0.98 pixels (as shown in the blue curve in Fig. 6. At this time the best reconstructed mode number is at or near 299. The lower-order mode errors and higher-order mode errors when the numbers are 135, 299, and 464 respectively are shown in Table V. In this Table, it can be found that both the measured mode error (3 ∼ 135, 3 ∼ 299) of the network and the unmeasured mode error (136 ∼ 299, 300 ∼ 464) are lower when the reconstructed mode number is 299, comparing to 135, 464. In summary, with other configuration parameters unchanged, there must be an optimal value for the reconstructed mode numbers of the DPRWR. And at this optimum, it is more accurate in measuring low-order aberrations and high-order unmeasured aberrations. In other words, there is an upper limit for the spatial resolution of the DPRWR to guarantee the minimum wavefront error. When the number of reconstructed modes is relatively small, the network is prone to misclassify the high-order modes and low-order modes, thus increasing the measurement accuracy of the low-order modes. In this case, increasing the reconstructed mode number can improve the measurement accuracy of high-order modes and at the same time reduce the misclassification of low-order modes. When the reconstructed mode number is relatively large, it is difficult for DPRWR to accurately discriminate between the high-order modes and low-order modes from the currently available information, so the performance both degrade.
Another point worth noting in Fig. 6 is that RMSE increases more and more slowly with the increase of mode number as the FWHM goes from 0.98 to 4.7 pixels and the number of reconstructed modes is more than 299. When it is less than 299, the change in RMSE is greater. This means that the risk of reconstructing more modes is much lower. Therefore, there is a special recommendation for the selection of the reconstructed mode numbers: the appropriate range of mode numbers is 299 to 561, which is 3.28 to 6.16 times the current sub-aperture number 91.

D. Influence of Noise
The exposure time of the sensor and target brightness indirectly affect the performance of the DPRWR by affecting the image signal-to-noise ratio. Therefore, in this section, we investigate the robustness of the DPRWR and a suitable scheme to cope with the variation of image signal-to-noise ratio caused by the sensor exposure time, target brightness, and so on.
We design four DPRWR schemes in Table VI for noise suppression. The main differences between these 4 schemes are the different preprocessing processes of the training data, including with or without noise and with or without noise reduction processing. The specific noise process refers to subtracting a fixed gray value from the SHWFS image, and the value is related to the actual noise intensity, which may be the median or first quartile of the grayscale range of the pure noise image. Other configuration parameters are the same, with an FWHM of 1.9, a bit depth of 8-bit, and a size of 241x241 pixels. In this simulation, we compare the effects of the 4 noise processing schemes (Table VI) on the wavefront measurements under Gaussian noise with different peak signal-to-noise ratios (PSNR). The details of the four schemes are introduced in II(E). The grayscale dynamic range of the added noise is given in Table VII when the gray level is 256, and Fig. 7 shows the RMSE of DPRWR when the training sets are added with different PSNR noise.
As seen from the figure, when the PSNR is greater than 30, the noise intensity is weak and the overall effect of the noise on the wavefront measurement is not substantial regardless of whether additional noise suppression pre-processing is taken. When the PSNR is less than 30, schemes 2, 3, and 4 all significantly improve the performance of the DPRWR against noise interference. Among them, the RMSE of scheme 3 is the smallest. This means that either adding noise to the training set or subtracting the threshold on the data set can suppress the noise to some extent. But the most effective method among these three schemes is scheme 3, adding noise to the training set and no image denoising processing. The neural network is capable to learn a strong reconstruction performance directly from the noisy image.
In order to find the optimal intensity of the noise signal added to the training set in scheme 3, the RMSEs measured by the DPRWR method on test data with different PSNR of 5, 15, 25, 30, and 50, respectively, are shown in Table VIII. Under the current parameter conditions, when the PSNR of the training set is about 50, the DPRWR method has the smallest RMSE both in strong noise and weak noise compared to the other four cases. At this time, the RMSE is 0.0618 λ when the test data PSNR is 5. Compared with the RMSE of 0.0597 λ when both the training set and the test set are free of noise, RMSE only rises by 3.5%, indicating that scheme 3 can effectively help DPRWR suppress the noise when the PSNR ratio is higher than 5.

A. Experimental Optical Bench
After completing the above simulation analysis, we conducted an experiment on DPRWR to further verify its performance characteristics. The experimental system is shown in Fig. 8, and the relevant parameters are listed in Table IX. The incident laser is modulated to a fixed polarization state by a polarizer and then passes through a collimator lens L1 and an aperture diaphragm to obtain a parallel light, which is split to the SLM by a beam splitter. In order to eliminate the multi-order diffracted light (including zero-order light) caused by the SLM pixel structure and the high-order spot, a 4f system (including L2 and L3) and a diaphragm are built for remaining a secondary order spot in this experiment. The emergent light from lens L3 is divided into two beams through the beam splitter. One beam arrives at the SHWFS forming a spot array image, which is subsequently acquired by the computer as a training data set. The other beam passes through lens L4 forming a focal PSF image on the CCD, which is used to observe the restored spot modulated by SLM and to analyze the wavefront reconstruction performance of DPRWR.
In this environment, the noise grayscale range of the acquired sub-apertures spot images is about 0∼2. Based on the analysis in Section III-D, we adopt scheme 3 to eliminate the effect of noise. Each sub-aperture of the training set and test data are normalized to remove the effect of laser power variation.
In addition, due to the DPRWR performance variation at different turbulence intensities is basically the same in the simulation analysis, we only selected one turbulence intensity for verification in this experiment. The D/r 0 of all the training sets is 10.

B. Experimental Results
To verify the relationship between spot sampling rate and reconstruction performance, the 10000 sub-apertures spot images acquired were downsampled four times. The FWHM of the original spot and downsampled modulation-free spots are 3.7, 1.44, 1.12, 1.08, and 1.06 pixels in this order. The RMSEs in these five cases are shown in Fig. 9(a). It is found that the performance can be significantly improved by increasing spot sampling rates when the spot FWHM is 1.06 to 1.12 pixels, while it no longer works at an FWHM of 1.44∼3.7. This characteristic is consistent with the simulation results in Fig. 4(a). The RMSE of experimental measurement at different bit depths is shown in Fig. 9(b). As the bit depth increases, the RMSE first decreases sharply and then remains almost constant, as Fig. 5(a) shows.
Differently, the minimum requirement for configuration (FWHM>1.44 pixel, bit depth>4-bit) in the experiment is slightly higher than that of simulation (FWHM>0.98 pixels, bit depth>2-bit) in order to ensure precision wavefront measurement because of the noise interference in the experimental data.
To verify the effect of reconstructed mode numbers on detection performance, each of the 10000 aberrated phase screens was generated by 665 Zernike modes. The FWHM of modulationfree spots is 3.8 pixels. Seven different networks differ in the number of neurons of the last full connection layer, which are 65, 135, 209, 377, 464, 561, and 665, respectively. As shown in Fig. 9(c), It is proved that the special resolution upper limit of DPRWR exists. Under these experimental conditions, the optimal reconstructed mode number is near 377. This conclusion is basically consistent with the simulation results.
Based on the above experimental conclusions, as well as taking into account the economy and convenience, we selected a suitable SHWFS parameter configuration with an FWHM of 3.8, a bit depth of 8, and a reconstructed mode number of 377 and then conducted two closed-loop iterative correction experiments respectively under D/r 0 = 6 and D/r 0 = 18.
The corrected focal spot and reconstructed wavefront are shown in Fig. 10 below. When the D/r 0 of the aberration is 6, a great recovery spot can be obtained by one correction in the DPRWR method. After the second correction, the Strehl ratio (SR) of the spot recovered by DPRWR is 0.73 and the RMSE of the reconstructed wavefront is 0.029 λ , while the SR is 0.63 and the RMSE is 0.123λ in the slope-based method.
When the D/r 0 of the aberration is 18, the original spot diffuses severely. For this strong aberration, it only takes two iterations in the DPRWR method to obtain a great recovery spot, yet the quality of the spot recovered by the slope-based method is much worse. The SRs of the spots secondly corrected by two methods are 0.72 and 0.47 respectively. The RMSEs are 0.041λ and 0.269λ respectively. Compared to the slope-based method, the intensity of the central pixel in the recovery spot recovered by DPRWR is much higher.
We find that both for strong and weak aberrations, only two rounds of iterative correction in DPRWR can achieve a focal spot with a high SR. Yet more rounds of iterative correction are considered in the slope-based method for obtaining a betterquality spot image.

V. CONCLUSION AND DISCUSSION
In this paper, we explore the relationship between SHWFS configuration parameters, network structure parameters (spot sampling rate, bit depth, number of reconstructed modes, noise) and reconstruction performance of the CNN-based DPRWR method. We gain a clear understanding of the application limits and resolution upper bound of DPRWR. To further optimize the DPRWR, we find a suitable SHWFS configuration and noise suppression scheme adapted to the neural network. This study provides a theoretical and experimental basis for the parameter selection of DPRWR to advance its future application to the telescope system.
Based on the previous analysis in this paper, we conclude that there are several important findings: 1) The performance of the CNN-based wavefront reconstruction method is more sensitive to the spot sampling rate compared to the bit depth, while the conventional slopebased method is exactly the opposite. Increasing the spot FWHM can further improve the DPRWR reconstruction accuracy, but for the slope-based method increasing the bit depth is more effective for performance improvement. 2) For both methods, the Hartmann sensor must meet this condition that the maximum light intensity focused on one pixel of the sub-aperture spot is at least less than 90%, otherwise, the detection performance will degrade drastically.
3) The impact of distortion of binary images on the prediction ability of neural networks is enormous. For the DPRWR, the bit depth needs at least 2-bit or more. However, for the slope method, this constraint does not exist. 4) The number of high-accuracy reconstructed modes of the DPRWR is about 3 to 6 times the number of sub-apertures, which is much higher than the spatial resolution of the slope-based method. Either increasing or decreasing the reconstructed mode number will reduce the reconstruction performance of DPRWR to some extent. Differently, for neural networks, the risk of taking a larger number of reconstructed modes is much lower than taking a smaller number of reconstructed modes, especially at larger spot sampling rates. 5) To achieve the optimal performance of DPRWR, the suitable range of FWHM is 1.46 to 4.7 pixels, of bit depth is 4-bit to 16-bit, and of reconstructed mode number is 3 to 6 times the number of sub-apertures. 6) The training set images with weak noise and without subtraction of the global threshold can effectively improve the generalization ability of the neural network under different levels of noise. In this experiment, the wavefront reconstruction performance of the DPRWR based on this scheme is only slightly degraded by 3.5% when dealing with strong noise at a PSNR of 5 compared to the case without noise interference. In the future, we will further study solutions adapted to the neural network structure in more extreme cases, such as the spot deficit of partial sub-apertures, broad-spectrum light source illumination, system latency under high temporal frequency distortions [30], and so on. We plan to apply DPRWR to the telescope system to obtain better observation results.