Machine Learning Approach for Frozen Tuna Freshness Inspection Using Low-Frequency A-Mode Ultrasound

Despite the ubiquity of ultrasonography in nondestructive inspection, its application to high-attenuation materials is challenging. At frequencies less than 1 MHz, ultrasound can inspect high-attenuation materials owing to its high penetration ability. Such ultrasound data are acquired using a single-element transducer that generates single-channel signals (A-mode). However, low-frequency A-mode ultrasound signals have low-resolution caused by long wavelengths, and less information than B-mode images generated by multi-channel transducers. Discriminating low-resolution data is made possible by recent advances in machine learning technology. This study employs machine learning to develop an inspection method for high-attenuation frozen materials. This study focuses on the inspection of the freshness of frozen tuna, which has a large market but uses a destructive inspection method. We applied eight typical machine learning algorithms to A-mode signal data (43 samples, 3168 signals) of frozen tuna to calculate freshness scores; we used fast Fourier transform in the feature extraction process. Our experiments show that all algorithms could classify the freshness of frozen tuna with statistical significance (<inline-formula> <tex-math notation="LaTeX">${p}$ </tex-math></inline-formula> < 0.05, one-tailed <inline-formula> <tex-math notation="LaTeX">${t}$ </tex-math></inline-formula>-test). Furthermore, we investigated the performance improvement in the mean (standard deviation) of the area under the receiver operating characteristic curves by taking the mean of the freshness scores on 24 signals. We observed that the best performance (quadratic discriminant analysis) increased from 0.619 (0.041) using a single signal to 0.724 (0.080) using 24 signals with statistical significance (<inline-formula> <tex-math notation="LaTeX">${p}$ </tex-math></inline-formula> < 0.05, paired one-tailed <inline-formula> <tex-math notation="LaTeX">${t}$ </tex-math></inline-formula>-test). This is the first study that inspects frozen tuna using ultrasound and machine learning technology.

and piping [15].In ultrasonography, the ultrasound frequency is selected according to the target materials.The ultrasound frequency determines its penetration ability; the lower the frequency, the greater the penetration ability [16].However, the resolution of low-frequency ultrasound is low owing to its long wavelength.
Low-resolution limits the development of ultrasound inspection technology for high-attenuation materials that require lower than 1 MHz frequency [17].This drawback has limited research on the inspection of high-attenuation materials using low-frequency ultrasound to only a few fields, such as bone inspection [18], [19], [20], [21] and concrete inspection [22], [23], [24].Moreover, the data acquired using ultrasound are noisy and require a high level of skill to read.However, recent rapid advances in machine learning technology [25] allow overcoming these challenges; therefore, the application of machine learning technology has been actively researched.
Machine learning has been widely developed for fetal ultrasound [26], [27], [28], [29], [30], [31] such as B-mode ultrasound, which is the most common modality used in ultrasound imaging.Examples include segmentation using time series information from ultrasound videos [32] and detection of acoustic shadows [33].Moreover, a convolutional neural network was used to inspect concrete using 50 kHz lowfrequency B-mode ultrasound [34].In addition, singleelement transducers are commonly used for low-frequency ultrasound because the transducer diameter must be large to maintain the directivity of the acoustic wave; the sine of the beam spread angle in the far-field is approximately inversely proportional to the frequency and transducer diameter [35].Ultrasound data acquired using single-element transducers are single-channel signals called A-mode and are less informative than B-mode, comprising multiple-channel signals.In this study, we focus on the machine learning analysis of low-frequency A-mode ultrasound signals.The drawback of less information on A-mode signals has inspired several studies to challenge the application of machine learning technology to A-mode ultrasound data.Some pioneering studies have developed technology to recognize gestures using A-mode ultrasound using typical machine learning algorithms [36], [37], [38], [39], [40], [41].However, no studies have reported the use of a machine learning approach for low-frequency A-mode ultrasound to inspect high-attenuation materials.
The attenuation coefficient of ultrasound waves in ice increases with frequency owing to the scattering from trapped air bubbles [42].Therefore, frozen materials are often highly attenuating at high-frequencies and requires low-frequency ultrasound to inspect them.In particular, we study the ultrasound inspection of frozen tuna, which has a large worldwide market but whose inspection involves the destructive tail-cutting method [43], [44], [45].Previous studies have shown that ultrasonography, including air-coupled techniques can be used to inspect non-frozen meat such as pork [46] and cured meat [47].Low-frequency A-mode ultrasound has been reported to inspect various characteristics of pork loin [48] and dry-cured loins [49].Ultrasound inspection is also feasible for non-frozen fish using machine learning technology [50].However, no previous studies have used ultrasound to inspect frozen fish or meat have been reported.
Therefore, this study aims to demonstrate an inspection method for frozen tuna that overcomes the low-resolution of low-frequency ultrasound using machine learning.We explored the effective use of eight machine learning algorithms by investigating the effects of multiple A-mode signals, probe positions, and pressure.The main contributions of our study are as follows: • This is the first demonstration that nondestructively inspects the freshness of frozen tuna using ultrasound and machine learning algorithms with statistical significance (p < 0.05, one-tailed t-test).
• This study shows that taking the mean of the freshness scores on the multiple A-mode signals acquired from the same sample improved the classification performance (p < 0.05, paired one-tailed t-test).
• We investigated the influence of probe positions and probe pressure on inspection performance; our results imply that smaller variations in position and pressure may contribute to better performance.

II. METHOD
Fig. 1 shows the overall flowchart of this study, with each panel corresponding to each subsection, i.e., Panel (A) corresponds to Section II-A, Panel (B) corresponds to Section II-B, and Panel (C) corresponds to Section II-C, respectively.
A. ULTRASOUND DATA ACQUISITION 1) TUNA SAMPLES We used the caudal side of the big-eye tuna (Thunnus obesus), as shown in Fig. 2. We obtained all samples from a major Japanese seafood supplier.The supplier graded all samples as fresh or not using tail-cutting, the current mainstream method [44].The tail-cutting method involves the visual assessment of the cross-section of the tail part by a tuna master.Freshness is the degree of progress of rigor mortis.
Rigor mortis involves a wide variety of physical and chemical phenomena.In the current industry, the grading of tail-cutting by tuna master is widely used and accepted.Therefore, we regard tail-cutting results as correct in the scope of this study.''Fresh'' indicates fish freshness used in luxury sushi restaurants.''Non-fresh'' indicates fish freshness used in supermarkets and conveyor-belt sushi restaurants.This study used 20 fresh and 23 non-fresh samples.Samples with skin were ordered to be cut with identical geometry regardless of freshness.

2) SAMPLE MEASUREMENT AND PROBE SPOTS ASSIGNMENT
Fig. 3 illustrates the measured parts that determine the shape of a tuna sample.For all samples, we measured the lengths of the major and minor axes and the circumferences of the ellipse in the caudal and abdominal side cross sections.We also measured the lengths of the four straight lines on the left, right, dorsal, and ventral sides, connecting the end of the major and minor axes each.The measurement results for the samples in Table 1 show almost no differences between the fresh and non-fresh samples.We also assigned the coordinate and set probe spots on the surface of a tuna FIGURE 1. Entire flow of this study.(A) Ultrasound data acquisition process.We cut a sample from the tuna and assigned the coordinate and probe spots to its surface.Subsequently, we acquired the A-mode signals using two single-element transducers with a center frequency of 500 kHz, one for transmitting and the other for receiving.(B) Machine learning approach employed in this study.(1) Single A-mode signal method.We performed a fast Fourier transform (FFT) on the analysis area that includes reflections from the spine and used amplitude spectrum as input to the eight machine learning algorithms.The output of the machine learning model was then scaled from 0 to 1 to provide the freshness score.We call this the single A-mode signal method.(2) Multiple A-mode signals method.The freshness score for the multiple A-model signals was calculated using the mean of the outputs of the single A-mode signal method on multiple signals (24 signals were used).The freshness score was also scaled from 0 to 1.We call this the multiple A-mode signals method.(C) Model evaluation and analysis.We investigated the following three questions by conducting numerical experiments: (1) whether the freshness score can inspect freshness, (2) whether the multiple A-mode signals method can achieve higher performance than the single A-mode signal method, and (3) how the probe position and pressure influence the performance.We employed the receiver operating characteristic (ROC) curve and its area under the curve (AUC) for the performance metric.We used a t -test for statistical analysis.
sample, as shown in Fig. 4. A three-letter alphabet and a single number indicate the probe spot of the ultrasonic transducers, as follows: • The first alphabet indicates either the dorsal half (U) or ventral half (D) of the tuna.
• The second alphabet indicates whether the right half (R) or left half (L) is viewed from the caudal side.
• The third alphabet indicates whether the portion is near the major axis of an elliptical cross-section (vertical ''v''), near the minor axis (horizontal ''h''), or between them (medium ''m'').
• The last number is the serial number of the crosssectional ellipse; 0 indicates the caudal cross-section, 2 indicates the abdominal cross-section, and 1 indicates the middle of the two cross-sections.
For example, ''URv0'' indicates the position near the upper right portion and the major axis in the ellipse of the caudal cross-section.Although each sample had 36 probe spots, the probe spots at the vertical (v) position were sharply angled, which made manual operation difficult; therefore, we limited the number of probe spots to 24 in this study.

3) ULTRASONOGRAPHY EQUIPMENT AND CONFIGURATION
Frozen materials, including frozen tuna, have a high attenuation coefficient owing to the air bubbles in ice.Therefore, low-frequency ultrasound below 1 MHz is necessary for ultrasound inspection.This study used single-element composite ultrasonic transducers, custom-made by Japan Probe Company Ltd. (Yokohama City, Japan), with a center frequency of 500 kHz.The model was B0.5K20N; the diameters of the element and case are 20 mm and 25 mm, respectively.A-mode signals were acquired using the two-probe reflection ( V-reflection) method; this method uses one for transmission and another for reception.The distance between the two single-element ultrasonic transducers was 3 mm.We placed a 3 mm thick silicon sheet with a hardness of 30 A between the tuna surface and the single-element ultrasonic transducers.We used glycerol as an ultrasonic couplant.We used a Japan Probe JPR-600C ultrasonic pulser-receiver.For the pulser configuration, we set the pulser voltage to 100 V, signal shape to square pulse, pulse frequency to 500 kHz, damping resistance to 100 , and the number of waves to 1.For the receiver configuration, we set the gain of the reception preamplifier to 40 dB, the analog filter to 100 kHz high-pass, and the sampling frequency for converting digital data to 50 MHz; JPR-600C has an internal anti-aliasing filter that blocks signals above the Nyquist frequency.The analog high-pass filter was the lowest setting of our equipment in order to obtain signals with the widest possible frequency bandwidth.The model number of the analog-to-digital converter is AD9214-105 (pipeline type) made by Analog Devices Inc. (Wilmington, Massachusetts); the resolution was 10 bits, and the record length was set to 159.6 µs (7980 points).

4) ULTRASOUND DATA ACQUISITION
Seven examiners acquired signals from the 24 probe spots for each sample.Signals were acquired multiple times from the same spot.We alternately acquired signals from fresh and non-fresh samples and randomly assigned examiners to avoid signal bias.To verify the effect of the probe contact condition, we acquired data for two probe pressures: normal and high.The mean (standard deviation) of normal pressure were 84.9 N (7.5 N) per two transducers.Those for high pressure were 174.8 N (34.0N) per two transducers.

5) DATASET ORGANIZATION
The A-mode signal data were organized according to the probe position and pressure.We denote the dataset in the form of P i L j , where i = {normal, high, mix}, j = {h, m, U, D, mix}.For example, we denote P normal L U for inspections at normal pressure and probe spot at {Uyzw|y = {R, L}, z = {h, m}, w = {0, 1, 2}}; first alphabet is fixed to ''U.''Particularly, we use P mix to indicate all pressures and L mix to indicate all probe spots employed in this study.Each dataset was organized into the following groups.
• Standard dataset• • • This is the standard dataset used in all numerical experiments and analyses.We employ P normal L mix .This is because this study considers it standard for performing inspections in all positions at constant pressure.
• Specialized datasets• • • The datasets in this group were used for investigating the influence of position and pressure.P normal L h , P normal L m , P normal L U , and P normal L D are datasets to investigate the influence of position on performance.P high L mix is the dataset used to investigate the influence of pressure on performance.P mix L mix is the dataset containing all data acquired in this study.Table 2 presents details of all datasets used in this study.We split each dataset into three-fold for training and testing.We describe details of the usage of datasets in numerical experiments in Section II-C.

B. MACHINE LEARNING APPROACH 1) FEATURE EXTRACTION PROCESS
In the feature extraction process, we cut out the analysis area from the A-mode signal, applied a fast Fourier transform (FFT) to transform signals to frequency domain, and resulting amplitude spectrum (i.e., bins of frequencies) was used as the input features of machine learning algorithms.The analysis area was set from 10 µs to 80 µs and included reflected waves from the spine.The exclusion of up to 10 µs removed signals from the surface waves and skin areas, and that 107382 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.later than 80 µs removed multiple reflections.We applied FFT to this signal and calculated the amplitude spectrum.We used amplitude spectrum for the input features because theoretical and experimental studies have shown that the behavior of ultrasound depends on frequency [51], [52], [53].For the amplitude spectrum, the part from 350 kHz to 650 kHz was used as the input features considering the ultrasound bandwidth emitted from the single-element ultrasonic transducer and the noise amplitude.The resulting input feature dimension (number of the frequency bins) was 10.
Regarding the FFT software, we employed the numpy FFT function to compute the amplitude spectrum.NumPy FFT function internally uses PocketFFT, which is a library that takes a signal of any input length and automatically selects the best FFT algorithm.The worst-case time complexity is O(N logN ); Bluestein's algorithm is used for the input that has large prime factors.

2) EIGHT MACHINE LEARNING ALGORITHMS
We employed the following classic classifiers as our machine learning algorithms.We selected eight algorithms from the Scikit-learn library.We employed support vector machines (SVMs) as representative standard classifiers.The one without a kernel is called the linear SVM [54], and the one with the radial basis function kernel is called RBF [55] SVM.We employed linear discriminant analysis (LDA) [56] and quadratic discriminant analysis (QDA) [57] as parametric models assuming Gaussian distributions.We employed random forest [58] and adaptive boosting (AdaBoost) [59] as ensemble learning algorithms using decision trees as weak learners.Finally, we employed a naive three-layer fully connected multi-layer perceptron (MLP) [60] and k-nearest neighbors [61] as representative non-parametric models.
We trained and tested all machine learning algorithms with a positive label (value is 1) as ''fresh'' and negative label (value is 0) as ''non-fresh.''The default hyperparameters in the Scikit-learn library were used.The details are described in Appendix A. We employed class probability as the output score for each machine learning algorithm.The class probability is defined as the ''decision function'' in the Scikit-learn library for each machine learning algorithm.

3) SINGLE A-MODE SIGNAL METHOD
We call the method of calculating the freshness score for a single A-mode signal the single A-mode signal method, and it is expressed as where s is the freshness score, d is the A-mode ultrasound signal, a is the sample index, b is the signal index, FFT(•) is the fast Fourier transform with the amplitude spectrum as output, ML(•) is the machine learning algorithm with the class probability as output, and NORM single (•) is the min-max normalization function, which makes the maximum value of s equal to 1 and the minimum value equal to 0.

4) MULTIPLE A-MODE SIGNALS METHOD
The classification performance must be higher with multiple A-mode signals than with a single signal because the noise effects are averaged out.Therefore, we calculated a freshness score for multiple signals by calculating the mean of the freshness scores of the single A-mode signal method.We call this method the multiple A-mode signals method, and its freshness score s is calculated by where s is the freshness score calculated by single A-mode signal method, a is the sample index, b and b ′ are the signal indices, k is the number of signals used, NORM multiple (•) is the min-max normalization function makes the maximum value of s be 1 and the minimum value be 0, and B k (b) is the set of index from b to b+k−1; we note that if b+k−1 exceeds the last index we collect the remaining index sequentially from the first index.In this study, we set the default value of the number of signals k to 24, except for the experiments to evaluate the performance dependency on k.

C. MODEL EVALUATION AND ANALYSIS
In this study, we conducted numerical experiments to answer the following three research questions.
• Whether the freshness score obtained by applying machine learning to A-mode signals can be used to inspect freshness.• Whether multiple A-mode signals can improve the performance of freshness inspection.
• Whether the position and pressure at which the A-mode signals are acquired have any influence on the performance of freshness inspection.We used the following software and hardware for all numerical experiments and analyses.The software used included Python 3.10.9,NumPy 1.25.0[62], SciPy 1.11.1 [63], Pandas 2.0.3 [64], and Scikit-learn 1.3.0[65]; The hardware used was an Intel Xeon CPU E3-1245 v5 with 32 GB RAM.

1) STATISTICAL ANALYSIS OF FRESHNESS SCORES (SETTING DETAILS)
We calculated the freshness scores for eight machine learning algorithms using the single and multiple A-mode signals methods to test whether the freshness scores can be used to inspect the freshness of tuna.We created two groups of freshness scores: freshness scores from fresh samples and freshness scores from non-fresh samples.We tested whether the difference between the means of the two groups was statistically significant.We used a one-tailed t-test to determine the p-value.We set the threshold for statistical significance at p = 0.05.We employed standard dataset P normal L mix as the dataset and used fold-1 and fold-2 for training and the fold-3 for the evaluation and statistical analysis.We performed this statistical analysis for each of the eight machine learning algorithms.

2) PERFORMANCE ANALYSIS OF THE SINGLE AND MULTIPLE A-MODE SIGNALS METHODS (SETTING DETAILS)
We conducted numerical experiments to compare the performance of the single and multiple A-mode signals methods.We employed the receiver operating characteristic (ROC) curve, calculated based on freshness scores, and its area under the curve (AUC) to evaluate the performance of the machine learning algorithm.We employed standard dataset P normal L mix as the dataset.For performance evaluation, one ROC curve is calculated for one set of training and testing.Therefore, multiple sets of evaluations are needed to assess ROC curve variation.This study employed three-fold cross-validation; two-thirds were used for training and the remainder for testing, and we repeated three patterns of splits.We compared the mean (standard deviation) of ROC-AUCs obtained using the single and multiple signals methods, and tested whether the improvements were statistically significant.We created two groups of ROC-AUCs: one for the single A-mode signals method and the other for the multiple A-mode signals method.We considered ROC-AUCs calculated from the same split to be a pair, and we performed statistical analysis on the difference in the means of the ROC-AUCs between the two groups.We used a paired onetailed t-test to determine the p-value.We set the threshold for statistical significance at p = 0.05.We performed this statistical analysis for each of the eight machine learning algorithms.Furthermore, we evaluated the performance dependency of the multiple A-mode signals method using ROC-AUCs on the number of signals by varying the number of signals from 1 to 24 on each of the eight machine learning algorithms.

3) ANALYSIS OF PERFORMANCE DEPENDENCY ON THE PROBE POSITION AND PRESSURE (SETTING DETAILS)
We further investigated the influence of the probe position and pressure on the performance.We investigated this topic using specialized datasets in which the probe spot position and pressure are different from the standard dataset.Specifically, we evaluated the mean and standard deviation of the ROC-AUCs of the multiple A-mode signals method  in three-fold cross-validation for eight machine learning algorithms on P normal L mix , P normal L h , P normal L m , P normal L U , P normal L D , P high L mix , and P mix L mix .We note that only the multiple A-mode signals method is included in the analysis because of the significant performance improvement achieved by the multiple A-mode signals method.

A. EXAMPLES OF A-MODE SIGNALS
We present representative examples of the A-mode signals from the P normal L mix dataset.Fig. 5 and 6 show the signals acquired from fresh and non-fresh samples, respectively.The amplitude spectra of these signals are provided in Supplementary Materials.In the two-probe reflection method, the signal rise time was approximately 10 µs because the transmitting and receiving probes were different, and a silicon sheet was placed between the transducers and tuna surface.Because ultrasound waves were intermittently reflected inside the tuna, reflected waves could be observed in all time areas up to the measurement limit after the rise time.
Because the spine has a larger acoustic impedance than ice, a large reflection from the spine was observed in most signals between 40 µs and 80 µs.Reflection waves were also observed from a later spine area.These waves included reflections from the opposite side of the spine and multiple reflections.Although reflection waves from this time area could contain some information, we excluded them from the following sections for ease of analysis.
Finally, we visually compared the signals obtained from fresh and non-fresh samples.The amplitudes of the reflection signal from the spine tended to be larger in the nonfresh sample.However, the signals from the fresh sample were larger in some cases owing to their complex internal anatomy.

B. STATISTICAL ANALYSIS OF FRESHNESS SCORES
Fig. 7 shows the freshness scores calculated using the single and multiple A-mode signals methods on the P normal L mix dataset.The graph also shows the p-values calculated using a one-tailed t-test.The p-values were less than 0.05 in all cases for all machine learning algorithms.Moreover, the p-values were smaller for the multiple A-mode signals method than for the single A-mode signal method for all machine learning algorithms.These results suggest that tuna freshness can be classified using A-mode ultrasound signals and machine learning, as proposed in this study.

C. PERFORMANCE ANALYSIS OF THE SINGLE AND MULTIPLE A-MODE SIGNALS METHODS 1) RECEIVER OPERATING CHARACTERISTIC CURVES
Table 3 shows the performance of eight machine learning algorithms on the standard dataset P normal L mix .For all algorithms, the performances of the multiple A-mode signals method were higher than those of the single A-mode signal method with statistical significance (p < 0.05, paired onetailed t-test).The performance of QDA was the best for both single and multiple A-mode signals methods.The mean of ROC-AUCs of QDA in the single and multiple signals methods were 0.619 and 0.724, respectively.RBF-SVM and random forest were close to QDA in the mean of ROC-AUCs at 0.721 and 0.722 using multiple A-mode signals method, respectively.The random forest had the lowest standard  deviation of 0.031, despite its high performance.Fig. 8 shows the ROC curves of eight machine learning algorithms using single and multiple signals methods.We compare QDA, RBF SVM, and random forest, which performed relatively better in the multiple A-mode signals method.QDA and RBF SVM tended to have a similar shape of the ROC curve.The ROC curves for QDA and RBF SVM are slightly convex in the middle of the false positive rates.The random forest showed relatively high true positive rates on the low and high false positive rates, whereas true positive rates are slightly lower in the middle of the false positive rates.

2) PERFORMANCE DEPENDENCY ON THE NUMBER OF SIGNALS IN THE MULTIPLE A-MODE SIGNALS METHOD
Fig. 9 shows the dependency of the performance on the number of signals in the multiple A-mode signals method; the standard dataset P normal L mix was employed.The ROC-AUC increased monotonically with the number of signals for all eight algorithms.The best-performing algorithm across all numbers of signals was QDA, increasing from 0.619 (single signal) to 0.724 (24 signals).Moreover, the trends FIGURE 10.Performance dependency of the multiple A-mode signals method on the probe position and pressure.Each bar shows the mean of the three-fold cross-validation, and each error-bar shows one standard deviation.For the dataset, ''P'' indicates the pressure to the transducers; ''normal,'' ''high,'' or ''mix'' of both.''L'' indicates the position of the probe spots;''v'' indicates near the major axis; ''h'' indicates near the minor axis; ''m'' indicates between the ''v'' and ''h'' probe spots; ''U'' indicates the upper part; ''D'' indicates the lower part; ''mix'' indicates all probe spots acquired in this study.SVM, support vector machine; RBF, radial basis function (kernel); LDA, linear discriminant analysis; QDA, quadratic discriminant analysis; AdaBoost, adaptive boost; MLP, multi-layer perceptron.

TABLE 4.
Performance dependency of the multiple A-mode signals method on the probe position and pressure.
in performance increase can be divided into two groups.For most algorithms, the performance improvement slowed down after the number of signals reached 10.In contrast, random forest and AdaBoost improved their performance nearly linearly after 10 signals.

D. PERFORMANCE DEPENDENCY ON THE PROBE POSITION AND PRESSURE
Fig. 10 and Table 4 show the dependency of the performance (ROC-AUC) on the probe position and pressure.The performances were lower for all algorithms on the P normal L h dataset than on the standard dataset P normal L mix .All A-mode signals in this dataset were acquired from the probe spots at the horizontal (h) position.On excluding the P normal L h dataset, the performances of the best-performing algorithm in P normal L m , P normal L U , and P normal L D datasets were higher than that of the P normal L mix dataset.Therefore, probe spots at similar positions are preferable than those at various positions for better performance.For probe pressure, the best performance was 0.779 for AdaBoost in the P high L mix dataset; however, the mean of the eight algorithms is slightly lower than the standard dataset P normal L mix .The high pressure may benefit certain algorithms.The best and mean performances in the P mix L mix dataset were lower than those in the P normal L mix and P high L mix datasets.Therefore, smaller variations in probe pressure improve the performance compared to larger variations.

IV. DISCUSSION
First, we discuss the possibility of performing ultrasonography of frozen materials, especially frozen fish, within the context of existing research.The single A-mode signal method is a naive combination of FFT and typical machine learning algorithms.All algorithms could classify the freshness of frozen tuna with sufficient statistical significance, as shown in Fig. 7.Although there are no publications on the 107388 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.ultrasonography of frozen fish, Tokunaga et al. have shown that the freshness and fat content of raw fish can be estimated using ultrasound and machine learning [50].The Young's modulus and viscosity coefficient of fish change with rigor mortis [66], which are thought to affect the signals.Although it is natural to assume that the Young's modulus and viscosity coefficient are also affected by the freshness of frozen meat, this has not been reported.This is because it is difficult to measure the Young's modulus and viscosity coefficient while maintaining the frozen state without peeling and this is a future challenge.In addition to freshness, other factors that change the physical coefficients in raw meat, such as fat content, can also change ultrasound signals in frozen meat and would be wide application.
We compare the advantages of ultrasonography for the inspection of frozen fish to those of other methods.Recently, optical methods such as hyperspectral imaging, near-infrared spectroscopy, and fluorescence spectroscopy have been studied to inspect the freshness of frozen fish [67].Although these methods can detect chemical compounds, the tissue penetration of light with wavelengths ranging from ultraviolet to near-infrared is not as deep as that of low-frequency ultrasound [68], [69], [70].We assume that this reason has caused optical methods to focus mainly on fillets and loins [71], [72], [73], [74], [75], [76], [77], [78], [79], [80].Few studies reported the inspection of intact frozen fish, and the target was horse mackerel fish [81], [82].To the best of our knowledge, no study to inspect large frozen fish with thick skin has been reported.The ultrasound-based freshness inspection method proposed in this study is assumed to be especially beneficial for the inspection of large frozen fish such as tuna.
In addition, we discuss the performance improvement of machine learning algorithms using the multiple A-mode signals methods.Low-frequency ultrasound is necessary to conduct ultrasonography of high-attenuation materials.Lowfrequency ultrasound requires a larger aperture owing to its high diffraction, and single-element ultrasonic transducers that generate A-mode signal data are commonly used.Fig. 7 shows that the single A-mode signal method can classify the data with statistical significance.However, the ROC-AUC was in the low range of 0.5 to 0.6 for the eight typical machine learning algorithms.The multiple A-mode signals method improved performance by taking the mean of the freshness scores.Fig. 9 shows that the performance increased monotonically with the number of signals.Therefore, the performance degradation due to noises that can be eliminated by averaging is substantial.This result agrees with the previous study that reported a substantial increase in performance by utilizing multiple A-mode signals in gesture recognition [36].This implies that the noise is large but follows a relatively simple distribution.This is consistent with the fact that QDA, which assumes a Gaussian distribution, is the best-performing algorithm.A previous study on gesture analysis using A-mode signals also reported the highest performance of QDA [38] and supports this insight for low-frequency A-mode ultrasound signals.Interestingly, the performance of random forest and AdaBoost increased linearly with the number of signals.This may be because the classification planes drawn by the two methods have captured the intrinsic features of the data better than the others.
We discuss the best option of the eight machine learning algorithms to use the multiple A-mode signals method.Table 3 shows that QDA has the best performance with a mean (standard deviation) of 0.724 (0.080) in ROC-AUC, but RBF SVM and random forest achieved 0.721 (0.066) and 0.722 (0.031), respectively.The performances of RBF SVM and random forest were very close to the best.In addition, the random forest had the smallest standard deviation despite its high performance.The ROC curves for QDA and RBF SVM shown in Fig. 8 are slightly convex in the middle of the false positive rates.The ROC curve of the random forest has relatively high true positive rates on the low and high false positive rates, which indicates that the random forest is relatively robust over a wide range of threshold values.As mentioned in the previous paragraph, the random forest also performed better improvement with the increase in the number of signals.Therefore, although QDA is the best choice in terms of performance, the random forest is also an option that cannot be completely ruled out.
We discuss the influence of performance on the probe position and pressure.In general, smaller variations in positions and pressures contributed to better performance than larger variations; P mix L mix is the worst dataset in terms of the best-performing machine learning algorithm.A closer look reveals slight differences in each position and pressure.The ROC-AUC of the best-performing machine learning algorithm was the second worst when using the horizontal (h) positions as the probe spots, as shown in Fig. 10.This is because the ultrasound waves pass through the dark muscle of the tuna at the horizontal (h) positions, and the ultrasound reflects this effect.P normal L m was the best dataset in terms of the best ROC-AUC among P normal L m , P normal L U , and P normal L D ; however, the differences between the other two datasets were too small to be insightful.Regarding the effect of probe pressure, different machine learning algorithms performed better in the P high L mix dataset with high pressure than in the normal case; AdaBoost demonstrated 0.779 in ROC-AUC and the best performance in this study.High probe pressures may cause intrinsic features to stand out compared with noise and contribute to the performance improvement of the algorithms with simpler classification planes.
Future studies will focus on improving performance.QDA generally performed better than other algorithms.However, the most influential factor on performance was not the algorithm type but the number of signals used.Our results on the multiple A-mode signals method show substantial performance improvements when utilizing multiple signals.
Further performance improvement can be expected by employing more advanced methods.This problem setting falls within the framework of multiple instance learning [83].A more sophisticated method may achieve better performance.Moreover, performance depends on the variations in the probe position and probe pressure.Manual acquisition of ultrasound data was employed in this study; however, it would be desirable to improve performance by acquiring data with more accurate positioning and stable pressure.This suggests the need to mechanize ultrasound data acquisition.
Finally, we discuss the generalizability of our findings to high-attenuation materials other than frozen tuna.The most important finding of this study is that performance can be dramatically improved by using multiple signals.
The performance improvement by using multiple A-mode signals has been reported in gesture detection study [36], and is expected to have some generality.The freshness of frozen tuna is a change in the physical properties of the entire material.Our findings are considered useful for the detection of changes in physical properties across the entire material.For other high-attenuation materials, we can consider the degradation of wood, foods, and composite materials.Conversely, the findings of this study are not considered applicable to the task of detecting defects such as cavities in a material.For such tasks, we need to employ B-mode ultrasound data using multi-channel transducers.

V. LIMITATIONS
This study has some limitations.First, this study depends on supplier tail-cutting for freshness labels.Although this is a standard process, it has not been investigated scientifically and remains a challenge for the future.Second, because the probe operation was manual, our data have large variations in the probe positions and pressures.Data acquisition using mechanized equipment can yield data with smaller variations.Finally, we employed amplitude spectrum as the input features of machine learning algorithms; however, other methods have not been tested.We also did not discuss deep learning algorithms [84], [85], [86], [87], [88], which have achieved state-of-the-art results in many fields, because of the large number of variations and the difficulty of analysis.

VI. CONCLUSION
In this study, we investigated whether low-frequency A-mode ultrasound combination with machine learning can be used to inspect frozen materials, specifically frozen tuna.We also evaluated the factors that influence the performance of this method.This is the first study in which the freshness of tuna was inspected using low-frequency ultrasound.Furthermore, this study shows that utilizing multiple signals can improve the performance of the A-mode ultrasound analysis.The findings of this study have the potential to contribute to the analysis of A-mode ultrasound data obtained from other highattenuation materials.Examples of such materials include wood, food, and composite materials.

APPENDIX A DETAILS OF HYPERPARAMETERS
As described in the main text, this study mostly employed the default values provided by the Scikit-learn library for the hyperparameters.The key hyperparameters for each algorithm are explained in this section.Support vector machine (SVM) internally calls LIBSVM [89] and employs hinge loss and L2 penalty.The cost parameter was set to 1.0.Linear SVM does not use the kernel function, whereas RBF SVM uses the radial basis function kernel.The gamma parameter for the RBF kernel is calculated as the inverse of the multiple of the input feature number and variance of the training data.Linear and quadratic discriminant analyses fit the model to the data with a Gaussian distribution assumption using Bayes' rule.Both employ singular value decomposition as solvers and require no hyperparameters except a small tolerance value.The random forest is an ensemble method that uses decision trees as weak learners.The maximum number of estimators was 100, the maximum depth of the decision tree was unlimited, and the maximum features of the decision tree were the square root of the number of input features.The Gini impurity was used as the criterion for the decision trees.AdaBoost is another ensemble method that uses decision trees as weak learners.The maximum number of estimators was 50, and the learning rate was 1.0; the maximum depth of the decision tree was 1, and the maximum number of features of the decision tree was the number of input features.For the multi-layer perceptron, we used the default value; however, the maximum number of iterations was changed to 5000 because the calculation did not converge at the default value of 200.The neural network comprised 100 units of intermediate layers and a ReLU function as the activation function.We used the Adam optimizer, with 0.001 as the initial learning rate and 200 as the batch size, as the optimization algorithm.Regarding k-nearest neighbors, the number of neighbors for each sample point was 5, and the metric was Euclidean distance.

FIGURE 2 .
FIGURE 2. Sample of a tuna; it has a skin and white spine in the center.

FIGURE 3 .
FIGURE 3. Truncated elliptic cone approximation of the tuna.

FIGURE 4 .
FIGURE 4. Illustration of the assignment of the coordinate and probe spots.The first alphabet indicates the dorsal half (U) or ventral half (D), and the second alphabet indicates the right (R) or left (L) sides as viewed from the caudal side.The third alphabet indicates the part closer to the minor axis (v) , the major axis (h), and the intermediate (m).The last number is the serial number from the caudal side.

FIGURE
FIGURESignals from a fresh sample on left side probe spots; the vertical axis shows voltage (V), and the horizontal axis shows elapsed time (µs).

FIGURE 6 .
FIGURE 6. Signals from a non-fresh sample on left side probe spots; the vertical axis shows voltage (V), and the horizontal axis shows elapsed time (µs).

FIGURE 7 .
FIGURE 7. Freshness scores and p-values (one-tailed t-test) of eight typical machine learning algorithms calculated using (a) single and (b) multiple A-mode signals methods.SVM, support vector machine; radial basis function (kernel); LDA, linear discriminant analysis; QDA, quadratic discriminant analysis; AdaBoost, adaptive boost; MLP, multi-layer perceptron.

FIGURE 8 .
FIGURE 8.Receiver operating characteristic curves (ROC) of eight machine learning algorithms in single and multiple A-mode signals method.The solid line shows the mean of ROC curves calculated by three-fold cross-validation, and the shadow is one standard deviation.Each p-value was calculated using a paired one-tailed t -test.SVM, support vector machine; RBF, radial basis function (kernel); LDA, linear discriminant analysis; QDA, quadratic discriminant analysis; AdaBoost, adaptive boost; MLP, multi-layer perceptron.

FIGURE 9 .
FIGURE 9. Performance dependency of the multiple A-mode signals method on the number of signals.Solid lines indicate the mean of three-fold cross-validation, and shadow is one standard deviation.ROC-AUC, the area under the receiver operating characteristic curve; SVM, support vector machine; RBF, radial basis function (kernel); LDA, linear discriminant analysis; QDA, quadratic discriminant analysis; AdaBoost, adaptive boost; MLP, multi-layer perceptron.

TABLE 2 .
Number of A-mode signals in each dataset.

TABLE 3 .
Performances of the single and multiple A-mode signals methods.