Phase-Based Motion Magnification for Structural Vibration Monitoring at a Video Streaming Rate

Here we propose a novel approach to image magnification that enables frame-by-frame motion magnification application at a video streaming rate of around 30 fps. This novel approach can be used instead of batch processing an image file in conventional phase-based magnification (PBM). This new PBM method can instantly show a magnified video streaming on display at 30 fps, which is helpful for vibration measurement and for monitoring tasks where magnified images need to be viewed simultaneously. To accomplish video streaming-rate magnification, the proposed PBM uses time-domain convolution in temporal bandpass filtering for frame-by-frame operation whereas conventional PBM employs a frequency-domain filter that is applied to the entire image file at once. An experiment was conducted to monitor the vibration of a cantilever using a webcam streaming at the same frame rate, and data were collected simultaneously using a laser Doppler vibrometer for comparison. The experiment confirmed that the proposed PBM approach is more effective than the conventional magnification method and it also analyzed the system performance for vibration measurement. Additionally, the proper orthogonal mode could be found through the singular value decomposition from vibration displacement data of the cantilever that was collected instantly from the magnified image frames. Furthermore, the dominant mode could be effectively extracted from excitation at the resonance frequency. Because of magnification factors, the vibration displacements from the proposed method were estimated using linear regression and the accuracy of the estimated displacements was within the permissible error bound. Since the proposed PBM is a frame-by-frame operation, instantaneous adjustment of magnification parameters is available, even while the magnification is being processed. In addition, the proposed PBM is independently adjustable to the number of image frames for temporal FIR filter order.


I. INTRODUCTION
To circumvent the limitations of commonly used vibration sensors, such as accelerometers or laser vibrometers, computer vision-based techniques utilizing high-speed or conventional digital cameras have been investigated [1], [2], [3]. The camera-based measurement method is unaffected by loading and temperature errors and has the advantage of obtaining a full-field image [4], [5], [6], [7]. Nevertheless, data acquisition from video frames is challenging when the vibration motion tracked in video images is as tiny as subpixel; however, small vibrations can be amplified using motion The associate editor coordinating the review of this manuscript and approving it for publication was Joewono Widjaja . magnification algorithms [8], [9], [10], [11], [12], [13] and can be applied to vibration measurements [14], [15], [16].
Liu et al. [8] developed an approach to visually show tiny motions in an image by employing the Lagrangian description used in fluid mechanics. The Lagrangian approach is a method for tracking individual particles with high precision, and it has a high computational cost. On the other hand, Wu et al. [9] demonstrated an effective technique called Eulerian video magnification (EVM) that is based on the Eulerian description. EVM works similarly to optical flow in that it amplifies the pixel intensities that change at a fixed pixel position. While this Eulerian magnification considerably reduces the amount of computing required, a high magnification factor cannot be applied to high-frequency VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ motion and noise increases linearly as the magnification factor increases [10]. To address this issue, Wadhwa et al. [10] suggested phase-based magnification (PBM) using a complex steerable pyramid (CSP) in the Eulerian framework. [17] Magnification factors greater than those used in conventional Eulerian magnification can be used, providing a magnified image that is more robust to artifacts and noise.
Despite the fact that CSP has a long computation time due to overcompleteness and a computational disadvantage [11], [17], it has substantial advantages in terms of obtaining accurate magnified motion and in detecting motion in many directions. Therefore, a method for reducing computing time by employing the Riesz pyramid, instead of CSP, has been proposed [11] for calculating faster using the local Riesz pyramid approach [13]. However, compared to the CSP, the Riesz pyramid has a disadvantage in detecting motion in various directions, and the local Riesz pyramid has a limitation in that motion is weakly magnified. Oh, et al. [12] proposed a learning-based motion magnification model based on convolutional neural network architecture. Their learningbased method demonstrated superior performance in terms of noise and accuracy without a temporal filter. This strategy, however, has disadvantages of requiring a large amount of training data and time to construct a trained model. Despite these alternatives, many vision-based vibration analysis studies still use the steerable pyramid-based motion magnification [14], [15], [16].
In contrast, the conventional phase-based motion magnification approach can examine the outputs after all frames of a finite-length video clip have been processed, but many applications require video streaming-rate processing. In conventional PBM, all video frames are processed simultaneously, and all video data are allocated and processed in memory. The processing time is proportional to the image resolution and frame number. However, PBM applications, such as structural health monitoring, require instant results, and real-time level performance for applying motion magnification would be beneficial. This paper introduces a novel approach that allows frameby-frame processing instead of batch processing of image files in conventional PBM, allowing motion magnification to be applied at the video streaming rate. In addition, vibration measurement for a simple cantilever example was conducted by applying magnification from real-time streaming video. We propose an algorithm for frame-by-frame evaluation of the magnification output by converting it from a bandpass filter in the frequency domain to a convolution in the time domain for video streaming processing. This streaming-rate magnification is implemented on a small portable laptop equipped with a GPU at a processing speed of 30 fps as the webcam's frame rate. An experiment was conducted to monitor the vibration of a cantilever using a webcam streaming at the same frame rates, and data were collected simultaneously using a laser Doppler vibrometer (LDV) for reference. The experiment confirmed that the proposed approach is as effective as the conventional magnification method in many circumstances, and it also evaluated the system validity for vibration measurement in displacement, frequency, and vibration modes.

II. PHASE-BASED MOTION MAGNIFICATION FOR VIBRATION MEASUREMENT A. PHASE-BASED MOTION MAGNIFICATION
The core idea of phase-based motion magnification is a CSP that decomposes an image into complex-valued data scales. As a result, phase-based motion magnification can handle image motion data expressed as a phase angle difference. The CSP [10] consists of multi-level complex steerable filters according to multiple orientations and spatial frequency bands. As illustrated in Fig. 1 (u,v) , where (x, y) is geometry coordinates and (u, v) is image frequency coordinates. Then, steerable filters ω,θ (u, v) are applied. ω,θ (u, v) is a filter component of the CSP that corresponds to the spatial frequency bands with scale levels ω and orientation θ in the frequency domain. This description is also a polar coordinate scheme of the spatial frequency domain (u, v) because ω and θ represent the distance and the direction from the origin, respectively, as shown in Fig. 1.
After applying CSP, the results were show the magnitude of inverse Fourier transformed image (of image geometry domain), which displays multi-orientation and multi-scale frequency components S ω,θ (x, y) in Fig. 1. The outer region corresponds to the high frequency in the spatial frequency domain, which correlates with a sharp change in pixel intensity value or patterns with a short wavelength. On the other hand, the inner region correlates with a low frequency, implying smooth variations in pixel intensity or long-wavelength patterns in the image. The direction from the origin to a specific steerable filter region in frequency coordinates is the orientation that the filter extracts, as shown in Fig. 1. For example, the center point of filter ω 3 ,θ 1 (u, v) at Scale 3 in Fig. 1 is horizontally aligned with the origin. Orientation θ 1 , therefore, represents the horizontal features of the image so that the shape of the cantilever is well shown as The image reconstruction is a sequence of each spatial frequency component, demonstrating mathematically how magnification proceeds. The complex steerable filter is comprised of windowing and sinusoids in the spatial domain. With spatial domain (x, y) defined as vector x, the spatial frequency of a decomposed image, S ω (x), applied with a filter of windowing, W (x), and spatial frequency ω can be described as follows: where the ⊗(circled times) represents the spatial-domain convolution and W (x)e jωx is the inverse Fourier transform of steerable filter ω (u), where u is a vector expression of (u, v). As the purpose of reconstructing an image is to collect all frequency image components, S ω (x), the image notation can be represented as an infinite series If I (x) is an initial state and it moves some position over time t, the value yields where δ x (t) is the subtle motion of the image. The phase difference between the first and current image frames is calculated using the phase angle in (2) and (3).
where denotes the phase difference between the two images. As can be seen in the equation, phase difference encompasses the phase difference δ x (t). To amplify the phase difference, we apply a magnification factor α to , which is then exp(jα ) multiplied by (3) to yield the magnified image as Therefore, motion-magnified images can be acquired in (5). Practical phase-based video motion processing procedures are illustrated in Fig. 2 using PBM. Although not explicitly stated above, must be subjected to temporal bandpass filtering prior to being inserted in order to focus on motion and eliminate the DC component [10]. Consequently, Fig. 2 depicts the filtered phase difference, , following temporal filtering. Additionally, while the orientation of the steerable filter is not depicted individually in Fig. 2 for clarity of explanation, the actual process of image decomposition must take the orientation into account.

B. TEMPORAL FILTERING AND CONVOLUTION
When contemplating frequency band filters to magnify the motion of a particular frequency component, either a finite impulse response (FIR) or an infinite impulse response (IIR) filter might be considered. As shown in Fig. 2 as ''Temporal FIR Filtering,'' located between the phase difference frame stack and the magnification factor, the FIR filter is used due to its implementation simplicity and frame-by-frame processing capability in video processing [18]. The response of the FIR filter has a linear phase and is always stable, making it excellent for video processing, even during online streaming. However, IIR filters have a non-linear phase response and they cannot be guaranteed to operate in a stable manner [18]. In addition, because FIR filters may be designed to behave similarly to IIR filters, even with lower filter orders, this highspeed attribute contributes to the processing time reduction. VOLUME 10, 2022 FIGURE 2. The general concept of phase-based motion magnification (PBM). 1) In the frequency domain, a CSP with scales and orientations is applied to the image. After decomposition and filtering, the images contain complex values. 2) The phase angle of the decomposed image and the phase angle difference reflecting the motion is calculated. 3) Temporal filtering is applied to the phase difference in order to extract the motion of the specified frequency component. 4) The filtered phase difference is amplified by the magnification factor and then combined with the original frame. 5) After completing the preceding procedures on each CSP-decomposed local image, the reconstructed image exhibits a magnified motion. Conventional PBM [10] employs a Fourier transform to perform bandpass filtering of an entire set of phase frames in batch processing at once, as shown in Fig. 3. Let b(t) be the FIR filter coefficients in the time domain prepared using the windowing method [19]. Then, temporal filtering of phase difference under the conventional method is represented as where (t) is the filtered phase difference in the time domain. Moreover, b(f ) and (f ) are the Fourier transform of filter coefficients b(t) and phase difference (t), respectively. Since the discrete Fourier transform returns a spectral array with the same length as the input signal array, the FIR filter coefficients will be of the same size as the filter length. This temporal filtering is a multiplication operation in the frequency domain, as shown in Fig. 3. Since the two arrays must have the same size, the length of the FIR filter must be the same as the length (number of frames) of the image sequence. This frequency-domain multiplication requires a higher-order filter, leading to much computation but not improvement in filtering performance as the filter order increases. Also, it is impossible to ascertain whether the filtering band frequency and magnification factor are correctly set in processing until the entire frame is processed.
The proposed PBM conducts temporal bandpass filtering using a convolution directly in the time domain, rather than via frequency-domain operation, as denoted by where the * (asterisk) represents the time-domain convolution. As illustrated in Fig. 4, in actual time-domain convolution, the digital filter operation is accomplished with a presenting number of phase image frames by matching the discrete FIR coefficients to the filter length. Considering N = 2n as an even-numbered discrete filter order, the FIR filter coefficient of order N is b N , which is also the impulse response of the filter. Then, the filter length of b N is N +1 and the center frame is on n. From (7), the discrete version of the filtered phase difference and the time-domain convolution are described as where t i represents ith discrete instance of the temporal frame.
As the filter has a group delay of n image frames, which is half of the filter order [19], the result of the convolution corresponds to the phase difference at t i−n , as shown in (8). The discretized time convolution (8) is employed instead of (7) in the actual code implementation of temporal filtering. This filtering approach produces a single frame-by-frame result for each processing iteration, allowing instantaneous adjustment of the frequency bands and magnification factors. Additionally, the magnified image can be monitored in real-time, which is advantageous for online video streaming applications. Compared to conventional PBM, the proposed approach is not order-dependent, hence computation is minimized. As a result, the user can control the order of the filters to improve processing time and image quality.

III. IMPLEMENTATION AND EXPERIEMENTS A. IMAGE MAGNIFICATION SOFTWARE
The proposed PBM technique, which enables frame-byframe magnification processing, was implemented in Python, executed on a laptop PC equipped with a GPU, and obtained a video streaming rate of 30 fps. However, there are intensive operations for multi-dimensional arrays in the motion magnification image, so an ordinary CPU-only PC is insufficient. A GPU is required for faster processing, even without optimized Python code. The proposed PBM code was tested VOLUME 10, 2022 on a small laptop equipped with an Intel Core i7 mobile CPU and an NVIDIA GTX1650 GPU to process webcam streaming (640 × 256-pixel resolutions at 30 fps) at the same processing capabilities (at least 1/30-s each frame). The PBM software was developed in Python (version 3.7) with image processing and temporal FIR filtering implemented using OpenCV (version 4.5.3) and SciPy (version 1.7) packages. While the NumPy (version 1.21.1) module is the de facto standard for array operations in Python programs, the CuPy (version 9.2 with CUDA 10.1) module was primarily used here for working with arrays utilizing the GPU. NumPy was only used to convert the processed data array back to an image at the end. Fig. 5 shows the pseudo-code of the core routine used to implement the proposed PBM algorithm with the temporal convolution illustrated in Fig. 4. Within the main loop, the camera reads the frame and the program returns a result when processing is complete; the program then obtains the next frame and repeats the subsequent calculation, providing real-time processing via frame-by-frame computation.
The following algorithm was used. (1) After filtering the image with the steerable pyramid, the phase angle difference is extracted by subtraction. (2) Direct temporal filtering with convolution is employed to filter the object motion equal to the subtracted phase whereas the conventional method employs a Fourier and inverse Fourier transform of the time domain before and after bandpass filtering. (3) After multiplying the filtered motion by the magnification factor, it is merged into the existing frame. (4) Images from steps (1) to (3) are re-merged for each level, along with the residual image from the low-pass filter. (5) The magnified motion image is saved. If checking is required, showing of the magnified images, as illustrated in Fig. 6, can be added. While the image is being streamed, steps (1) to (5) can be repeated indefinitely.

B. EXPERIMENT
To verify the performance of the proposed PBM method, which operates at a video rate, an experiment to measure cantilever beam vibration was designed and compared with batch processing of entire frames in the conventional PBM method. The key parameters affecting the processing speed of the proposed real-time method are the size (resolution) of the image and the filter order chosen for temporal convolution, which is dependent on the number of incorporated image frames. The processing time for the proposed PBM method on an image size of 640 × 256 can be achieved by 1/30-s per frame when the FIR filter order is 30. Therefore, we set the temporal FIR filter order to 30 to achieve a video streaming rate of 30 fps. In addition, the total length of the video file is limited to 500 frames for convenience though the proposed method can be applied to a continuous online video sequence. We also used a conventional PBM of the same length as the recorded video. As mentioned in Section II.B, the filter length of the conventional PBM method should be set the same as the frame length; therefore, the filter order is set to 499 as the filter length is 500. Moreover, the measurable frequency at 30-fps video is up to 15 Hz according to sampling theory.
As shown in Fig. 7, a thin metal cantilever was mounted to an electromagnetic exciter and a function generator was used to generate an excitation signal at a specific frequency. An experiment was conducted to magnify images captured through a webcam connected to a laptop PC. The actual amplitude and frequency were determined using a reference LDV. The experimental cantilever was constructed of stainless steel of 20.8-g and was 0.5-mm-thick, 15-mm-wide, and 400-mm-long with uniform cross-section. It was established that the independently measured natural frequencies of the cantilever were 1.53 and 12.2 Hz in the first and second resonance modes, respectively. The applied frequencies of the first two cantilever modes were less than 15 Hz, which is the Nyquist limit in the experiment. When the base of the cantilever was excited with a tiny excitation force according to the natural frequency of each mode, 640 × 256-pixel images at 30 fps were captured and processed using the conventional and the proposed methods. The benchmark time was recorded for each program step in each method. The vibration displacement was extracted from the image using the centroid tracking method [20] and compared with reference displacement data acquired simultaneously with the LDV.

C. RESULT
We obtained the free-tip displacements of the cantilever from the magnified video using the centroid tracking method [20]. Fig. 8(a) shows the frame stack process for extracting the  tip displacement of the cantilever from the magnified image. First, the pixel displacement data were extracted from the original (unmagnified), the conventional PBM, and the proposed PBM videos, respectively, for the case of α = 15, as shown in Fig. 8(b) and (c). Then, the pixel values were conventional and proposed methods exhibit identical magnified displacement waveforms, as shown in Fig. 8(b) and (c), the proposed method only provides the magnification after specific frames of the transient response due to the group delay aspect of the FIR filter design [19]. Indeed, steadystate magnified displacement may be obtained at the video rate after the initial transient delay, even with the proposed method. After an initial delay of as many frames as the half order of the temporal filter, magnification processing is carried out at the frame rate.
In theory, since both conventional and proposed PBMs use a linear magnification mechanism, the displacement is linearly magnified by magnification factor α. The linear relationship between magnified root-mean squared (RMS) displacement amplitudes and magnification factors ranging from α = 1 to 15 is evident in Fig. 9, as is the linear regression with excellent correlation. The amplitude value at the amplitudeaxis intercept of the regression line seen in Fig. 9 (when α = 0) can be regarded as the de-scaled value from the magnified amplitude and it is approximated as the actual unmagnified amplitude of the vibration. The estimated displacement amplitude differs slightly from the displacement amplitude extracted from the unmagnified image. Instead, the estimated displacement amplitude from the magnified image is much closer to the LDV reference amplitude. The calibrated and estimated displacement RMS amplitudes for each case and for the LDV measurement data listed in Table 1.
As shown in Table 1 and Fig. 10, the data obtained from the original unmagnified image, which imposes subpixel displacement, is overestimated in comparison to the actual LDV reference data, even after calibration. It is known that calibrated data obtained by rescaling a magnified image yield  a more precise result. In Fig. 10, each frequency can also be estimated from the power spectrum transformed from the temporal displacements. It can be seen that the estimated frequency obviously coincides well with the excitation frequency of the corresponding mode. The excitation frequencies are 1.53 and 12.2 Hz.
We set regions of interest of six points along the cantilever, including the base as shown in the screenshot in Fig. 6, and simultaneously extracted dynamic displacement data of six points from the original and magnified images using the centroid tracking method. From the data extracted using singular value decomposition (SVD) as a proper orthogonal decomposition, the operational deflection shapes or proper orthogonal modes (POMs) representing the dynamic signature of the structure were obtained. These POMs represent the mode in forced vibration [22]. The forced vibration modes obtained in each case are shown in Fig. 11. The singular mode (i.e., POM) extracted from the magnified image is closer to the normal mode at the resonance frequency than the original unmagnified image. Table 2 shows the relative proportion of singular values of a vibrating cantilever with base excitation using SVD. The first POM has the largest singular value and the first singular value from the magnified image at each resonance frequency is larger than the singular value from the original (unmagnified) image. This supports extracting the vibration mode associated with the corresponding frequency by image magnification from the vibrating structure. Fig. 12 shows the modal assurance criteria (MAC) [23], which indicates how similar the POM is, according to the magnification for each excitation frequency, to the analytical resonance mode of the corresponding frequency. The first POM in 1.53-Hz excitation (1st resonance) has the largest MAC value with the first eigenmode, whereas the first POM in 12.2-Hz excitation (2nd resonance) has the largest MAC value with the second eigen mode. In particular, vibration mode extraction through SVD in the proposed PBM is as reliable as that for the conventional PBM. In this case, if important modes are selected among the measured POM modes by calculating the optimal hard threshold [24] in SVD, the values are Mode 1 or Mode 2 above the dotted line in Table 2. In each POM of Fig. 11, not only is the singular value relatively very small, but it also has a different shape from the normal mode due to the influence of noise in the case of Mode 3 or higher.

A. PROCESSING TIME
The processing time of each algorithm routine was measured to evaluate whether the proposed PBM approach can sustain a processing speed of lower than 30 fps. Since the VOLUME 10, 2022  conventional method processes the entire set of image frames from a previously saved video file into memory, the processing time was measured at five different process steps (Table 3), ranging from applying complex steerable pyramids to saving the image frames, except for the image loading stage. The processing time is dependent on the hardware and the results reported in Table 3 are based on the laptop PC equipped with a GPU used in the experiment. The proposed method can check every single frame result in real-time, so the processing time for each iteration was measured and averaged over 500 frames as seconds-per-frame. Since the processing time of the conventional method can be measured only as entire frames, it is measured in a total processing time and then divided by the number of frames (500 frames in this study) to evaluate the seconds-per-frame. Finally, the processing times of both methods are transformed into equivalent frame-per-seconds (fps) and compared. According to Table 3, the conventional PBM method processed each frame in 39.108 ms, whereas the proposed method processed each frame in 32.697 ms. If the frames-per-second rate exceeds 30 fps, the images streamed from the webcam can be processed at a real-time video streaming rate. Therefore, the proposed PBM could achieve real-time performance.
The proposed method performs about 20% faster processing than the conventional method because it needs only direct temporal convolutions with the filter order from temporal filtering to the magnification process, which takes advantage of the frame-by-frame processing. However, the conventional method requires Fourier and inverse transforms and performs temporal bandpass filtering in the frequency domain via filter multiplication. The (tensor) multiplication is a computationally intensive operation for data arrays with dimensions of 640 × 256 × 500, which correspond to the image size and total frame count in this experiment. While all transforming and filtering operations are expedited by the GPU, managing such large amounts of data concurrently may be slower due to computational and hardware constraints.
The overall processing time is not significantly different between the two methods; the proposed method provides the essential advantage of frame-by-frame processing and enables real-time magnification of online streaming video at around 30 fps. Additionally, the proposed method allows instantaneous adjustment of magnification parameters, such as bandpass frequencies and magnification factors. Moreover, users can specify an FIR filter with lower order, which is less vulnerable to system restrictions. Because the proposed method processes a fixed number of image frames required for the FIR filter order for convolution, regardless of the total number of frames, it is efficient for image memory access in the program. On the other hand, the conventional method should specify the FIR filter order in terms of the total number of frames. Therefore, the proposed magnification approach efficiently controls resources and processes them faster with acceptable image quality and data extraction accuracy. Also, as shown in Fig. 8, initial delay occurs as often as the number of image frames equals the order of the applied FIR filter in the proposed method. Therefore, magnification usually occurs after an incomplete magnification (delay) time from the initial stage of the image.

B. IMAGE QUALITY ASSESSMENT
The mean squared error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM) were used to determine the degree of distortion relative to the original image [21]. MSE is a fundamental metric that calculates the average squared difference between all pixel intensity values. PSNR is the decibel-scale ratio of noise power to the maximum possible power of an image, calculated using MSE. When the MSE equals zero, PSNR approaches infinity and, as MSE increases, PSNR decreases in value. SSIM is a metric that quantifies the similarity of the structure of an object without quantitatively evaluating the error and evaluates the quality based on image similarity [25]. Both conventional and proposed PBMs exhibit a distorted object shape as the α is increased excessively, which affects subsequent operations. Visual noise becomes discernible only when the cantilever image is magnified at around α = 10. Therefore, the results were compared with the original video according to both PBM methods when α = 10.
As can be seen, there is no noticeable difference in image quality between the two methods and the images obtained using the proposed method are as acceptable as those produced using the conventional magnification method. Nevertheless, as shown in Fig. 13, the proposed magnification method is slightly superior in the first-mode frequency while the conventional method is slightly superior in the secondmode MSE and PSNR values. As a result, it is difficult to assign consistent superiority in image quality metrics comparing both magnification techniques. Fig. 14 shows the PSNR score for each method at the second-mode frequency as a function of α. As such, the proposed method tends to be better with lower α values (i.e., α < 6), but as α increases, the quality score of the conventional magnification method takes precedence. Therefore, the appropriate α value for better image quality varies from case to case. In particular, when both PSNRs are greater than 30 dB, the image quality of the two methods is comparable at the human-perceived level [26].

C. DATA ACCURACY AND APPLICATION
Both magnification approaches estimated the displacement amplitude closer to the reference LDV value than the extracted amplitude without magnification. The proposed method shows slightly underestimated displacements than conventional magnification in Table 1. From Fig. 9 and in the same viewpoint as Table 1, the conventional method tends to be magnified more for the same value of α, whereas the proposed method tends to be less magnified. In particular, the displacement (green) in the case of no magnification for the low-frequency first mode ( Fig. 10(a)) is very overestimated and more noticeably undersampled than the scaled displacement data extracted from the magnified image (blue and red) and the reference LDV displacement (black dotted). This overestimation in displacement is because of geometric discretization according to the pixel resolution when converting the displacement of the vibration image into the pixel intensity values. In the second-mode frequency ( Fig. 10(b)), the amplitude difference of displacements extracted from magnified and unmagnified images is insignificant, except for the time sampling effect, as they are at the first-mode frequency. As illustrated in Fig. 9, the RMS response in the second mode is set to be less than that in the first mode in the experiment, but by chance the actual amplitude matches the image unit pixel level, limiting overestimation as in the first mode. Instead, the displacement may not be extracted due to unpacking by the pixel resolution of the image.
By integrating the proposed magnification technique with the current system specifications in this study, real-time processing at a 30-fps rate is conceivable. As a result, real-time monitoring of structures and machinery will be possible by acquiring monitoring video streams and magnifying subtle motions and vibrations in real-time. The primary issue with vibration measurement via images is the difficulty in accurately quantifying tiny vibrations at the subpixel level. As a result, magnification methods, particularly PBM, are beneficial. The vibration can be observed and measured using a conventional magnification algorithm. Nonetheless, it is challenging to apply motion magnification to vibration monitoring by post-processing image files. However, qualitative real-time and online vibration monitoring might be easily achieved by applying the proposed frame-by-frame magnification technique.

V. CONCLUSION
The purpose of this study was to offer a novel approach for effectively implementing a PBM algorithm that leverages direct time-domain convolution at the temporal filtering stage. The developed real-time PBM algorithm produced a frame rate of up to 30 fps in dynamic displacement measurements based on online vibration monitoring using a nominal webcam. To verify the performance of the proposed PBM, the magnification processing time, the magnified image quality, and the vibration displacement extraction for a simple cantilever structure vibrating less than 15 Hz were compared to the conventional PBM. Because of the magnification factors, the vibration displacements from the proposed method were estimated using linear regression and the accuracy of the estimated displacements was within the permissible error bound. Finally, the vibration characteristics were compared with the frequency spectrum of the extracted displacement.
The following advantages of the proposed video streaming-rate PBM can be summarized on the basis of the discussion: 1) While conventional batch processing works at a speed of 25.6 fps in a typical laptop PC environment, the proposed frame-by-frame processing technique can achieve a faster speed of 30.6 fps, which is 19.5% improvement.
2) The proposed PBM has the advantage of not being independent of the number of image frames for the temporal FIR filter order. A low-order filter provides rationale for fast processing without significantly affecting overall image quality and accuracy of the estimated vibration displacement. 3) Since the proposed PBM is frame-by-frame processing, instant adjustment of magnification parameters is always available, even while the magnification is being processed. This enables implementation of online vibration monitoring.