Direct Imaging of Stabilized Optical Flow and Possible Anomalies From Moving Vehicle

Machine perception of dynamic scenes becomes more and more important for autonomous vehicles and vision-based driver-assistance systems. Even with other 3D ranging devices, dense, detailed and instantaneous detection of optical flow is essential for early distinguishing small moving objects in the 3D environment from the moving vehicle. To overcome the limited performance in the immediacy, resolution, accuracy and acuity of existing methods, we provide an optical flow detection scheme based on a three-phase correlation image sensor (3PCIS) that is capable of Fourier-coefficient imaging combined with an exact and direct algorithm derived from the weighted integral method of identifying the differential equation model from a short-duration observation. To utilize inherent performances of the detection scheme by removing the large and rapid disturbances induced by the rotational fluctuations of the platform, we introduce a software operation of gaze in which the image coordinates are fixed on and smoothly pursue a forward stable object so that the optical flow field is relative to the moving coordinate system. In it, the gaze subsystem continuously provides angular velocity and pose between the camera and gaze target, while the imaging subsystem instantaneously obtains two optical flow distributions by cancelling the ego-rotation components and then removing the outwardly diverging components derived mainly from stationary 3D environments. Possible anomalies captured in each frame instantaneously provide candidates of hazardous objects that should be tracked and further investigated. We examine the performance of optical flow stabilization and anomaly detection using image sequences of monocular 3PCIS mounted on a moving vehicle on town roads and a highway.


I. INTRODUCTION
T HE early detection of harmful traffic situations is an important subject for autonomous vehicles and visionbased driver-assistance systems navigating around other moving vehicles and humans. Future vehicles must be able to perceive their environment reliably from their own visual input, just like a careful human driver with keen senses. In DARPA Grand/Urban Challenges for practical autonomous driving technologies [1], [2], LiDAR is a main device for capturing 3D traffic environments [3]. However, its resolution is too low to detect an early sign of hazard for planning an emergent reaction to avoid it [4], [5]. The role of vision sensors remains important in the dense and rapid detection of anomalous conditions such as a ball and children jumping into a road, a motorcycle approaching an intersection without slowing down, and a vehicle that changes lanes without considering the surroundings. Toward this aim, dense optical flow (OF) is particularly advantageous since it not only instantaneously and widely captures the environment and objects, but also provides insights into their geometric layout that can be easily fused with LiDAR or millimeter-wave radars. However, under the unknown ego-motion of the advancing platform, the tasks become considerably difficult for detecting small moving objects in cluttered backgrounds [6].
The methods for OF detection [7], [8] comprise 1) a description of brightness constancy during motion, 2) the local or global modeling of the velocity field, and 3) an optimized solution with regularization, and they all have long been studied. The techniques for 2) and 3) are mostly general frameworks, but those for 1) are highly specific to optical images. The most established one is the optical flow partial differential equation (OFPDE), which describes strictly the spatiotemporal variation of image intensity over the velocity field. However, the OFPDE in the differential methods (DMs) [9], [10] is followed immediately by a dirty approximation of the temporal differential using the consecutive-frame difference and thus suffers a fundamental limitation in the range of velocity and accuracy. Also, an OFPDE can constrain only a normal component of the flow. Thus, the aims of many studies have been the regularization of these problems, such as locally smooth [10] or globally coherent flows [11], multiscale analysis [12] and extended constancy assumptions [13]. Another category of 1) is the multiframe correspondences of sparse features in an image sequence [14], [15]. Although it provides long-term reliable results, an obvious disadvantage is that small components without distinctive features can be ignored. To implement an on-vehicle early warning vision under egomotion, two different roles of the OF must be coupled and enhanced together. The tracking determines both the velocities and loci of featured points, which enable accurate recoveries of ego-motion and the related OF. Then, the OFPDE, with proper mathematical treatments, provides a stable, detailed and instantaneous description of the flow field.
Recently, a novel framework for these studies has been obtained as a combination of a mathematical technique, the weighted integral method (WIM), and an imaging device, the three-phase correlation image sensor (3PCIS) [16]. The WIM This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ provides an exact algebraic equation (AE) for determining the coefficients of a differential equation from weighted integral measurements of its variables [17], [18], [19], [20], [21]. For the OFPDE, an exact closed-form solution of the velocity is obtained from short-time Fourier coefficients (FCs) of timevarying intensity [22], [23] captured directly by the 3PCIS. Various types of 3PCIS have been fabricated [24], [25] and improved for an accurate algebraic solution of OF [26].
The purpose and the main contribution of this paper is to present an on-vehicle OF detection scheme with an extremely low latency and an extended spatiotemporal resolution free of the accuracy problems and ill-conditions in the traditional methods. The most severe impediments to obtain practicable performances are the rapidly time-varying OF components induced by rotational fluctuations of the platform. To remove them, we introduce a software operation of gaze driven by the instantaneous image motion, in which the image coordinate is fixed on a stable stationary object. Thus, it pursues it so that the OF field is relative to the moving coordinate. Owing to the extended performances of the WIM and 3PCIS, both the image coordinate and the relative OF fields are stabilized up to their time-differential order such that only the translational ego-motion components and flows due to moving objects (if they exist) remain.
The method we propose is for a realtime, direct imaging of anomalous objects as inputs to succeeding high-level analysis; thus, it is restricted to an early, pre-attentive (near-sensor) level. The procedure is based solely on the mathematical models, dominant statistics and algebraic operations. Also, it is so designed to permit feedback from the cognitive level in the integration. In Section III, the WIM for ego-motion OF is described. Then, in Section IV, the stable visual coordinate (SVC) and gazeflow are introduced. In Section V, we describe the algorithms in detail, while in Section VI, we examine them using real 3PCIS image data captured from a vehicle driven on busy town roads and on a highway.

A. Ego-Motion and Moving-Object Detection in 3D Scenes
The ego-motion-induced OF in a static 3D scene [27] was implemented with the DM for detecting rotation-only or translation-only ego-motion [28], for the focus of expansion (FOE) on a rigid object [11], and for the robust 3D reconstruction [29]. The problems encountered when moving objects are present have been studied as the structure from motion in a multibody environment. For separating static scenes and rigidly moving objects, most methods are based on the random sample consensus (RANSAC) framework and require a significant number of iterations [30]. Further considerations include the use of the factorization method [14], a piecewise planar model of a 3D scene [15], probabilistic reasoning including occlusion [31] etc. The visual odometry or simultaneous localization and mapping (SLAM) [33] provide a reliable framework for visual navigation. Applications to vehicles include the estimation of ego-motion using the road surface [34], the detection of position/velocity of other vehicles [35] and the inference of road layout for autonomous driving [36]. Additional considerations include tight coupling with multiperson detection [37], the use of vehicle kinematics constraint [38], robust circular matching in tracking and stereo [39], context-aware motion descriptor using oriented histograms of OF [40] etc. However, in these methods, the instantaneous detection of small moving objects for traffic safety is not a primal concern.

B. Motion Anomaly Detection From a Moving Vehicle
For vehicle safety, a child, bicycle or ball approaching the road represent targets of interest that requires an emergent reaction. The difference between ego-motion-compensated frames of stereo cameras [41], [42], [43] or a monocular camera [44], [45] provides the instantaneous distribution of possible anomalies. The frame difference, however, includes various noises such as occluding and/or occluded edges or misaligned textured regions. A reliable operation for rejecting them is the continuous tracking and detection described in the previous section as well as the semantic inference of moving targets [46], detection and tracking via composite object description [47], fusing OF orientation and magnitude for robust obstacle detection [48], use of a learning framework for robustness [49] and maintaining occupancy probability in voxels using 3D point cloud and odometry [50], which often reduce the rapidness of anomaly detection and sensitivity to small objects. Real-time approaches with binocular stereo and other sensors include pedestrian detection from stereo depth histograms [51], moving object detection using FOE clustering and inertial sensors [52] and obstacle detection by fusing stereo and radar [53]. Among others, the direct and single-frame detection of ego-motion-compensated OF with an extended reliability is desirable for removing erroneous responses and solving the spatiotemporal resolution problems simultaneously.

C. Studies Inspired by the Human Vision
The roles of gaze or fixation in human and machine vision have been indicated in various studies on the following topics: recovering ego-motion [54], geometrical reasoning of object shape [55] and facilitating obstacle avoidance and target following [56]. The ultrashort latencies of the conjugate eye responses to ego-motion and pursuing a moving object are shown to be enabled by feedforward and feedback mechanisms based on OF [57]. An active vision system using a pan/tilt platform [58] is a hardware realization of gaze. In contrast, software realization is less costly using a coordinate transform and geometrical correction on the image whenever the gaze target is present in the image area. Other studies include the foveated vision with space-variant resolution [59], an event camera with time-resolved sensitivity [60] and a small moving object detector inspired by the fly visual system [61].

A. Optical Flow Differential Equation for Ego-Motion
Let r = (x, y) and (X, Y, Z ) be the image and camera coordinates fixed on the vehicle (see Fig. 1). Let the rotation and translation motions of the camera be = ( x , y , z ) and T = (T x , T y , T z ), respectively. Then, for the stationary object surfaces free from occlusions, the OF field from the moving camera is expressed as [27], [29] v(r) = B(r) where Z (r) is the distance to the surface being imaged at r.

The 2 × 3 matrices B(r) and A(r) are expressed as
where f c is the focal length of the camera. By using Eq. (1), the OFPDE induced by ego-motion is expressed as where f (r, t) is the light intensity on the image plane.

B. Exact Short-Time Integral Form of OFPDE
Assume Eq. (4) is satisfied in the frame interval [0, T ] of the camera and the temporal changes of v(r) in the frame are ignorable in comparison with those of f (r, t). Then, the OFPDE in [0, T ] is identically expressed by optical flow algebraic equations (OFAEs, see [26] for derivation): where ω ≡ 2π/T is the unit frequency, and is the nth order Fourier coefficient in [0, T ] which can be directly captured by the 3PCIS (n = 0 is the intensity image T 0 using n = 0 and n = 1, the OFAE is expressed as For the whole image, the system of Eq. (7) involves six unknowns of and T plus pixelwise unknowns of Z (r).
In the same time, Eq. (7) is complex, providing two equations pixelwise. Therefore, in the WIM, all unknowns are essentially solvable except for a common scale of T and Z (r) if the 3D environment is stationary (no moving objects). It is not the case for the DMs using Eq. (4) which provides a real equation pixelwise.
The differences in mathematical backgrounds and performance for OF detection between the WIM using the OFAE and a 3PCIS, on one side, and the DM using the numerical approximation of OFPDE and a conventional image sensor, on the other side, are summarized in Table I. For experimental comparisons between them and for details of the 3PCIS, see Section VI-A and [26].

A. Definitions
Gaze defined in this paper is the operation of keeping the view center and horizontal axis of the image plane at the images of a steady object ahead (gaze target) and the horizontal line in the environment, respectively. For the geometry, see Figs. 1 and 5 and their captions. The view center and horizontal axis make up a moving coordinate system in the image plane. Stability of OF is, therefore, achieved in both egocentric and object-centric senses because the origin is at the moving vehicle, while the angles of axes are determined in relation to the gaze target and the horizon fixed in the world coordinate. A stable visual coordinate (SVC) is the frame determined by the line of sight connecting the camera origin and the point on the gaze target (gaze center) as the Z axis, and the moving coordinates in the image plane as the x and y axes. Gazeflow is the OF defined in the SVC. The operation to acquire the stable visual coordinate is the gaze. At the gaze center, the gazeflow must be always zero since the SVC continuously and smoothly pursues the gaze target. The gazeflow describes a detailed motion distribution relative to the gaze center. This is an important difference of the gazeflow from the OF stabilized by a gyro sensor (see Fig. 8 in Section V for the difference between the gaze parameters and gyro-sensor outputs).

B. Gazeflow: Optical Flow Under Gaze Operation
Let the position of the gaze center be r 0 = (x 0 , y 0 ). Let the distance to the gaze target be Z (r 0 ) ≡ Z G . The OF at r 0 is expressed as We also express r 0 and v 0 as smooth temporal functions v 0 (t) =ṙ 0 (t) for the inter-frame Kalman filtering. The method of obtaining r 0 , v 0 and z is described in Section V. Now, let us consider a velocity distribution being relative to the gaze center as (9) in which the rotational motion term is expressed as For the second-order small term on the right-hand side, let us introduce the approximations that the gaze center is sufficiently near the optical center, i.e., r 0 f c , and the gaze target is sufficiently distant, thus (T x , T y )/Z G ( x , y ). Then, the small term is expressed as using the gaze center velocity v 0 . Therefore, by subtracting from v(r) the velocity distribution determined by r 0 , v 0 and z , the gazeflow distribution is expressed as where ⊥ indicates the π/2 rotation of the vector. The gazeflow v G (r) is the sum of a unidirectional vector field of the first term and a diverging vector field of the second term. The first term vanishes where the distance is near the gaze target. In this respect, gaze is the procedure of making the description of the OF field simpler by excluding the effects of everywhere and those of T x and T y near the distance of the gaze target.
To understand the terms (x 0 , y 0 )T z / f c subtracted from (T x , T y ) in Eq. (12), see Fig. 2. The direction of the line of sight (Z axis) and the velocity toward it are (θ x , θ y ) and T z , respectively. Then, if r 0 f , the velocity components induced by T z T z along the x and y axes are Fig. 2. Translational ego-motion components T x and T y induced by the forward motion T z and the pose rotation (side view for T y is shown). Gaze operation is also capable to cancel these components from OF.
The function value is proportional to the influence of T x and T y in the gazeflow, and it is zero when Z = Z G . For finite Z G , the influence becomes smaller in Z < 2Z G than when Z G = ∞. It is bounded within the widest range of Z (e.g., red zone) when Z G is the harmonic mean of the terminal values.
which are equal to the subtracted terms. This means that the components T x and T y induced by the tilt of the optical axis from the line of sight have been removed in the gazeflow.

C. Desired Conditions of the Gaze Target
The gaze target should be, for reliable OF detection, richly patterned, occlusion-free, and stationary during the vehicle advance. Thus, it is in the forward direction. Regarding its distance, Fig. 3 shows graphs of the residual influence of T x and T y in the gazeflow. By choosing a finite gaze-target distance Z G , the influence of T x and T y becomes far smaller than in the case of Z G = ∞. Generally, the influence is bounded equally at extremes of a distance range Z min < Z < Z max when Z G = 2/(1/Z min + 1/Z max ). If Z max = ∞ on the far side, Z min = Z G /2 on the near side. In practice, the distance is unknown. Hence, an appropriate algorithm is designed to choose the gaze target as a distant object with a large area (see Section V).
On the relationship to the ground plane, let the heights of the target and camera be H G and H c , respectively. Then, the image height in the SVC of the ground plane at the distance Z (r) is expressed as and if H c = H G , the second term including Z G (decreasing with locomotion) vanishes. Therefore, by adopting a gaze target at the same height as the camera, the ground plane image has a simple and stationary distance distribution. This is the case even after changing the gaze target.

V. PRACTICAL MATTERS AND ALGORITHMS
The algorithms for determining the gazeflow comprise 1) pursuing operation of the gaze target to determine the origin and velocity (r 0 , v 0 , z ) of the SVC, and 2) detecting the gazeflow as an instantaneous OF field free from the above-determined ego-motion components. Note that the strategies of 1) and 2) should be contrastive: 1) requires a reliable steady object selection, smoothness, accuracy and long-time consistency as the ego-motion of the platform, while 2) requires immediateness and spatio-temporal resolution as an early warning mechanism in a possibly harmful environment. The algorithms are integrated together as shown in Fig. 4.

A. Placement of Gaze Area and Velocity Estimate
The gaze rectangle is, as shown in Fig. 5, an extended region (128×72 pixels area in 704×512 pixels image) attached to the gaze center for velocity estimation and image matching. The determination of the gaze center is activated initially, when the offset of the gaze center from the optical axis exceeds a threshold or when tracking fails. The center is placed at a local maximum of image variance near the optical center. Across a saccadic change of the gaze center, the gaze parameters v 0 , z and θ z are transfered and r 0 is shifted accordingly.
The gaze area is a subset of the gaze rectangle consisting solely of the target image and is determined based on the clustering of OF. First, the velocity distribution is obtained using the local least-squares method expressed as where (r) is a small area (e.g, 3×3 pixels) around r. The normal equation where {·} and {·} are the real and imaginary parts, respectively. From it, a 2D histogram of v(r) is generated in the gaze rectangle. While weighting the similarity with the previous velocity v 0 (t − T ), the peak is selected as v 0 (t). The variance of scatter around the peak is used to extract the gaze area using the Mahalanobis distance of v(r) from v 0 (t). By using the peak velocity as v 0 (t), the largest object with a uniform OF is chosen as the gaze target. Objects with spatially varying OF, such as the road surface, are excluded. The selection of the rear face of the preceding vehicle is usually not excluded since it has little effect on visual stability. Also, a large crossing object near the path can be gazed. For a moment, the gazeflow becomes relative to the object to describe its detailed motion (see Fig. 14(d) for an example). Such gaze targets are reset whenever the condition becomes unacceptable for stable tracking.

B. Tracking of the Gaze Area With Enhanced Resolution
Images from a moving vehicle suffer motion blur, which reduces the matching accuracy. With the 3PCIS, blur can be reduced by combining the intensity and correlation images. A truncated Fourier series expansion of the time function f (r, t) of the incident light with g −1 (r) = g * 1 (r), g 0 (r) and g 1 (r) is expressed as f (r, t) g 0 (r) + 2{g 1 (r)e j ω t } (0 < t < T ), (17) which leads to the central-time image (CTI) where the superscript m is the frame count. The first spectral zero due to the motion blur is removed [22], [23], and spatial bandwidth of the CTI is increased to about three times g 0 (r).   6 shows an example of motion blur reduction. Resolution enhancement in the motion direction (mostly by pitch and yaw) is achieved without any false effects such as ringing at edges. In the following matching operation, the CTIs are used to increase their selectivity and accuracy.
Even in the gaze area, different appearances of the gaze target can occur. Thus, the matching criterion is the maximization of the sum of similarities being tolerant of the inclusion of unmatched pixels defined as (19) where σ specifies an acceptable deviation for the similarities.
Since the gaze area is small, the expansion and rotation of the previous image in it are ignorable. The initial position of the search is at r [ The subpixel offset is then determined by the differential method.

C. Estimation of z
The roll velocity z is determined using (v 0 , r 0 ) after Kalman filtering. Peripheral regions (see Fig. 5) are more informative for this estimate. To exclude small moving objects and estimate z in stationary regions, we again use the histogram method. Let v ρ (r) ≡ T z /Z (r) be the radial velocity coefficient. Then, the local least-squares method to determine the rotational and radial velocity coefficients is expressed as which is solved by inverting the normal equation

= −
where The results in the peripheral regions make a 2D histogram. Even for static objects, v ρ = T z /Z varies in accordance with the distance. A marginal distribution of it for positive v ρ is obtained and then weighted by the similarity with the previous frame. The peak is used as the estimate  Fig. 7 show examples of 2D velocity histograms for estimating v 0 and z , respectively. In the top-row histograms, single clear peaks moving up and down are evident. The peaks usually correspond to a stationary distant scene or the back face of a large vehicle in front. The highest peak is chosen as v 0 unless otherwise specified, and the contributing region to it is extracted as the gaze area. The middle row shows the 2D histograms of z (horizontal axis) and v ρ (vertical axis) in the peripheral image region. In it, populations on the upper side are from proximate objects and those near the center are from a distant scene. Graphs in the bottom row show the vertical projection of the middle-row histogram. Their peaks are used as z . Fig. 8 shows an example of velocity detection and gazetarget tracking when the vehicle is passing through an intersection under repair. Images are shown in Fig. 12. In the top and bottom graphs, the traces of the gaze position x 0 , y 0 , θ z and the gaze velocity v 0x , v 0y , z are shown, respectively. Gray lines near each trace show the gyro sensor outputs as the ground truth values. In the graphs, repetitive changes in pitch angle and motion owing to the waviness of the road surface are  clearly captured as the traces of y 0 and v 0y . Velocity traces are very accurate and delay-free: about 0.2 pixel/frame for v 0x , v 0y (0.0086 deg/frame as x and y ) and 0.1 deg/frame for z . The increasing offset of y 0 is due to the continuing slope of the road (slanted ground plane).
The tracking operation continues at all times while keeping or inevitably changing the gaze target. Fig. 9 shows a long-term trace of (x 0 , y 0 ) and θ z after the scenes in Fig. 12. Repeated jumps of x 0 indicate the saccadic changes of gaze target along a curving road so that the gaze rectangle is near the image center. Slight discontinuities of y 0 at the jumps are caused by the shift of gaze center to the most textured position.

D. Estimation of Gazeflow
As the final step in each frame, the gazeflow is estimated directly from the image data g 0 (r) and g 1 (r) using the above determined gaze parameters v 0 , r 0 and z . The Horn-Schunck-type global optimization method [10] is used for the calculation. A particular emphasis is on the regularization of ill conditions using an a priori distribution given by a previous frame and the knowledge of stationary objects such as the ground plane and sky. Owing to the enhanced stability and continuity in the SVC, the regularization performs well along both the temporal and spatial axes.
Let the a priori gazeflow distribution beṽ G (r). Then, the estimation problem is stated as the minimization of the sum of the squared deviation of the gazeflow from the default gazeflow, the squared deviation from the local average, and the squared error of OFAE: where {v G (r)} indicates the set of gazeflows in the whole image, indicates a spatial integral for smoothing, and μ 2 and λ 2 are the regularization parameters. Let us , v 0y ) for brevity. Then, the integral is expressed as (25) By differentiating J with δv x and δv y for all r and equating to zero, we obtain the iterative scheme δv k+1 δv k+1 where the superscripts k and k +1 are the iteration counts. For the regularization parameters, our choice in the experiments was μ 2 ∼ 20 and where ε 2 1 and ε 2 1 are small positives. The computation time for each frame (3PCIS @30fps) from the image input, the gaze and imaging operations, and to the display of results was about 0.62s using a Core™ i7-4790 CPU @3.6GHz (Visual C++ 14.1 codes, single thread). A suitable GPU will enable the real-time implementation. Fig. 10. Optical flow detection of an approaching wall. All images were taken by a 3PCIS [26]. (a) intensity image (first frame for the DM), (b) OF result of the DM (two frames, 5×5 pixel least squares, cascaded estimates using low resolution images and pixel-shifted high-resolution images), (c) OF result of the WIM (single frame, 2×2 pixel least squares using Eq. (15)). Velocity vs color correspondence is indicated by the color scale to the right of (c). Fig. 11. Tangential (black) and radial (red) velocity components (leftaxis scale) and rms estimation errors (right-axis scale) of the OF results in Figs. 10(b) and (c). The horizontal axis is the distance from the focus of expansion. Vertical bars indicate the rms error. Note that the right-axis scale (rms error) in (a) is magnified 5-fold that in (b).

A. Performance Evaluation of OF Detection by 3PCIS
The proposed method is based on the extended sensing capability of the 3PCIS and the exact algebraic solution based on the WIM. We first confirm the performance of this device and algorithms with a simple ego-motion setup. Fig. 10 shows OF results of a flat board (cork) approaching the camera. It simulates all directions and wide magnitudes of OF from the vehicle (see [26] for the results obtained using a rotating object). The intensity image (a) captured at 30.3 frame/s shows g 0 (r). It is also used as the first frame of a simple dual-resolution DM [8]. Spatial gradients (2-level smoothing for the DM) were computed in the frequency domain for accuracy reasons, while 5 × 5 and 2 × 2 pixels local least-squares methods were applied for the DM and WIM, respectively. In the DM, pixel shift was introduced to reduce the relative displacement using the first velocity estimate. However, the DM result Fig. 10(b) shows large disorders except for the central region with low velocities (∼3 pixels/frame). For an instantaneous OF result, velocity errors such as spots are difficult to distinguish from small moving objects. In contrast, the stability is maintained even at the peripherals (∼6 pixels/frame) in the WIM result Fig. 10(c). Moreover, the temporal and spatial resolutions are 2 and 2.5 times larger than those in Fig. 10(b). Small decreases in velocity in Fig. 10(c) are caused by the aperture problem: the decrease in ∇(g 0 (r) − g 1 (r)) particularly under high velocity conditions [22], [23]. Figs. 11(a) and (b) show the velocity dependences of the accuracy of the DM and WIM. In both results, the tangential and radial velocity components, together with the rms errors are plotted. The DM has an unavoidable velocity limit even when smoothing the images. In contrast, the accuracy of the WIM is maintained over a wide velocity range. Even with a small estimation area (2 × 2 pixels), the rms error at a low velocity is about 0.25 pixels/frame and the relative error at a high velocity is about 4%. This shows that highresolution, wide-range velocity can be obtained directly and more reliably by using the 3PCIS combined with the WIM.

B. Gazeflow Detection From a Moving Vehicle
Figs. 12(a) to (d) show conventional egocentric OF and gazeflow. Both were obtained by the 3PCIS with the WIM. The scenes were acquired when the vehicle is passing through an intersection under repair. A few workers are standing on the road near construction equipments. In the second row, egocentric OFs are shown. The velocity and directions Fig. 12. Egocentric OF and gazeflow (both were obtained by 3PCIS with WIM). The test vehicle with 3PCIS is passing through an intersection under repair. First row: intensity images; second row: camera-centered OFs; third row: gazeflows (maximum brightness: 5.0 pixel/frame), fourth row: magnified brightness of the third row (ditto: 1.0 pixel/frame). The egocentric OF changes severely and rapidly owing to the ego-motion. In contrast, most of the ego-motion components are removed, and the distance-related flow field caused by the vehicle advance and its details become evident. indicated by color change severely and rapidly owing to the pitching and rolling of the vehicle. In contrast, the gazeflows shown in the third row are very stable and present a mostly gradual expansion of a stationary distribution. This is an inverse-distance-related flow field caused by the the locomotion of the vehicle. In it, the workers and construction equipment can be clearly recognized as proximate objects in the advancing direction. The velocities on the road are more stabilized in the gazeflow because the regularization effects in the iteration are enhanced by the temporal continuity of the velocity field in this area. The regularization also clarifies the small OF in distant scenes as shown in the fourth row (maximum brightness: 1.0 pixel/frame). Increasing rightward velocities of power poles are from 0.2 to 0.8 pixel/frame.
Changing colors due to specular reflections are seen on the vehicle in front. Other bright spots are mostly due to sensor dusts.
The accuracy of OF and gazeflow was mostly maintained over an entire velocity range of this experimental condition. Even without an interframe regularization, the rms error at a low velocity is about 0.25 pixel/frame and the relative error at a high velocity is about 4%.   The leftward velocities of the vehicle higher than the surroundings are captured in the gazeflow (h). Also, their tangential components on its lower side are captured as the anomalous flow (i). One of the bicycles is moving slightly rightward along the road. Therefore, it is nonzero in the gazeflow and is also detected as an anomalous flow. In Figs. 13(j), (k) and (l), a pedestrian crossing the road in front is captured in the gazeflow and anomalous flow. The leftward velocity on the right side of the road is improbable as a diverging flow from a moving vehicle. Tangential components of the walking motion of a pedestrian on the left side are also captured as anomalous flows, whereas two slow walkers on the right side of the road are not. The other nonzero components in (k) are caused by the insufficient cancellation of ego-motion OFs and/or aperture problems in textureless regions or oriented patterns. Fig. 14 shows a passive transition of the gaze target when a vehicle crossed the path in front. The gaze changed from a road side to a side face of the moving vehicle, which is dominant in the gaze rectangle. This causes a change of gazeflows since they are relative to the target motion. In Fig. 14(c), the gazeflow captures clearly the leftward motion of a crossing vehicle. Also, ego-motion components are suppressed mostly so that the sensitivity to other moving objects is maintained. In contrast, Fig. 14(d) captures detailed motions of the vehicle parts including their directions, e.g. a rotating wheel of the vehicle. However, a uniform bias is present in the OFs of environment and the removal of ego-rotation components is degraded. The suppression or utilization of this type of gaze transition can be a subject of future studies.

C. Anomaly Detection From Moving Vehicle
TABLE II summarizes the features of gazeflow for acquiring the 3D environment and moving objects in it. The advantages of gazeflow over the conventional egocentric OF are evident in the rapid finding of moving objects as anomalous or suspicious events. As early warnings of danger, these events can be used to trigger the suitable high-level image analysis. For the early warning conditions from the notes 1) to 5) in TABLE II, examples of gazeflow and anomalous flow responses are as follows: 1) tangential flow due to object motion: Fig. 13

VII. SUMMARY
An on-vehicle OF detection scheme with an extended spatiotemporal resolution of the flow field and stability against the rotational fluctuation of the platform was proposed based on the exact algorithm and the solid-state 3PCIS with Fourier-coefficient imaging capability. A gaze operation was introduced so that the OF field relative to the object-centric coordinate, called the gazeflow, is stabilized and the detailed motion of small objects becomes detectable. The overall performance including the anomalous object extraction was examined using real 3PCIS data sequences acquired by a vehicle moving on busy town roads and a highway. The proposed method is expected to be helpful for enhancing the performance of succeeding high-level operations and for realizing advanced safety vehicles with keen senses.