A Self-Powered Asynchronous Image Sensor With TFS Operation

This article presents a self-powered image sensor with a novel pixel architecture with energy harvesting capabilities. Pixels are autonomous entities that can harvest or sense illumination independently. Pixels have a fully asynchronous operation and do not require to be scanned to read out their asynchronous output. The sensor was manufactured in UMC 180-nm technology and tested. Their specifications are competitive against the art, offering fast operation, a good balance between the energy consumed and harvested, and high-dynamic range operation. The article describes the sensor and pixel architectures in detail and provides experimental results. Sensor specifications are benchmarked against the art.


I. INTRODUCTION
E NERGY-EFICIENT sensors are the fuel to implement the Internet-of-Things (IoT) paradigm and, ultimately, critical enablers of the cyber-universe of fully interconnected analog sensors and digital processors [1]. The International Roadmap for Devices and Semiconductors [2] identifies the reduction of the size, weight, and power of sensor nodes as major challenges for deploying distributed sensor networks in an ever-increasing number of applications involving interaction with the environment and hence analog signal sensing. Because optical images and the visual sense have a large quota among the different sensing modalities, it is arguable that energy-efficient vision sensors will play a relevant role in future IoT systems and intelligent electronic systems, in general [1], [2]. Indeed, different strategic agendas identify the deployment of visual intelligence as one primary driver for disruption in information technologies and microsystem design [3]. The quest for energy-efficient vision sensors encompasses diverse challenges. On the one hand, vision chip concepts and architectures capable of capturing and analyzing images with a minimum power budget are required. On the other hand, opportunities to harvest energy from the luminous stimuli must be explored. Reducing the power budget guarantees long working cycles under battery supply for low-latency applications. A reduced power budget besides energy harvesting might allow battery-less operation or very long battery-replacement cycles for high-latency applications.
The power reduction challenge calls for either the incorporation, at the focal plane, of embedded feature extraction and parallel processing resources [4] or for the use of asynchronous event-driven vision sensor concepts [5], [6], [7], among other strategies. The harvesting challenge calls for reconfiguring photodiodes to acquire either information or energy. Different image sensors combining both acquisition modes have been reported during the last few years (see [8], [9], [10], [11], [12], [13], [14], [15], [16] as representative examples). The vision sensor chip reported in this article employs event-driven pixels operating in the time-to-first-spike (TFS) mode [17], [18] This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Fig. 1. Sensor architecture. The main constitutive blocks are: the pixel array, external buffers and control circuitry, external asynchronous arbitration logic to implement the AER communication protocol, and a Dickson dc-dc converter.
Asynchronous sensors are particularly well suited for power-budget reduction. Unlike conventional frame-based image sensors, asynchronous ones do not aim at scanning and encoding the whole pixel array, despite the relevance of the local pixel information, but at locating and encoding only pixel data that convey information encoded in the form of events, for example, temporal or spatial contrast. Hence, no energy is employed to encode and read pixels that do not contain information from an application point of view, thus making a significant difference from conventional imagers. Precluding such useless data encoding saves significant energy because ADC conversion and readout define the largest part of the imagers power budget [4]. Since the pixel operates in the TFS mode in this implementation, the information corresponds to the local pixel illumination, and all pixels are read out. However, the asynchronous nature of the pixel prevents dark pixels from occupying column ADCs for extended time intervals, leading to increased speed and power efficiency as pixels can be turned off immediately after readout. 1) Pixels are autonomous entities, and hence they can start harvesting energy individually once they have sensed their local illumination. 2) Their intrinsic asynchronous operation mode fits very well with application scenarios where snapshots are taken when an event occurs. For instance, an auxiliary dynamic vision sensor could trigger the proposed sensor after the recognition of an intruder in surveillance systems [22], motion detection at home [23], traffic monitoring [24], or external operator-triggered scene luminance measurements [25]. Although several authors have already anticipated the advantages of using pulsewidth modulation to information in self-powered pixels with shallow power consumption [11], [13], [26], there are very few self-powered asynchronous sensors reported so far. Shi et al. [12] advanced a concept of an isolated asynchronous self-powered pixel. However, to our knowledge, no contributions report self-powered asynchronous image sensors. In this article, we propose a novel, fully asynchronous sensor implementation. Its pixels are autonomous entities that can harvest energy once they have sensed their local illumination. Thus, pixels contribute to harvesting energy once their local illumination level is read out. The article provides experimental results and insights into the pixel and sensor architecture. The proposed pixel architecture is the first one that allows different pixels to harvest energy or sense illumination independently and has not been reported yet. The image sensor is benchmarked against the art showing the advantages of the proposed architecture.

II. SYSTEM ARCHITECTURE
The proposed system is composed of two different blocks, as shown in Fig. 1. The first of them is a low-power asynchronous vision sensor, which combines the tasks of sensing and energy harvesting. The harvested energy is stored in a 7.5-mF low leakage off-chip supercapacitor, C storage , designed for energy harvesting applications [27]. During the energy harvesting phase, the photodiodes are connected in parallel, charging the supercapacitor, whose voltage, V EH , is limited by the illumination conditions. This supercapacitor performs the function of a charge reservoir, with a reduced footprint of 3.2 × 2.5 mm. Although the pixels are designed to operate with a supply voltage as low as 250 mV, it is advisable to use larger supply voltages to decrease image artifacts caused by delays in the readout chain, thus reducing image degradation. These artifacts occur because the pixels encode the illumination level into the time it takes for the pixel to spike. Subsequently, any random delay (e.g., collisions in the readout channel) appears as random noise in the signal. Additionally, a lower supply voltage decreases the integration time, leading to a lower signal and thus decreasing the signalto-noise ratio. This will be further addressed in Section III. Therefore, a dedicated dc-dc converter is included in the same die to step up the voltage at V EH . Fig. 2 shows a microphotograph of the die containing both blocks.
The operation of the system is divided into two different phases. During the harvesting phase, the sensor is in an idle state, and all photodiodes operate in the photovoltaic mode. In this mode, a current I pv flows from the cathode to the anode, and a voltage V pv builds up at the junction, as depicted in Fig. 3. The collected charge is stored in the supercapacitor. Since power consumption is not continuous and the system does not have an external battery beyond the supercapacitor, no maximum power point tracking technique [28] was implemented. Therefore, the choice of a high capacitance allows storing a large amount of energy when the sensor is exposed to high illumination. When the system demands information from the scene, the sensing phase starts. During this phase, the photodiodes of the pixels are reverse-biased and operate in the TFS mode to measure the photocurrent. One major novelty of this sensor lies in how the pixels work after readout. The pixels stop consuming energy and start harvesting energy immediately after they send the information to the external receiver. In this way, the most illuminated pixels support the sensing operation. However, the supply voltage drops during the sensing operation, limiting the frame rate of the sensor by the time the dc-dc requires to regulate the supply voltage. This voltage drop depends on the time the pixels are consuming energy, and therefore on the scene.

A. Harvesting Subsystem
The core components of the harvesting subsystem are the photodiodes of the pixel. The cathode of the photodiode is connected to the ground when the pixel operates in the harvesting mode. Thus, the photocurrent leaving the anode (see Fig. 3) is used to charge the reservoir. Fig. 3 encloses four parameters. First, I sc is the short-circuit current, that is, the current at 0 V. The voltage when no current flows through the junction is called the open-circuit voltage, V oc , and it is the maximum voltage that can be generated while keeping the diode reverse-biased. If the junction voltage is greater than V oc , it is forward-biased and consumes energy. Finally, there is a maximum power point, defined by I mpp and V mpp .
Photodiodes are usually connected in series to increase the effective value of V oc . These branches of photodiodes are then connected in parallel to increase the available power; however, standard CMOS processes do not allow series association of photodiodes without penalty [20], [29]. Although other authors proposed a series connection of two diodes with different active areas to reduce parasitic current leakage [30], this approach is process-dependent and difficult to predict analytically. Thus, an only-parallel photodiode array is usually preferred, at the cost of a maximum V oc close to 500 mV [19]. Photodiode stacking can also be employed to slightly increase V oc [19], but still dc-dc conversion might be necessary to power a real system [21]. Fig. 4(a) shows the schematic of the dc-dc converter core. The dc-dc converter steps up the V EH voltage to 700 mV, since the open-circuit voltage of a silicon photodiode is measured to be in the range of 250-450 mV at a luminance of 10-10 000 lx [19], [20], [31], [32]. The core is composed of a classic three-stage Dickson dc-dc converter. In this implementation, the well of the pMOS transistors is connected to the drain to reduce the reverse current [33]. Furthermore, connecting the bulk to the drain reduces the transistors' threshold voltage, allowing for operation at lower voltages. Indeed, the operation benefits from the current flowing through the source-to-bulk diode. The circuit in Fig. 4(a) operates as a conventional Dickson dc-dc converter but is designed to precharge the output voltage before operation (i.e., with no load) and then halt during operation. Initially, when is low, the capacitor in x 1 is charged to V EH − V drop , where V drop is the voltage drop across M 1 . Since the output power when the converter is enabled is nearly zero, V drop will tend to a value lower than the threshold voltage, as the subthreshold current of M 1 charges x 1 ; however, this implies that the dc-dc converter will have a slow startup. In this particular implementation, V drop is close to 70 mV. On the other hand, when is high, the voltage at x 1 increases by the voltage swing of the clock, V clk . Thus, considering that the converter has three stages, the maximum output voltage of the dc-dc converter, V DD , is determined as Since the input voltage can be as low as 250 mV, the converter also includes a clock voltage doubler to ensure that the converter works under low-illumination conditions. Fig. 4(b) shows the schematic of the voltage doubler, where and are the clock and inverted clock phases, respectively. This circuit functions as follows: suppose that V ⋆ is close to V EH . When goes from 0 to V EH , V ⋆ increases to 2V EH , causing the output to go from 0 to 2V EH .
The output of the dc-dc converter is connected to an offchip 100 µF capacitor to reduce the voltage drop during operation. The frequency of the clock is controlled off-chip, since the voltage regulation of this converter was not included in this implementation. Thus, energy management still needs to be optimized in future implementations, including more complex circuits to enhance power management [21], [34].

B. Vision Sensor
The vision sensor employs a standard address event representation (AER) protocol [35], [36], [37], [38] to support asynchronous readout. Fig. 1 shows the block diagram of the vision sensor. In addition to the pixel array, the sensor is composed of column and row buffers followed by 0.3-1.8-V level shifters (in case the peripheral circuitry is biased at a voltage higher than the pixel array), AER communication logic blocks, 7-bit encoders, and row and column arbiter trees [38]. Fig. 5 summarizes the waveforms of the AER communication protocol. When a pixel generates an event, the X and Y arbiter trees decide which pixel accesses the address bus, while the AER communication logic blocks generate two global signals, req_y and req_x. The AND operation of both signals, bus_req indicates to the receiver that the pixel corresponding to the address bus coordinates has generated an event. This bus is composed of the output of the encoders, addr_x and addr_y. Finally, when the external peripheral has read and stored the pixel address, the external receiver activates bus_ack and the AER communication logic blocks generate the signals that reset the pixel.
Furthermore, two 128-bit shift registers store the enable sequence in such a way that the pix_on and reset signals are only enabled in certain rows and columns. This allows a region of interest (ROI) to sense the lighting level and keep the rest of the pixels harvesting energy. Fig. 6 shows the schematic of the pixel, where V EH is the global storage node shared among pixels. The pixel has been designed at the electrical level to operate with 250-mV supply voltage so that its operation is directly compatible with the voltage levels generated by the pixel array. Pixels are autonomous entities that can harvest energy from the environment while not sensing the illumination level. This is controlled by the signal stored in the per-pixel SR latch. Depending on the output of the latch, transistors M p1−2 and M n1−2 commute the terminals of the photodiode [39]. When the photodiode is reverse-biased, a comparator and  pull-down transistors implement the TFS operation of an AER communication.

A. Photodiode Layout and Configurations
A description of diode implementation was advanced in a preliminary conference contribution [39]. As already mentioned, diodes toggle between two operation regimes: Photovoltaic and reverse-biased regime. The top view and the cross section of the photodiode are shown in Fig. 7, where three different diodes are connected in parallel, namely the n + /p-well junction, D 1 , the p-well/Deep n-well junction, D 2 , and the Deep n-well/p-sub junction, D 3 . The three diodes share the same cathode. The anodes of D 1 and D 2 are also shared, and the anode of D 3 is grounded; therefore, D 3 cannot contribute to energy harvesting. It has previously been demonstrated that the architecture including D 1 and D 2 in parallel presents a higher open-circuit voltage than nonstacked diodes [19]. An interesting aspect introduced in this implementation is that the cathodes of D 1 and D 2 are connected extending the n + diffusion of D 1 . This allows increasing the active area of D 1 and the pixel fill factor. While offering certain benefits, note that the utilization of these diodes and the integration scheme present some drawbacks in terms of noise performance. Compared to pinned photodiodes, where the charge-collecting layer is buried within the silicon crystal, surface traps result in a higher noise level in these devices. Furthermore, the integration scheme precludes the implementation of a floating diffusion, resulting in a high kTC noise due to the reset operation, which cannot be mitigated. Apart from these general considerations, we did not observe a significant improvement in noise performance between stacked and conventional junction configurations.
The configuration of the photodiode depends on the state of the pixel, which is defined by the local lock signal. When lock is deactivated (and, therefore, lock activated), the photodiode is reverse biased and a current can flow from the power supply node, V DD , to ground through the photodiode, as depicted in Fig. 8(a). In contrast, when lock is logic high, the cathode is grounded, and the photodiode enters the photovoltaic regime to harvest energy immediately after readout. In this configuration, the photocurrent flows to the V EH node, thus contributing to the power supply, as shown in Fig. 8(b).
The proper electrical design of M n3 and M p4 is crucial. To operate at supply voltages close to 250 mV, thin-gate-oxide transistors were selected. These two transistors must be strong enough to drive a current larger than the photocurrent, I ph , when the pixel is under high-illumination conditions; however, the leakage current, I leak , increases with the strength of the switch, leading to a random offset and therefore, pixel-to-pixel nonuniformities. Although I leak can reduce the dynamic range, its effect can be alleviated by an off-chip calibration process.

B. Pixel Operation
The pixels rely on the TFS concept to sense the illumination level of the scene. TFS operation is particularly interesting as the brightest pixels are read out first. Therefore, they instantly contribute to energy harvesting, while the less-illuminated ones consume energy. The operation is governed by pix_on, which is defined as the AND operation of two signals pix_on_h and pix_on_v. These signals are shared per row and per column, respectively, and are the output of the AND operation of the global pix_on signal and the sequence stored in the shift registers from Fig. 1. This allows the system to enable only an ROI, while the rest of the pixels are harvesting energy.
The different signals involved in the operation of the pixels are represented in Fig. 9. Before starting the sensing operation, pix_on is logic-low and rst_pix is logic-high. Thus, lock is activated at the beginning of the operation. At this point, the operation can be divided into three phases. 1) Reset: To sense the pixel illuminance, the charge is integrated into the photodiode capacitance, C ph . Therefore, the photodiode must be reset at the beginning of the operation. This is accomplished by the active-low rst_pix signal and transistor M p4 . Note that this signal also unlocks the pixel. To prevent M p4 from driving a current in pixels outside the ROI, rst_pix is implemented as the NAND function of pix_on and a global reset signal: In this way, when the pix_on signal is logic high, rst_pix is the negated value of reset. The reset time, t rst , must be minimized to reduce power consumption, though this time must be long enough to reset high-illuminated photodiodes. 2) Integration: When reset is released, the voltage at node V n decreases linearly depending on the photocurrent value. Defining T int as the time it takes the photodiode to reach the voltage threshold of the comparator, V th , the photocurrent can be computed as It is important to note that the measurement depends on the value of V DD . This value is the same for all pixels, but may vary from frame to frame. Thus, if the application requires computing information among different measurements, this voltage must be properly regulated. Furthermore, as shown in (3), IR drops in the V DD line inside the pixel array during the reset phase can cause a systematic error in the signal, that is, patterns. However, the power consumption of the pixel array is low enough that the IR drop is insignificant. In contrast, IR drops during the integration phase can lead to a jitter at the comparator, resulting in random noise in the signal.
3) Readout: When V n reaches V th , the output of the comparator, spike, indicates that the pixel must be read out. To do so, the pixel uses transistors M n4−7 . First, the series combination of M n4 and M n6 pulls down req_row. This signal is shared by the entire row, implementing a wired OR gate. An arbiter circuit decides which row must be read and activates the ack_row signal, so pixels in that row can pull down the req_col signal and repeat the process at the column level. After granting access to the pixel, the AER communication logic block activates bus_req to indicate that valid data is set on the output address bus. Finally, the receiver stores the coordinates of the pixel and activates bus_ack. This signal triggers rst_col and rst_row, so that the local ack_pix signal locks the pixel. Fig. 10 shows the schematic of the in-pixel comparator. It is the only analog block and the dominant source of power consumption. Thus, a careful design of this component is crucial for proper performance. The comparator is composed of a 5T-OTA followed by a CMOS inverter, as depicted in Fig. 10. The comparator is designed to work at supply voltages as low as 250 mV and all transistors work in weak inversion to reduce the power consumption of the entire array, at the expense of a reduced speed. When the pixel is harvesting energy, M n,en switches off the bias current of the comparator using the lock control signal.

C. Comparator
The comparator plays a crucial role in fixed-pattern noise (FPN) and temporal noise; as (3) shows, any variation in T int or V th modifies signal. Because the mismatch of pull-down transistors and other digital components entails negligible time variations concerning the signal, FPN has two primary sources, according to our study: the comparator's offset and the deviation of the leakage current of M p2 and M n2 . The latter can be minimized using thick-oxide devices. The comparator's offset has an impact on the effective value of V th , resulting in a random pattern in the pixel array. Also, the transient noise of the comparator is translated into pixel signal transient noise.
Offset-cancellation techniques can be employed to mitigate the effect of the comparator's offset in FPN. However, their implementation at the pixel level incurs a high cost in terms of area and increases the complexity of the operation. As such, the implementation of these techniques falls outside the scope of this study and merits further investigation in future research.
Additionally, the architecture depicted in Fig. 10 exhibits a systematic offset due to its asymmetrical topology. However, it would not affect the operation, as it is a systematic error present in all pixels and can be compensated by adjusting the value of V th .
Finally, the value of V th is not trivial. It can be adjusted to alter T int and, therefore, the sensitivity of the pixel to light, as depicted in (3). On the one hand, a low T int value (high V th ) reduces the period of time the sensor consumes energy; however, the signal-to-noise ratio decreases, as delays in the readout chain (primarily due to collisions in the arbitration circuitry) are significant compared to T int . The signal power is defined either by T int or (V DD − V th ) in the time or voltage domain, respectively. Thus, a low value of V th enhances the signal power, minimizing the impact of the comparator's offset and noise, at the cost of reduced speed and increased energy consumption. Fig. 11 shows the dependence between the short-circuit current and the open-circuit voltage with illumination. Both parameters determine the sensor's capability to harvest energy since the maximum power generated by the array of photodiodes is related to these parameters. Empirical studies [40] have represented this correlation as

A. Open-Circuit Voltage
where k 1 and k 2 are constants. The range of k 1 has been found to be 0.71-0.78, while k 2 is in the range of 0.78-0.92 [40]. Given an illumination value, V EH tends to reach V oc , which has a logarithmic dependence on illumination and decreases linearly with temperature [19]. On the other hand, as the value of k 2 is approaching unity, the short-circuit current is found to be a determinant of the rate at which energy can be harvested, that is, the steepness at which the voltage of the supercapacitor increases when V EH < V oc . Consequently, the frame rate of the sensor is constrained by these two parameters and thus by the illumination. This is because the sensor must recover the energy expended during the image capture process to maintain a continuous operation. Fig. 12 shows an image captured with the sensor, wherein both a color map and an 8-bit grayscale representation of the illumination values have been employed. The image was acquired in 14 ms. Fig. 13 depicts experimental data showing the amount of energy consumed and harvested by the sensor during the previous image acquisition. Fig. 13 (top panel) shows the percentage of pixels operating in sensing and harvesting mode during the image acquisition. Initially, all pixels start in the sensing mode. At this instant, all pixels consume energy, and the sensor experiences its maximum power consumption, 1.14 µW. Then, the most illuminated pixels start to toggle to the harvesting mode. As a result, the overall power consumption of the sensor decreases with time. Consequently, acquiring a low-light scene requires greater energy consumption and, therefore, a longer harvesting period to recover from it, as pixels operate in the sensing mode for longer. In Fig. 13 (bottom panel), the measured energy consumed and the estimated energy harvested are plotted. Two situations have been considered to estimate the harvested energy: operating with an ideal Dickson dc-dc converter and with the implemented converter with an efficiency of η = 60%. The sensor harvests the energy required to acquire the image in 17 ms. Analyzing the plots, we can conclude that the maximum energy that the sensor consumes when all pixels operate in sensing mode is higher than the maximum amount of energy collected when all pixels harvest energy simultaneously. Thus, the sensor cannot operate continuously and needs to harvest energy between acquisitions. In the example, an effective frame rate of 58 frames/s was achieved with the self-powering operation.

C. Acquisition Time Versus Illumination
The experimental results of Fig. 13 show that during the acquisition time, the consumed power and energy are higher than the harvested ones. Hence, if the sensor is intended to acquire images continuously, the frame rate will be restricted by the time required for the sensor to recover the energy used during one acquisition, as discussed in Sections IV-A and IV-B. The experimental results determined that the maximum continuous frame rate for the sensor was 51.5 frames/s at an illumination level of 500 lx, and 19.9 frames/s at an illumination level of 100 lx.
When a reduced latency between frames is required, the sensor can operate in the burst mode, allowing for the capture of multiple frames before recovering the energy expended during the process. In this scenario, assuming that the dc-dc converter is fast enough, the frame rate will be limited by the readout time, specifically the value of T int of the darkest pixel. Fig. 14 depicts experimental data showing the maximum possible frame rate versus the illuminance. All pixels were exposed to the same illumination level. Given that the external supercapacitor is precharged and it stores enough energy, the sensor could acquire images at 50 frames/s with a chip illuminance of only 10 lx. It must be remarked that the energy balance is quite complex for visual scenes with a large intrascene dynamic range, with high-and lowilluminated pixels. When a high-speed acquisition is necessary, a defined acquisition time can be established for reading out the image. However, this comes at the cost of losing the information from pixels whose T int exceeds the defined acquisition time (low-illuminated pixels), resulting in a tradeoff between image quality and acquisition speed, which will vary depending on the specific scenario and the application requirements. Fig. 15 shows the measured sensor output versus the chip illuminance. The digital number is normalized and ranges from 0 to 1000. All pixels were illuminated uniformly with a white Lambertian light source. In this experiment, the luminance was varied while maintaining the same electrical conditions, allowing for the determination of the intrascene dynamic range, which was found to be 100 dB. The red trace corresponds with the linear data fitting. The coefficient of determination is r 2 = 0.9370. The sensor response to illumination is linear within an illumination range of four decades. Below 10 lx, the sensor output linearity is limited by the current leakage introduced by the transistors connected to the anode and cathode of the photodiode in Fig. 6. This leakage is comparable to the photocurrent with low illumination and is responsible for the sensor nonlinearity in such circumstances. The current leakage also impacts the FPN of the pixels with low illumination because the leakage current has a large variability from pixel to pixel. The measured FPN was 5.34% with a sensor illuminance of 1 klx. This limitation could be amended by using transistors with a thicker gate oxide. Therefore, the lower limit of the dynamic range is determined by the leakage current of M p2 and M n2 , while the bandwidth of the comparator determines the upper limit, that is, the digital number is limited when T int is shorter than the response time of the comparator. Sensor calibration is also possible to improve image quality. The procedure can be performed off-chip by subtracting the illumination values measured without illumination from each frame.

V. BENCHMARKING
In order to evaluate the performance of the proposed sensor compared to existing solutions, the image figure of merit (iFoM) was calculated and compared. The energy required to capture a single frame was determined by integrating the blue trace in Fig. 13. However, for a fair comparison, the iFoM was calculated using the assumption of a dark frame, in which the power consumption is approximately equal to the peak power consumption (1.14 µW) for the majority of the time, and a frame rate of 19.9 frames/s under 100-lx ambient lighting conditions. The resulting iFoM was found to be 3.41 fJ/pixel · code. Table I compares the specifications of the implemented sensor against other relevant and recent self-powered ones. Although different architectures for low-power imaging have been reported in the past [41], [42], [43], we only compare the proposed sensor to different image sensors implementing energy harvesting capabilities. Among the reported solutions in the literature, the selected ones were chosen for their outstanding balance between power consumption and frame rate (i.e., iFoM) or their unique functionalities. The work presented in [11] reports a PWM sensor that operates at 0.32 V, being compatible with the native open-circuit voltages of silicon photodiodes. This work was continued in further publications [13], obtaining a better image quality and linearity  [10] demonstrates an APS solution that features two stacked diodes (one for harvesting, and the other for sensing) that operates at 0.6 V, while the work presented in [8] describes an APS architecture in which the same photodiode can work as a harvester or sensing unit. Also, Wang and Leon-Salas [16] reported an APS sensor in which the photodiodes can independently switch terminals, enabling each pixel to function in sensing or harvesting mode. However, during image acquisition, the pixels cannot operate in both modes, and iFoM is worse than that of other reported sensors due to the lower resolution of the implemented ADCs. In summary, the proposed sensor exhibits the most favorable value of iFoM, followed by PWM sensors, and then by APS sensors.
The proposed architecture is the only one compatible with simultaneous harvesting and image acquisition at the pixel level. This possibility optimizes the amount of energy harvested because the most illuminated pixels have an earlier and higher contribution to the sensor's self-powered operation. Moreover, the proposed asynchronous readout allows dynamically deciding the amount of time dedicated to acquiring one image. It is possible to render one image after receiving a certain number of events or update the image representation dynamically. Therefore, there is room for further research work to investigate the optimal balance between integration time, image quality, and power consumption.
The proposed architecture outperforms the art, offering a very low image acquisition and readout time, exploiting the advantages of the fast asynchronous readout circuitry. Frames can be acquired within a short time interval. Therefore, the ratio between the amount of time the sensor is harvesting and sensing is higher than in previous implementations. Even if we consider a continuous sensor operation, powering the sensor externally, the pixel energy consumption is competitive over the art. The sensor also offers high intrascene dynamic range operation. The main limitation of the proposed architecture is the FPN level, which is higher than the art. This can be amended in further implementations by performing a careful study to reduce the current leakage of the transistors that alters the photocurrent measurement. At this point, there is room for improvement by selecting different types of transistors or simplifying the number of transistors connected to the photodiode.
Furthermore, the functionalities of the sensor can be extended to implement a Free Running mode. In this mode, pixels are reset after readout, and events are generated continuously. Therefore, the temporal error caused by the collisions is averaged. However, pixels are continuously consuming energy during this mode, limiting its use to periods when the average illumination is high enough. To implement this function, switches must be added to V p to keep the photodiode reverse-biased regardless of lock and a reset logic must be implemented to reset the pixel when lock is activated.

VI. CONCLUSION
A new self-powered asynchronous image sensor has been presented. The proposed architecture amends some of the limitations of classic synchronous pixel architectures to harvest energy. On the one hand, pixels are autonomous entities that can harvest energy or sense illumination independently, not necessary to divide the sensor operation into two phases associated with energy harvesting and sensing. On the other hand, the proposed asynchronous readout scheme proves to be very competitive to harvest energy because there is no requirement for an A/D conversion or to read out all the pixels to render one image. With the proposed sensor architecture, the system can trade between image quality, frame rate, and power consumption. All of these capabilities open the possibility for further research work to optimize sensor operation and the amount of energy harvested depending on the illumination conditions and image rendering requirements.