Analog Defect Injection and Fault Simulation Techniques: A Systematic Literature Review

Since the last century, the exponential growth of the semiconductor industry has led to the creation of tiny and complex integrated circuits, e.g., sensors, actuators, and smart power. Innovative techniques are needed to ensure the correct functionality of analog devices that are ubiquitous in every smart system. The ISO 26262 standard for functional safety in the automotive context specifies that fault injection is necessary to validate all electronic devices. For decades, standardization of defect modeling and injection mainly focused on digital circuits and, in a minor part, on analog ones. An initial attempt is being made with the IEEE P2427 draft standard that started to give a structured and formal organization to the analog testing field. Various methods have been proposed in the literature to speed up the fault simulation of the defect universe for an analog circuit. A more limited number of papers seek to reduce the overall simulation time by reducing the number of defects to be simulated. This literature survey describes the state-of-the-art of analog defect injection and the fault simulation methods. The survey is based on the preferred reporting items for systematic reviews and meta-analyses (PRISMA) methodological flow, allowing for a systematic and complete literature survey. Each selected paper has been categorized and presented to provide an overview of all the available approaches. In addition, the limitations of the various approaches are discussed by showing possible future directions.


I. INTRODUCTION
N OWADAYS, with the increase in the complexity of ana- log and mixed-signal circuits, guaranteeing the functional safety of both digital and analog circuits is fundamental to reducing every risk of failure in cyber-physical system (CPS) and industrial CPS (ICPS) [1].Building efficient analog defect injection and simulation techniques became strategic for manufacturers [2].In this context, the maturity of the techniques used in the analog field is lagging behind compared with the digital field.For the digital domain, the standard ISO 26262 [3] defines the guidelines for implementing a fault injection campaign to ensure the functional safety of road vehicles.With the focus on digital circuits, in literature, these methodologies are referred to as fault modeling and injection techniques because the functionality of a digital circuit (fault-free and faulty) can be seen as an abstraction of the analog circuit that implements the physical functionality.Instead, when referring to the analog domain, the correct term used to identify these techniques changes from fault to defect.Thus, the defect models that are characterized to mimic physical deformations and injected into a netlist can produce a fault after the simulation.Due to these historical reasons, the industry uses fault injection and simulation more than defect injection and simulation when referring to the analog testing field.This error is probably related to the fact that the first attempt to standardize the terminology in this field is proposed by the IEEE P2427 draft standard [4].Previously, in some cases, these two terms were swapped and used without distinguishing between their different significance in the digital and analog domains.By analyzing the literature terminology and maintaining compliance with the P2427 draft standard, this literature survey uses the terms analog defect injection and fault simulation techniques.
SPICE-based simulators are still the core technology to simulate defects in analog circuits described at the transistor level [5].However, the circuits are becoming more complex as the number of transistors increases [6]; thus, more evolved methodologies are required.Simulating an analog circuit for a set of input stimuli can require a lot of time and could vary from a fraction of a second to several days depending on the complexity and details accuracy of the circuit under test (CUT).The complexity of a defect injection campaign is directly proportional to the number of defects to be simulated because, for each injected defect, a complete simulation using the nominal input stimuli needs to be computed to retrieve faulty data/matrices.In literature, researchers from industry and academia suggested different techniques to reduce the overall simulation time required to simulate an entire set of defects, named defect universe [4].These techniques include parallel simulation on different cores [7], analysis in the frequency domain at specific operating points [8], usage of high-level models [9], Monte Carlo-based simulations [10], or fault sensitivity analysis (FSA) [11].Other techniques aim at reducing the set of defects that need to Fig. 1.Methodology used to build this systematical survey.Starting from the definition of the search string and the list of databases, the entire set of articles is retrieved and processed to obtain the final set of articles included in the survey.be simulated, e.g., with fault grouping techniques.The defect models injected at transistor level are different for each component, e.g., a defect for a metal-oxide-semiconductor field-effect transistor (MOSFET) [12] could be a stuck-on between drainsource, a stuck-off between drain-gate, and so forth.The first attempt to standardize the defect models for each component in a transistor-level description will be released with the standard IEEE P2427 [4], currently in a draft version.
This article aims to present a systematic review that explores all the techniques proposed in the literature on analog defect injection and simulation.A notable literature review is available on digital fault injection [13].The digital fault injection approaches are divided into: hardware-based (physical), simulation-based, and emulation-based.For the analog domain, a noteworthy survey of analog fault diagnosis is presented in [14] and is related to the detection and diagnosis of analog faults.However, an analogous review of analog defect injection and simulation techniques is not presented in the literature.To systematically review the selected articles, the Rayyan platform [15] is used to support the selection phases of the articles.It significantly improves the process of selecting and screening literature papers when more than two researchers collaborate on building the survey.Moreover, the preferred reporting items for systematic reviews and metaanalyses (PRISMA) [16] methodology is used to build a structured and complete survey.The PRISMA methodology was initially presented to build systematical surveys in the medical domain, and it is currently used for different scientific domains.
Each technique is analyzed in detail in this literature survey by showing its advantages and limitations by following the precise workflow described in Fig. 1.Starting from a list of keywords, the search string shown in (1) is built.Using the defined search string inside the Web of Science platform, 1580 articles are retrieved from the selected databases: IEEEXplore, Elsevier, Spring Nature, Wiley, and Association for Computing Machinery (ACM).The main contributions of this survey are as follows.
1) Analyze the literature systematically on analog defect injection and simulation at the transistor level.2) Propose a classification of each selected work in ten different categories by unifying approaches that exploit common strategies.3) Describe the features and limitations of each selected work.4) Describe the advantages and limitations of the different techniques by analyzing them grouped in different categories.Furthermore, the main research gaps in this area are discussed and motivated with a global vision.The remainder of this article is organized as follows.Section II exemplifies the methodologies used to build this systematic survey.Section III provides the basic definitions and types of analog defects and the state-of-the-art of defect modeling and injection.In Section IV, the core of the survey is presented.Then, Section V discusses the limitations and future directions on adapting fault simulation techniques, by discussing real applications.Finally, the conclusions are given in Section VI.

II. RESEARCH METHODOLOGY
This section explains the methodological flow used to create this literature survey as schematized in Fig. 1.First, we searched the articles on different databases by defining a search string covering most papers related to analog fault injection and simulation.Then, we selected papers by performing a collaborative decision by using the Rayyan platform [15].

A. Papers Selection Methodology
Fig. 2 depicts the steps followed in this survey for the research and selection of the articles based on the PRISMA methodology [16].We uploaded 1580 papers on the Rayyan collaborative platform [15].This platform is helpful for screening a large number of papers systematically.In the screening process, duplicate papers and book chapters are removed by exploiting the Rayyan functions.Then, the most important step was to find the most relevant papers for review based on their relevance to the target topic.For this purpose, three reviewers initially screened the papers by reading the title and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.abstract of each paper.During this step, it was noticed that some papers were related to digital fault injection, some papers were related to the testing part, and some other papers were relevant to other fields because faults can exist in other applicative areas as well.In this step, 1434 papers were removed, and 126 papers were selected for evaluation.All the manually excluded articles are related to other physical domains than the electrical one.From these 126 selected papers, there were some conflicts in some of these papers, i.e., different reviewers' decisions.Then, all the reviewers resolved the conflicts by reading each paper in detail, agreeing to exclude only 47 of those 50 papers.After screening and resolving conflicts, 79 papers were selected to be included in this survey.The most important thing about the selection of fault injection articles was that: there are different approaches for analog fault injection: simulation-based, hardware-based, and emulation-based.The majority of the papers selected are related to simulationbased fault injection.This is due to the practical complications in applying hardware or emulation-based fault injection.

B. Search String and Selection of Databases
The process of retrieving all the articles for inclusion in the survey is started by defining the principal keywords in this research field.These keywords include analog/analogue, fault, analysis, injection, and simulation.The search strings combine the selected keywords by using AND and OR boolean operators in The query includes the conference paper and journals published from 2000 to 2021.The query does not include defect as a keyword because most articles in this field mainly use the "fault" keyword for historical reasons as described in Section I.
The platform chosen to search the articles through the search string is Web of Science because it allows the selection of many different databases as sources and enables exporting the selected articles in multiple convenient formats.The articles found by using the search string on the Web of Science platform are matched by wrapping each keyword inside an ALL operator.The ALL operator searches the keyword in every metadata associated with the articles, for instance: title, abstract, keywords, conference name, funding text, and so forth.Basically, it searches in all metadata fields except for the content (i.e., main text) of this article.Table I shows the list of sources that are: IEEEXplore, Elsevier, Springer Nature, Wiley, and ACM.It also shows the number of papers we found in these databases, which is 1580 overall.All these papers are not related to the analog fault injection field because faults can be part of other domains like geography, material sciences, etc. Fig. 3 shows the subdivision of all articles obtained through the search string [see (1)] according to their scientific field.The total number of items included in this figure is 2777, which is larger than the initial set of articles retrieved through the search string.This is because an article can be related to several scientific fields simultaneously.Most publications related to analog defect injection and simulation are in the area of electrical and computer engineering and fewer in other areas, such as robotics and manufacturing.Fig. 4 instead shows the subdivision per year of the entire set of papers retrieved through the search string (highlighted in blue) and the selected papers for the survey (highlighted in orange).From the figure, it is evident that in the period 2000-2021, the number of articles related to analog defect injection and simulation Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.remains low and constant, indicating that the community working in this area is small and not expanding.

III. BACKGROUND AND STATE OF THE ART
This section presents the definitions related to analog defects/faults, and basic knowledge of the different classes of possible defect models.It starts by defining a list of terms used throughout the survey.
1) Defect: Unwanted physical change in a circuit element or connection between circuit elements that are not within fabrication specifications for the circuit element or connection.2) Fault: A model of faulty behavior at a functional level higher than circuit elements, e.g., an unresponsive inverter output, a drain/source short, and a transistor stuck on/off.3) Failure Mode: Deviant behavior of a subsystem that may cause the system to fail to execute its intended function, e.g., an operational amplifier (OPAMP) output that oscillates or that has a voltage or current offset deviating unacceptably from the fault-free behavior.4) Defect Coverage: Percentage of defects detected during the defect injection campaign.These terms are used in the context of transistor-level circuit realization.Then, starting from the circuit schema by applying the synthesis process, the transistor-level layout can be obtained.An example of transistor-level circuit realization for an OPAMP circuit is shown in Fig. 5.The defect modeling techniques usually work at this abstraction level, not at the layout level.Instead, when it is necessary to model short and open defects in the interconnections, it is required to combine information from the layout level to annotate the transistor-level schematics correctly.

A. Analog Defect Models
According to the definition of IEEE P2427 draft standard [4], a defect is considered as an unexpected change in the physical structure of a circuit.The difference inside the physical structure affects the function of a subsystem that can produce faulty behaviors.While a fault in the circuit is an unexpected variation in a circuit module that has a performance specification.Such defects described in this section are designed to replicate real defects that occur in a real circuit so that they can be simulated.Stuck-on/off are known to occur in a MOSFET caused by excess trapped charge in the gate oxide or excess interface states [17].As illustrated in Fig. 6, trapped charges, marked in red "o," are fixed positive charges which are trapped in the gate oxide and have a primary effect of shifting the threshold-voltage.Interface traps, represented with red "x," are parasitic states which can capture and release minority carriers.These carriers are normally intended to populate the channel but remain blocked in the trap and can not participate in the current conduction in the channel anymore.
Soft and hard defects are the two types of defects that can happen in analog circuits.Fig. 7 proposes the locations of possible analog defects of a transistor-level description of an inverter circuit [1].For transistor-level models, there is a broad agreement in the industry on the common hard defects in the analog device-level models1 : transistor, resistors, capacitors, and interconnections.While the situation is more complex relative to soft or parametric defects.Soft defects are also referred to as parametric defects.The random and systemic process variations in analog circuits cause parametric values, such as capacitance and resistance changes [18], [19], [20].The hard defects comprise open and short circuits between the connections of the device-level models that change the topology of the circuit.Hard defects are further divided into various categories depending on their impact on the primary current path between the drain and source in MOS transistors.
1) Hard defects can create a permanent direct or indirect conductive path between the drain and the source.The direct path defect is referred to as drain-source short, and the indirect path defect is referred to as drain-gate short.2) Hard defect can also prevent the direct or indirect flow of current between the drain and source.The defect that causes direct prevention of current is defined as drain open or source open, and the defect that causes indirect prevention of current is defined as gate-source short or gate-body short.3) Hard defect can also result in loss of control of the state of the transistor, which is referred to as an open gate.These defects are injected by various techniques that will be described in Section IV.These defects are usually injected into a netlist generally described at the transistor level (see Fig. 5 for a transistor-level representation).A netlist can also be described by other languages, e.g., HDLs with an equivalent behavior but abstracted.Fig. 8 shows the balance between simulation performance and accuracy that can be achieved by exploiting different modeling techniques.For example, the family of SPICE languages allows for greater accuracy but requires more time to accomplish simulations, while a functional simulation requires less computational time because the simulated model is less accurate.

IV. ANALOG FAULT INJECTION AND SIMULATION TECHNIQUES
This section presents all the articles selected from the literature to be included in this systematic survey.The complete list of articles included in the survey is presented in Table II ordered by publication year with a brief description.The survey includes 79 articles retrieved by following the PRISMA methodology described in Section II.Fig. 9 describes the main categories in which the selected articles are subdivided and presented in the following sections.Furthermore, the principal tools used to inject defects in the transistor-level models and speed up the simulation are shown.The categories selected to group all selected articles enable covering all types of simulation techniques that exist for simulating faults caused by defects placed inside a netlist.These categories include standard transient fault simulation, DC fault simulation, concurrent fault simulation, and so forth.Again, this classification is just one of many that could be used to partition articles related to this topic.It was inspired by the commonalities between the articles selected for the survey.There are other categories included in this survey, like using random sampling to reduce the number of possible defects to be simulated or differentiating between the circuit types to which the defects can be applied.

A. Transient Fault Simulation
In analog circuits, usually, defects are injected at the transistor level and simulated with SPICE-based simulators.SPICE-based simulation is time-consuming because it allows accurate simulations of transistor-level models.Consequently, the overall simulation time is considerably increased, and furthermore, the reliability of each simulation of a faulty circuit is often low due to convergence issues of the solver.A common problem encountered with transistor-level defect injection is that the defect parameter value required to model the defect is unknown.This value is referred to as the parameter value of a defect, and its value should correspond to the typical parameter Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.value of a defect.In the literature, many authors proposed different values for the injection.The IEEE P2427 draft standard provides hints about the range of values suitables to model the parameter value for short, generally between a few milliohms and 20 k, and for open between several megaohms and infinity.Unfortunately, these ranges of values are too broad and in practice, extremely time-consuming to handle with mere brute-force simulation.
The time-domain algorithm for fault simulation work in two iterative shells.One is the outer iteration that steps through the time instants, such as t n+1 = t n + h n , n = 0, 1, etc.The second iterator steps through the instants t n+1 inside the outer iterator to solve the nonlinear equation by applying the Newton-Raphson (NR) method.A nonlinear circuit equation is solved in each iteration of the NR method.A linear system is solved in each NR iteration with lower-upper (LU) factorization [24].In [44], a new method for transient fault simulation for nonlinear analog circuits has been presented.An operational amplifier and an active band-pass filter were selected as test cases.Resistive bridge defects were chosen in this article as structural kinds of defects.In this technique, the speed of fault simulation is measured in terms of simulation latency.Simulation latency means computational redundancy, e.g., avoiding the repetition of unnecessary specific operations in the simulator.Another approach has been proposed in [57] for fast-time domain simulation.In this work, fast simulation can be carried out by combining different techniques (hierarchical simulation, on-the-fly-decrease of the list of defects, prediction of neighboring problems by implicit sensitivity) together with the enhancement of hierarchical modeling using extra ports.

B. Fault Simulation Using High-Level Models
The fault simulation can be performed by using high-level models.Different techniques are proposed in the literature and exploit behavioral level fault simulation or a hybrid multilevel fault simulation.
1) Behavioral Level Analog Fault Simulation: Fault simulation and injection can be performed at a high level, as described in [40].Applying behavioral models is one of the methods proposed in the literature to increase fault simulation speed.These behavioral models can be a set of equations relating inputs and outputs or can be several lines of microcode.Verilog-A, Verilog-AMS, VHDL-A, VHDL-AMS, and System-C are behavioral modeling languages used for accomplishing this task.VHDL-AMS simulators are powerful at a higher level and can make the modeling of analog or mixed signals easier, speeding up the simulation time.Behavioral modeling [58] and simulation of failure modes in analog blocks are possible.However, it also involves a comprehensive analysis of the possible faults that mimic the behavior of a defect.While such behavioral modeling is possible using SPICE macro-models, described with VHDL-AMS, or other analog hardware description languages.These techniques are possibly best if used in multilevel fault simulations.The idea of modeling faults at the behavioral level has been discussed in [9], [29], [31], [35], [53], [62], [66], [74], [78], [82], [89], and [91] to speed up the fault simulation process.
2) Multilevel/Mixed-Mode Analog Fault Simulation: Fault simulation can be speed-up by using multilevel hierarchical analog fault simulation techniques.In multilevel fault simulations, defects can be modeled at the transistor level, whereas fault-free parts of the circuit can be modeled at the behavioral level.In some articles, the multilevel fault simulation is referred to as mixed-mode simulation.Multilevel hierarchical analog fault simulation refers to the use of behavioral models for components defined at the transistor level, defect injection at various abstraction levels, and hierarchical handling of all different definitions of circuit components and defects during the process of fault simulation.Multilevel hierarchical analog fault simulation is a valuable method for dealing with the complexities of analog circuits and producing test signals with high defect coverage [82].The simulation time can be decreased by using behavioral models for some components of the circuit.Behavioral models have a small number of variables, and the systems of equations representing the circuit's behavior have less computational complexity [42].In multilevel fault simulation, the defects can be modeled at the transistor level, while the fault-free parts of the circuit can be modeled at the behavioral level to improve the overall speed of fault simulation [82].The primary drawback of this technique is that defects in a circuit can take other parts of a circuit out of their normal operating regions.A similar technique is explained in [91], with the difference that defects are modeled at the behavioral level through the Verilog-AMS language while the circuit remains described at the transistor level.

C. Fault Simulation Using Fault Grouping/Equivalence
Fault grouping is another direction in which research has been conducted in the past decade to reduce the entire computation time required to perform a defect injection campaign.Grouping of faults can be performed using both transient simulations as well as frequency simulations.Fig. 10 shows the general flow used in different fault grouping techniques to reduce the amount of defects to simulate.The details for the most relevant grouping techniques are shown in the following sections.
1) Fault Grouping Using Transient Simulation: The primary objective of fault grouping is to subdivide faults into groups [70].Grouping can be performed by using different strategies related to the simulation method.Various authors have proposed several algorithms for grouping faults [21], [25], [34], [41], [46], [48], [54], [60], [65], [92].The ultimate goal of grouping is to reduce the number of faults that must be simulated by creating clusters of similar/related faults and simulating just one representative fault for each cluster.Fault grouping can be performed in both transients as well as frequency domains.A dynamic fault grouping method to improve the concurrent fault simulation was presented in [21].This algorithm is also applicable to nonlinear circuits.The primary aim of this article was to reduce the computational cost of fault simulation.In this work, faults were grouped dynamically at every time step.After the grouping, faults presented in every group can be simulated parallel simultaneously.Every group had different time steps, but faults in each group had the same time step.Another approach for fault grouping was introduced in [65].In this method, hierarchical clustering has been proposed for fault grouping.This method was presented for grouping faults at the component level.One drawback of the proposed work was that by adopting this technique for every fault present in the group, circuits need to be simulated once for every fault, which is time-consuming.
In [41], a fault list compression using the stratified fault grouping technique has been proposed.This approach used stratified fault grouping for the identification of representative faults.This technique can reduce the number of randomly selected defects needed to achieve a target confidence interval.This method is generic and independent of the fault models used and can be applied to more complex fault models.In [34], the clustering algorithm and external cluster validation techniques are applied to obtain an optimal number of fault groups.However, these techniques are based on the knowledge that requires additional data observation before applying cluster methods.In [65], a fault grouping technique is proposed to counter the problem in cluster analysis techniques proposed in [34].It considers component failures as waveforms and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
does not require prior knowledge.At the circuit level, only one defect is selected from each fault group for simulation-based verification.There are two primary advantages of using this method: the first is simulations of only representative defects, while the second is the minimization of the uncertainty of missing out on safety-critical defects.
Digital circuits have efficient and cost-effective solutions for structural testing in the presence of faults.Testing of digital circuits is different from analog/mixed-signal (AMS) circuits because we also need to look at the parametric deviation of circuits and the presence of continuous signals.In [92], fault simulation based on fault equivalence has been presented for AMS circuits.Using this methodology, the total number of transistors stuck-off and stuck-on defects required for simulation decreases to 31% for analog transistors in the mixed-signal circuit, while the analog only comprises 32% of all the transistors, consequently, the total reduction is about 69% of 32%, which is only 20%.This proposed technique was a good contribution to organize a fault simulation test production in mixed-signal circuits.In [54], another fault list compression technique was explained for the structural testing of AMS circuits.In [46], fault equivalence concept was applied to diagnose the fault in linear circuits.This approach was proposed to provide a systematic approach to diagnosing analog faults.
2) Fault Grouping Using AC Analysis: Fault simulation using transient simulations is time-consuming.To overcome the problems of transient analysis in [8] and [86], frequencybased analysis techniques have been proposed to reduce the simulation speed of analog defects.Sanyal et al. [86] presented a fault clustering technique combining faulty DC operating point (OP) and frequency domain analysis.That technique requires N transient simulations (i.e., one for each injected defect), and then, at each OP, they perform an AC analysis.Another fault clustering technique was proposed in [86].In this work, a different analysis was presented by combining frequency domain and faulty DC OP.Specifically, to perform the clustering each time the defect is injected, N transient simulations are required, and then at each OP, an AC simulation is performed.Recently, a predictive fault grouping using faulty AC matrices was proposed in [8].In this work, two grouping methods were presented: the first based on AC-based grouping, and the second based on circle-based grouping.The AC grouping feature vector uses the S-parameters [94], while the circle-based grouping feature vector was obtained from the fault circles.

D. Concurrent and Parallel Fault Simulation
The fault simulation of analog circuits is more challenging as compared to the fault simulation of digital circuits.Only information associated with a part of the circuit where faults are propagated is utilized in the simulation of digital circuits.While the simulation of analog circuits impacts the current and voltage levels across all the branches and circuit nodes.Concurrent [51] and parallel fault simulation [37] are considered efficient tools for fault simulation for digital circuits.Nevertheless, fault simulation in analog circuits is often achieved by repeating the process of injecting the defects systematically, requiring to consume a lot of time.As a result, concurrent analog fault simulation techniques have created a new application field in the analog domain [7].Performing a series of sequential transient simulations is a standard fault simulation method.The different runs are usually not related to the information acquired from the previous runs.A parallel transient simulation method is proposed in [95] to significantly reduce the simulation time by simultaneously simulating many defects with a transient analysis.This technique also provides better performance because the simulation allows the reuse of some results and structures which are obtained from the previous runs [30], [43].In another paper [51], an efficient fault simulation method has been proposed to simulate defects with a DC and transient simulation simultaneously.The division of the fault groups is dynamic depending on their impact on transient response.The complexity is reduced using novel techniques, including state prediction, RFM computation, and fault ordering.Outcomes of corresponding fault-free and faulty circuits are shared during the simulation process to speed up the simulation.Consequently, sharing outcomes from one fault helps to simplify the next one.A parallel paradigm-based approach has been discussed in [69] for automating fault simulation in analog circuits.The key concept was to use several computational resources for the simulation of defects in parallel.Schneider and Wunderlich [61] also proposed a parallelization approach to achieve high throughput for a series of transistor-level fault simulations.

E. Monte Carlo Simulation
Monte Carlo [10] is one of the most widely used methods to model parametric variations based on random combinations of values that are selected within the range of each parameter.However, the repetitive random sampling process is required for each defect to acquire more accurate results on the behavior of a circuit.This phenomenon makes the Monte Carlo method very expensive in terms of processing time [23], [63].Behavioral modeling and inductive fault analysis can be applied to enhance the processing time of the simulation.The Monte Carlo simulations require a lot of time.Nevertheless, there are works, unrelated to the analog domain, that propose sampling techniques that aim to overcome this problem as explained in [96] and [97].These sampling techniques are easier to implement with respect to the others.Stratigopoulos and Sunter [10], [47] made a reasonable effort to reduce the simulation time of the Monte Carlo technique.The Monte Carlo methods usually generate a large number of samples of predictable manufacturing deviations in a circuit and then only simulate those samples that are most likely to generate failing or marginally passing circuits.Hence, most of the time needed to assess the defect parameter value distribution is spent in the Monte Carlo simulations.Generally, analog designers spend much time with these types of simulations.In the absence of known distribution of the defect parameter value, a uniform distribution is used to inject the defects.The main shortcoming of this technique is the possibility of generating an unlimited number of parameter variation combinations.

F. Inductive Fault Analysis
Fault simulation speed can be reduced by reducing the number of defects to be simulated by using inductive fault analysis [34], reducing the simulation complexity by behavioral modeling, and reducing the equation set-up time by using the cache mechanism [35].An efficient flow approach has been proposed in [38], and the method is demonstrated to be valid for a large-scale industrial analog circuit.The primary contribution of this work was the automatic extraction of possible defect locations, computing the likelihood [52], [63] of each defect, and calculating the defect coverage of the circuit test list.The test list can be optimized by analyzing the simulation results and keeping only those defects that allow for achieving the highest coverage.Selecting defects to simulate based on their probability of occurring can reduce their number but still abide by a given confidence interval of the test coverage estimate.The main idea is to organize the simulation results so that the previous faulty circuit's solution can be a good starting point for simulating the next faulty circuit.For all the faulty and the good circuits that have similar behaviors, a one-step NR iteration is performed to create a good ordering.

G. Fault Sensitivity Analysis
FSA is a technique for the efficient simulation of analog defects without affecting the simulation accuracy of nonlinear circuits while performing transient analysis [11], [28], [33], [49], [75].This methodology can speed up the simulation process by two orders of magnitude for each defect.In [26], defects are injected only for a specific duration to reduce the overall simulation time, as shown in

H. DC Fault Simulation
A method for the efficient DC fault simulation of nonlinear analog circuits was presented in [64].This technique's main focus was reducing the NR iterations for the faulty circuit through two distinct methods.The first method aims at reducing the number of NR iterations to one for the faulty circuit and using the approximate solution for detecting faults.Instead, the second method proposes to reduce the number of NR iterations by exploiting the fault ordering technique.The idea is to sort the faults by the closeness of the approximate solutions derived from the one-step relaxation method.In this way, the previous faulty response is used as a good starting point for the next faulty circuit.In [36], an approach for fault analysis of the DC domain has been presented that reduces the number of required simulations.In the initial step, nonlinear equations are solved during DC simulation.It can be done by solving nonlinear equations using the NR method.The resulting system is solved, which consists of linear equations.The number of simulations required to determine the resistance between arbitrary nodes of a circuit is decreased by using the numerical technique of DC fault analysis proposed in this study.The total simulation time for the selected faults can be reduced by using many CPUs in parallel.In this work, the main focus is on reducing the simulation time of specific defects.One strong point of this method, which uses DC simulation, is that it is also applicable to nonlinear circuits.

I. Other Simulation Methods
There are other methods in the literature regarding defect injection and simulation of analog circuits.Fraccaroli et al. [71], [79] proposed different methodologies to speed up the simulation process by simulating functional level models.Another method proposed in [98], uses graph techniques applied to partition the circuit into independent subcircuits.The performance of fault simulation on subcircuits is expected to be more time efficient than simulating the complete circuit at once.A novel approach has been presented in [23] for test vector generation and parametric fault simulation.The statistical models of the faulty circuit and fault-free circuits are generated based on the sensitivity and the process information of the principal components of a circuit.In [80], an effective technique for the simulation of multiple catastrophic defects, either open circuits or short circuits for AC simulation is proposed.The technique uses the well-known Householder formula from matrix theory to determine node voltage discrepancies caused by changes in certain circuit components.This article's major contribution is to provide a systematic method for simulating various combinations of catastrophic defects.

J. Commercial Tools for Defect Injection and Simulation
In literature, some articles describe tools for automatic fault injection and simulation [39], [88].Cadence has introduced Legato Reliability Solution to support analog defect simulation.The tool offers various options to designers to accelerate the defect simulation and allows for exploring systems testability [84].In [93], an open-source tool that can be used for Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.transistor-level defect injection has been proposed.The impact of user-defined, open, short, and extreme variation defects modeled within schematic netlist or layout-extracted can be measured in Tessent DefectSim by Siemens electronic design automation (EDA) [99].Tessent DefectSim first it randomly selects defects to inject.In [85], an interface for analog defect injection and simulation based on Saber has been discussed.
Sunter [2] proposed a flow for more efficient simulation of defects, consistent with IEEE P2427 draft standard.A MATLAB/Simulink-based framework for fault simulation of linear analog circuits has been proposed in [67].The simulation started by identifying the type of defect and then constructing a signal flow graph (SFG) of the corresponding faulty circuit.In [32], another framework, SLIDER, has been proposed for injecting and simulating layout-level defects.

V. RESEARCH/TOOLING GAPS AND FUTURE DIRECTIONS
We have presented several state-of-the-art techniques for injecting and simulating transistor-level defects.In this section, we describe the limitations that those techniques encounter to be effectively used.We also describe the possible links between those techniques and new approaches in this research field.Defect injection techniques are mainly characterized by two aspects: 1) which defect model they inject and 2) the location where it is injected inside the device under test (DUT).The defect models represent an unwanted physical change in a circuit element or connection that is not within fabrication requirements.Unfortunately, there are several ways of modeling a defect's physical behaviors, which has led to the definition of different modeling techniques.Furthermore, each defect model can be injected into several locations of the DUT.Simulating all of them is infeasible given the number of defects that need to be injected based on the combination of those two dimensions (i.e., defect model and location).A designer must find a meaningful subset of all possible defects that allows computing fault diagnostic metrics properly and in the shortest possible time.Without standard guidelines on how to model defects, choose defect locations, and perform defect selection, each designer is bound to use useror company-defined techniques and metrics.The first attempt to standardize analog defect models for the different components of SPICE-based languages is the IEEE P2427 draft standard [4], currently under final revision.Some works have been proposed to generalize the analog defect models by defining them at the behavioral level with the Verilog-A language [91].These behavioral fault models are injected directly into transistor-level descriptions by using preprocessor commands.Consequently, this approach is challenging to apply but can be the first work that tries to define generic defect templates suitable for different simulators.
Table III summarizes the different articles included in the survey subdivided into macro categories.Some specific categories report more publications in the last years (see Table II for the complete list ordered by year).For example, techniques that perform fault grouping or behavioral modeling are the most present in the latest articles published in this area.On the contrary, some specific approaches are outdated and not followed anymore, e.g., Monte Carlo simulation-based approaches.The table presents only the articles related to fault simulation techniques and not the tools for simulating transistor-level descriptions with defects injected, so the total number of articles included in Table III is less than the total number of articles included in the survey Table II.All the techniques listed in Table III aim to reduce the overall simulation time required to perform a complete defect injection campaign based on the defect universe.These techniques can be subdivided into three groups, those techniques that 1) try to reduce the simulation time required to simulate one defect at a time; 2) try to reduce the total number of defects to be simulated; and 3) combine both the previous techniques.Performing a global defect injection campaign for a DUT has become a time-consuming task due to the ever-increasing size of analog circuits.These techniques are usually bound to (or developed for) a specific simulator, which heavily hinders the efforts of both industry and academia to improve them.The trend is to move away from the continuous-time models of computation and explore instead discrete-time or eventdriven ones.However, this transition requires generating new abstract models, starting from the original SPICE descriptions.Consequently, many works in literature are related to techniques that try to reduce the overall number of defects to be simulated by relying on the equivalence of the fault behaviors or using abstracted behavioral-level models.Another criticality of some techniques is related to the probability distribution of the defects used to compute the appropriate set of defects to be injected, e.g., likelihood-based techniques.To improve these, techniques are needed to automatically relate the design layout of integrated circuits (ICs) with these probability distributions Fig. 12.Comparison between the different fault simulation techniques using different criteria: the ability to reduce the number of defects, fault injection throughput, ability to generate abstract defect models, structural simplification, convergence stability, reuse of partial results, and support AMS simulation.The score is related to different criteria used to evaluate the different simulation techniques, and it goes from zero to five, where zero means that the corresponding technique does not meet the required capabilities, while five means that the technique can efficiently implement the required capabilities.associated with the faults to define appropriate metrics of each fault.III.Each metric has an associated score ranging from zero to five.A score of zero means that the considered technique does not meet the corresponding capabilities, while a score of five means that the technique can efficiently implement the corresponding capabilities.For example, the techniques that perform fault simulation by exploiting fault grouping have associated a score of 5 in the metric ability to reduce the number of defects to simulate because none of the other techniques have similar technical characteristics.While for example, simulation using fault sensitivity analysis techniques is associated with a score of 1 in the metric AMS simulation support because its technical capabilities to support co-simulation of the analog and digital part is less supported.A possible research direction to reduce the number of defects to simulate is combining layout-level descriptions with transistor-level descriptions to annotate the latter automatically.By exploiting this concept should be possible to inject and simulate only the defects that are reasonably likely contained in the defect universe as specified in the IEEE P2427 draft standard.Also, the work in [18] showed that simulating only the most likely one can still result in too many defects to simulate for an industrial circuit.Another possible research direction is moving to event-driven simulations, that is, simulations based on discrete-time events in the analog domain.Generally, when people speak about reduced models, they think about real number models (RNMs), which are models based on simple resolution functions.A resolution function makes some assumptions about the circuit, and then when a fault violates, it is easy to identify with checkers.The problem is that these techniques do not allow defects on the interfaces to be modeled, and consequently, only a reduced subset of defects can be described.Recently, the technique attracting the most interest exploits SystemVerilog models refined by a novel proposal, defining linear elements and a resolved, bidirectional signal type called EEnet resolution function proposed by Cadence [100].

VI. CONCLUSION
Analog defect injection and fault simulation techniques are EDA methodologies used to compute the diagnostic coverage of analog circuits to increase the functional safety of IC.These topics are interesting for researchers because there remain many research gaps and a lack of standard specifications.Defect simulation in large IC is time-consuming Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
and challenging for the designers because it requires prior knowledge of the design layout, the probability distribution of the defects, and the defect models to be injected.This survey presents a complete literature review of defect injection and simulation techniques used in the analog domain.Moreover, the criticality and limitations of the different techniques grouped in macro-categories are discussed.Finally, the evolution trends of this research field are presented and related with some suggestions based on the author's knowledge in this field.

Fig. 2 .
Fig. 2. PRISMA-based flowchart used for the research and selection of the articles.

Fig. 3 .
Fig. 3. Entire set of papers retrieved through the search string (see (1)), subdivided into scientific sectors.Articles can fall into several scientific sectors.

Fig. 5 .
Fig. 5. Transistor-level description of the OPAMP model take from [6], a set of AMS benchmark circuits for comparing fault-related techniques.

Fig. 6 .
Fig. 6.MOS cross section with oxide trapped charges and interface traps.

Fig. 7 .
Fig. 7. Locations of possible analog defects of a transistor-level description of an inverter.

Fig. 9 .
Fig. 9. Categories in which the selected articles are subdivided in this survey.Each technique is explained in detail into Section IV.

Fig. 11 .
Fig. 11.Fault sensitivity analysis overview (see [26]).(a) Calculation of the results at every time point.(b) Calculation of results only on selected test points.

Fig. 11 .
Time points are shown in the (a) part of the figure for fault simulation of a priority chosen conductance bridge.A large number of time points are used to execute the standard transient simulation.Meanwhile, the number of time points with test measurements is low, as shown in the (b) part of the figure.The aim is to compute the outcome of the faulty circuit only at the measurement time points.

Fig. 12 1 ) 6 )
Fig. 12 compares the fault simulation techniques presented in the survey based on different criteria.The criteria used to create the comparison are as follows.1) Ability to Reduce the Number of Defects to Simulate: The overall number of defects included in the fault injection campaign.2) Fault Injection Throughput Per Unit Time: The time required to perform the fault injection campaign.3) Ability to Generate Abstract Defect Models: Identification of failure modes.4) Structural Simplification: Ability to replace peripheral blocks with abstract representations.5) Convergence Stability: Risk of nonconvergence of the simulation.6) Reuse of Partial Results: The ability to reuse previous solutions to increase the solver efficiency.7) AMS Simulation Support: The ability to support mixed analog-digital simulations.These metrics make it possible to highlight the key technical characteristics of each simulation technique proposed in TableIII.Each metric has an associated score ranging from zero to five.A score of zero means that the considered technique does not meet the corresponding capabilities, while a score of five means that the technique can efficiently implement the corresponding capabilities.For example, the techniques that perform fault simulation by exploiting fault grouping have associated a score of 5 in the metric ability to reduce the number of defects to simulate because none of the other techniques have similar technical characteristics.While for example, simulation using fault sensitivity

TABLE I NUMBER
OF PAPERS RETRIEVED FROM THE SELECTED DATABASES THROUGH THE PRISMA-BASED FLOWCHART (SEE FIG. 2)

TABLE II COMPLETE
LIST OF ARTICLES INCLUDED IN THE SURVEY RELATED TO ANALOG DEFECT INJECTION AND FAULT SIMULATION

TABLE III SUBDIVISION
OF THE SELECTED ARTICLES IN MACRO-CATEGORIES RELATED TO FAULT SIMULATION TECHNIQUES