Applications of Phase Change Materials in Electrical Regime From Conventional Storage Memory to Novel Neuromorphic Computing

Phase-change materials, also well known as the Chalcogenide alloy, have received considerable attention during last two decades owing to its widespread applications in the field of the electrical storage market such as phase-change random access memory and phase-change probe memory. In addition to the storage devices, its unique electrical properties that can be dynamically tunable with respect to the electrical excitations lead to numerous novel applications represented by memristor and memristor-based neuromorphic electrical circuits. These emerging applications undoubtedly allows for a further exploitation of the potential of phase-change materials, and thus makes it advantageous over other storage medium like ferroelectric and magnetic materials. In order to help researchers understand the role of phase-change materials in these novel applications as well as their importance for citizen’s daily life, a comprehensive review that not only covers the traditional storage applications, but also the applications on these exotic devices becomes imperative. In this review, the chemical structure of phase-change materials and their remarkable electrical properties are first reviewed, followed by an introduction of their applications on the storage fields. The physical principles of various emerging electrical devices using phase-change materials are subsequently overviewed in association with their state-of-the-art progress. The prospect of phase-change materials for the future non-volatile electrical applications that are yet to be unraveled is finally envisaged.


I. INTRODUCTION
'Digitalization' is ubiquitously involved into every citizen's daily life today and generates tremendous amount of digital data all over the world. To outpace the increasing rate of the global data, the capacities of current mass storage devices need to be considerably boosted. However, the conventional mass storage devices such as magnetic hard disk [1], optical disc [2], and magnetic tape [3], have been subjected to their respective physical limits, severely hindering their potential for further enhancing the storage capacity. On the other hand, the data processing speed of the modern computer is restricted by the well-known von Neumann bottleneck that separates the storage device (i.e., memory) from the data processing device (i.e., central processing unit (CPU)) [4].
The associate editor coordinating the review of this manuscript and approving it for publication was Andrei Muller . This mode causes data transfered back and forth between the memory and the CPU, and thereby impairs the data processing efficiency. Additionally, artificial neural networks (ANNs) have been explored and successfully applied in various fields such as image and pattern reccognition [5], machine translation [6], and beating humans at the Go game [7]. In spite of these progress in neuromorphic computing, the hardware implementation of these ANNs are impaired by the fact that the key component of modern computers, i.e., the digital transistors, do not operate in the same fashion as the analog synapse that serves as the basic building block of the biological neural network. As a result, more innovative technologies are highly desired so as to revolutionize the traditional storage memories and ANNs.
One promising solution to resolve above issues is to exploit some novel materials that exhibit the distinct physical attributes at the nanoscale or even smaller regime. Crystalline state is achieved by heating amorphous PCMs above T glass (glass transition temperature), while heating crystalline state above T melt (melting temperature), followed by a quick quenching, can reversibly turn it back to amorphous state. (a) is reprinted with permission from [14].
Different physical properties can be usually implemented to indicate binary codes (i.e., '1' and '0'), while sustaining such properties difference when even downscaled to the nanoscale size implies the possibility of achieiving ultra-high storage capacity. To find out the targeted materials, majority of efforts have been devoted to the material characterization and synthesis, leading to an advent of several candidates such as the ferroelectric materials [8]- [10], the magnetic materials [11]- [13], and the resistive phase-change materials (PCMs) [14]- [16]. Within these candidates, PCMs have attained more attention due to its great downscalability, fast switching speed, long endurance cycling and stable data retention, and thus received widespread applications in the electrical storage market, mainly focused on phase-change random access memory (PCRAM), and phase-change electrical probe memory. Given its superior storage features, numerous research paper and US patents with regard to phase-change memories have been published during last two decades. As the electrical properties (e.g., resistivity) of PCMs can be continuously tailored for different external stimulus, PCMs have most recently opened several prospective applications ranging from memristor to the brain-inspired neuromorphic circuits. To help researchers understand the unique properties of PCMs and current PCMs-based approaches being investigated that aim to improve the performance of synaptic devices towards the hardware acceleration of ANNs, a thorough review that outlines the intrinsic electrical properties of PCMs and their respective physical principles when applied for different stroage memories and the novel electronic devices is of high importance. In this review, the chemical structure of PCMs and their remarkable electrical properties are first reviewed, followed by an introduction of their applications on storage fields ranging from PCRAMs to phase-change probe memory. The physical principles of various emerging electrical devices using PCMs are subsequently overviewed in association with their state-of-the-art progress. The prospect of PCMs for the future non-volatile applications that are yet to be unraveled is finally envisaged.

II. PHASE-CHANGE MATERIALS
PCMs are generally considered as any substances that exhibit either a so-called crystalline form with the atoms arranged in an ordered pattern, or a so-called amorphous form lacking of an atomic long-range order. However, the PCMs suitable for the storage applications usually need to satisfy the following requirements [17]: (1) superb down-scalability in nanoscale regime targeted for high storage capacity; (2) rapid phase transition targeted for fast write speed; (3) a relatively low phase-transition temperature targeted for small energy consumption; (4) great stability at room temperature targeted for long data retention; (5) large properties difference between the crystalline and amorphous forms targeted for the discernible readout signals. As a result, most of the encouraging PCMs candidates fall into the category of the Chalcogenide family (Group VI elements, mainly O, S, Se, Te, Po), further divided into the materials along the GeTe-Sb 2 Te 3 pseudo-binary line, the alloys along the GeTe-Sb pseudo-binary line, and the Te-based eutectic alloys, as illustrated in Figure 1(a). Within these groups, the compositions along the GeTe-Sb 2 Te 3 pseudo-binary line, represented by Ge 2 Sb 2 Te 5 (GST225), have received extensive applications for various phase-change memories [18]- [20].
To induce the crystalline GST225, it is essential to heat the amorphous GST225 above the glass transition temperature but below the melting point, followed by a slow FIGURE 2. Nucleation-dominant and growth-dominant crystallization mechanisms for Ge 2 Sb 2 Te 5 (left column) and AgInSbTe (right column), respectively. Reprinted with permission from [22].
quenching process. The amorphous GST225 can be achieved by heating its crystalline state above melting temperature that is subsequently subjected to a rapid cooling. Such transition processes are schematically illustrated in Figure 1(b). It should be kept in mind that crystallization of GST225 usually displays a so-called nucleation dominant mechanism, meaning that its crystallization mainly depends on the formation of numerous nucleus inside the amorphous background and their subsequent expansion [21], particularly for the PCMs located in the GeTe-Sb 2 Te 3 pseudo-binary line. In contrast to GST225, the undoped and doped SbTe materials exhibit a so-called growth dominant mechanism [22], for which the crystallization stems from the crystalline-amorphous interfaces. This allows for a higher crystallization rate. Such crystallization discrepancy is illustrated in Figure 2.
The amorphous-to-crystalline phase transition, exemplified by GST225, in fact experiences several intermediate crystalline phases (i.e., metastable phase) and eventually reaches a stable state at higher temperature [23]. It is well known that the crystalline GST225 at its first metastable phase presents a rock-salt structure belonging to the cubic system [24], as illustrated in Figure 3(a). Note that for the first metastable phase, the cation sites are randomly occupied by 40% Ge and 40% Sb as well as 20% vacancies, whereas the anion sites in the cubic structure are occupied by Te atoms. This results in a mixed GeSb/vacancies sublattice sites that distorts the Ge and Sb atoms within the idealized Te(GeSb) 6 octahedrons [24], [25]. In this case, the amount of vacancies in the cation sublattices determines the extent of the lattice distortions in the GST225 alloys, suggesting that higher amount of vacancies usually cause the larger lattice distortions. The aforementioned metastable phase is however energetically unfavourable, and along with the thermal heating, the additional thermal energy gives rise to the rearrangement of Ge and Sb atoms as well as gradual ordering of the structural vacancies in the planes in the GST225 [26]- [29]. Different spatial distributions of the vacancies in the GST225 are secured, thus resulting in various metastable GST225 crystal structures (i.e., polymorphism of the metastable GST225 crystal structures). Within these metastable states, the GST225 phase with highly ordered vacancy layers that comprises the GST building blocks periodically separated by vacancy layers between adjacent Te-Te planes is generally considered as the second metastable phase of GST225 [30], as shown in Figure 3(b). As in the case of the first metastable phase, off-center of the Ge and Sb atoms within the Te(GeSb) 6 octahedrons are witnessed, and the Te-Te distances across the vacancy layers vary relying on the degree of vacancy ordering. Such a metastable phase is assumed to have a random distribution of the Ge and Sb atomic species within individual cation layers, indicating that the preferential ordering of Sb atoms are located in the cation layers next to the vacancy layers and the Ge species tend to accumulate preferentially in the middle of the GST225 building units. The second metastable crystalline phase of GST225 was reported to have a superstructure of the rock-salt type with the cubic stacking of Te layers, in analogy to the vacancy ordered Chalcogenide alloys forming in the Ge-Sb-Te-Sn system [31]. The continuous heating of GST225 transforms its metastable phase to the final stable state that is categorized to the trigonal crystal system and exhibits a layered structure with a 9P type stacking sequence [32], as illustrated in Figure 3(c). The 9 layers make a building unit of the stable phase, and these building blocks are stacked along the c-axis and periodically separated from each other by van der Waals gaps between the adjacent Te layers.
The GST225 alloy is well known to exhibit sharp difference on electrical resistivity and optical reflectivity between the crystalline and amorphous states, as verified in Figures 4(a) and 4(b). Such properties variations can be attributed to its structural disorder [33]- [35]. The vacancies in the first metastable phase of GST225 are randomly distributed and enable the structure stability [36]. Additionally, the energy is further reduced due to the lattice distortions, and the large electron scattering in conjunction with the electron low mobility are obtained due to the high degree of the structural disorder, giving rise to the electrically insulating behavior [33], [37]. However, the subsequent ordering of the vacancies into vacancy layers gradually improves the structural order in the cation sublattice in the GST225 materials, which vanishes the localized states and causes a transition from an insulating to a metallic state. Unlike its electrical properties, the optical properties difference is likely due to the changes in the free carrier absorption caused by the electron localization effects [23], [38], [39]. As revealed from the corresponding electrical measurements, the vacancy order phase (the second phase) was found to have a slightly higher carrier concentration than the disordered phase (the first phase). This may indicate that the increased carrier mobility due to the vacancy ordering causes such optical parameters contrast such as absorption coefficients. Another crucial trait of GST225 arises from its FIGURE 3. (a) ABC cubic stacking of the Te sublattice with the crystal structure of GST225 at its first metastable phase along <001> crystallographic projections. The bright dots are Te/GeSb/vacancies atomic column; (b) Atomic-resolution HAADF-STEM image of GST225 at its second metastable phase with the corresponding structural model along the [1120] direction; the dark lines in the image represent vacancy layers. (c) Atomic-resolution Cs-corrected HAADF-STEM image of the trigonal GST225 with the trigonal GST crystal structure. The unit cell is marked by a rectangle and the main structural motif of the motif of the GST225 phase is marked by an octahedron. Reprinted with permission from [23]. threshold switching phenomenon that the resistance of the amorphous GST225 experiences a sudden drop once the voltage across the material reaches the so-called threshold voltage value [40], as depicted in Figure 4(c). The advent of the threshold switching behavior allows for high current when suffering from a relatively low voltage, thereby reducing the energy consumption significantly. The physical mechanism governing the threshold-switching behavior was originally ascribed to be thermal-induced [41], as the electrical conductivity of GST225 was found to increase at higher temperature. However, as the switching speed of the GST materials is faster than the thermal time constant, the current consensus is that threshold-switching is a result of the pure electronic-induced mechanism [42]. Several plausible models were therefore proposed to interpret the threshold switching behavior. One electronic model attributed threshold switching to the outcome of the energy gain of the electrons in a high electric field due to a voltage-current instability [43], whereas another model considered the cause of threshold switching as the formation of the crystalline nuclei inside the amorphous region induced by the electric field [44], [45]. Differing from aforementioned two models, an impact ionization model owes threshold switching to the secondary carrier generation in the amorphous region [46]. Further experimental demonstration is therefore required so as to unravel the realistic threshold switching mechanism. The thermal conductivity of GST225 that strongly depends on its metastable phases ranges from 0.05 to 1.76 Wm −1 K −1 , implying a good heat insulation effect [47].
In contrast to threshold switching, resistance drift that causes an increase in amorphous state resistance as a function of the time on the other hand gives rise to an adverse feature of GST225 [48], [49], as demonstrated in Figure 5.
Although such resistance drift may lead to higher reading contrast, it may severely impair the possible multi-level storage applications due to the difficulty in discriminating different resistance states. To date, structural relaxation [49], [50] and mechanical stress relaxation [48] are considered as the most likely cause of the resistance drift. Structural relaxation results from the decrease of the structural defects such as reflectivity measured as a function of increasing temperature (ramp rate of 10 • C/min) for 100 nm thick GeTe and Ge 2 Sb 2 Te 5 films. The resistivity measured during cooling down is also shown. The amorphous phase exhibits a high electrical resistivity and a low optical reflectivity, while the crystalline phase shows a much lower resistivity and higher reflectivity. (c) Current-voltage curve for PCMs having threshold switching behavior. V th denotes the threshold voltage. (c) is reprinted with permission from [40].
the vacancies and distorted bonds, thus yielding the increase of the band gap and the activation energy for conduction while decreasing the trap densities [49], [50]. This in turn extends the trap spacing to raise the barrier for conduction through the trap states, consequently inducing the resistance drift. Mechanical stress relaxation arises from a fact that the density variation between the amorphous and crystalline phases produces the relaxation of the compressive stress developed in the amorphous phase in solidification [51]. Albeit the theoretical calculations from aforementioned two hypotheses matched the experimental measurements well, the realistic physical mechanism governing the resistance drift still remains mysterious.

III. PHASE-CHANGE MEMORIES
Owing to its remakable properties difference, PCMs have received extensive applications particulary in the storage fields. Several well-known PCM devices devised to satisfy various storage requirements have been proposed during last decade, mainlying including PCRAM, and phase-change probe memory. PCRAM is undoubtedly the most well known phase-change memory that was invented to replace Flash or even dynamic RAM. The conventional PCRAM, also called Lance-type or ovonic unified memory, consists of a phase-change layer such as GST225 sandwiched between a top metal electrode and a resistive electrode called heater [52], as illustrated in Figure 6(a). Programming is performed by applying programming current into the GST225 via the heater. Under this circumstance, high current density is secured at the GST225/heater interface due to the small contact region, which induces phase-transformation at this interface when the corresponding phase-transition temerature is achieved. During the readout process, a low voltage potential between the top and bottom electrodes is implemented to produce the readout current variation between phase-transformed region and its surrounding un-transformed area. The storage merits of PCRAM when compared with other non-volatile memories such as Flash, ferroelectric RAM (FeRAM), and magnetic RAM (MRAM) mainly arise from its ability to provide fast write speed, long data retention, low energy consumption and superb downscaling [53]. Note that the crystallization process of PCRAM is usually named 'SET', while its amorphization process is called 'RESET'. The induced phase-transformation regions during the 'SET' and 'RESET' processes can be recognized through the cross-section transmission electron microscopy (TEM) images [54], as shown in Figures 6(b) and 6(c). The write speed of PCRAM therefore fully depends on the switching speed of the amorphous-to-crystalline transition, VOLUME 8, 2020 i.e., the width of the 'SET' pulse. The switching speed of the GST225-based PCRAM was reported to be less than 100 ns [55], lagging behind Flash and the conventinal dynamic RAM. One solution is to make use of an atomic-level engineered GST225(doping through the reactive sputtering) to form the Ge-Sb/Ge-Te metallic bonding similar to undoped GST225, exhibiting fast SET switching (down to 20 ns) in a 128 Mbit phase-change memory chip [56] (Figure 7(a)). Another promising scenario to increase the switchng speed is to employ post PCM deposition processing by coating a dielectric capping layer over the PCM element [57]. Data retention time is another critical parameter that demands at least 10 years at 150 • C for the automotive applications [58] and tens of seconds at ∼260 • C for the pre-coded chips [59]. As retention time is dominated by the crystallized temperature, relatively higher temperature for crystallization is desired so as to improve its thermal stability. A fairly low crystallized temperature of the GST225 (∼150 • C) was experimentally found [60], thus implying low retention time. This issue can be addressed using GST212 whose thermal stability can be further mitigated by increasing the Ge concentration [61], resulting in a crystallized temperature at ∼250 • C. It was found that doping additional elements such as N or C into aforementioned highly Ge-rich composition allows for longer retention time [62], [63].
The energy consumption of PCRAM is usually determined by its 'RESET' current that needs to be greatly reduced. Note that 'RESET' current is generated by a cell selector connecting in series with the PCRAM unit. The downscaling of the cell selector is therefore required to match the downscaling of PCRAM. On the other hand, the size of the cell selector can not be reduced significantly in order to provide adequately high 'RESET' current to induce the  Reset current scaling versus electrode contact area in PCM devices. The inset shows a schematic of a typical PCM cell, where A is the contact area and d is the contact diameter. Blue circles represent devices with metal contact electrodes that are lithographically defined, while the black squares are data taken on prototype devices with carbon nanotube electrodes. Reprinted with permission from [78]. density expansion. The traditional approaches for downscaling of 'RESET current is to shrink either the heater/PCM interfacial region [64], [65] or the PCM programming volume [66]- [68], leading to the advent of several advanced cell structures such as edge-contact [69], µTrench [70],] ringshaped [71], pillar [72], pore [73], cross-spacer [74], and dash cells [75] (Figure 8). According to the 'RESET' current comparsion among aforementioned cells, 'RESET' current was found to increase proportionally along with the effective contact diameter, as illustrated in Figure 9, and an average 'RESET' current density of ∼40 MA/cm 2 is required to program the RCRAM cell [68]. Engineering the PCRAM cell to the sublithographic sizes by introducing extra spacers or using the carbon nanotube electrodes leads to a current density down to ∼10 MA/cm 2 [76]- [78]. Another innovative paradigm from the conventional methods to reduce 'RESET' current arises from the presence of a novel interfacial phasechange memory concept that utilizes a superlattice PCM stack formed by alternating two crystalline layers with different compositions [79], as illustrated in Figure 10. The principle of such interfacial phase-change memory depends on the alignment of the c-axis of a hexagonal Sb 2 Te 3 layer and the <111> direction of a cubic GeTe layer in a superlattice, which switches Ge atoms between the octahedral sites and lower-coordination sites at the interfaces between superlattice layers [80]. As its switching mechanism was reportedly dominated by the limited movement of the Ge atoms with the conduction channel of Te-Te inducing the difference in resistance [81], the phase transition of the interfacial phase-change memory can be achieved without reaching the melting temperature, and the 'RESET' current was confined down to 3 µA with a Sn 10 Te 90 /Sb 2 Te 3 superlattice PCRAM cell [79]. Besides above properties, the enhancement on the storage capacity of PCRAM heavily relies on the scalability of the cell structure. The thinnest PCM thickness was theoreticaly predicted to be ∼2 nm at which the corresponding threshold voltage was estimated to be ∼0.1 V, assuming that the threshold voltage is proportional to the layer thickness [82]. Obviously such low threshold voltage can not be distinguished from the readout voltage and causes the readout error. One possible solution is to develop a phase-change nanowire device that gives rise to a constant threshold voltage below 10 nm thickness [83]. An alternative strategy is to implement a PCRAM cell with an active device area serving VOLUME 8, 2020 as a nanoscale gap between two carbon nanotube electrodes through the electrical breakdown [76], [77]. This novel design makes its threshold voltage vary proportionally with the nanogap size, suggesting an average field of 100 V/µm for GST225. It is necessary to point out that the threshold voltage of a given PCRAM also varies with respect to time due to the aforementioned structural relaxation effect that causes a higher energy barrier for sub-threshold conduction [84], thus resulting in an increase of resistance and a decrease of leakage current. The lower leakage current also gives rise to a higher threshold voltage, indicating a drift behavior where the threshold voltage increases linearly with the logarithm of time. As a result, the increase of the threshold voltage may make 'SET' programming somewhat difficult, when the threshold voltage raises above the nominal voltage for 'SET' process. One feasible way to overcome such a threshold voltage drift problem is to tailor the programming voltage. Alternatively, it is possible to alleviate drift by adopting the bipolar switching mechanism that gives rise to an oppose signal polarity between 'SET' and 'RESET' process [85]. The bipolar switching leads to different resistance states through the ionic migration other than the phase-transformation. The high resistance state inside the bipolar switching mode is secured by applying a high negative voltage at the top electrode, causing a depletion of cation species (Ge, Sb) at the bottom electrode contact. Benefiting from the independence of the amorphous region, structural relaxation and drift are fully suppressed in the 'RESET' state. In spite of aforementioned progress, the key requirement for improving the scalability of PCRAM is to devise a cell selector that not only provides sufficiently high current density for 'RESET', but also exhibits great scalability. As a result, the conventional MOSFET selector was ruled out from the candidate list due to its failure to offer required current density. In this case, several alternative cell selectors such as bipolar transistors [86], p-n junction diodes [86], Schottky diodes [87], metal-insulatortransition [88], and ovonic threshold switching (OTS) [89] were therefore under extensive research. One prospective access device has more recently been proposed, named mixed ionic-electronic conduction (MIEC) for the 3D-stackable access device [90]. Such an access device depending on Cu ion motion in novel Cu containing MIEC materials can be fabricated with temperatures commensurate with back-endof-line processing (∼400 • C), exhibiting the scalability of < 30 nm and the current density up to 50 MA/cm 2 [91]. These encouraging findings have excited an advent of an all 3D vertical chain cell type phase-change memory (VCCPCM) based on poly-Si diode [92], giving rise to an excellent scalability with a reasonable crystallization temperature.
Scanning probe phase-change memory that comprises a nanoscale electrical probe and a storage stack having a PCM layer sandwiched between a capping layer and a bottom electrodes [93], [94], as illustrated in Figure 11, was proposed to compete with the magnetic hard disk for next generation mass storage device. The recording of phase-change probe memory is performed by injecting write current into the PCM layer via a conductive probe, and the resulting joule heating gives rise to the required phase-transformation. To perform the readout process, a low voltage potential is applied into the PCM layer through the conductive probe, and the consequent readout current is detected to distinguish the phase-transformed region from the un-transformed region. Undoubtedly the advantages of PCRAM such as fast switching speed, low energy consumption, long data retention can be perfectly inherited by phase-change probe memory. Moreover, the size of the recorded bit is approximately proportional to the probe dimension. This clearly indicates that using a probe tip with sharp tip apex, (i.e., small probe diameter) in general allows for ultra-high storage capacity. As write current prefers to follow a conductive path, the recorded bit size in fact much FIGURE 11. Designed scanning probe memory system (left) and its storage operations in write (top right) and readout (bottom right) modes. A 2D cantilever array is usually implemented for scanning probe storage applications to record data on a mobile sample. The write process is achieved by thermally inducing the phase transformation via high write current, and sensing the resistance difference between phase-transformed region and its surrounding can realize the readout process. Updated and Reprinted with permission from [93], [94]. more relies on the conductive diameter of the electrical probe than its physical diameter. As a result, the probe tip used for phase-change probe memory only needs to be electrically sharp [95], [96], while having a blunt physical diameter. This advantageous attribute from conventional probe concepts can reduce the pressure at probe/sample interface and significantly mitigate its anti-wear characteristic.
The design of phase-change probe memory mainly focuses on its media stack and probe tip. Based on aforementioned design rule, a SiO 2 encapsulated Si probe with PtSi at tip apex has been widely implemented for phase-change probe memory, as schematically described in Figure 12(a). The SiO 2 encapsulation can increase the physical diameter of the probe tip and thus extend the longevity of the probe tip. Thanks to the higly electrical conductivity of PtSi, doping PtSi at tip apex can improve the electrial conduction performance of the probe tip. In addition, using PtSi can also benefit the anti-wear characteristic of the probe tip ( Figure 12(b)). More efforts have therefore been devoted to the optimization of the media stack including capping layer, phase-change layer and bottom layer. Bottom layer that acts as the bottom electrode to collect current was reportedly found to have a slight impact on the write and readout performances of phase-change probe memory [97], [98]. However, the bottom layer is commonly desired to have large electrical conductivity and low thermal conductivity to provide adequate current density and reduce heat dissipation towards substrate [97]. Several bottom electrodes with different compositions such as diamond-like carbon (DLC) [99], metal [100], and TiN [101], [102], have been implemented in the past. The thermal conductivity of the DLC layer can be experimentally minimized to 0.2 Wm −1 K −1 [103], giving rise to a good thermal insulation effect. However, its electrical conductivity remains low (∼100 −1 m −1 ) even if increasing its thickness to 30 nm [104]. This increases the device resistance and causes larger energy consumption for phase-transformation. VOLUME 8, 2020 The metal electrode such as Pt exhibits an ultra-high electrical conductivity as well as a high thermal conductivity. Such high thermal conductivity obviously causes serious heat loss towards the substrate, and also costs extra write energy. In this case, TiN bottom electrode has been commonly adopted due to its high electrical conductivity (∼5 × 10 6 −1 m −1 ) and relatively low thermal conductivity (∼12 Wm −1 K −1 ) [105]. Phase-change layer plays an important role in the write performance of phase-change probe memory, as the corresponding threshold voltage is strongly dominated by its thickness. As a result, a thin phase-change layer is preferable due to its ability to provide for lower threshold voltage [106]. However, phase-change layer seems to barely influence the resulting reading contrast [107], which is likely due to the fact that device resistance is majorly determined by the capping layer rather than phase-change layer. Capping layer is majorly implemented to protect the phase-change layer from wear and oxidation, and the DLC media previously used as the protective coatings in objects such as magnetic storage disks, car parts, and optical windows due to its high hardness and great thermal stability [108], was previously considered as the most appropriate composition for capping layer [99], [105], [109]. More importantly, as capping layer acts as a conductive bridge to connect probe with phase-change layer, its thickness and electro-thermal properties need to be carefully determined. A thin capping layer with a high electrical conductivity and a low thermal conductivity is favourable owing to its capability of generating required phase-transition under a low current bias [110]. In contrast, inducing larger reading contrast is usually accompanied with a capping layer having relatively low electrical conductivity. Obviously using a capping layer with higher electrical conductivity intensifies the current spreading effect and thus lowers the reading contrast. Based on results presented so far, a DLC capping layer with a thin thickness, intermediately high electrical conductivity, and low thermal conductivity is required in order to accomplish the expected phase-transformation associated with high reading contrast and the least energy consumption. The optimized capping layer is therefore predicted to have a 5 nm thickness with an electrical conductivity of ∼100 −1 m −1 and a thermal conductivity of ∼0.5 Wm −1 K −1 , respectively [111]. Nevertheless, similar to bottom electrode, the thickness of the capping layer in practice at least needs to exceed 30 nm in order to achive the theoretically optimized value, which makes phase-transformation extremely difficult. Besides, using DLC capping introduces large contact resistance at the probe-capping interface and causes extra write energy [112]. As contact resistance is mainly determined by the electricial resistivities of probe and capping layer, it is timely to search for a thin capping layer that however allows for a high electrical conductivity and a low thermal conductivity. Triggered by above quest, two additional capping layers that are made of TiN [113], [114] and ITO medium [115] respectively, have most recently been proposed. As suggested from its bottom electrode application, TiN media enables a high electrical conductivity and a relatively low thermal conducitivity under a thickness of 2 nm, satisfying the above requirement. Similar to TiN, a 5 nm thick ITO layer also exhibits an electrical conductivity of 10 3 −1 m −1 and a thermal conductivity of 0.84 Wm −1 K −1 [115], consequently leading to a low contact resistance. Although both TiN and ITO capping layer provide satisfying write performance, their feasibility of inducing distinguishable readout signal remains questionable. This is because the capping layer presents a much more electrically conductive path than phase-change layer in this case, and the readout current prefers to flow along the capping layer rather than through the phase-change layer, severely deteriorating the reading contrast. (a) Readout mechanism of phase-change probe memory using an optical means (left) and device transmission variations along with the laser spot location for both crystalline and amorphous array (right). Tr 0 corresponds to the minimum device transmission within various configurations previously investigated and Tr represents the transmission difference between any other configuration and Tr 0 . (b) simulated geometrical stack of designed patterned probe phase-change memory in front view (left) and the writing of the crystalline bit array using the aforementioned design (right). (a) is reprinted with permission from [116]. (b) is reprinted with permission from [117].
One possible solution to address above issue is to develop an optical mean to distinguish the recorded mark from its background [116], as shown in Figure 13(a). A mobile laser beam, through a scanning near-field optical microscopy, is focused on the surface of a previously recorded phase-change layer. Due to the sharp transmission difference between the crystalline and amorphous states, The resulting device transmission from the laser beam when focused on the recorded mark strongly discriminates from that focused on the background. The recorded mark can be therefore detected by sensing the transmission variation. However, the possibility for the focused laser spot to exactly fit the size of the recorded mark is challenged. Another strategy, as shown in Figure 13(b), is to fabricate the patterned PCM media [117]. Instead of the continuous PCM layer, the patterned PCM media comprises numerous patterned PCM cells isolated by the insulator regions such as SiO 2 . To protect cells from oxidation and wear, a capping layer can be also patterned and deposited above each PCM cell. Owing to the great electrical and thermal insulation effects of the isolated regions, phase-change probe memory with the patterned media eliminates the thermal and readout cross-talke effects and allows for ultra-high density with the discernible reading contrast [114], [117].
The deep understanding on PCMs and the technological developments of phase-change memories most recently revealed another promising potential of PCMs for a storage-class memory (SCM) whose target is to complement the existing memory hierarchy and operate at performance and cost between that of NAND Flash and dynamic-RAM (DRAM) [118]. The implementation of the PCMs on the future SCM market can be categorized into a memory-type SCM that endeavors to the write speed, and serves as a nonvolatile DRAM, and a storage-type SCM whose goal is to lower the cost and to replace with NAND Flash [119]. Figure 14(a) shows the access time versus cost of various memory technologies and the desired range of operations required for SCM. SCM in this case is desired to fill in the gap having both < 10 µs access time and < $1/GB die cost, and thus to complement the current market.
Developing a storage-type PCMs-based SCM requires a significant enhancement of the recorded density, which is likely to be achieved by integrating sub 10 nm feature size devices, high density 4 -6 F 2 structures, multiple stable resistance states, and low reset current density < 1 MA/cm 2 [119]. In contrast, high endurance cycles (≥ 10 12 cycles) and fast 'SET'/'RESET' speeds (< 50 ns) are two essential parameters for a memory-type PCM-based SCM. As a result, expanding the conventional 2-D phase-change devices into the 3-D architecture has recently attained vast attention due to its ability to further increase the density and to improve the endurance cycles. One intriguing 3D structure, which has already been realized at the industry level, is the 3D XPoint memory [120], as shown in Figure 14(b). Such 3D XPoint structure has multiple memory cells stacked verticallly that alternately share either wordlines or bitlines with the stack above them. Although the key materials compositions adopted the 3D XPoint still remain controversial, it is generally suspected that it makes use of the PCMs with Ovonic threshold switch selector structure. As the 3D XPoint memory involves a lithography and patterning step for each layer, the number of layers is possibly restricted to the single digits (exact number depends greatly on the yield and cost of each step required). Such 3D stacking can also reduce the device cost owing to a fact that the peripheral circuitry requires a constant number of process steps whereas the steps to deposit each layer increase linearly with the number of layers. Stacking layers can therefore reduce the overall cost/bit, particularly for the case of the cost of the peripheral circuitry much higher than the memory stacks [121].

IV. PHASE-CHANGE MEMRISTORS
Memristor that describes a non-linear relationship between the magnetic flux and the electric charge is well known as the four th fundamental circuit element in addition to resistor, capacitor, and inductor, as illustrated in Figure 15(a).
The mathematical model of memristor, proposed by Chua, was defined as [122]: where w can be an internal state variable, and R and f can in general be the explicit function of time. Eqns (1) and (2) relate the voltage across the device with the current VOLUME 8, 2020 through it at any particular time. In spite of the mature theory, the physical realization of memristor device remained undeveloped until the presence of a nanoscale TiO 2 device model that mimicked theoretically the memristive behavior [123]. This breakthrough brought about the prosperity of the memristor technology in aspects of both the theoretical models [124], [125] and experimental fabrication [126], [127], leading to many technical publications and enormous US patents. The most attractive feature of memristor arises from its current-voltage curve that displays a pinched hysteresis loop, as illustrated in Figure 15(b). Although such a pinched hysteresis loop passes through the origin, various shapes can be obtained depending on the frequency and amplitude of the input signals [128]. Obviously increasing the frequency of input signal to infinity turns memristor into an ordinary resistor. In addition, Figure 15 clearly reveals that memristor in its state variable maintains the information on the electric charge/magnetic flux that pass it rather than maintaining the charge/flux itself, suggesting for a capability of remembering its most recent state when encountering power off. This property ideally makes memristor suitable for the non-volatile storage. Most importantly, memristor resistance can be dynamically modified along with the input excitation, whereby memristor can be implemented to simulate the functions of brain synapse whose biological state (i.e., weight) can be adjusted by the interactive strength between its pre-and post-neurons (Refer to Section 5 for more details). Memristor in this case has received widespread applications from the non-volatile memory to the artificial brain, and memristive devices based on various materials such as resistive materials [124], [129], ferroelectric materials [130], and magnetic materials [126], [131], have been technologically investigated. Due to its eminent binary storage characteristic, phase-change memory devices that belongs to resistive memory category undoubtedly meets the memristor definition and has recently attained considerable attention.
The memristive behavior of phase-change memory was first studied on a Cu/GST225/Pt stack [132]. The corresponding current-voltage curve with a pinched hysteresis loop, as shown in Figure 16(a), indicates a positive voltage bias that induces current increase during the 'SET' process and a negative voltage bias resulting in the current decrease during the 'RESET' process. It should be kept in mind that memory switching takes place only when the top electrode is made of either Ag or Cu. The ''SET' process is realized by the formation of the Cu metallic filaments connecting the top and bottom electrodes. On the other hand, rupture of Cu ion filaments induced by the negative bias polarity results in the 'RESET' process. Such principle is schematically depicted in Figure 16(a). Another tri-layered phasechange memristor having GST225 sandwiched between the Cu top electrode and the Ag bottom electrode was recently proposed [133]. Varying sweeping DC voltage and limited current produced two I-V hysteresis loops for the amorphous and crystalline states, respectively, as shown in Figure 16(b). Its low resistive state is mainly caused by driving numerous Ag ions from the Ag top electrode towards the phase-change layer with an application of the electric field, while the high resistive state is achieved through the annihilation of the Ag ions filament, whose principle is schematically interpreted in Figure 16(b). The phase structure of the phase-change layer remains unchanged when subjected to a low operating voltage and limited current. Increasing operating voltage and limited current obviously enlarges the resulting joule heating and induces the crystallization. As aforementioned findings ascribe the memristive mechanism to the metal ion migration according to the field-induced electrochemical redox reactions [132], these device are usually considered as electrochemical metallization cells (ECM) rather than phase-change devices.
The aforementioned memristors, following the mathematical model defined by Equations (1) and (2), exhibit a pronounced bipolar switching behavior. However, there exists some other type of memristors that do not strictly meet the conventional mathematical model, exemplified by the unipolar switching memristor and the thermal memristor. In contrast to bipolar switching, phase-change memory also exhibits a unipolar switching that allows for a switching process that is independent of the voltage/current polarity, meaning that the 'SET' and 'RESET' processes are achieved with the same signal polarity. To simulate the unipolar memristive behavior of phase-change memory, a comprehensive electro-thermal and phase-transition model consisting of the Laplace equation, heat transfer equation, and classical nucleation-growth equation was previously developed to calculate the corresponding I-V curve and phase transformation extent [134]. In this model, the well-known threshold switching phenomenon was described by the modified trap-limited transport theory, defined as [134]: where N T1 is the concentration of a deep traps at energy E T1 aligned with the Fermi level, N T2 is the concentration of a shallow traps at energy E T2 close to the conduction band, E is the electric field inside the phase-change layer, q is the unit charge, z is the intertrap distance, τ 0 is the attempt-to-escape time for a trapped electron, γ is a non-equilibrium term. Based on this sophisticated model, the resulting write current and phase-transformation degree of a phase-change memristor having GST225 layer sandwiched between two TiW electrodes were calculated [135], as described in Figure 16(c). It is clearly revealed that the crystallization extent of GST225 gradually increases along with the repeated cycles of the applied excitations. This can be attributed to the formation of the nucleus and their subsequent growth. Simulation results also suggested that the nucleation usually initiates from the regions at the electrode-GST225 interfaces due to the much higher heterogeneous nucleation rate than the homogeneous rates generally occurring inside the bulk GST225 region. Another intriguing feature, as observed from Figure 16(c), is that the rising edge of the I-V curve of one excitation cycle does not exactly follow the trailing edge of the preceding cycle, discriminating from the conventional TiO 2 memristor. This is possibly caused by the field-dependent conductivity of the amorphous GST225 yielding different temperature distributions induced during the rising and trailing curves of the successive cycles. Despite its discrepancy from conventional resistive memristor, such unipolar memristive behavior still endows the designed phase-change memristor with its capability of remembering its present state and previous state, thus suitable for the future memristive applications. In addition to the electrical memristor, a novel concept of a thermal memristor using W doped Vanadium Dioxide (VO 2 ) media has most recently been proposed [136]. Instead of the required I-V curves for the electrical memristor, the memristive behavior of a phase-change thermal memristor depends on its pinched Lissajous type q-T curve (heat flux-temperature difference). Such memristive function was realized by adding a sinusoidal thermal input across the two terminals, resulting in a pinched Lissajous type q-T curve ( Figure 17). Further observations indicated that increasing the temperature amplitude and thermal conductivity ratio can strengthen the memristive effect. Lissajous type q-T curve of a thermal memristor system. To obtain heat flux, it is essential to know not only the temperature difference but also the history of the q-T curve. Reprinted with permission from [127].
Additional dynamic negative differential thermal resistance and a closed loop in the κ eff -T curve can be also obtained from this developed memristor, demonstrating its similarity to the electrical memristor. This may imply the promising application of phase-change thermal memristor for information storage on thermal energy history, neuromorphic circuits, and spintronics.

V. PHASE-CHANGE NEURO NETWORKS
One of the most important and exciting application of the memristor arises from its adaptability for next generation ANNs. Scientists have been endeavoring to the development of the next generation computer that can operate in the brain inspired computing. This is because human brain is well known as the most efficient computational entity due to its ability to provide ultra-low power consumption, super-fast processing speed, and most importantly, in-memory computing that computation and storage are simultaneously processed in the same location [137]. Today the realization of the brain-like ANNs is mainly focused on the hardware implementation using CMOS transistors [138]. On one hand, the integration density of the digital transistors is infinitely close to the limits of Moore's law, thus hampering the device performances. On the other hand, transistor-based ANNs still follows von Neumann computational mode that computation and storage are performed separately, remarkably differing from the human brain [139], [140]. As a result, aforementioned memristor devices as one of the promising candidates for the next generation computing systems has attained enormous attention, aiming to break the bottleneck of the conventional von Neumann architecture. The charm of memristor, as presented in last section, stems for its memristive function that can imitate the biological brain synapse. Human brain was reported to include ∼10 11 neurons and 10 15 synapses [141]. As illustrated in Figure 18, one particular neuron is made up of a soma, an axon and dendrites. Soma takes up majority of a neuron and is connected to the neighboring neurons through the axon and dendrites. Axon is deployed to transmit information (output), and dendrites are targeted for receiving signals from neighboring neurons (input). The small gaps between the axon of the previous neuron and the dendrites of the next neuron are called synapse. The connection strength between two neighboring neurons is well known as 'weight' that can be depressed or potentiated through a process called synaptic plasticity. Synaptic plasticity can be simply categorized into short-term plasticity that change only lasts short time and long-term plasticity that change can last long time [142]. However, long term synaptic plasticity has been considered to be responsible for learning and memory, which can be further divided into long-term potentiation (i.e., synaptic weight increase) for long-term memory, and long-term depression (i.e., synaptic weight decrease). One famous mechanism that governs longterm plasticity is called spike-timing-dependent-plasticity (STDP) [143]. STDP theory arises from an assumption that the long-term plasticity is not determined by the activations of the pre-and postsynaptic neurons, but by the relative spiking time that is the time difference between the presynaptic spike and the postsynaptic spike. It was clearly indicated in Figure 18 that the presynaptic spike preceding the postsynaptic spike results in LTP, while the presynaptic spike lagging the postsynaptic spike causes LTD. As a consequence, the true hardware realization of ANNs intensively depends on the successful emulation of the brain synapses.
Based on the comparisons between biological synapse and non-volatile phase-change memristor, it is exciting to notice that these two entities in fact share numerous common features in terms of the switching speed, the power consumption, and the integration density [144], [145]. Additionally, the programming current-dependent resistance of phase-change memristor can ideally mimic the synaptic plasticity of the biological synapse, as interpreted in Figure 19. Thanks to above reasons, various PCM-based devices have been explored to study their practicality for the artificial synapses. The most common approach to achieve phase-change synapse is to make use of multiple pulse schemes pioneered by Kuzum et al in 2011 [146]. According to their strategy, two electric pulses connected to the top and bottom electrodes of phase-change memristor are defined as the presynaptic and postsynaptic spikes, respectively, and the resistance of phase-change device corresponds to the synaptic weight. The presynaptic spike comprises a train of stepwise pulses with gradually increasing magnitudes for the 'RESET' state and gradually decreasing magnitudes for the 'SET' state, respectively, and the postsynaptic spike only consists of a single pulse with negative magnitude. As a result, the net electric potential across the phase-change memristor is equal to the magnitude difference between the presynaptic and postsynaptic spikes. In this case, the presynaptic spike preceding the postsynaptic spike implies a fact that the net potential across the phase-change memristor exceeds the crystallization threshold voltage, which induces the crystallization and increases the conductance. This simply means the increase of the synaptic weight. When the presynaptic spike lags the postsynaptic spike, the net potential in this case prevails the amorphization threshold voltage. This consequently causes amorphization and reduces the device conductance, meaning the decrease of the synaptic weight. It is clearly revealed in Figure 20(a) that, modulating relative time delay between the pre-post spikes allows the post-spike to overlap with the pre-spike with different magnitudes and leads to various extent of either amorphous or crystalline states, corresponding to different resistance states (i.e. synaptic weight). This can closely mimic the biological STDP mechanism. Note that the performances of this electrical synapse strongly rely on the precise control of the intermediate resistance states, which may cause some other issues such as capacitive line charging [147]. Besides, its STDP behavior involves the 'RESET' process, thus giving rise to high energy consumption [147]. To overcome it, a concept of 2 phase-change memories based synapse was proposed, as illustrated in Figure 20(b). Based on this design, one phase-change memory is responsible for LTD, while the other accounts for LTP. The most promising feature of this design possesses from its operations in crystalline regime for both LTP and LTD events thus at the cost of the much less energy. Moreover, the synaptic operations in crystalline regime makes device effectively immune to the resistance drift phenomenon usually occurring inside the amorphous PCM. Another phase-change memristor also exhibits its STDP behavior only in crystalline phase [148]. Similar to [146], the device resistance, defined as the synaptic weight, can be adjusted through the aforementioned pulse schemes in [146]. It was reported that the negative pulses representing the potentiating spikes cause the conductance increase, while the positive pulses for depressing spikes decrease the conductance. The device was initially crystallized to a low resistance state and the applied pulse sequence comprises a series of pulse trains with the FIGURE 19. (a) Experimental demonstration of the LTP behavior in a biological synapse in terms of synaptic conductance as a function of time (left) and its electrical realization using PCRAM device (right). The solid and hollow circles presented on the left section correspond to the conductance due to high frequency tectanic stimulation that results in LTP and in the absence of tectanic stimulation, respectively; such synaptic conductance increase is reflected in PCRAM by applying successive 'SET' pulses with the same amplitude into the amorphous region to continuously increase the device conductance. (b) Experimental demonstration of the LTD behavior in a biological synapse in terms of synaptic conductance as a function of time (left) and its electrical realization using PCRAM device (right). Left section shows the change in synaptic conductance due to low frequency stimulation, giving rise to LTD. The dotted horizontal line shows the conductance level when there is no stimulation applied. Reproducing such LTD behavior can be achieved by applying successive 'RESET' pulses with different amplitude to the crystalline region, as demonstrated in right section. Reprinted with permission from [144], [145].
successive increasing amplitude for potentiation and a series of pulse trains with successive decreasing amplitude for depressing spikes, whereas a single pulse was implemented for the post spike. Hence, varying relative time delay between the pre and post spikes results in the net voltage drop with various amplitudes across the designed phase-change devices. This enables a continuous change of device resistance, well matching the typical STDP forms, as shown in Figure 20(c). Another pulse scheme was to employ a one transistor one resistor (1T1R) architecture that connected with phase-change device as an access selector [149]. The amplitudes of the pre and post spikes were determined by the access transistor that only programmes during the brief overlap of these two signals. Here the timing information between the spikes is translated into the amplitude of the overlapping pulse to implement various STDP schemes. Although above pulse schemes provide effective means to emulate the timing based plasticity, complex timing circuitries are generally required to fulfill these schemes and may not be suitable for the practical use.
In addition to the electrical emulation of the biological synapse, the prerequisite of realizing a realistic brain-like neural network fully depends on the imitation of the brain neuron involving maintenance of the equilibrium potential, the transient dynamics, and the neurotransmission process.
However, such complicate biological neuron behavior needs to be significantly simplified for hardware implementation [150]. The fundamental demand for phase-change neuron covers the capability of 'firing' after the receiving of a sequence of pulses trains that can impact an internal state independent of the external conductance unless the neuron fires when exceeding the threshold. Moreover, the stochastic neuronal dynamics reported to be responsible for the signal encoding and transmission needs to be taken into account for achieving an artificial neuron [141]. This stochastic behavior was regarded as a consequence of various biological mechanisms that include ionic conductance noise, chaotic motion of charge carriers, inter-neuron morphologic variabilities, and other background noise [151], thus required to be reflected by the designed artificial neuron. One promising route to devise phase-change neuron is to encode the evolution of neuronal membrane according to the phase configuration within the device [152], as illustrated in Figure 21. Such device exhibits remarkable inter-neuronal and intra-neuronal randomness. Multiple integrated-and-fire cycles in a single phase-change neuron to yield a distribution of the interspike intervals was demonstrated via the intra-device stochasticity usually owing to the shot-to-shot variability in both internal atomic configuration of the melt-quenched amorphous region and thickness. According to [152], despite the slow firing FIGURE 20. (a) STDP implementation through the overlapping between a train of pulses and a gate pulse (left) and the resulting STDP measurements on PCM synapses by modifying the spacings and amplitudes of the pulses in the prespike. LTP1, LTP2, and LIP3 correspond to three different prespike configurations for long-term potentiation. LTD1, LTD2, LTD3 correspond to three different prespike configurations for long-term depression (LTD). (b) Circuit schematic for the 2-PCM synapse. The input of the current from the LTD devices is inverted in the post-synaptic neuron. (c) Implementation of STDP with nanosecond-scale time windows in the PCM synapse with the antisymmetric Hebbian learning rule (left) and its ability to operate at ultralow voltage and tune the time window down to the nanosecond scale, whereas the time window of biological synapse is about 60 ms.(d) the pulse algorithm adopted to emulate the STDP (left) and the generated STDP events using the designed circuit. (a) is reprinted with permission from [146]; (b) is reprinted with permission from [147]; (c) is reprinted with permission from [148]; (d) is reprinted with permission from [149]. rate of the individual neurons, overall neuron population still allows for the fast signals. The devised phase-change neuron was proven to offer the detection of the temporal correlations within a large number of event-based data streams, resulting in the advent of a thorough phase-change neuromorphic circuit containing both phase-change synapses and phase-change neurons.
A conventional ANN usually comprises the input neurons, hidden neurons, and output neurons, as illustrated in Figure 22. The summation of total inputs into one neuron is first calculated and subsequently compared with a threshold value. Each neuron passes its signal to next layer once the summation exceeds the threshold. This procedure is repeated for each layer of the network until it reaches the output layer. The correct output signal is eventually selected depending on which final neuron fires. The threshold value and the synaptic weight that determines the connection strength between the neurons therefore become two crucial metrics that need to be carefully determined. As mentioned above, synaptic weight determines how much of signal is distributed through to each neuron in the next layer. In conventional ANNs, these threshold values are usually obtained from a so-called training process involving different algorithms of varying complexity [144]. One common function of these algorithms is to training a system by a large series of static data first before using it for recognizing, known as offline VOLUME 8, 2020 FIGURE 21. Schematic of PCM-based neuron with an array of plastic synapse at its input. Reprinted with permission from [152]. learning [154]. This approach however is very time consuming and requires a large scale data set. In contrast to offline learning, another route makes use of dynamic data to train the networks while at the cost of more on-chip memory to store the new weight values and extensive peripheral circuits to perform the large number of weight update calculations in real time [155]. The trained network is able to self-operate and its output closeness to the realistic output depends on the efficiency of the training scheme. Depending on the type of network and task complexity, different training approaches such as online and offline are adopted. The ANNs can be categorized into two types: Spiking Neural Networks (SNN) and Deep Neural Networks (DNN).
SNNs have recently exhibited great potential to reproduce the biological system owing to its ability to process spike input signals that was considered as the main reason to make brain unique at sequence recognition [156] and memory [157]. Obviously STDP behavior that significantly affects plasticity in the brain needs to be mimicked successfully in order to make SNNs close to the biological networks, resulting in several novel paradigms. The conventional method to expand a single electronic synapse to the network level is to take advantage of the well-known crossbar structure that  [161]; (b) is reprinted with permission from [162]; (c) is reprinted with permission from [163].
has been widely implemented for PCRAM [158] and other non-volatile memories applications [159], [160]. One example [161], as illustrated in Figure 23(a), is to harness a 10×10 PCM memory array and each cell consists of one Lance-type PCRAM and one access selector. The bitline connected to the gate of the access selector and the wordline connected to the VOLUME 8, 2020 top electrode of the PCRAM element can be implemented to access each cell. The synaptic events can be imitated using a pulse training that comprises a 'RESET' pulse followed by nine 'SET' pulses, thus giving rise to nine discernible resistance levels. Such designed networks can be also used to realize the associative learning function. The experiment is made up of a lot of learning epochs during which the synaptic weight update occurs depending on the firing neurons. The pattern having an incorrect pixel was displayed after the training and the incorrect pixel was expected to be recalled in the recall phase after the training process. As a consequence, an incomplete pattern with an incorrectly OFF pixel and a complete pattern are presented during the recall phase and the training phase of an epoch, respectively. Using aforementioned methods reportedly allows for the recall and storage of the test pattern associatively via Hebbian plasticity in a manner analogous to the biological brain. Recently a three-layer perceptron network consisting of 164, 885 synapses, as illustrated in Figure 23(b), was developed to be trained on the MNIST database, allowing for a training accuracy of 82.2%. The synapses used in this network adopted aforementioned 2-PCM cell structure and can achieve the optimum potentiation and depression characteristics according to the application of the successive pulses for the bidirectional synaptic devices in the neuromorphic systems. It was demonstrated in [162] that, the PCM devices with the bidirectional function can exhibit a linear conductance response for a high dynamic range, thus enabling high classification accuracies. The unsupervised learning and detection of the temporal correlations in parallel input can be also achieved using the crossbar-based neuromorphic architecture where both neurons and synapses were built using PCM devices, as verified in Figure 23 (c).
DNNs have most recently received considerable attention due to its promising potentials to handle super complex task with large amounts of training data [164]. DNNs usually accompanied with many layers can achieve unsupervised, semi-supervised and supervised learning, and are currently under intensive developments, mainly stimulated by the recent powerful parallel computation devices such as graphic processor units (GPUs) and field programmable gate arrays (FPGAs) [165]. Under this circumstance, some DNNs architectures with the novel learning algorithms were proposed during last 5 years. One intriguing feature of the DNN training is reportedly to perform the forward and backward propagations imprecisely while the gradients need to be accumulated in high precision, which induced the debut of a mixed-precision in-memory computing approach [166].
The key idea is to store the synaptic weights in phase-change devices where the forward and backward passes are performed, whereas the weight changes are accumulated in high precision, as shown in Figures 24(a)-(e). The synaptic weights are changed by the pulses applied to the memory devices once the accumulated weight exceeds the threshold value. Inspired by this idea, a two-layered neural network having 2-phase-change devices in the differential configuration that indicate the synaptic weights was utilized to solve the handwritten digit classification problem [166]. The test accuracy after 20 epochs of the training was approximately 98%. Besides, another method to train the ResNet-type convolutional neural networks, which leads to almost no accuracy loss when transferring weights to analog in-memory computing hardware based on the phase-change memory, was proposed [167], [168], as schematically shown in Figures 24(f)-(h). The as-programmed classification accuracy of 93.69% on the CIFAR-10 dataset with ResNet-32 was demonstrated based on a network of 361,722 synaptic weights programmed on two phase-change devices deployed in a differential configuration, which stays above 92.6% over a one day period.

VI. CHALLENGES OF PHASE-CHANGE DEVICES
One prominent trait of the phase-change electrical devices arises from the variations of its electrical properties such as the resistance states, the SET/RESET current, and the working voltages from device to device and from cycle to cycle. The device-to-device variations are likely due to the restriction of the fabrication techniques that cause non-uniform film morphology and homogeneity [169]. The cycle-to-cycle variations on the other hand results in the temporal stochasticity induced by the randomness in atomic configurations over a long incubation period [169], [170], as revealed in Figure 25(a), which undoubtedly brings about some adverse effects on neuromorphic applications.
One technique to circumvent the stochasticity is to introduce a long (∼10 ns) pretreatment to preseed the nuclei inside the amorphous matrix, whereby the crystallization majorly depends on its crystal growth [171]. Triggered by this idea, crystallization of the amorphous GeTe annealed at different temperature above the glass transition point was modeled based on the density functional theory (DFT) simulations [172]. The results revealed that a large population of the subcritical nuclei was formed at lower temperature with the large nucleation rate, while this population can partially survive during the fast annealing. This suggested for a promising finding that operating the system at lower temperature for a sufficient time enables the remarkable reduction of the incubation time and the nucleation stochasticity. Differing from aforementioned method, another strategy to reduce the stochasticity of crystal nucleation as well as the incubation time is to alloy the Sb 2 Te 3 compound with Scandium (Sc) [173]. This novel Sc 0.2 Sb 2 Te 3 composition leads to a write speed of ∼700 ps without the requirement for the preprogramming, resulting from its geometrically matched and robust Sc-Te bonds that serve as the crystal precursors improving the formation of the postcritical nuclei. Except the unfavorable stochasticity effect on the storage applications, the stochasticity is however preferable for the hardware security and neural network applications. The stochasticity of a single filament in the amorphous-Si exhibits a Poissonian distribution, thus enabling the imitation of a random number generator [169]. This random number generator can be also realized according to the intrinsic stochasticity of the delay A global drift compensation procedure is performed for every layer before every inference. A hardware model incorporating PCM conductance drift (ν = 0.06), device-to-device drift exponent variability with 0.1ν standard deviation, and instantaneous Gaussian read noise with 0.6 µS standard deviation is able to capture the experimental results fairly well. (a)-(e) are reprinted with permission from [166]; (f) is reprinted with permission from [168]; (g) and (h) are reprinted with permission from [167].
mechanism of the Ag diffusive memristor [174]. Additionally, the conductance changes in 90 nm Lance-type PCRAM cells was investigated in terms of the granularity and stochasticity. The standard deviation of the conductance change was reportedly similar to the mean conductance change, and the classification accuracy was impaired considerably with the increased size of the mean conductance change [168]. These findings make it possible to innovate the device technologies VOLUME 8, 2020  [111] (right). The right section reveals that the SST device can repeatedly perform ultrafast SET and RESET operations up to 105cycles with 800 ps pulses. (a) is reprinted with permission from [170]; (b) is reprinted with permission from [173]; (c) is reprinted with permission from [174]. and synaptic architecture that are more robust to these undesirable properties.
Neuromorphic applications have been shown to be fairly robust to some kinds of device variability and non-ideality, while sensitive to asymmetry and nonlinearity of conductance response [141]. An ideal non-volatile memory device is anticipated to have a near-linear response over most of its conductance range, with each programming pulse changing conductance by only a small portion of the overall dynamic range. However, the state-of-the-art PCM device fails to satisfy the above criteria. Although PCM exhibits small and contiguous conductance increase through partial crystallization, its conductance decrease caused by the conventional melt-quenching is abrupt. Accordingly, the conductance response of the PCM device shows imperfections that mainly include nonlinearity, limited dynamic range, varying maxima, and asymmetry between increasing/decreasing responses [162]. These imperfections severely affect the performances of the resulting neural networks. It was reported that some degree of non-linearity can be tolerated provided that the conductance range over which the response is non-linear is only a small fraction (e.g. ∼ 10%) of the overall conductance range [175]. One potential solution to overcome the limited dynamic range is to implement multiple conductances per synapse and periodic corrections.
∼ 20 -50 programming steps between min and max conductance with a conductance pair representing a single weight was recently proposed. Other promising scenarios include the quantification of the impact of parameter variation and device reliability on DNN training [176].
Another stringent challenge that phase-change electrical device is currently encountering is its limited integration density. The well-known crossbar architecture has been widely implemented for PCRAM array and phase-change neural networks today, as schematically illustrated in Figure 26. The crossbar array comprises the perpendicular rows and columns, and the PCM devices are located at each crosspoint. The device conductance is therefore adopted to denote the synaptic weight. To compute the weight sum, read voltages are applied to all the rows, followed by a multiplication of read voltages by the conductance of the electronic synapse. This gives rise to a weighted current sum in each column. This analog current is required to be converted to the digital output or spikes by the neuron circuits at the end of the column. Note that for the practical operations, the digital number of pulses is favorable as input signals rather than the analog voltage due to the possible distortion of the weighted sum accuracy caused by the analog voltage [165]. The weight update is conducted through either row by row or column by column, as programming voltages can be applied to both row and column. Although the crossbar integration technology has been extensively implemented for various PCM devices to construct the phase-change network, the number of phase-change cells per crossbar for most networks was reported to be less than 1000 [177], severely handicapping their applications for the brain-like neural networks. One possible reason that limits the large scale integration is due to the intensified effect of the parasitic wire resistance of the crossbar [177], [178], thus exacerbating the precision of the weighted sum of the array. Besides, the aforementioned stochasticity also deteriorates the programming. One effective solution to overcome the density limitation is to take advantage of the 3D memristor crossbars [179], which can effectively suppress the parasitic wire resistance of each crossbar layer. The perpendicular conductors of the emerging 3D XPoint memory are reported to connect 128 billion densely packed memory cells, each of which stores a single bit of data [120]. In this case the arrangement of the memristor crossbars stacked in 3D can significantly burgeon the integration density per area, thus providing large amount of weights usually demanded for the cognitive operations.

VII. OUTLOOKS
Chalcogenide based PCMs have been a prospective technology during last two decades, enabling various exciting electrical devices closely related to citizen's daily life. The realization of their electrical function depends on the vast resistance difference between the amorphous and crystalline states. Considerable efforts have therefore endeavored to the comprehending of the physical properties of the Chalcogenide based materials. Firstly, the cubic metastable phase of the Chalcogenide alloy can undergo several modification due to a large number of the structural vacancies following different distributions [23]. Such spatial distributions of the vacancies in the crystalline Chalcogenide lattices play an important role in determining the electronic properties of the crystalline PCMs. Secondly, the metallic behavior is found on PCMs with the ordered vacancy layers, whereas showing the insulating behavior with the disordered vacancies. These variations can be ascribed to the electron localization effects [180]. Finally, controlling disorder extent of PCMs becomes highly critical for the multi-level data storage that can greatly boost the storage density of the phase-change memories and also benefit the weight update of the memristor-based neural network. It should be noticed that achieving multilevel function today relies on the control of volume ratio of amorphous to crystalline states. Nevertheless, such control fails by scaling cell down to a few nanometers, thereby making it imperative to precisely control the amorphous-crystalline volume ratio [181]. This can be accomplished using the epitaxial Chalcogenide thin films that not only enables interface assisted crystal growth along preferred crystallographic directions, but also considerably increases the crystallization speed [182]. Most recently, the debut of the highly textured TiTe 2 -Sb 2 Te 3 based hetero-structures further reduces the energy consumption of the PCM devices as well as enhancing their switching speed [183]. In spite of the conceptual demonstration of such devices, practical devices using the epitaxial single-phase Chalcogenide materials at the industry level was yet to be devised. In this case, the resistive switching characteristic and atomic structures of this novel concept associated with other key physical parameters including crystal structure, disorder degree, and density of bilayer defects need intensive study in near future.
The drawbacks of the conventional phase-change memories like relative low 'SET' speed and high 'RESET' current are also reflected in the phase-change memristor and the memristor-based neural networks. A serial heating resistance is therefore required to produce adequate joule heating for phase transformation when induced by a low 'RESET' current [184]. Replacing the conventional Lance-type architecture with phase-change memristors with smaller interfacial area between the PCM and electrode material (refer to Section 3) can achieve the same heating effect. Besides, doping conventional Ge-Sb-Te materials with other chemical elements like C [185], SiO 2 [186], or SiC [187] can also facilitate the thermal properties of the crystalline state. Moreover, the vacancies likely formed at the Chalcogenide-electrode interface during the repeatable writing of PCMs may cause the device failure at the high resistance state [188]. This issue can be effectively addressed by adding carbon dopants inside the Ge-Sb-Te media to circumvent the interfacial vacancies formation [187], [189]. Besides, the distance pitch between neighboring synapses in the whole networks is usually required to be very short to increase the density scale. However, the thermal cross-talk phenomenon that one cell is unintentionally rewritten when programming its neighboring cell may occur due to the heat diffusion effect. Such thermal disturbance becomes more pronounced along with the downscaling of the cell dimension. One potential solution is to find an appropriate dielectric material to prevent such thermal cross-talk effect [190], [191]. Recently some oxides, such as VO 2 , were also attributed into PCMs regime due to its fascinating insulator-to-metal phase transition occurring near room temperature, and ability to control this transition by applied current, electric field and photoexcitation. These traits make VO 2 attractively appropriate for memristor [192] and neuromorphic computing applications [193].
One encouraging fact is that long retention time generally required for phase-change memories is not urgently needed for deep learning training, thus allowing for the presence of new PCMs like elemental Sb [194]. In addition to aforementioned drawbacks, structural relaxation of the melt-quenched amorphous phase, as mentioned in Section 2, also poses a challenge on the phase-change neural networks due to the resulting conductance shift. One prospective solution is to adopt the projected phase-change memory where a shunt path is provided for read current to bypass the amorphous PCMs [195]. Another factor that may harm the performances of the phase-change neural networks is the limited endurance cycles of PCMs (10 9 to 10 12 ) that may be satisfactory for memory applications, but not be adequate for training applications, which can be partially improved by multi-memristive synaptic architectures [137].