A Detailed Study of SOT-MRAM as an Alternative to DRAM Primary Memory in Multi-Core Environment

As the current primary memory technology is reaching its limits, it is essential to explore alternative memory technologies to accommodate modern applications and use cases. However, using new memory technology poses the challenge of deriving accurately estimated parameters for integrating new memory technology and doing reliable simulations. This study proposes a new approach incorporating Spin-Orbit-Torque-Magnetic-RAM (SOT-MRAM) into hybrid and full main memory architectures within a multi-core system, encompassing various memory configurations and capacities. The study addresses the challenge of evaluating SOT-MRAM-based memory systems when specific SOT-MRAM memory parameters are not publicly available. The research methodology includes micro-architectural (circuit-level) design space exploration and comprehensive full system simulations, which evaluate benchmark programs representing diverse application domains. The evaluation includes three memory structures with varying memory organizations and capacities. The results show that SOT-MRAM is a robust replacement for DRAM or hybrid memory, offering compelling advantages such as a remarkable 74.05% reduction in power consumption, a noteworthy 40.10% increase in bandwidth utilization, and a significant 72.85% reduction in Energy-Delay Product (EDP). The maximum latency penalties are also minimal, with a 3.71% increase for hybrid structures and a mere 0.07% for standalone SOT-MRAM memory structures.


I. INTRODUCTION
DRAM, the popular memory technology, faces issues with power and scalability when scaled to large sizes.A significant limitation of DRAM is that when scaled, the capacitors are more vulnerable to errors due to their reduced size [1].Numerous alternatives have been suggested as substitutes for the current DRAM memories based on CMOS technology for multiprocessor systems.Replacing DRAM with Non-Volatile Memory (NVM) is the most promising option.NVM technologies that could be used as a replacement for traditional DRAM memory are Phase Change Memory (PCM), Spin-Transfer-Torque Magnetic RAM (STT-MRAM), and Resistive RAM (ReRAM).These The associate editor coordinating the review of this manuscript and approving it for publication was Mario Donato Marino .
technologies offer byte-addressability, non-volatility, and fast read times.STT magnetization-based switching memory was first introduced in [2], leading to the genesis of several memory technologies based on this method.STT-MRAM technology has a similar capacity, frequency, and device size as DRAM [1], making it a potential replacement for DRAM.However, due to the common path shared by read as well as write operations, STT-MRAM suffers from the read-write disturbance problem [3], [4].
Spin-Orbit-Torque(SOT)-Magnetic-RAM(MRAM) is being widely explored as a potential technology for cache memory applications.SOT-MRAM boasts separate paths for the read and write process, rendering it an attractive and efficient memory device.The potential of SOT-MRAM as a lowpower, high-speed spintronic device for on-chip memory and main memory in HPC and AI applications is highlighted by Zheng et al. [5].The behaviour of SOT-MRAM in embedded systems was analyzed in [6] with experiments, showing an average reduction in power by 47.1% and a performance increase of 39.9% than DRAM.Reliable evaluation of SOT-MRAM-based main memory is challenging due to the public unavailability of parameters related to timing and current.The significant advantage of MRAM devices is scalability, and they do not require any refresh, reducing static power consumption.STT-MRAM has been explored as the main memory in recent research [1], [7] providing promising results.The average speedups of Open and Close Page Techniques with a 1.2x configuration, as mentioned in [1], are 2.4% and 2.7%, respectively.Additionally, there was 4.6 times more power consumption than DRAM.One setback for STT-MRAM is its write latency, write energy and reliability [7].SOT-MRAM provides better read/write latency, retention time, reliability and endurance than STT-MRAM [5], [8].
SOT-MRAM has not yet been evaluated as the primary memory for multi-core systems using full system simulators and benchmark programs, such as PARSEC [9].Given its attractive features, SOT-MRAM has the potential to serve as the main memory.This research seeks to propose and explore its behaviour in a multi-core environment when handling shared memory workloads.We have derived SOT-MRAM main memory's timing and current parameters using the approach in [1].The authors in [10] and [1] have validated the method and the parameters from industry estimation. Figure 1 depicts the different memory architectures used in our study.
The contributions of this work are: • The first (best of our knowledge) to integrate SOT-MRAM into multi-core systems extending its application scope and three memory structures.
• To address the challenge of parameter absence, the study estimates and scales parameters, ensuring a robust analysis of three memory structures.
• A comprehensive survey and micro-architectural exploration reveal SOT-MRAM's superior circuit-level metrics.
• Thorough system-level simulations provide a holistic evaluation, covering power consumption, bandwidth utilization, EDP, and total latency.
The subsequent sections of the work are presented as follows: In Section II, we provide an introduction to MRAM memory.Section III comprehensively reviews the state-of-the-art literature in the field.Section IV elaborates on the parameters of SOT-MRAM main memory, detailing the derivation process and validation.Section V outlines the experimental setup and processes, while Section VI offers a detailed analysis of the results obtained under various scenarios.Finally, in Section VII, we draw conclusions based on our findings.

II. MRAM BACKGROUND
Scaling down memory based on charge-based storage is facing issues due to the physical limitation of the components used in the memory circuits.Spintronics-based devices use the electron's spin property to store data [11].Magnetic random access memories were invented several decades ago, but they were not used in mass production for commercial use because of the vulnerabilities they faced in storing the data.With the advent of magnetic tunnel junction (MTJ), these memory technologies returned to the limelight.MRAM devices use an MTJ to store information, made up of a couple of ferromagnetic layers decoupled by an insulated layer.Read operations involve assessing the tunnel junction's resistance, while write operations are performed by directing a current through the Magnetic Tunnel Junction (MTJ) [12].STT-MRAM technology has been explored as an alternative to DRAM and SRAM due to their density, scalability, and reduced power consumption benefits.However, because of the common read-write paths used, optimizing both operations independently is impossible [13].To address these issues, alternatives like SOT-MRAM devices are explored.SOT-MRAM-based devices share several common features with STT-MRAM, with the added advantage of separate readwrite paths.This makes it an ideal device in the memory hierarchy.[14], [15].

III. RELATED WORK
Though SOT-MRAM is investigated as a caching technology, it has not been explored yet as a main memory device alternative to DRAM or STT-MRAM for multi-core systems.This section reviews the integration of NVM as primary memory in horizontal structure, vertical structure, and hybrid memory systems for various workloads.Although the feasibility of using SOT-MRAM technology as the primary system memory in embedded, HPC, or general-purpose computers is not yet thoroughly studied, some researchers have investigated the potential of using STT-MRAM or PCM as a hybrid or full main memory in various computing systems.
We first list (Table -2) all the works using the commercially available hardware in the experiments.In [16], Intel's first NVM-based Optane DC persistent memory module was tested on a 24-core processor for high-performance computing (HPC) applications, known as the ''seven dwarfs''.The results show that DRAM-cached NVM enhances HPC application performance and allows for handling more significant problems than DRAM alone.HPC applications' performance on uncached-NVM was categorized into three levels: insensitive, scaled, and bottlenecked.They introduced two optimization methods: a predictive model and a write-aware data placement, which doubled performance and cut DRAM use by 60%.However, these findings on HPC applications may differ for other applications or workloads using NVM-based memory.
Liu et al. [17] have devised a hybrid DRAM/STT-MRAM memory for IoT systems to enhance STT-MRAM's write speed with a fast data migration method.They built this system using Micron DDR3 SDRAM [24] and Everspin DDR3 STT-MRAM [25], focusing on reducing power use on standby mode.Their findings have shown only a minor drop in STT-MRAM performance.Importantly, they have shared specific timing and power details, offering valuable insights for further research.Work in [18] examines the effects of commercially available STT-MRAM and DRAM on energy harvesting in IoT systems.The study evaluates different scenarios and finds that using STT-MRAM results in a 15% decrease in power consumption and a 714x faster data restoration time.However, the study is limited by a lack of published timing and power parameters for 32MB STT-MRAM, which may affect its accuracy in representing applications with higher memory requirements.
The rows (Table -2) with type simulator are open-source simulators with limited publicly available configuration, timing, and power parameters.In [19], a new system called CAHRAM is introduced to improve PCM in hybrid setup.CAHRAM uses deduplication to reduce access overheads.Results using Gem5 [26] and NVMain [27] show, the system outperforms existing methods in speed, PCM lifetime, and efficiency.This work's effectiveness may be limited for workloads with low-redundancy data.The study lacks detailed timing and current parameters.Jing and Li [20] analyze advancements in heterogeneous memory with STT-MRAM, introducing three optimized schemes: hierarchical, parallel, and hybrid architectures.However, the study lacks full system simulation analysis with benchmark programs, highlighting the need for further research to test these schemes under diverse workloads.
In [21], Mahdavi et al. review STT-MRAM issues like read disturbances and write failures.They propose reducing write operation errors and optimising encoding to decrease read disturbances.Their simulations on a quad-core processor using Gem5 [26]-NVMain [27] show a 92% reduction in read disturbances and a 22% decrease in write failures.Despite minimal increases in area, power, and performance(0.1%,1.52%, and 1.19%) overheads, the study lacks detailed timing and current parameters.
Focusing on NVM-based main memory systems, we review research with publicly available timing and current parameters for open-source simulator verification.
In the [10] thesis, the author evaluates STT-MRAM in HPC systems with trace-based simulations, assuming STT-MRAM is 50% and 100% slower than DRAM.Performance degradation ranges from 2 to 10%.Another study [1] assesses STT-MRAM as primary memory in high-performance computing compared to DRAM.Using SPEC2006 benchmarks on ZSim and DRAMSim2 with timing parameter scaling of 1.2x, 1.5x, and 2.0x, they find an average performance drop of 5.4% and 11.3% in integer and floating-point benchmarks, respectively, when timing parameters are twice that of DRAM.Lastly, considering STT-MRAM's benefits like radiation resistance and zero leakage power, its suitability for real-time systems, including space, automotive, and avionics applications, is evaluated [28].Using benchmarks tailored for these sectors, the study suggests STT-MRAM could be a strong candidate for this sector.
Ma et al. [22] introduced a framework to assess STT-MRAM's reliability and performance at the computer architecture level using GEM5+NVMain.Their results show that STT-MRAM's average latency and energy use could increase by up to 5.996% and 20.65% compared to standard models.These results indicate that reliability and performance issues at the system level might limit STT-MRAM's use in high-performance and real-time systems.
Another issue with STT-MRAM is its modified sensing methods.The SMART system in [23], uses a modified approach for STT-MRAM to address the sensing challenges more efficiently than DRAM.SMART's sense amplifier design reduces row buffer size, leading to higher activation energy and lower performance.However, it improves by delaying bit-line sensing until receiving a column access command.This approach offers numerous benefits, including larger page sizes, less reliance on sense amplifiers, reduced activation power, better parallelism, lower latency, and more effective repair of faulty columns.SMART surpasses traditional STT-MRAM and DRAM in energy efficiency and performance while being more compact.This study provides publicly available timing and current parameters.
Research on STT-MRAM has been extensive, exploring its use in various settings, from IoT to real-time systems, using real hardware and simulations.Despite some performance drawbacks, studies demonstrate its potential to replace DRAM in many computing systems.However, high write energy, slow write speeds, and reliability issues still hinder STT-MRAM's adoption as the main memory.The following section will discuss the potential of SOT-MRAM as an alternative.Table 2 summarizes this research, noting gaps and challenges for future studies.'' '' marks in the table indicate unresolved issues, while '' '' marks show where studies have provided solutions or key parameters.The final row outlines the contribution of our study in this work.

IV. SOT-MRAM MAIN MEMORY
To address the challenges and retain the benefits of STT-MRAM, researchers have investigated other memory technologies, such as SOT-MRAM.Recent studies have demonstrated that SOT-MRAM could serve as a promising alternative to SRAM, offering similar benefits while overcoming some of the limitations of STT-MRAM.F. Oboril et al. provided compelling evidence of SOT-MRAM as a viable alternative to SRAM in their pioneering work, as noted in [29] for multi-core systems.The research has demonstrated that SOT-MRAM offers a 60% reduction in energy consumption, a 1% performance improvement, and an outstanding 27-fold reduction in retention failure probability.The results provided in their work make a strong case for using SOT-MRAM as a cache memory [29].
The integration of SOT-MRAM with the CMOS manufacturing process is another crucial aspect that needs to be addressed.As demonstrated in [5], SOT-MRAM has been successfully integrated into CMOS technology for cache replacement.
According to a study by Garello et al. [30], SOT-MRAM has shown superior power efficiency and performance compared to SRAM, which makes it a promising choice for cache applications.However, the larger bit cell area is still considered a major disadvantage.
To solve the problem of larger bit cell area, a potential solution has been presented in [31] with an area-optimized cell design that achieved a remarkable area optimization of 42%.Also, a similar pioneering area-optimized SOT-MRAM bit cell and memory organization was proposed in [32].The result shows an area reduction of 32%.However, the requirement of two access transistors per bit has made SOT memories area-intensive.Along similar lines to the previous works, the problem of area efficiency was solved by introducing a multiple-bit SOT-MRAM cell in [33].A multibit SOT cell with a shared write channel among multiple bits is proposed to address this challenge, enabling an area-efficient memory design with improved device density.
The authors conclude that SOT-MRAM may become a preferred choice for a wide range of memory hierarchies [33] including main memory.
All these above works give us strong evidence that SOT-MRAM can be a potential candidate for main memory.Previous studies have shown favourable outcomes for the main memory implemented using STT-MRAM, and SOT-MRAM is adopted as the cache memory technology in [29].Since SOT-MRAM and STT-MRAM exhibit similarities, and STT-MRAM is employed as the primary memory, we have evaluated the effects of employing SOT-MRAM as the primary memory in a multi-core environment.
In the absence of reliable current and timing parameters from manufacturers, the proposed SOT-MRAM-based main memory for the multicore environment uses approaches demonstrated in [1], [10], and [23].These methods use simulators and commercially available hardware to validate their timing and current parameter scaling methodology.The results strongly suggest that read and write operations have identical costs after the data is loaded into the row buffer regardless of the memory technology [1].
Given the compelling evidence presented by prior research, this work explores the viability of SOT-MRAM as a main memory technology.The evaluation involves deriving and using separate scaled parameters for row buffer-associated and non-associated parameters.We estimated and scaled the timing and power parameters using established and validated methods outlined in previous studies such as [1] and [23].Additionally, we conducted circuit-level memory exploration experiments to further validate the scaling factors employed.The details of this analysis can be found in the results section.The following section lists and provides details of the timing parameters.
Our decision to use DDR3-1600 for comparisons is based on the accessible and validated parameters specific to the DDR3 standard, as per their industry partner Everspin Technologies under a co-operation agreement [1].However, the parameters used do not correspond to their commercial product parameters.Moreover, the DDR3-1600 parameters also apply to other DDRx standards, as highlighted in recent studies [10], [17].This approach ensures that our findings are relevant across various DDR technologies.

A. PARAMETERS FOR ESTIMATION OF TIMING
The DDRx protocol compatibility of SOT-MRAM memory and its similarity in organization and CPU interface to STT-MRAM provides valuable insights into the timing parameters of SOT-MRAM memory.In the case of DRAM and STT-MRAM primary memory devices, a row buffer serves as an interface between the memory bus and cell arrays [1].The same holds good for SOT-MRAM as well.Only timing parameters associated with the row buffer differ between DRAM and SOT-MRAM.This is because, following loading a row of data into the buffer, the timing parameters for subsequent operations are identical for both types of memory [28].The circuitry beyond the row buffer for DRAM and SOT-MRAM is essentially the same [10].The timing parameter tCWD, which denotes the delay between issuing a column write command and placing the data on the bus, remains the same for DRAM and SOT-MRAM.Other timing parameters not related to row operations, such as tBURST, tCAS, and tWTR, are identical.Table-3 presents these timings, expressed in cycles of DDR3-1600.The primary distinction between SOT-MRAM and DRAM main memory is the technology used in their storage cells, namely MTJ, and capacitor.The cell access mechanism differs between these two memory technologies, leading to differences in the timing parameters related to SOT-MRAM row operations compared to DRAM.The access operation of DRAM is voltage-based, whereas SOT-MRAM's access operation is current-based [28].
The timing parameter related to precharging a bit line to a reference voltage before the cell access is Row pre-charge (tRP) [10].In DRAM, to access a cell, a bit line is first pre-charged, and then the word line is activated, enabling the sensing circuit to sense the data.SOT-MRAM cell array access is different from DRAM, as it uses a current operation mode to read data stored in MTJ by activating a word line and applying a small amount of current to sense the data through the bit-line [28].tRCD is the timing parameter that represents the time required to access and retrieve the data from a row and has it ready in the row buffer.tRCD,tRP,and row to row activation delay(tRRD) are three values which differ between DRAM and SOT-MRAM.
The timing parameters specific to SOT-MRAM have not been standardized or publicly disclosed due to the constantly evolving nature of the technology.As a result, memory manufacturers working on STT-MRAM and SOT-MRAM are not disclosing these parameters.Therefore, there is a need to perform a sensitivity analysis on the parameters that vary from DRAM to SOT-MRAM.In this study, a conservative scaling of SOT-MRAM parameters from DRAM-1600 is adopted using the methodology in [1] and [34].The scaling methodology was validated in the work [1] and [23].SOT-MRAM tRFC(Refresh cycle time) and tRAS(Activate to precharge delay) values are taken as 0. Parameters are listed in Table -4.Unlike DRAM, SOT-MRAM access operation is nondestructive, so no restoration of the row is required.The next, row access operations in SOT-MRAM initiate sooner compared to DRAM.The SOT-MRAM Row cycle (tRC) is shorter than DRAM in certain cases, despite having a longer tRCD [35].In this work, we assume the SOT-MRAM operation to be symmetric.

B. PARAMETERS FOR ESTIMATION OF POWER
The fundamental distinction between the two main memory technologies is the storage cell.The power parameters linked with the access of these cells vary between SOT-MRAM and DRAM.In regards to the power consumption of DRAM and SOT-MRAM, there are three parameters that differ in present models.These parameters are the Active pre-charge-Current (IDD0), Active Read pre-charge-Current (IDD1), and Operating-Burst-Current (IDD4(R/W)) [1].SOT-MRAM uses current mode to access its cells, in comparison to DRAM's voltage-mode cell operations.So we choose the same methodology adopted in timing parameter estimation and scale the values to account for current-based sensing methods of SOT-MRAM as in [1] and [23].In the case of SOT-MRAM, since it does not require refresh, the Refresh Current (IDD5) and Self Refresh Current (IDD6) are set to 0. However, the current parameters which are not associated with any operation accessing the cell(precharge and active, power down and stand-by current remain unchanged from DRAM to SOT-MRAM.Table-5 lists in bold the row-related parameters and the rest are the same for both memory types except refresh currents.

DESTINY (Design Space Exploration for Non-volatile
Memory Technology) [36] is a simulator for exploring and analyzing non-volatile memory (NVM) technologies.It is a powerful tool used in computer architecture and memory design to evaluate different NVM technologies and configurations' performance, power consumption, and area characteristics.The DESTINY simulator core was employed for conducting micro-architectural circuit-level experiments, optimizing the main memory's results for various memory capacities.
The system-level experiments integrate two standard Gem5 [26]-NVMain [27] Simulators.Gem5 is a system-level simulator for computer architecture simulation in full system mode.NVMain supports non-volatile main memory technology in hybrid as well as stand-alone mode.The benchmark programs used in the experiments are summarized in the Table-6.TABLE 6. Workloads from the PARSEC [9] benchmark suite used in the study.
Benchmark programs for the experiments are taken from the popular benchmark suite Parsec benchmark suite [37].It is a collection of programs used to evaluate multicore machines.The accuracy of experimental results is crucial in evaluating a system's performance, which is often achieved through simulation environments.In this study, experiments were conducted using the full system (FS) simulation environment of the gem5-NVMain simulator.The FS mode provides a more accurate representation of system interactions with the operating system compared to system emulation (SE) mode [26].
The NVMain configuration files for SOT-MRAM were populated using the approach proposed to design new memory devices compatible with DDRx protocol standards [1].
In our experiments, we employed a lower scaling factor for SOT-MRAM, considering its improved performance and modified sensing methods while still adhering to the DDRx protocol, similar to DRAM.This conservative approach was based on our circuit-level analysis results and supported by strong evidence from [6], [23], and [38].In contrast, STT-MRAM was scaled at 1.25x, resulting in comparatively inferior read-write latency [1].Additionally, we set other parameters following the DDR3-1600 standard, utilizing values sourced from the micron data sheet [39].
Table -7 lists all the multi-core environment-related parameters viz., clock speed, caches, memory controller and OS kernel used for FS simulation.

VI. RESULTS AND DISCUSSION
This section comprehensively analyses the circuit-level main memory parameters for three memory technologies: DRAM, STT-MRAM and SOT-MRAM.The system-level analysis considers the influence of a multi-core environment, different memory organizations, and various memory capacities on memory structures.The study encompasses various metrics across workloads, including latency, power consumption, and bandwidth.Additionally, it examines the impact of various design parameters on the performance of SOT-MRAM.To conduct this analysis, we employed the exploration Algorithm-1, which allowed us to obtain valuable circuit-level results and insights.

A. ANALYSIS OF MICRO-ARCHITECTURE LEVEL NVM MAIN MEMORY DESIGN EXPLORATION
This section presents the results of micro-architecture level main memory exploration.Focusing on the circuit-level analysis of 1GB to 128GB main memory.The ensuing analysis is conducted based on our in-depth understanding and by compelling evidence from [6] and [38].
For these experiments, we utilized a refined version of Algorithm-1, which we adapted and optimized based on the work in [34].To enhance the circuit-level performance of the main memory across different memory capacities, we integrated the core of the DESTINY simulator [36] into our approach.
Algorithm-1 serves as the guiding instructions for the ideal tuning algorithm applied to the main memory.The power, performance, and area results of primary memory significantly vary based on the optimization target chosen in DESTINY [36].The optimal configuration for each memory technology is selected independently using Algorithm-1 to obtain the optimal performance results.

1) TOTAL AREA
In Figure 2, we observe the impact of memory size on the area occupied by different main memory technologies, viz DRAM, STT-MRAM, and SOT-MRAM.As we vary the memory size from 1GB to 128GB, it is evident that the overall memory area also increases by approximately two times across all memory technologies.A notable finding is that the area occupied by the DRAM to STT-MRAM main memory chip undergoes a significant reduction approximately four times when STT-MRAM replaces DRAM.In parallel, our analysis also revealed a substantial reduction in the total chip area occupied by SOT-MRAM to DRAM, which amounts to three times.This finding underscores the efficiency and space-saving advantages of employing SOT-MRAM as an alternative memory technology.The same is listed in Table 9.
Positive values indicate an increase in the area while negative indicates a decrease in the area (percentage-wise).The reduction in the area presents two significant advantages.Firstly, it allows for higher memory density in a given physical space, making it possible to accommodate larger memory sizes.Secondly, it provides the opportunity to maintain the same memory capacity as DRAM while utilizing MRAM technology with a more compact memory area.STT-MRAM enables a 4x increase in memory size, while SOT-MRAM offers 3x more memory capacity.SOT-MRAM is preferred over STT-MRAM as it overcomes the limitations associated with the latter.

2) READ AND WRITE LATENCIES
The analysis of main memory read latency (Fig. 3) reveals significant latency reductions that can be achieved by adopting STT-MRAM and SOT-MRAM over traditional DRAM.At 1GB memory size, STT-MRAM offers approximately 4.56 times faster read access than DRAM, while SOT-MRAM showcases an even more impressive 5.61 times improvement.These reductions continue across various memory sizes, with STT-MRAM and SOT-MRAM consistently outperforming DRAM in read access times.
Similarly, regarding write latency, STT-MRAM and SOT-MRAM demonstrate substantial improvements over DRAM.At 1GB memory size, STT-MRAM provides around 2.68 times faster write access, while SOT-MRAM achieves a remarkable 6.34 times improvement.These gains continue at larger memory sizes, with STT-MRAM offering approximately 3.56x faster write access on average compared to DRAM and SOT-MRAM delivering an impressive 5.16x improvement.Also, STT-MRAM takes 3x to 2x more write access time than SOT-MRAM at various memory capacities.

FIGURE 3. Analysis of access latency(ns).
In conclusion, adopting MRAM-based technologies, especially SOT-MRAM presents a compelling opportunity to significantly reduce both read and write access times compared to traditional DRAM.These latency reductions have the potential to enhance overall system performance and make MRAM-based main memory systems a promising choice for future memory architectures.SOT-MRAM's ability to overcome the long and unreliable write times of STT-MRAM makes it a superior replacement for DRAM regarding access latencies.

3) DYNAMIC ENERGY
The analysis of access energy consumption in main memory reveals notable differences among memory technologies (Fig. 4).When DRAM main memory is compared to STT-MRAM main memory, significant reductions in read dynamic energy are observed, with an average decrease of approximately 72.18%.Similarly, SOT-MRAM has 75.98% read energy reduction compared to DRAM.Further SOT-MRAM reduces the read dynamic energy by 16.62% on an average in comparison to STT-MRAM.
The write dynamic energy (Fig. 4) reduction from STT-MRAM in place of DRAM results in an average decrease of about 78.84%.In contrast, the STT-MRAM to SOT-MRAM main memory comparison yields a reduction of approximately 65.48%.Replacing DRAM from SOT-MRAM leads to the most significant energy reduction, with an average decrease of around 92.70% for write dynamic energy.
These findings highlight the potential benefits of adopting MRAM-based technologies, especially SOT-MRAM, in reducing the energy consumption of main memory systems.The considerable energy reductions achieved through these transitions can improve energy efficiency and lower power consumption in memory-intensive applications, making SOT-MRAM-based main memory systems a promising choice for energy-conscious memory architectures.

4) ENERGY-DEALY PRODUCT (EDP)
Fig. 5 presents the Energy-Delay Product (EDP) analysis in main memory micro-architecture, revealing significant advantages of transitioning from DRAM to MRAM-based technologies, particularly SOT-MRAM.At 1GB memory size, STT-MRAM offers approximately 93.71% lower EDP for read operations and around 85.99% lower EDP for write operations than DRAM.However, SOT-MRAM outperforms DRAM and STT-MRAM, with approximately 96.35% and 96.62% lower EDP for read and write operations, respectively.
At 8GB memory size, STT-MRAM shows around 97.16% and 98.86% EDP reduction for read and write operations, respectively, compared to DRAM.Yet, SOT-MRAM exhibits even greater improvements, with approximately 97.88% and 99.75% lower EDP for read and write operations, respectively.For the 64GB memory size, STT-MRAM achieves approximately 90.23% lower EDP for read operations and around 92.59% lower EDP for write operations compared to DRAM.On the other hand, SOT-MRAM again surpasses DRAM and STT-MRAM, showing approximately 92.35% and 98.58% lower EDP for read and write operations, respectively.
Finally, at 128GB memory size, STT-MRAM exhibits around 80.04% lower EDP for read operations and approximately 84.14% lower EDP for write operations compared to DRAM.Nevertheless, SOT-MRAM remains the superior choice, showcasing approximately 84.68% and 97.12% lower EDP for read and write operations, respectively.
On average, STT-MRAM provides an EDP reduction of approximately 92.73% for read operations and around 77.37% for write operations compared to DRAM.However, SOT-MRAM continues to exhibit the most significant improvements, with an average EDP reduction of approximately 94.55% for read operations and 98.57% for write operations compared to DRAM.
The substantial EDP reductions observed in SOT-MRAM indicate its superiority over DRAM and STT-MRAM.The technology's ability to deliver remarkably lower energy consumption and access delays makes it a highly favourable alternative for energy-efficient memory architectures, offering the potential for enhanced system performance and reduced power consumption.

5) LEAKAGE POWER
In Fig. 6, the main memory analysis reveals interesting insights regarding leakage power.STT-MRAM consumes approximately 1.8 times more than DRAM, while SOT-MRAM consumes around 1.3 times more than DRAM and has 0.5 times less leakage power than STT-MRAM.Despite this higher leakage power, other essential performance parameters strongly favour SOT-MRAM.

FIGURE 6. Analysis of leakage power(W).
In conclusion, the analysis reveals that STT-MRAM and SOT-MRAM consistently outperform DRAM regarding access time.SOT-MRAM showcasing the most impressive performance as shown in Table 9.On average, SOT-MRAM provides approximately 4.83 times faster read access and 5.16 times faster write access than DRAM, making it a superior option for memory-intensive tasks.Furthermore, SOT-MRAM leads in energy efficiency, achieving remarkable reductions in dynamic energy consumption compared to DRAM.On average, SOT-MRAM demonstrates a 75.98% reduction in read dynamic energy and a significant 92.70% reduction in write dynamic energy.While STT-MRAM also exhibits energy savings, SOT-MRAM's improvements are more substantial.
The Energy-Delay Product (EDP) further highlights the dominance of SOT-MRAM, offering approximately 94.54% and 98.56% reductions in read and write EDP compared to DRAM, respectively, outperforming both DRAM and STT-MRAM.Regarding chip area, SOT-MRAM is more efficient, occupying approximately three times less space than DRAM.Although STT-MRAM can provide higher memory density, it has significant disadvantages, such as read disturbance, high write time, and energy consumption.
Despite having slightly higher leakage power than DRAM, SOT-MRAM exhibits nearly 16% less leakage than STT-MRAM.Combined with its superior access time, dynamic energy savings, lower EDP, and better memory density, as shown in Table 9 (fourth column), SOT-MRAM proves to be a compelling alternative to replace DRAM in primary memory systems.Its potential for improved memory performance and energy efficiency, along with overcoming the drawbacks of STT-MRAM Table 9 (third column), solidifies SOT-MRAM as the more promising choice for next-generation memory solutions.
Table 9 summarizes three memory technologies' average micro-architectural parameter values.Positive values denote an increase, whereas negative values indicate a reduction in the corresponding parameters.The first column lists the parameters, while the subsequent columns compare values between DRAM and STT-MRAM, STT-MRAM and SOT-MRAM, and DRAM and SOT-MRAM, respectively.
The analysis shows that at the circuit level, SOT-MRAM performs better than DRAM.However, when adapting an NVM cell for main memory using the same DDRrx protocol, we took a cautious approach.In the following section, we will delve into system-level simulations and their results, where we intentionally scaled the current and timing parameters for SOT-MRAM beyond what DRAM typically uses as in [1] and [10].Even with these conservative adjustments, SOT-MRAM not only exhibited similar performance to DRAM but even outperformed it in some aspects.

B. FULL SYSTEM ANALYSIS OF MULTI-CORE ENVIRONMENT
In this section, we examine how different applications in a multi-core environment perform in terms of power consumption, bandwidth utilization, EDP (Energy-Delay Product), and total latency when using DRAM, hybrid memory, and SOT-MRAM as the main memory structures.

1) ANALYSIS OF TOTAL POWER CONSUMPTION
This section analyses the total power consumed by different memory structures.Figure 7a,8a, 9a and 10a depicts the total power consumed by different benchmark programs.Analysis was performed in single-core, dual-core, quad-core, and octacore environments.The values in Table 10 shed light on the power consumption of each memory technology across different core configurations.When values are positive, it means there's an increase in total power consumption, while negative values indicate a reduction.This analysis helps us understand how core counts influence the total power consumption of different memory technologies under various application workloads.
We consistently observe reductions when comparing DRAM and Hybrid memory power consumption across various core configurations.In a single-core setup, Hybrid memory consumes 62.16% less power than DRAM.This reduction remains consistent as the number of cores increases, with power reductions of 61.32%, 60.99%, and 61.59% in 2-core, 4-core, and 8-core environments, respectively.
It is interesting to note that full SOT-MRAM main memory outperforms DRAM regarding power consumption.Specifically, in a 1-core environment, full SOT-MRAM exhibits 74.78% less power consumption, which significantly improves over DRAM.This trend continues as the number of cores increases, with reductions of 73.73%, 73.97%, and 73.73% in 2-core, 4-core, and 8-core environments, respectively.
The analysis shows that SOT-MRAM consistently has lower power consumption than Hybrid memory.In a 1-core environment, the SOT-MRAM configuration shows a 33.36% decrease in power consumption compared to Hybrid memory.This reduction remains consistent in 2-core, 4-core, and 8core environments, with 32.09%, 33.26%, and 31.61% less power consumption, respectively.These findings suggest that SOT-MRAM is more efficient and cost-effective for powerconscious applications than DRAM.

TABLE 10. Comparison of the memory structures total power consumption(%).
In conclusion, regardless of the number of cores, on average full SOT-MRAM demonstrates superior power reduction(74.05%)compared to both DRAM and Hybrid memory configurations.Its consistent performance across various core configurations makes it a promising and energy-efficient memory technology for multi-core environments.

2) ANALYSIS OF BANDWIDTH
The exploration of bandwidth utilization across different core configurations and memory technologies yields insights into three memory structures.The values presented in Table 11 provide insights into the performance of each memory technology across varying core configurations.Positive values indicate an increase in bandwidth utilization, while negative values signify a reduction.This analysis elucidates the impact of core counts on the bandwidth utilization of different memory technologies.SOT-MRAM consistently demonstrates the highest bandwidth utilization across all core configurations.In a singlecore setup, SOT-MRAM achieves an impressive 40.35% increase in bandwidth utilization compared to DRAM.Moreover, Hybrid memory attains a 29.24% enhancement over DRAM, while its performance is slightly diminished by 8.60% when compared to SOT-MRAM.
As the core counts increase, this trend perseveres.SOT-MRAM maintains its bandwidth utilization superiority in dual-core, quad-core, and octa-core environments, sustaining an average increase of 40.10%.In contrast, DRAM and Hybrid memory showcase comparatively lower values.This consistent pattern underscores the exceptional efficiency of SOT-MRAM in managing data-intensive tasks across diverse computational workloads.
Furthermore, a closer examination of the percentage change values accentuates this prevailing trend.On average, SOT-MRAM exhibits an impressive 40.10% higher bandwidth utilization than DRAM, with Hybrid memory having a noteworthy 28.34% advantage over DRAM.Notably, Hybrid memory's edge over SOT-MRAM diminishes significantly to −9.16%, underscoring the consistent and superior performance of SOT-MRAM in optimizing bandwidth utilization.
In conclusion, this in-depth analysis highlights the significant influence of memory technology on bandwidth utilization, with SOT-MRAM standing out as the preferred option across a range of core configurations.Its ability to consistently maintain high bandwidth utilization under varying workloads underscores its potential to enhance overall system performance and efficiency.

3) ANALYSIS OF TOTAL LATENCY
Examining total latency across various core configurations and memory technologies yields significant insights into their performance dynamics.The comparison of DRAM, Hybrid memory, and SOT-MRAM structures, as showcased in Table 12, each entry in the table represents the percentage change in total latency when transitioning between memory technologies for different core counts.The interpretation of positive and negative values reveals the percentage increase or decrease in latency, effectively highlighting the impact of memory technologies.To further enhance clarity, Figures 7c,8c, 9c, and 10c visually represent these trends, elucidating the intricate interplay between core counts and memory technologies in influencing latency outcomes.Notably, a lower latency value indicates better performance, underscoring the critical role of memory technology and core configuration in shaping system responsiveness.
Analyzing the average latency across all core configurations provides a more comprehensive view.On average, Hybrid memory showcases a latency of approximately 47.80 cycles, while DRAM records a latency of about 46.14 cycles, and SOT-MRAM demonstrates the lowest latency at around 43.92 cycles.This reveals that SOT-MRAM consistently offers the lowest latency values regardless of the core count, while Hybrid memory and DRAM follow closely.
In a single-core environment, replacing DRAM with Hybrid memory leads to an increase of 5.72% in total latency, indicating that Hybrid memory technology takes slightly longer to execute tasks in this configuration.Similarly, comparing DRAM to SOT-MRAM, a 4.98% increase in latency is observed for core-1, underlining the potential efficiency of SOT-MRAM in single-core scenarios.As core counts rise to 2, the trend continues.For DRAM to Hybrid memory, the increase in latency drops to 1.13%, showing that the difference in latency between the two memory technologies diminishes with more cores.Comparing DRAM to SOT-MRAM in a dual-core setup yields a mere 0.22% increase in latency for Hybrid memory, implying a minimal impact on performance.Interestingly, core-4 showcases a varying impact.For DRAM to Hybrid memory, the latency increase becomes more significant at 4.37%, suggesting that Hybrid memory may become less favourable in quadcore environments.On the other hand, comparing replacing the DRAM to SOT-MRAM results in a −0.68% latency reduction for core-4, signifying SOT-MRAM's suitability for specific multi-core workloads.Core-8 reveals further intriguing outcomes.DRAM to Hybrid memory results in a latency increase of 3.61%, while switching from DRAM to SOT-MRAM leads to a substantial latency reduction of −4.80%.This divergence highlights the potential of SOT-MRAM to excel in highly parallel computational scenarios.
One notable trend is the consistent latency increase when transitioning from DRAM to Hybrid memory.Across various core counts, the latency values for Hybrid memory consistently exceed DRAM's.On average, Hybrid memory exhibits around 3.70% higher latency than traditional DRAM.This increase in latency aligns with the characteristics of Hybrid memory, which typically introduces some overhead due to its complex architecture.However, when assessing  the shift from DRAM to SOT-MRAM, a different pattern emerges.The average latency change is almost negligible, with SOT-MRAM showcasing a mere 0.07% variation from DRAM.This outcome indicates that SOT-MRAM performs on par with, if not better, DRAM in terms of latency.This holds across different core counts, suggesting SOT-MRAM's consistently delivering efficient memory access, reinforcing its suitability for diverse computational loads.
In summary, the analysis underscores memory technology's pivotal role in influencing total latency.Hybrid memory consistently introduces a modest latency increase compared to DRAM, with variations influenced by core counts.On the other hand, SOT-MRAM maintains latency levels comparable to or better than DRAM across different core configurations.This study advocates for SOT-MRAM's adoption, given its potential to enhance system responsiveness, particularly in multi-core environments.These findings contribute to the broader discourse on memory technology's impact on system performance and highlight SOT-MRAM as a compelling choice for memory system optimization.

4) BURST POWER
Figure 7b,8b,9b, and 10b shows the burst power analysis conducted across various core configurations (ranging from single-core to octa-core) and memory technologies (DRAM, Hybrid memory, and SOT-MRAM) provides valuable insights into the dynamic interaction between core count and memory technology.As we examine the burst power values, it becomes evident that core count and memory technology play pivotal roles in determining power consumption patterns.When focusing on the average burst power values, we find that the lowest value is associated with DRAM at 0.016 watts, Hybrid memory at 0.023 watts, and SOT-MRAM with the highest burst power consumption at 0.034 watts.Considering the impact of core count, a consistent trend emerges: as the number of cores increases, the burst power consumption converges around the average values mentioned above across all memory technologies.

5) ANALYSIS OF EDP
The analysis conducted on the Energy-Delay Product (EDP) across diverse core configurations and memory technologies offers valuable insights.The comparison of substituting DRAM with Hybrid or SOT-MRAM memory technologies across varying core counts, along with the average EDP values, is presented in Table 13.Additionally, the findings illustrated in Fig. 11 underscore a notable trend: EDP values tend to converge around the memory technology employed rather than the number of cores.This analysis comprehensively explains the intricate relationship between core configurations, memory technologies, and EDP.
The comparative analysis between DRAM and Hybrid Memory Structures demonstrates a consistent average reduction of approximately 57.08% in the Energy-Delay Product (EDP) across varying core configurations.This underscores the inherent energy efficiency improvements that Hybrid memory consistently provides over conventional DRAM, regardless of the number of cores in the system.
Similarly, the comparison between DRAM and SOT-MRAM Memory Structures reveals an average EDP reduction of approximately 72.85% across diverse core counts.This underscores the substantial energy-saving potential inherent in SOT-MRAM when juxtaposed with DRAM, irrespective of the system's core configuration.
Further, examining SOT-MRAM and Hybrid Memory Structures elucidates an average EDP reduction of approximately 36.65% across varying core counts.This finding underscores the synergy between Hybrid memory and SOT-MRAM, showcasing their collective capacity to enhance energy efficiency in memory systems, irrespective of core count fluctuations.The findings conclusively establish that the choice of memory technology exerts a substantial influence on Energy-Delay Product (EDP), whereas the impact of core count remains marginal.Hybrid memory and SOT-MRAM consistently emerge as superior alternatives to conventional DRAM in terms of energy efficiency, emphasizing their potential to optimize energy consumption and performance within memory systems.
Fig. 12 and 13 present EDP, power, and performance analysis of the three memory organizations evaluated.
Looking at the total power consumption, SOT-MRAM-1 stands out as the most power-efficient, followed closely by Hybrid-1 and DRAM-3.Burst power favors Hybrid-1, with SOT-MRAM-1 and SOT-MRAM-2 showing com- parable results.Regarding bandwidth, SOT-MRAM configurations exhibit higher values, with SOT-MRAM-1 leading the way.Regarding average latency, DRAM-3 has the lowest latency, while Hybrid-3 shows the highest.As for EDP, DRAM-3 demonstrates the lowest energydelay product, indicating better overall performance.Comparing the different DRAM configurations, DRAM-2 and DRAM-3 show similar results in most parameters.Hybrid-1 offers the best performance among the Hybrid configurations, while Hybrid-2 and Hybrid-3 have higher latency values.
In the case of SOT-MRAM, SOT-MRAM-1 and SOT-MRAM-2 are generally more power-efficient and have lower latency compared to SOT-MRAM-3.When comparing the best-performing memory configurations among DRAM, Hybrid, and SOT-MRAM, Hybrid-1 and SOT-MRAM-1 stand out with lower power consumption, latency, and bandwidth.In terms of EDP, SOT-MRAM-1 exhibits the most favourable results.
In conclusion, Hybrid-1 and SOT-MRAM-1 emerge as the best-performing memory organizations among the tested configurations, offering a balance of power efficiency, low latency, high bandwidth, and favorable EDP.These findings highlight the potential benefits of adopting Hybrid and SOT-MRAM technologies for next-generation memory solutions, showcasing their superior performance over traditional DRAM configurations.
Regarding total power consumption, Hybrid-1 shows a significant advantage with a 60.74% reduction compared to DRAM-2.However, SOT-MRAM-1 outperforms DRAM-2 and Hybrid-1 with a remarkable 73.21% reduction in total power.Regarding burst power, SOT-MRAM-1 consumes the highest, with a 102.81% increase compared to DRAM-2 and a 44.55% increase compared to Hybrid-1.Average latency is slightly improved in DRAM-2 and SOT-MRAM-1 configuration by 4.55% and 3.94%, respectively, compared to Hybrid-1.For bandwidth, both DRAM-2 and SOT-MRAM-1 show improvements compared to Hybrid-1, with percentage increases of 20.66% and 31.03%,respectively.EDP reduction is most significant in SOT-MRAM-1, showing a 75.27% reduction compared to DRAM-2 and a 42.36% reduction compared to Hybrid-1.
Overall, SOT-MRAM-1 exhibits the best balance of power, performance, and energy efficiency among the three memory configurations, making it the most promising choice for next-generation memory solutions.

D. MEMORY CAPACITY BASED ANALYSIS
This section analyses how different main memory technologies with varying capacities, organized into three memory structures, affect system-level parameters when employing an optimal memory organization.The study investigates how these memory configurations impact performance and power consumption in the overall system.The results shed light on the relationship between memory capacity, organization, and key system-level metrics, helping to make informed decisions on memory architecture for improved overall system performance and efficiency.Fig. 14 presents the results for 4GB to 128GB main memory as the average of values across all workloads from PARSEC.Table-16 shows the various capacity hybrid memory compositions from 4GB to 128GB used in the analysis.The analysis of total power consumption for the given workloads is shown in Fig. 14a.The following points analyse the DRAM, Hybrid and SOT-MRAM memory structures.
• DRAM and Hybrid: The hybrid memory configuration shows a notable reduction in total power consumption compared to DRAM across all capacities, ranging from -30.96% to -68.28%.At 16GB capacity, the hybrid memory demonstrates a surprising 67.46% increase in power consumption, suggesting that it may not be the optimal choice for this specific capacity.
• DRAM and SOT-MRAM: The SOT-MRAM configuration consistently displays significant power savings over DRAM, with reductions ranging from -37.3% to -84.62%.At 64GB capacity, the SOT-MRAM configuration shows a 61.89% decrease in power consumption, highlighting its superiority over DRAM for this capacity.
• Hybrid and SOT-MRAM: The SOT-MRAM memory configuration experiences varying power savings compared to the hybrid memory configuration, ranging from -9.25% to -62.75% reduction.At 16GB capacity, the SOT-MRAM memory exhibits a 62.74% decrease in power consumption compared to hybrid memory, indicating its potential advantage over hybrid memory for this capacity.
Overall, the analysis reveals that SOT-MRAM consistently demonstrates the most significant power savings compared to DRAM and hybrid memory across different capacities.However, it is crucial to consider specific capacity requirements when selecting the optimal memory organization to achieve a balance between power consumption and performance.Figure 14b illustrates the average Energy-Delay Product (EDP) analysis for optimal memory organizations suitable for 4GB to 128 GB capacity.• DRAM and Hybrid: The hybrid memory configuration shows an increase in EDP compared to DRAM, with the percentage change ranging from 79.61% to 481.14%.At 8GB capacity, the hybrid memory exhibits the highest increase in EDP, indicating that it may not be the most energy-efficient option for this specific capacity.
• DRAM and SOT-MRAM: The SOT-MRAM configuration demonstrates a reduction in EDP compared to DRAM, with the percentage change ranging from -33.75% to -82.17%.At 4GB capacity, the SOT-MRAM configuration exhibits the highest reduction in EDP, suggesting it may offer better energy efficiency for lowcapacity scenarios.
• Hybrid and SOT-MRAM: The SOT-MRAM memory configuration shows a decrease in EDP compared to the hybrid memory configurations, with the percentage change ranging from -63.11% to -93.85%.At capacities of 8GB and 32GB, the SOT-MRAM memory demonstrates the most significant EDP reduction of -93.8% and -93.6%, respectively.This suggests that, for these specific capacities, SOT-MRAM could offer enhanced energy efficiency compared to hybrid memory.
In summary, the analysis reveals that SOT-MRAM demonstrates lower EDP values compared to DRAM, making it a more energy-efficient choice.However, the hybrid memory configuration shows higher EDP values than both DRAM and SOT-MRAM, suggesting it may not be the best option for optimizing energy efficiency.Selecting the appropriate memory organization should consider the specific capacity requirements and the desired trade-off between energy efficiency and performance.
In the following analysis, we consider Fig. 14c the average bandwidth utilization across all workloads for three memory structures.
• DRAM and Hybrid: The hybrid memory configuration experiences a decrease in bandwidth utilization compared to DRAM, with the percentage change ranging from -6.66 to 15.36%.At 4GB capacity, the hybrid memory demonstrates the highest decrease in bandwidth utilization, indicating that it may not be the best choice for low-capacity scenarios.
• DRAM and SOT-MRAM: The SOT-MRAM configuration shows an increase in bandwidth utilization compared to DRAM, with the percentage change ranging from 10.09% to 84.45%.At 8GB capacity, the SOT-MRAM configuration exhibits the highest increase in bandwidth utilization, suggesting it may provide better performance for this specific capacity.
• Hybrid and SOT-MRAM: The hybrid memory configuration experiences varying changes in bandwidth utilization compared to SOT-MRAM, with the percentage change ranging from 17.94% to -4.18%.At 64GB capacity, the SOT-MRAM memory exhibits the highest increase in bandwidth utilization, indicating it may offer better performance for this particular capacity.
Overall, the analysis reveals that the hybrid memory shows lower bandwidth utilization compared to DRAM, whereas SOT-MRAM demonstrates improvements over DRAM depending on the capacity.Fig. 14d presents the analysis of the three memory technologies' average total latency at various memory capacities.
• DRAM and Hybrid: The hybrid memory configuration exhibits a considerable increase in average latency compared to DRAM, ranging from 61.24% to 137.20%.At 32GB capacity, the hybrid memory shows the highest increase in average latency, indicating that it may not be the best choice for applications.
• DRAM and SOT-MRAM: The SOT-MRAM configuration demonstrates a slight increase and decrease in average latency compared to DRAM, ranging from 2.83% to −32.37%.At 8GB capacity, the SOT-MRAM configuration displays the highest reduction in average latency by −32.37%, suggesting it may offer better performance for this specific capacity.
• Hybrid and SOT-MRAM: The hybrid memory configuration experiences varying changes in average latency compared to SOT-MRAM, ranging from -36.23% to −71.49%.At 8GB capacity, the SOT-MRAM memory exhibits the highest decrease in average latency at −71.48%, indicating it may provide better latency performance for this particular capacity.On average, the SOT-MRAM memory can reduce the latency and speed up workloads by 49.90%.
The analysis indicates that hybrid memory exhibits higher average latencies than DRAM, while SOT-MRAM showcases enhancements and comparable latencies to DRAM, contingent upon the capacity.Specific capacity needs and the intended balance between latency and performance should guide the choice of an optimal memory organization.
The analysis of average burst power across capacities from 4GB to 128GB reveals distinct power consumption patterns among the memory technologies in Fig. 14e.DRAM exhibits the lowest burst power consumption at 0.015 watts, followed by hybrid memory at 0.017 watts.Notably, SOT-MRAM demonstrates higher burst power consumption, registering 0.036 watts.This observation underscores the varying power efficiency profiles of these memory technologies.
In summary, Fig. 14 offers a comprehensive overview of the analysis conducted on different memory technologies across a wide range of capacities, spanning from 4GB to 128 GB.Comparing DRAM to hybrid memory, there is a noteworthy reduction in average total power consumption by 23.56%, and in comparison to SOT-MRAM, the reduction is even more significant at 53.21%.Similarly, the average total latency demonstrates a substantial increase of 115.78% when comparing DRAM to hybrid memory but a slight reduction of −2.2% when comparing to SOT-MRAM and a larger reduction of −49.90% for SOT-MRAM to hybrid memory.
Furthermore, the analysis shows improvements in bandwidth utilization.There are increases of 13.27%, 83.80%, and 62.54% for DRAM to hybrid, DRAM to SOT-MRAM, and SOT-MRAM to hybrid comparisons, respectively.On the other hand, the Energy-Delay Product (EDP) shows an increase of 291.72% for DRAM to hybrid while experiencing reductions of −56.44% and −83.80% for DRAM to SOT-MRAM and SOT-MRAM to hybrid comparisons, respectively.
It is important to note that burst power varies across different sizes, except for SOT-MRAM, which remains consistent across all capacities.In conclusion, this analysis underscores the significance of memory technology selection based on specific capacity requirements and trade-offs between factors such as power consumption, latency, bandwidth utilization, and energy-delay products.Considering these factors, SOT-MRAM emerges as an appealing main memory candidate.Its combination of energy efficiency, competitive performance, and capacity scalability positions it as a technology that could address the evolving demands of modern computing systems while contributing to more sustainable and efficient computing practices.

VII. CONCLUSION AND FUTURE WORK
This study brings in a fresh approach by introducing SOT-MRAM into various memory structures within multicore systems.This tackles the challenge of evaluating SOT-MRAM-based memory systems when specific parameters are missing.In-depth investigations at the micro-architectural level and extensive full-system simulations across diverse applications highlight SOT-MRAM's potential as a main memory technology.
At the circuit level, SOT-MRAM offers advantages like a 3x smaller footprint, 4x to 5x lower access latencies compared to DRAM at various capacities, 72.18% less read energy usage, and an impressive 92.70% reduction in write energy.Furthermore, at the system level, during multi-core evaluations with real workloads, it demonstrates substantial power efficiency with a remarkable 74.05% reduction, a noteworthy 40.10% increase in bandwidth utilization, and a significant 72.85% reduction in Energy-Delay Product (EDP).It maintains minimal latency impact, which is vital for real-time applications.Our comprehensive evaluation, encompassing both circuit-level and system-level analyses, underscores SOT-MRAM's superiority over traditional DRAM.It suggests significant enhancements in performance, energy efficiency, and minimal latency penalties.These findings position SOT-MRAM as a transformative technology with the potential to revolutionize memory systems across a wide spectrum of computing applications.
The application of these results extends to various computing domains, where SOT-MRAM can offer substantial benefits.Its reduced power consumption and enhanced bandwidth utilization can enhance the performance of energy-sensitive applications in fields like IoT and mobile devices.In highperformance computing, the minimal latency impact and favourable power reduction make it a promising candidate for optimizing memory-intensive tasks.Additionally, real-time systems, which demand low latency and energy efficiency, can greatly benefit from the superior attributes of SOT-MRAM.The potential for SOT-MRAM to replace or complement DRAM in diverse computing applications is a significant step toward improved system efficiency and performance.
We recognize that this work has few limitations.We intend to optimize sense amplifiers in the memory array and evaluate performance against modern AI/ML workloads in future work.Furthermore, device parameters may differ, and manufacturing and operating conditions can influence results.As memory technologies continue to evolve, our findings require updates to remain relevant.Despite these constraints, our research provides valuable insights into integrating SOT-MRAM into memory systems with reliable simulations.

FIGURE 7 .
FIGURE 7. Power and Performance analysis of 1-Core.

FIGURE 8 .
FIGURE 8. Power and Performance analysis of Core.

Figure 7d ,
Figure7d,8d,9d, and 10d presents the details of the bandwidth utilization for different memory technologies.In bandwidth utilization higher the value better the result.SOT-MRAM consistently demonstrates the highest bandwidth utilization across all core configurations.In a singlecore setup, SOT-MRAM achieves an impressive 40.35% increase in bandwidth utilization compared to DRAM.Moreover, Hybrid memory attains a 29.24% enhancement over DRAM, while its performance is slightly diminished by 8.60% when compared to SOT-MRAM.As the core counts increase, this trend perseveres.SOT-MRAM maintains its bandwidth utilization superiority in dual-core, quad-core, and octa-core environments, sustaining an average increase of 40.10%.In contrast, DRAM and Hybrid memory showcase comparatively lower values.This consistent pattern underscores the exceptional efficiency of SOT-MRAM in managing data-intensive tasks across diverse computational workloads.Furthermore, a closer examination of the percentage change values accentuates this prevailing trend.On average, SOT-MRAM exhibits an impressive 40.10% higher bandwidth utilization than DRAM, with Hybrid memory having a noteworthy 28.34% advantage over DRAM.Notably, Hybrid memory's edge over SOT-MRAM diminishes significantly to −9.16%, underscoring the consistent and superior performance of SOT-MRAM in optimizing bandwidth utilization.In conclusion, this in-depth analysis highlights the significant influence of memory technology on bandwidth utilization, with SOT-MRAM standing out as the preferred option across a range of core configurations.Its ability to consistently maintain high bandwidth utilization under varying workloads underscores its potential to enhance overall system performance and efficiency.

FIGURE 9 .
FIGURE 9. Power and Performance analysis of 4-Core.

FIGURE 10 .
FIGURE 10.Power and Performance analysis of 8-Core.

FIGURE 13 .
FIGURE 13.Analysis of power and performance for memory organizations.

FIGURE 14 .
FIGURE 14. Various capacity memory structures power and performance.

TABLE 1 .
An overview of memory technologies

TABLE 2 .
Table provides a concise overview of the various studies on NVM main memory systems.

TABLE 3 .
The SOT-MRAM timing parameters unrelated to row operation(scaled from DDR3-1600 in Cycles).

TABLE 5 .
Current parameters(in mA) used in the study(scaled from DDR3-1600).

TABLE 8 .
Memory configuration details.

Table 8
provides information on the parameters associated with the memory configuration and scheduling algorithm used in the FS simulation.

TABLE 9 .
Average percentage change of circuit level parameters for main memory.

TABLE 11 .
Comparison of the memory structures bandwidth utilization(%).

TABLE 12 .
Comparison of the memory structures total latency(%).

TABLE 14 .
Different memory organizations for evaluation.

TABLE 15 .
Comparison of three memory organizations.

TABLE 16 .
Composition of hybrid memories.