Design of Memory Shifting System Based on Dual-Space Storage Architecture

The computer system of the traditional storage architecture has many bottlenecks in massive data transmission, analysis, processing, etc., especially the frequent data migration or copying problem between memory and external storage, which is the most prominent and restricts the full play of CPU performance. To address this, Professor Y. Jin proposed a new solution called dual-space storage architecture based on non-volatile random-access memory (NVRAM) and a data security technique called non-closable window. For the new storage architecture, a corresponding underlying software management model is proposed in this paper. The design scheme of the memory shifting system in this model is introduced in detail, and the feasibility of the scheme is verified through software simulation experiments.


I. INTRODUCTION
Since the beginning of the 21st century, computer technology has flourished and played a significant role in applications such as massive data storage and processing, memory computing, and data-intensive computing, thereby providing new opportunities for development in each field. However, defects that require frequent data migration between memory and secondary storage are more prominent in traditional computer storage architectures. In response to these defects, academic and industrial research has achieved remarkable results in this area, which can be roughly divided into three categories as shown in Figure 1. The first category uses Nor flash(or Nand flash) as a substitute or auxiliary tool for mechanical hard disks [1], [2]. This category is mainly used for The associate editor coordinating the review of this manuscript and approving it for publication was Chong Leong Gan . applications with relatively low performance requirements of computing systems, such as solid state drives (SSDs) that replace mechanical hard disk drives for PC and netbooks, and Nor flash(or Nand flash) are used as main memory for embedded systems; the second category uses storage class memory (SCM) [3], [4] as main memory and the buffer between main memory and secondary storage, and the boundaries of memory will no longer be obvious. This category is primarily for enterprise-class or high-performance computing applications and is part of the multicore processor shared memory architecture. For example, IBM, Facebook, Intel, Google, and other research institutes or companies have been researching or adopting storage class memory architecture storage systems [5], [6], [7], which use this structure to build hybrid storage arrays to increase DRAM access speed; Last category uses NVRAM as only storage device [8], [9]. NVRAM technology [10], [11], [12], which combines  low-latency, low-power consumption, non-volatility, highdensity, and byte-addressable, has become more mature, and manufacturing costs declining year by year. For example, phase-change memory (PCM), magnetoresistance randomaccess memory (MRAM), resistive random-access memory (RRAM), and spin-transfer torque random-access memory (STT-RAM), etc., they have both byte-addressable characteristics of memory (such as DRAM) and non-volatile characteristics of external storage (such as HDD). However, most of these storage systems are still in the exploration phase or in small-scale applications owing to the limitation of the processor address line number or manufacturing process of the processor. Currently, the address space that the processor can directly access remains at 4GB (32-bit address lines, personal computer) or 4PB (52-bit address lines, advanced server) [13], and the data to be calculated or processed by the processor still needs to copy data frequently between memory and external storage. In 2013,Jin [14], [15] proposed memory space shifting theory and its implementation technologies. In the same year, a small-capacity dual-space storage with 1GB for the experimental platform was successfully developed, which expanded the CPU addressable space from 2MB to 1GB and zero-copy access data.

II. DUAL SPACE STORAGE ARCHITECTURE AND ITS WORKING PRINCIPLE
A schematic diagram of dual-space storage architecture is shown in Figure 2, which only has dual-space storage, but includes word space and block space, corresponding to memory and secondary storage in the traditional storage hierarchies,respectively. Because they use the same storage medium, the space state is determined by the CPU according to the control signal. Therefore, the components of the dual-space architecture include hardware and software . The   hardware part mainly includes a push and shift latch set,  an output decoder, an input decoder, and dual-space storage;  the software part mainly includes window wall management,  window frame management, shifting vector table management, and shutdown status table management, etc. The core of building a dual-space storage architecture is design of shifting latch set. This design schematic is shown by the dotted blue box in Figure 2. This technology called memory space shifting technologies automatically map 2 m+s−n times (m > n − s) larger than itself. Its working principle is summarized as follows.
First, the memory address line output from the processor are divided into two parts: low and high address lines. The low address lines AB (s−1)−0 are directly connected to the low address line CAB (s−1)−0 of dual space storage; And the high address lines AB (n−s)−s are connected to decoder Y o through the input signal terminal I (n−s) of Y o , and then the output signal terminal P i after the Y o decoding is strobed to the output control signal terminal OE of the corresponding latch Latch i , and finally the value (D m D m−1 · · · D 0 ) of Latch i is output to the high address lines CAB (m+s−1)−s of the dual space storage via the data line DB via the output signal terminal Q of the latch.
Secondly, the low address lines CAB (n−s−1)−0 of the dual space storage is connected to Y i through the input signal terminal I (n−s) of decoder Y i , and then the output control signal terminal P i via Y i decoding is gated to the corresponding lock. The write control signal terminal IE of latch Latch i , and then the value (D m D m−1 · · · D 0 ) of the corresponding latch Latch i is modified to the value on the data line DB, thereby establishing a new mapping relationship to achieve memory space in entire word space. In other words, users can access entire word space by modifying corresponding latch value as needed.
In general, a part of dual-space storage, called noncloseable window, is used to store programs or important data, and these window frames cannot be moved. The value of the corresponding latch will be precured (in this case, the line marked with red ''X'' in Figure 2 is turned off) or set by the system designer using software. For example, if the value of each bit of the latch Latch 0 in Figure 2 is set to 1 and cannot be arbitrarily modified, then NO.2 m−1 window wall is regarded as a non-closeable window of the current storage system. The program or data frequently used will be stored in non-closeable window, and mainly includes an initialization program involving the security of the computer system, a transition vector table, an interrupt vector table,  a shutdown state table, a window wall management table,  a window frame management table, etc., and these parts space play a resident memory role.

III. DESIGN OF MEMORY SHIFTING SYSTEM
Based on the dual-space storage architecture, we propose a dual-space storage management system model that consists of two core subsystems (as shown in Figure 3), including a memory shifting system (blue dashed box) and dual-space file system (green dashed box), which function similarly to the memory management module and file system in modern operating systems, respectively. In this paper, we discuss only the design scheme of the memory shifting system, and the design scheme of the dual-space file system is described in detail in a new paper.

A. STRUCTURE DESIGN OF THE CORE DATA TABLE
The core data tables of the memory shifting system include window entity management (2) window frame management table. / * Records the status of the window frame assignment. This table is only required to initialize the settings when the system is booted on the first time but is not loaded repeatedly thereafter. * / struct window_frame_table { // window frame number unsigned long w_ frame _no; unsigned long w_frame_movable; / * whether window frame can be moved or not * / unsigned long lnu_count;// least recently used unsigned short distribution_mark; / * assign mark to match the window frame status bitmap * / unsigned short shared_num;/ * If the shared number is not 0, it means this window frame cannot be reclaimed temporarily. The shared number is added to 1 when there is an occupied process, subtracted from 1 when it is released, and is 0 when there is no occupation * / // window entity number unsigned long w_entity_no; . . . . . . // Omitted }; (3) window frame status bitmap / * Designed to quickly query the usage status of window frames, its data comes from the window frame management table. * / struct frame_state_bmp { unsigned long w_ frame _no; unsigned short distribution_mark;/ * assignment identifier, 0 means the window frame is not assigned, 1 means the window frame is assigned * / };   The layout of the core data table for dual-space storage is shown by the dashed red box in Figure 4. They are generally loaded into non-closable window entities (see the data structure of the window entity table), that can be controlled by hardware or user programs.

B. WORKFLOW DIAGRAM OF MEMORY SHIFTING SYSTEM
According to the working principle of dual space storage, the memory shifting system workflow can be outlined as three steps.
Step 1: The user or program sends a request to the OS to access a file, and then sends the file name to the file system. The file system queries from the window entity management table based on file name. If the corresponding file is not found in the window entity management table, invalid information is sent back to the OS and the OS notifies the user. If the corresponding file is found, then the starting address (dualspace storage address) of the target file and its corresponding window frame and window entity numbers are obtained.
Step 2: According to the window frame number and window entity number obtained in Step 1, the memory shifting system checks whether there is a correspondence between them in the window frame management table. If there is a correspondence between the two, the file is accessed directly according to the target address and the processing result is returned to the OS, which then notifies the user. Go to Step 1. If no correspondence exists between the two, then the interrupt controller sends a shifting interrupt request to the CPU. Then go to Step 3.
Step 3: After the CPU receives the shifting interrupt request, it continues to execute the running program and starts the field protection mechanism immediately after the running program. It then sends the interrupt response signal to the interrupt controller by saving the current values of all registers involved. Immediately afterward, the CPU jumps to execute the shifting interrupt handler, which checks the window frame usage using the window frame status bitmap. If there are free frames, one of the free frames is randomly (or sequentially) assigned to the current request, and the value of the corresponding latch is modified according to the obtained window entity number to establish the correspondence between the currently assigned window frame number and window entity number of the target file, and the window entity management table, window frame management table, and window frame status bitmap are updated simultaneously. Finally, the target file is directly accessed from dual-space storage, the processing result is returned to the OS, and the OS notifies the user. If there is no free window frame, a certain window frame is reclaimed and reassigned to the current request according to the window frame reclaim policy (e.g. LRU), the window frame is shifted, and the window entity management table, window frame management table, window frame status bitmap, etc., are simultaneously updated. Finally, the target file is directly accessed from dual-space storage, the processing result is returned to the OS, and the OS notifies the user. At this point, the processing of the shifting interrupt is finished, and the CPU needs to start the recovery mechanism immediately, and the values of the registers originally stored in the wharves will be stacked out and passed into the corresponding registers in reverse order, and then the CPU will intermittently execute the program instructions that were forced to abort due to the shifting interrupt. Workflow diagram of memory shifting system is shown in Figure 5.
C. ALLOCATION AND RECYCLING STRATEGY OF WINDOW FRAME 1) WINDOW FRAME ASSIGNMENT STRATEGY This policy adopts the allocation principle of sequence or priority. If there are multiple processes (tasks) that simultaneously make access requests, the request priority queue is first established and then assigned according to the priority of the processes (tasks) from highest to lowest. Otherwise, the window frame is assigned according to the order of requests, and the implementation process is as follows.
Step 1: Randomly (or sequentially) assign a window frame to the current request if there is a free window frame based on the window frame status bitmap. Otherwise, go to Step 2.
Step 2: One of the allocated window frames is reclaimed according to the window frame reclamation policy, and this window frame is assigned to the current request. For additional details, please refer to Section 3.2.

2) WINDOW FRAME RECYCLING STRATEGY
When there is no free window frame to be allocated, the least recently used policy (LRU) and the program's space local priority principle are used to reclaim the allocated frames (except for those corresponding to non-closable windows). The window frame with lnu_count = 0 is the most recently used window frame, and its corresponding window entity number is W _entity_no. The reclamation process is as follows.
Step 1: Calculate the absolute value of the subtraction of the window entity number and W _entity_no in the window frame management table, sort these absolute values in descending order, and prioritize the window frame corresponding to the window entity with the maximum absolute value. If the absolute values are the same, go to Step 2.
Step 2: The value lnu_count in the window frame management table are sorted in descending order, and the window frame with the maximum value lnu_count is recycled in priority. If the maximum value lnu_count is the same, then the first one is taken.

IV. SIMULATION EXPERIMENT OF RESOURCE MANAGEMENT STRATEGY BASED ON DUAL SPACE STORAGE
Currently, many NVRAM are still in the exploratory stage and no products are available. Therefore, this experiment simulated the workflow of a memory shifting system on a PC in order to verify the feasibility of the proposed design scheme.

A. EXPERIMENTAL ENVIRONMENT AND PROCEDURE 1) EXPERIMENTAL ENVIRONMENT
In this experiment,the software and hardware used include:Intel(R) core(TM) i3-5005U CPU @2.00GHz, 4GB DDR3, Windows 7, VMware R Workstation 14 Pro, CentOS 7.0, 16GB USB flash drive, and so on.

2) EXPERIMENTAL DESIGN
According to the construction principle of dual-space storage in Section 2, in this experiment, we assume that the addressable space of the CPU is 2 MB (= 2^21), where three high address lines of the CPU are used as selected shifting latches (window frames), each latch is 12 bits and is used as the high address line of dual-space storage, represented by the shifting vector table; the remaining 18 low address lines are directly connected to the low address line of dual-space storage. Its addressing space is extended to 1 GB (= 2^30) by the memory shifting system, and its extension schematic is shown in Figure 6.  , the shifting vector table,  the window frame management table, the window frame  status bitmap, the shutdown status table,     b) Writing the shifting interrupt-handling simulation program; c) Writing of window frame allocation and recycling simulation programs; d) Creating a system boot integration image and burning it in a USB flash drive; e) Load the integrated image into the specified location of the DDR3 memory using the USB flash drive bootloader, as shown in Figure 7; f) Randomly input ten sets of addresses of dual-space storage and verify that the above simulation program execution process is executed according to the preset flow.

B. SIMULATION RESULTS AND ANALYSIS
In this experiment, there are 8 window frames and 4096 window entities. It is assumed that window frame #0, corresponding to window entity #0 and window frame #1 corresponding to window entity #1 are solidified. These are called non-closable window entities. In other words, only six window frames can be moved between window entity #2 and #4095. The main interface appears after the simulation program finishes initializing the relevant data and loading the program as shown in Figure 8. There are eleven options, and we mainly choose #7 and #9 to perform the test experiments. Option #7 performs the initialisation of the core tables for dual space strorage as shown in Figure 9. By inputting different target addresses, we focus on observing the changes of the four core data tables. For ease of description, the addresses in this simulation are written in 32-bit hexadecimal form. At the same time, considering the length of the article, we only take the target address ds_addr = 0X00008000 as an example to analyze the entire access process. And then obtained its high 12-bit address 00000000000000 (w_entity_no = 0) and low 18-bit address 100000000000000000 in Figure 6. Therefore, the high 3-bit address 000 (w_frame_no = 0) and the low 18-bit address from ds_addr are stitched together to form the processor's access address pc_addr = 0X00008000. According to the workflow diagram of the memory-shifting system (in Figure 5), the first step is to examine the window entity management table to determine whether there is a corresponding window frame exists for window entity #0. Because we have already assumed that window entity #0 is a non-closable window, and it corresponds to window frame #0. In other words, a mapping relationship exists between the two. Target address ds_addr can be accessed directly via pc_addr. Once accessed, the window entity management table (Figure 8), window frame management table (Figure 9), and window frame status bitmap (Figure 10) must be updated simultaneously. The shifting vector table (Figure 11) is mainly used to modify the value of the shifting latch for simulation. It is automatically implemented using a hardware circuit; therefore, it can be ignored. The remaining nine sets of test data are presented in Table 1. In Table 1, the bold blue font indicates that there is already a correspondence between the window frame and window entity; therefore, the target address can be accessed directly. The bold red font indicates that the window frame and window entity do not have correspondence, such as ds_addr = 0X00FF0020; the corresponding window frame must be recycled, and the recycled window frame is then assigned to the current request.
From the test results in Table 1, the simulation experiment meets the expected goal, which proves the feasibility of the memory shifting system workflow and the correctness of the window frame allocation and recycling strategy.

V. COMPARISON WITH TRADITIONAL STORAGE ARCHITECTURE AND STORAGE MANAGEMENT MODEL
Compared with traditional storage architecture, the dualspace storage architecture has many advantages.

A. MORE FLEXIBLE SCALABILITY AND COMPATIBILITY
Users do not need to replace major devices such as motherboards or CPUs, but simply choose the right size of dual-space storage modules to be installed in memory slots according to their needs.

B. FASTER ACCESS SPEED
Logically, dual-space storage can be divided into word and block spaces. They belong to the same media and physical storage entity with the same access speed. Next, the speed difference between the two storage architectures is discussed in terms of the time required for the CPU to read the same data D with u MB. Assuming that T latch denotes the time required to modify the shifting latch value (typically in nanoseconds). According to the DDR4 SDRAM data sheet from micron website, read/write speeds of the current mainstream DRAM can reach to 40GB/s, wroten as V _R DRAM = V _W DRAM = 40GB/s. By the literature [10], [12], we can deduce that read speed of PCM about 20GB/s, wroten as V_R PCM = 20GB/s. In the HDD/SSD data sheet from the Western Digital website, typical read/write speeds of HDD can be top out at 200 MB/s, SSD top out at 7100MB/s, and are denoted as V_R HDD = V_W HDD = 200MB/s, V_R SSD = 7100MB/s, respectively. In a conventional storage structure with DRAM/ SSD (or HDD), the time required T old is: if D is in the DRAM at moment then at this time, D must first be copied from the HDD into the DRAM; In the new storage structure, the time required T new is: if the window entity where D is located has a mapping relationship with the window frame then Clearly, T latch is negligible when D increases and T new T old . If a conventional storage structure with DRAM/HDD, T old will become larger. Although the write operation of PCM is unsatisfactory, it can be solved in the future by improving the fabrication process.

C. SUITABLE FOR MULTI-CORE OR MULTI-PROCESSOR SHARED STORAGE
It is only necessary to modify the value of the corresponding shifting latches to the same value to share the same storage space. Therefore, it is more useful in computation-intensive applications.

D. HIGHER SECURITY
The non-closable window allows hardware or software control. For example, the shift vector table can be solidified by hardware when it leaves the factory. At the same time, core data tables can be installed in the non-closable window, and general users are not allowed to access these space. Therefore, this technology can enhance system security.
Additionally, compared with traditional storage management model, the dual-space storage management model also has some advantages.

1) STORAGE SPACE MANAGEMENT MADE EASY
Storage space is no longer separated into memory and external storage, and only logically divided into word space and block space in the same storage entity. All spaces are allowed random access by byte in units. Other technologies, such as virtual memory and address relocation technology will are discarded and replaced by memory space shifting technology. This technology is automatically implemented using hardware. Therefore, from a space management perspective, we maintain only a few core data tables and leave the remainder of the work to hardware.

2) ZERO COPY
To be compatible with conventional storage devices, dualspace storage is logically divided into word and block space, which correspond to the internal and external memory of conventional storage devices, respectively. However, in practice, both parts of the space can be accessed randomly by bytes. When there is no mapping relationship between window entity and window frame, it is sufficient to modify the value of the corresponding shifting latch. Therefore, there is no need to migrate data between the word space and block space.

3) WORK-ON-START
When the computer system is ready to shut, it automatically records its current running status and writes it into the shutdown status table; When computer system is ready to start, it directly starts from its last status. Therefore, we call it work-on-start(WOS). Therefore, these non-closable windows become a new fast startup mode and security mode, which makes users feel that program initialization or application loading process do not occur.

VI. CONCLUSION
In this paper, we first briefly introduce a dual-spacer storage architecture and its working principle. To address the management issues of this new storage architecture, the design of a memory-shifting system is proposed, and its design and implementation processes are described in detail. The correctness and feasibility of the system design are verified through software simulation experiments. Finally, compared with traditional storage, the advantages and potential applications of this new architecture and its space management model are discussed from the perspective of hardware and space management.
However, the implementation of the proposed memoryshifting system relies on immature NVRAM technology. Therefore, this is verified only through software simulations, and a preliminary prototype of the memory-shifting system is established. This study provides a theoretical basis for future research into dual-space file systems. graduate students of their research center for their kind help and valuable discussions in preparing the article.

DISCLOSURES
The authors declare that there are no conflicts of interest regarding the publication of this article.