What Is Memory Hierarchy?
In computer architecture, the memory hierarchy divides computer storage into a hierarchy based on response time. Since response time, complexity, and capacity are related, these levels can also be distinguished by their performance and control techniques. The memory hierarchy affects the performance of computer architecture design, algorithmic predictions, and lower-level programming structures involving reference locations.
- Due to the limitations of hardware technology, we can make small but fast
- The memory hierarchy is the arrangement order of the storage system hierarchy under the computer architecture. Each layer has a higher level than the next layer
- 1. Increasing complexity reduces the memory hierarchy.
- 2. CMOx memory technology extends the Flash space in the memory hierarchy.
- 3. One of the main ways to improve system performance is to minimize the direction of the memory hierarchy to manipulate data.
- 4.
- Over time, the number of levels in the memory hierarchy and the performance of each level have improved. For example, the memory hierarchy of the Intel Haswell mobile processor circa 2013 is:
- Processor registers-fastest access (usually 1 CPU cycle), several kilobytes in size.
- cache:
- Level 0 (L0) micro-op cache-6KiB in size.
- Level 1 (L1) instruction cache-128KiB size.
- Level 1 (L1) data cache-128KiB size. The best access speed is about 700GiB / s.
- Level 2 (L2) instruction and data (shared)-size is 1MiB. The best access speed is about 200GiB / s.
- The size of level 3 (L3) shared cache-6MiB. The best access speed is about 100GB / s.
- Level 4 (L4) shared cache-128MiB size. The best access speed is about 40GB / s.
- Main memory (main storage)-gigabytes. The best access speed is about 10GB / sec. For NUMA machines, access time may not be uniform disk storage (secondary storage)-terabyte size. As of 2017, the best access speed is about 2000MB / s from a consumer SSD.
- Nearline storage (Tier 3 storage)-up to EB level. As of 2013, the best access speed is about 160MB / s.
- Offline storage:
- The lower levels of the hierarchy (from disk down) are also referred to as tiered storage. The formal differences between online, nearline and offline storage are:
- 1. Online storage is immediately available for I / O.
- 2. Near-line storage is not immediately available, but can be quickly stored online without human intervention.
- 3. Offline storage is not immediately available and requires some manual intervention to get online.
- For example, a disk that is always online is online and a spinning idle disk (such as a free disk array (MAID)) is nearline. Removable media (such as tape cartridges) that can be automatically loaded (such as in a tape library) are near-line, while cartridges that must be manually loaded are offline.
- Most modern CPUs are so fast that for most program workloads, the bottleneck is the reference location for memory access and the efficiency of cache and memory transfers between different levels of the hierarchy. Therefore, the CPU spends a lot of time idle, waiting for memory I / O to complete. This is sometimes referred to as the cost of space because larger memory objects are more likely to overflow a small / fast level and require the use of a larger / slower level. The load generated during memory use is called stress (respectively, recording pressure, buffer pressure, and (main) memory pressure). The conditions for losing data from higher levels and getting data from lower levels are: register overflow (due to registration pressure: registration to cache), cache miss (cache to main memory), and (hard) page fault (main memory to disk) ). [4]
- Modern programming languages mainly use two levels of memory, main memory and disk memory, although in languages such as assembly language and inline assembler (such as C), registers can be accessed directly. To take full advantage of the memory hierarchy, the cooperation of programmers, hardware, and compilers (and the underlying support of the operating system) is required:
- 1. The programmer is responsible for moving data between disk and memory through file I / O.
- 2. The hardware is responsible for moving data between memory and cache.
- 3. The optimization compiler is responsible for generating the code, which will make the hardware use cache and register efficiently when executed.
- Many programmers assume a memory level. This works fine until the application hits the performance wall. The memory hierarchy will then be evaluated during code refactoring.