What Is a Memory Map?

A memory-mapped file is a mapping from a file to a piece of memory. Win32 provides functions (CreateFileMapping) that allow applications to map files to a process. Memory-mapped files are somewhat similar to virtual memory. A memory-mapped file can reserve an area of an address space and submit physical memory to this area. The physical memory mapped by the memory file comes from a file that already exists on disk, The file must be mapped before it can be manipulated. When using memory-mapped files to process files stored on disk, you no longer need to perform I / O operations on the files, so that memory-mapped files can play a very important role in processing large data files.

Memory map

Right!
A memory-mapped file is a mapping from a file to a piece of memory.
We often in the program
Modern operating systems are in 32-bit protected mode. Each process can generally address 4G physical space. But ours
The kernels are big now, so we need some kind of tool to read the huge
In this thesis analysis, I chose the kernel of linux-2.6.10. The latest kernel code is 2.6.25. But now the mainstream servers are using RedHatAS4 machines, which makes
Use 2.6.9 kernel. I chose 2.6.10 because it is very close to 2.6.9. Now Red Hat Enterprise Linux 4 is based on the Linux 2.6.9 kernel and is the most stable and powerful commercial product. In 2004
During the year,
Before the kernel enters protected mode, the paging function has not been enabled. Before this, the kernel must first create a temporary kernel.
4.1 Example code
Through the previous theoretical analysis, we write a simple program to analyze how the kernel maps linear addresses to physical addresses.
[root @ localhosttemp] # cattest.c
#include <stdio.h>
voidtest (void)
{
printf ("hello, world. \ n");
}
intmain (void)
{
test ();
}
This code is very simple. We intentionally want to call the test function, just to see how the virtual address of the test function is mapped to a physical address.
4.2 Segmental Mapping Analysis
We compile and test the file under disassembly
[root @ localhosttemp] # gcc-otesttest.c
[root @ localhosttemp] # objdump-dtest
08048368 <test>:
8048368: 55push% ebp
8048369: 89e5mov% esp,% ebp
804836b: 83ec08sub $ 0x8,% esp
804836e: 83ec0csub $ 0xc,% esp
8048371: 6884840408push $ 0x8048484
8048376: e835ffffffcall 80482b0 <printf @ plt>
804837b: 83c410add $ 0x10,% esp
804837e: c9leave
804837f: c3ret
08048380 <main>:
8048380: 55push% ebp
8048381: 89e5mov% esp,% ebp
8048383: 83ec08sub $ 0x8,% esp
8048386: 83e4f0and $ 0xfffffff0,% esp
8048389: b800000000mov $ 0x0,% eax
804838e: 83c00fadd $ 0xf,% eax
8048391: 83c00fadd $ 0xf,% eax
8048394: c1e804shr $ 0x4,% eax
8048397: c1e004shl $ 0x4,% eax
804839a: 29c4sub% eax,% esp
804839c: e8c7ffffffcall8048368 <test>
80483a1: c9leave
80483a2: c3ret
80483a3: 90nop
From the above results, we can see that the address allocated by ld to the test () function is 0x08048368. In the executable file code of the elf format, the actual location of ld always starts from 0x8000000
The snippet is the same for every program. As for the actual location of the program in physical memory during execution, the kernel will make temporary arrangements when setting up a memory map for it. The specific address is
Depends on the physical memory pages allocated at that time. Assume that the program is already running, the entire mapping mechanism has been established, and the CPU is executing the call8048368 in main ().
Let's move to virtual address 0x08048368 to run. The mapping process of this virtual address to physical address will be described in detail below.
The first is the segmented mapping phase. Because 0x08048368 is the entry point of a program, it is more important that the instruction counter EIP in the CPU points to it during the execution process, so in the code segment
. Therefore, the i386CPU uses the current value of the code segment register CS as the selector of the segment map, that is, it uses it as the index in the segment description table. What is the value of CS?
Test with GDB:
(gdb) inforeg
eax0x1016
ecx0x11
edx0x9d915c10326364
ebx0x9d6ff410317812
esp0xbfedb4800xbfedb480
ebp0xbfedb4880xbfedb488
esi0xbfedb534-1074940620
edi0xbfedb4c0-1074940736
eip0x804836e0x804836e
eflags0x282642
cs0x73115
ss0x7b123
ds0x7b123
es0x7b123
fs0x00
gs0x3351
You can see that the value of CS is 0x73, we break it into binary:
0000000001110011
The lowest 2 digits are 3, indicating that the RPL value is 3, which should be our program. The province is in user space, and the RPL value is naturally 3.
The third bit is 0 to indicate that this index is in the GDT.
The upper 13 bits are 14, so the segment descriptor is in the 14th entry of the GDT table. We can verify it in the kernel code:
In i386 / asm / segment.h:
# defineGDT_ENTRY_DEFAULT_USER_CS14
#define__USER_CS (GDT_ENTRY_DEFAULT_USER_CS * 8 + 3)
You can see that the segment descriptor is indeed the 14th entry in the GDT table.
Let's go to the GDT table to see what the specific entry value is. The content of GDT is defined in arch / i386 / kernel / head.S:
ENTRY (cpu_gdt_table)
.quad0x0000000000000000 / * NULLdescriptor * /
.quad0x0000000000000000 / * 0x0breserved * /
.quad0x0000000000000000 / * 0x13reserved * /
.quad0x0000000000000000 / * 0x1breserved * /
.quad0x0000000000000000 / * 0x20unused * /
.quad0x0000000000000000 / * 0x28unused * /
.quad0x0000000000000000 / * 0x33TLSentry1 * /
.quad0x0000000000000000 / * 0x3bTLSentry2 * /
.quad0x0000000000000000 / * 0x43TLSentry3 * /
.quad0x0000000000000000 / * 0x4breserved * /
.quad0x0000000000000000 / * 0x53reserved * /
.quad0x0000000000000000 / * 0x5breserved * /
.quad0x00cf9a000000ffff / * 0x60kernel4GBcodeat0x00000000 * /
.quad0x00cf92000000ffff / * 0x68kernel4GBdataat0x00000000 * /
.quad0x00cffa000000ffff / * 0x73user4GBcodeat0x00000000 * /
.quad0x00cff2000000ffff / * 0x7buser4GBdataat0x00000000 * /
.quad0x0000000000000000 / * 0x80TSSdescriptor * /
.quad0x0000000000000000 / * 0x88LDTdescriptor * /
/ * SegmentsusedforcallingPnPBIOS * /
.quad0x00c09a0000000000 / * 0x9032-bitcode * /
.quad0x00809a0000000000 / * 0x9816-bitcode * /
.quad0x0080920000000000 / * 0xa016-bitdata * /
.quad0x0080920000000000 / * 0xa816-bitdata * /
.quad0x0080920000000000 / * 0xb016-bitdata * /
/ *
* TheAPMsegmentshavebytegranularityandtheirbases
* andlimitsaresetatruntime.
* /
.quad0x00409a0000000000 / * 0xb8APMCScode * /
.quad0x00009a0000000000 / * 0xc0APMCS16code (16bit) * /
.quad0x0040920000000000 / * 0xc8APMDSdata * /
.quad0x0000000000000000 / * 0xd0-unused * /
.quad0x0000000000000000 / * 0xd8-unused * /
.quad0x0000000000000000 / * 0xe0-unused * /
.quad0x0000000000000000 / * 0xe8-unused * /
.quad0x0000000000000000 / * 0xf0-unused * /
.quad0x0000000000000000 / * 0xf8-GDTentry31: double-faultTSS * /
.quad0x00cffa000000ffff / * 0x73user4GBcodeat0x00000000 * /
We expand this value into binary:
0000000011001111111110100000000000000000000000001111111111111111
According to the above description of the value of the segment descriptor table entry, the following conclusions can be drawn:
B0-B15, B16-B31 are 0, which means that the base address is all 0.
L0-L15, L16-L19 are 1, which means that the upper limit of the segment is all 0xffff.
A G bit of 1 means that the unit of segment length is 4KB.
D bit is 1 means that all accesses to the segment are 32-bit instructions
A P bit of 1 indicates that the segment is in memory.
DPL is 3 means privilege level is 3
S bit is 1 as code segment or data segment
A type of 1010 indicates a code segment that is readable, executable, and has not yet been accessed
This descriptor indicates the entire 4G virtual memory space starting from the 0 address, and the logical address is directly converted into a linear address.
Therefore, after segment mapping, the logical address is converted into a linear address. This is also the reason why logical addresses are equivalent to linear addresses in Linux.
4.3 Paged Mapping Analysis
Now enter the process of page mapping, each process in the Linux system has its own page directory PGD, a pointer to this directory is stored in the mm_struct data structure of each process
in. Whenever a process is scheduled to run, the kernel must set the control register cr3 for the upcoming process, and the MMU hardware always obtains the current page from cr3.
Recorded pointer. When we want to transfer to address 0x08048368 in the program, the process is running, cr3 is set up and points to the page directory of our process. First linear
Address 0x08048368 expands into binary:
00001000000001001000001101101000
Comparing the format of the linear address, it can be seen that the highest 10 digits are 0000100000 in binary, which is 32 in decimal, so the MMU uses 32 as the index to find its directory entry in its page directory. This one
The upper 20 bits of the directory entry point to a page table. The CPU adds 12 0s to these 20 bits to get a pointer to the page table. After finding the page table, the CPU looks at the middle 10 bits in the linear address.
0001001000, which is 72 in decimal. So the CPU uses this as a subscript to find the corresponding entry in the page table. The upper 20 digits of the entry value point to a physical memory page.
Manage the start address of the page. Assume that the physical address is 0x620000 and the lowest 12 bits of the linear address are 0x368. Then the entry address of the test () function is 0x620000 + 0x368 = 0x620368
.

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?