What Is a Memory Map?
A memory-mapped file is a mapping from a file to a piece of memory. Win32 provides functions (CreateFileMapping) that allow applications to map files to a process. Memory-mapped files are somewhat similar to virtual memory. A memory-mapped file can reserve an area of an address space and submit physical memory to this area. The physical memory mapped by the memory file comes from a file that already exists on disk, The file must be mapped before it can be manipulated. When using memory-mapped files to process files stored on disk, you no longer need to perform I / O operations on the files, so that memory-mapped files can play a very important role in processing large data files.
Memory map
Right!
- A memory-mapped file is a mapping from a file to a piece of memory.
- We often in the program
- Modern operating systems are in 32-bit protected mode. Each process can generally address 4G physical space. But ours
- The kernels are big now, so we need some kind of tool to read the huge
- In this thesis analysis, I chose the kernel of linux-2.6.10. The latest kernel code is 2.6.25. But now the mainstream servers are using RedHatAS4 machines, which makes
- Use 2.6.9 kernel. I chose 2.6.10 because it is very close to 2.6.9. Now Red Hat Enterprise Linux 4 is based on the Linux 2.6.9 kernel and is the most stable and powerful commercial product. In 2004
- During the year,
- Before the kernel enters protected mode, the paging function has not been enabled. Before this, the kernel must first create a temporary kernel.
- 4.1 Example code
- Through the previous theoretical analysis, we write a simple program to analyze how the kernel maps linear addresses to physical addresses.
- [root @ localhosttemp] # cattest.c
- #include <stdio.h>
- voidtest (void)
- {
- printf ("hello, world. \ n");
- }
- intmain (void)
- {
- test ();
- }
- This code is very simple. We intentionally want to call the test function, just to see how the virtual address of the test function is mapped to a physical address.
- 4.2 Segmental Mapping Analysis
- We compile and test the file under disassembly
- [root @ localhosttemp] # gcc-otesttest.c
- [root @ localhosttemp] # objdump-dtest
- 08048368 <test>:
- 8048368: 55push% ebp
- 8048369: 89e5mov% esp,% ebp
- 804836b: 83ec08sub $ 0x8,% esp
- 804836e: 83ec0csub $ 0xc,% esp
- 8048371: 6884840408push $ 0x8048484
- 8048376: e835ffffffcall 80482b0 <printf @ plt>
- 804837b: 83c410add $ 0x10,% esp
- 804837e: c9leave
- 804837f: c3ret
- 08048380 <main>:
- 8048380: 55push% ebp
- 8048381: 89e5mov% esp,% ebp
- 8048383: 83ec08sub $ 0x8,% esp
- 8048386: 83e4f0and $ 0xfffffff0,% esp
- 8048389: b800000000mov $ 0x0,% eax
- 804838e: 83c00fadd $ 0xf,% eax
- 8048391: 83c00fadd $ 0xf,% eax
- 8048394: c1e804shr $ 0x4,% eax
- 8048397: c1e004shl $ 0x4,% eax
- 804839a: 29c4sub% eax,% esp
- 804839c: e8c7ffffffcall8048368 <test>
- 80483a1: c9leave
- 80483a2: c3ret
- 80483a3: 90nop
- From the above results, we can see that the address allocated by ld to the test () function is 0x08048368. In the executable file code of the elf format, the actual location of ld always starts from 0x8000000
- The snippet is the same for every program. As for the actual location of the program in physical memory during execution, the kernel will make temporary arrangements when setting up a memory map for it. The specific address is
- Depends on the physical memory pages allocated at that time. Assume that the program is already running, the entire mapping mechanism has been established, and the CPU is executing the call8048368 in main ().
- Let's move to virtual address 0x08048368 to run. The mapping process of this virtual address to physical address will be described in detail below.
- The first is the segmented mapping phase. Because 0x08048368 is the entry point of a program, it is more important that the instruction counter EIP in the CPU points to it during the execution process, so in the code segment
- . Therefore, the i386CPU uses the current value of the code segment register CS as the selector of the segment map, that is, it uses it as the index in the segment description table. What is the value of CS?
- Test with GDB:
- (gdb) inforeg
- eax0x1016
- ecx0x11
- edx0x9d915c10326364
- ebx0x9d6ff410317812
- esp0xbfedb4800xbfedb480
- ebp0xbfedb4880xbfedb488
- esi0xbfedb534-1074940620
- edi0xbfedb4c0-1074940736
- eip0x804836e0x804836e
- eflags0x282642
- cs0x73115
- ss0x7b123
- ds0x7b123
- es0x7b123
- fs0x00
- gs0x3351
- You can see that the value of CS is 0x73, we break it into binary:
- 0000000001110011
- The lowest 2 digits are 3, indicating that the RPL value is 3, which should be our program. The province is in user space, and the RPL value is naturally 3.
- The third bit is 0 to indicate that this index is in the GDT.
- The upper 13 bits are 14, so the segment descriptor is in the 14th entry of the GDT table. We can verify it in the kernel code:
- In i386 / asm / segment.h:
- # defineGDT_ENTRY_DEFAULT_USER_CS14
- #define__USER_CS (GDT_ENTRY_DEFAULT_USER_CS * 8 + 3)
- You can see that the segment descriptor is indeed the 14th entry in the GDT table.
- Let's go to the GDT table to see what the specific entry value is. The content of GDT is defined in arch / i386 / kernel / head.S:
- ENTRY (cpu_gdt_table)
- .quad0x0000000000000000 / * NULLdescriptor * /
- .quad0x0000000000000000 / * 0x0breserved * /
- .quad0x0000000000000000 / * 0x13reserved * /
- .quad0x0000000000000000 / * 0x1breserved * /
- .quad0x0000000000000000 / * 0x20unused * /
- .quad0x0000000000000000 / * 0x28unused * /
- .quad0x0000000000000000 / * 0x33TLSentry1 * /
- .quad0x0000000000000000 / * 0x3bTLSentry2 * /
- .quad0x0000000000000000 / * 0x43TLSentry3 * /
- .quad0x0000000000000000 / * 0x4breserved * /
- .quad0x0000000000000000 / * 0x53reserved * /
- .quad0x0000000000000000 / * 0x5breserved * /
- .quad0x00cf9a000000ffff / * 0x60kernel4GBcodeat0x00000000 * /
- .quad0x00cf92000000ffff / * 0x68kernel4GBdataat0x00000000 * /
- .quad0x00cffa000000ffff / * 0x73user4GBcodeat0x00000000 * /
- .quad0x00cff2000000ffff / * 0x7buser4GBdataat0x00000000 * /
- .quad0x0000000000000000 / * 0x80TSSdescriptor * /
- .quad0x0000000000000000 / * 0x88LDTdescriptor * /
- / * SegmentsusedforcallingPnPBIOS * /
- .quad0x00c09a0000000000 / * 0x9032-bitcode * /
- .quad0x00809a0000000000 / * 0x9816-bitcode * /
- .quad0x0080920000000000 / * 0xa016-bitdata * /
- .quad0x0080920000000000 / * 0xa816-bitdata * /
- .quad0x0080920000000000 / * 0xb016-bitdata * /
- / *
- * TheAPMsegmentshavebytegranularityandtheirbases
- * andlimitsaresetatruntime.
- * /
- .quad0x00409a0000000000 / * 0xb8APMCScode * /
- .quad0x00009a0000000000 / * 0xc0APMCS16code (16bit) * /
- .quad0x0040920000000000 / * 0xc8APMDSdata * /
- .quad0x0000000000000000 / * 0xd0-unused * /
- .quad0x0000000000000000 / * 0xd8-unused * /
- .quad0x0000000000000000 / * 0xe0-unused * /
- .quad0x0000000000000000 / * 0xe8-unused * /
- .quad0x0000000000000000 / * 0xf0-unused * /
- .quad0x0000000000000000 / * 0xf8-GDTentry31: double-faultTSS * /
- .quad0x00cffa000000ffff / * 0x73user4GBcodeat0x00000000 * /
- We expand this value into binary:
- 0000000011001111111110100000000000000000000000001111111111111111
- According to the above description of the value of the segment descriptor table entry, the following conclusions can be drawn:
- B0-B15, B16-B31 are 0, which means that the base address is all 0.
- L0-L15, L16-L19 are 1, which means that the upper limit of the segment is all 0xffff.
- A G bit of 1 means that the unit of segment length is 4KB.
- D bit is 1 means that all accesses to the segment are 32-bit instructions
- A P bit of 1 indicates that the segment is in memory.
- DPL is 3 means privilege level is 3
- S bit is 1 as code segment or data segment
- A type of 1010 indicates a code segment that is readable, executable, and has not yet been accessed
- This descriptor indicates the entire 4G virtual memory space starting from the 0 address, and the logical address is directly converted into a linear address.
- Therefore, after segment mapping, the logical address is converted into a linear address. This is also the reason why logical addresses are equivalent to linear addresses in Linux.
- 4.3 Paged Mapping Analysis
- Now enter the process of page mapping, each process in the Linux system has its own page directory PGD, a pointer to this directory is stored in the mm_struct data structure of each process
- in. Whenever a process is scheduled to run, the kernel must set the control register cr3 for the upcoming process, and the MMU hardware always obtains the current page from cr3.
- Recorded pointer. When we want to transfer to address 0x08048368 in the program, the process is running, cr3 is set up and points to the page directory of our process. First linear
- Address 0x08048368 expands into binary:
- 00001000000001001000001101101000
- Comparing the format of the linear address, it can be seen that the highest 10 digits are 0000100000 in binary, which is 32 in decimal, so the MMU uses 32 as the index to find its directory entry in its page directory. This one
- The upper 20 bits of the directory entry point to a page table. The CPU adds 12 0s to these 20 bits to get a pointer to the page table. After finding the page table, the CPU looks at the middle 10 bits in the linear address.
- 0001001000, which is 72 in decimal. So the CPU uses this as a subscript to find the corresponding entry in the page table. The upper 20 digits of the entry value point to a physical memory page.
- Manage the start address of the page. Assume that the physical address is 0x620000 and the lowest 12 bits of the linear address are 0x368. Then the entry address of the test () function is 0x620000 + 0x368 = 0x620368
- .