What Is Protected Mode?
Protected mode is a 8086 series and later x86 compatible CPU operating mode. Protected mode has some new features designed to enhance multiplexing and system stability, such as memory protection, paging system, and hardware-backed virtual memory. Most of today's x86 operating systems run in protected mode, including Linux, FreeBSD, and Microsoft Windows 2.0 and later.
- Chinese name
- Protection mode
- Foreign name
- Protected Mode
- Foreign language abbreviations
- PMode
- Field
- Computer operating system, assembly language
- Correspondence concept
- Real mode
- Protected mode is a 8086 series and later x86 compatible CPU operating mode. Protected mode has some new features designed to enhance multiplexing and system stability, such as memory protection, paging system, and hardware-backed virtual memory. Most of today's x86 operating systems run in protected mode, including Linux, FreeBSD, and Microsoft Windows 2.0 and later.
- The other 286 and subsequent CPU operation modes are real mode, a CPU operation mode that is forward compatible and has the features of protection mode turned off. Used to make new chips run old software. According to the design specifications, all x86 CPUs are booted in real mode to ensure forward compatibility of traditional operating systems. Before any protected mode features are available, they must be manually switched to protected mode by some program. In today's computers, this switchover is usually one of the first tasks that an operating system must complete when it is turned on. It is also possible to use virtual 86 mode to run code designed to run in real mode when the CPU is running in protected mode.
Overview of protection modes
- The protection mode corresponds to the real mode. Before 80286, the CPU had only real-time mode, the address bus had 20 bits, and the memory address was 16 bits, which means that it can access a maximum of 2 ^ 20 = 1M of memory space. In 80286 and later, the memory address is changed to 16 or 32 bits, and at least 2 ^ 32 = 4G of memory space can be accessed. However, in order to ensure that subsequent CPUs can run the old CPU, they can only be kept backward compatible. Therefore, the 80286 and later CPUs first enter the real mode, and then enter the protection mode through the switching mechanism.
Protected Mode Real Mode
- So what is real mode? When the CPU is reset (reset) or powered on (power on), it starts in real mode, and the processor works in real mode. In real mode, the memory addressing mode is the same as 8086. The content of the 16-bit segment register is multiplied by 16 (10H) as the segment base address, plus the 16-bit offset address to form a 20-bit physical address. The maximum addressing space is 1MB. , The maximum segment is 64KB. You can use 32-bit instructions. A 32-bit x86 CPU is used as the high-speed 8086. In real mode, all segments are readable, writable, and executable. [1]
- Compared with the real mode, there are two main differences between the protected mode and the protected mode. One is to provide a protection mechanism between segments to prevent problems caused by random access to addresses between programs, and the other is to increase the access memory space, as described above.
- In the 8086/8088 era, the processor only had one operation mode (Operation Mode). Since there were no other operation modes, this mode was not named. Since 80286 to 80386, the processor has added two other modes of operation-protected mode and system management mode (SMM). Therefore, the 8086/8088 mode is named Real-address Mode (RM) .
- Protected mode is the processor's native mode. In this mode, the processor supports all instructions and all architectural features, providing the highest performance and compatibility. This mode is recommended for all new applications and operating systems. To ensure PM compatibility, the processor allows execution of RM programs in a protected, multi-tasking environment. This feature is called Virtual -8086 Mode, although it is not a true processor mode. Virtual-8086 mode is actually a property of PM, and any task can use it.
- RM provides a programming environment for the Intel 8086 processor, with additional extensions (such as the ability to switch to PM or SMM). When the host is Power-up or Reset, the processor is under RM.
- SMM is a standard architecture feature that is unified for all Intel processors. Appeared on the Intel386 SL chip. This mode provides a transparent mechanism for the OS to implement platform-specific functions (such as power management or system security). When the external SMM interrupt pin (SMI #) is activated or an SMI is received from the APIC (Advanced Programming Interrupt Controller), the processor will enter the SMM. Under SMM, when saving the entire context of the currently running program, the processor switches to a separate address space. Then the code specified by SMM may be executed transparently. When returning from the SMM, the processor will return to the state it was in before the system management interrupt.
- Because the machine is in the RM state after Power-up or Reset, and for Intel 80386 and subsequent chips, only PM can play the biggest role. So we face a problem of switching from RM to PM.
- This article does not discuss SMM. The focus of this section is on how to switch from RM to PM during the Booting phase. I won't discuss the details of PM here because the Intel Architecture Software Developer's Manual Volume 3: System Programming has a very detailed and comprehensive Accurate introduction.
GDT Protected mode GDT
- GDT Global Descriptor Table: In protected mode, an important and necessary data structure is it.
- Why is there a GDT? Let's first consider the programming model in real-time mode:
- In real-time mode, we access a memory address through Segment: Offset, where Segment is the Base Address of a segment, and the maximum length of a Segment is 64 KB, which can be expressed by a 16-bit system Maximum length. Offset is the offset from this Segment Base Address. Base Address + Offset is an absolute memory address. From this, we can see that a segment has two factors: Base Address and Limit (the maximum length of the segment), and access to a memory address needs to indicate: which segment to use? And relative to the Offset of the Base Address of this segment, this Offset should be smaller than the Limit of this segment. Of course, for a 16-bit system, don't specify a Limit. The default length is 64KB, and a 16-bit Offset can never be larger than this Limit. In actual programming, we use 16-bit segment registers CS (Code Segment), DS (Data Segment), and SS (Stack Segment) to specify segments. The CPU shifts the value in the segment register to the left by 4-bit and puts The 20-bit address line becomes the 20-bit Base Address.
- In protected mode, the memory management mode is divided into two types, segment mode and page mode, of which page mode is also based on segment mode. In other words, the memory management mode of protected mode is actually: pure segment mode and segment page mode. Further, segment mode is essential, while page mode is optional-if page mode is used, it is segment page; otherwise it is pure segment mode.
- In this case, let's not consider page mode first. For segment mode, it is natural to access a memory address using Segment: Offset. Since the protected mode runs on a 32-bit system, the two factors of the segment: Base Address and Limit are also 32-bit. IA-32 allows the Base Address of a segment to be set to any value that can be represented by 32-bit (Limit can be set to any value that can be represented by 32-bit, which is a multiple of 2 ^ 12), unlike In real-time mode, the Base Address of a segment can only be a multiple of 16 (because its lower 4-bit is obtained by left-shift operation and can only be 0, so that the 16-bit segment register is used to represent the 20-bit Base Address Purpose), while the Limit of a segment can only be a fixed value of 64 KB. In addition, the protection mode, as the name suggests, also provides a protection mechanism for the segment mode, which means that a segment descriptor needs to specify its own access rights (Access). Therefore, in protected mode, the description of a segment includes three factors: [Base Address, Limit, Access], which are added together in a 64-bit data structure called a segment descriptor. . In this case, if we refer to a segment directly through a 64-bit segment descriptor, we must use a 64-bit segment register to load the segment descriptor. However, in order to maintain backward compatibility, Intel still specifies the segment registers as 16-bit (although each segment register actually has a 64-bit long invisible part, for programmers, the segment register is 16-bit ), Then it is obvious that we cannot directly reference the 64-bit segment descriptor through the 16-bit segment register. How to do?
- The solution is to put these 64-bit segment descriptors into an array, and use the value in the segment register as a subscript index to indirectly reference (in fact, the 13-bit high in the segment register Content as an index). This global array is GDT. In fact, not only the segment descriptors but other descriptors are stored in the GDT. They are all 64-bit long, which we will discuss later.
- The GDT can be placed anywhere in memory, so when a programmer refers to a segment descriptor through a segment register, the CPU must know the entry of the GDT, which is where the base address is placed, so Intel's designer gate provides a register GDTR is used to store the GDT entry address. After the programmer sets the GDT to a location in memory, the GDT entry address can be loaded into this register by the LGDT instruction. From then on, the CPU will use the contents of this register as the GDT. To enter the GDT.
- GDT is the data structure necessary for the protection mode, and it is the only oneshould not, and there can be more than one. In addition, as its name (Global Descriptor Table) reveals, it is globally visible, as it is for any task.
- In addition to GDT, IA-32 also allows programmers to build data structures similar to GDT. They are called LDT (Local Descriptor Table), but different from GDT, LDT can exist in the system multiple, and from LDT You can see that LDT is not globally visible, they are only visible to the tasks that reference them, and each task can have at most one LDT. In addition, each LDT exists as a segment itself, and their segment descriptors are placed in the GDT.
- IA-32 also provides a register LDTR for the LDT entry address. Because only one task can be running at any time, only one LDT register is needed globally. If a task has its own LDT, then when it needs to refer to its own LDT, it needs to load its LDT's segment descriptor into this register via LLDT. The difference between the LLDT instruction and the LGDT instruction is that the operand of the LGDT instruction is a 32-bit memory address, which stores a 32-bit GDT entry address and a 16-bit GDT Limit. The operand of the LLDT instruction is a 16-bit selector. The main content of this selector is: the index value of the loaded LDT's segment descriptor in the GDT-this is referenced by the segment register just discussed. The pattern of segments is the same.
LDT Protected mode LDT
- LDT is just an optional data structure, you can use it completely. Using it may bring some convenience, but it also brings complexity. If you want to keep your OS kernel simple and portable, it is best not to use it.
- The segments described by the segment descriptors in the GDT and LDT are implemented through a 16-bit data structure. This data structure is called Segment Selector-segment selector. Its upper 13 bits are used as the index index of the referenced segment descriptor in GDT / LDT. Bit 2 is used to specify whether the referenced segment descriptor is placed in GDT or LDT. Bit 0 and bit 1 are RPL -Requesting a privilege level is used for protection purposes, and we will not discuss it in detail here.
- The GDT / LDT index in the load segment register discussed earlier is the Segment Selector. When a memory address needs to be referenced, the Segment: Offset mode is still used. The specific operation is to load the Segment Selector into the corresponding segment register. According to this Segment Selector, you can find the corresponding Segment Descriptor in GDT or LDT. This Segment Descriptor records the Base Address of this segment, and then adds Offset to get the final memory address.
Protected Mode Installation Description
- According to the discussion in the previous section, GDT is a necessary data structure for the protection mode, then we must set the GDT before entering the protection mode, and load it into the corresponding register through LGDT.
- Although the GDT is allowed to be placed anywhere in memory, since the elements in the GDTthe descriptorsare 64-bit long, that is, they are all 8 bytes, in order to allow the CPU to access the GDT at a speed The fastest, we should put the entry address of the GDT in an 8-byte aligned location, that is, a multiple of 8.
- The first descriptor in the GDT must be an empty descriptor, that is, its content should be all zeros. If this descriptor is referenced for memory access, a General Protection exception is generated.
- If an OS does not use virtual memory, segment mode would be a good choice. But modern OS does not use virtual memory, and the more convenient and effective memory management method for implementing virtual memory is page management. But on IA-32, if we want to use page management, we can only use segment page-there is no way to completely disable segment mode. But we can do our best to minimize the effect of the segment.
- IA-32 provides a segmented mode called "Basic Flat Model" to achieve this effect. This mode requires that at least two segment descriptors be defined in the GDT, one to refer to the Data Segment and the other to refer to the Code Segment. These two segments all contain the entire linear space, that is, Segment Limit = 4 GB. Even though the actual physical memory is far from that much, this space is defined to realize virtual memory by page management in the future.
- Here, we are only in the startup stage, so we only need to set up the GDT initially, and when the kernel mode is actually entered. After the OS Kernel is started, how the OS intends to set the GDT and what memory management mode is set by Kernel itself It only needs to set all linear spaces for the data and code segments of Kernel.
- The format of the segment descriptor is shown below:
- Specific to the code segment and data segment, their format is shown below:
- The following is a temporary gdt set to enter protected mode during the startup phase. There are three segment descriptors defined here: the first is a system-specified empty descriptor, the second is a code segment referencing a 4 GB linear space, and the third is a data segment referencing a 4 GB linear space. This is the lowest GDT setting required by the "Basic Flat Model", but it is sufficient for the purpose of entering the protection mode and providing a continuous, maximum linear space for the kernel in the startup phase.
- # Descriptor tables
- gdt:
- .word 0, 0, 0, 0 # dummy
- .word 0xFFFF # 4Gb-(0x100000 * 0x1000 = 4Gb)
- .word 0 # base address = 0
- .word 0x9A00 # code read / exec
- .word 0x00CF # granularity = 4096, 386
- # (+ 5th nibble of limit)
- .word 0xFFFF # 4Gb-(0x100000 * 0x1000 = 4Gb)
- .word 0 # base address = 0
- .word 0x9200 # data read / write
- .word 0x00CF # granularity = 4096, 386
- # (+ 5th nibble of limit)
Protected mode load description
- After setting the GDT, we need to load the set gdt entry address and the size of the gdt table into the GDTR register through the LGDT instruction.
- The GDTR register consists of two parts: a 32-bit linear base address, and a 16-bit GDT size (in bytes). It should be noted that for a 32-bit linear base address, it must be a 32-bit absolute physical address, not an offset relative to a segment. And we are in the startup stage, before entering the protection mode, our CS and DS settings are probably not 0, so we must calculate the absolute physical address of gdt.
- In order to execute the LGDT instruction, you need to place these two contents in a certain location in memory, and then pass the memory address of this location as an operand to the LGDT instruction. The LGDT instruction will then automatically load the two values stored in this location into the GDTR register.
- # This is where the two parts of GDTR are stored
- gdt_48:
- .word 0x8000 # gdt limit = 2048,
- # 256 GDT entries
- .word 0, 0 # gdt base (filled in later)
- # The following code is used to calculate the 32-bit linear address of the GDT and load it into the GDTR register.
- xorl% eax,% eax # Compute gdt_base
- movw% ds,% ax # (Convert% ds: gdt to a linear ptr)
- shll 4,% eax
- addl $ gdt,% eax
- movl% eax, (gdt_48 + 2)
- lgdt gdt_48 # load gdt with whatever is appropriate
Other things in protected mode
- Before entering protected mode, in addition to setting and loading the GDT, you need to do the following:
- Mask all maskable interrupts;
- Load IDTR; Full Chinese Name: Interrupt Description Table Register
- All coprocessors are properly reset.
- Because there are some differences in the interrupt handling mechanisms in real-time mode and protected mode, you must disable all maskable interrupts before entering protected mode. This can be done in one of two ways:
- Use CLI instructions;
- Program the 8259A programmable interrupt controller to mask all interrupts.
- Even when we enter the protected mode, we can't turn on the interrupt immediately. At this time, because we must properly initialize the data structure required for the related protected mode interrupt processing in the OS Kernel, we can turn on the interrupt, otherwise the processor will be generated. abnormal.
- In real-time mode, interrupt processing uses IVT (Interrupt Vector Table), and in protected mode, interrupt processing uses IDT (Interrupt Descriptor Table), so we must set the IDTR before entering the protection mode.
- The format of IDTR is the same as that of GDTR, and the loading method of IDTR is the same as that of GDTR. Because the related interrupt handler in IDT needs to be set by the OS Kernel, during the startup phase, we only need to set the IDT base address and Size in IDTR to 0. Then, after entering the protection mode, OS Kernel to really set it up.
- Regarding the interrupt mechanism and interrupt handling, please refer to Interrupt & Exception, which will not be repeated here.
- #
- # This is where the two parts of the IDTR are stored
- #
- idt_48:
- .word 0 # idt limit = 0
- .word 0, 0 # idt base = 0L
- # For IDTR processing, only this instruction is needed
- lidt idt_48 # load idt with 0,0
- #
- # By setting 8259A PIC, mask all maskable interrupts
- #
- movb xFF,% al # mask all interrupts for now
- outb% al, xA1
- call delay
- movb xFB,% al # mask all irq's but irq2 which
- outb% al, x21 # is cascaded
- # Ensure that all co-processing is reset correctly
- xorw% ax,% ax
- outb% al, xf0
- call delay
- outb% al, xf1
- call delay
- # Delay is needed after doing I / O
- delay:
- outb% al, x80
- ret
- 5. Let's Go
- OK, everything is ready
- Entering the protection mode or the real-time mode is completely controlled by the PE flag bit in the CR0 register: if PE = 1, the CPU switches to PM, otherwise it enters RM.
- There are two ways to set the CR0-PE bit:
Protected mode first
- The first is the LMSW instruction used by the 80286. Later 80386 and later CPUs retained this instruction for backward compatibility. This instruction can only affect the lowest 4 bits, namely PE, MP, EM and TS, and has no effect on the others.
- #
- #Enter protection mode by LMSW instruction
- #
- movw $ 0x0001,% ax # protected mode (PE) bit
- lmsw% ax # This is it!
Protected mode second
- The second is the method suggested by Intel for entering the PM on CPUs after 80386, that is, by moving the MOV instruction. The MOV instruction can set the values of all the fields of the CR0 register.
- #
- #Enter protected mode by MOV instruction
- #
- movl% cr0,% eax
- xorb $ 0x01,% al # set PE = 1
- movl% eax,% cr0 # go !!
- It is now in protected mode.
Protected mode boot kernel
- We have entered protected mode from real-time mode, and now we are about to start the OS Kernel.
- OS Kernel runs in 32-bit segment mode, but currently we are still in 16-bit segment mode. How is this going? In order to understand this problem, we need to discuss the implementation method of IA-32's segment mode.
- IA-32 provides a total of six 16-bit segment registers: CS, DS, SS, ES, FS, GS. In fact, these 16-bits are only visible to the programmer, but each register still includes the 64-bit invisible.
- The visible part is for the programmer to load the segment register, but once the loading is completed, the CPU only uses the invisible part, and the visible part is completely useless.
- What is in the invisible part? I haven't seen relevant information about the specific format, but it is certain that the content of the hidden part is the same as the content of the segment descriptor (please refer to the format described in the segment), but the format may not be exactly the same. But the format is not important to our understanding, because programmers cannot be able to manipulate it directly.
- We take the CS register as an example, the same is true for other registers:
- In real-time mode, when we execute an instruction that loads the CS register (jmp, call, ret, etc.), the relevant value will be loaded into the visible part of the CS register, but at the same time the CPU will set it according to the contents of the visible part Invisible part. For example, after we execute "ljmp x1234, $ go", the content of the visible part of the CS register is 1234h, at the same time, the 32-bit Base Address field of the invisible part is set to 00001234h, and the 20-bit Limit field is set to a fixed value 10000h, that is, 64 KB. We do not consider other values in the Access Information section, but only consider the D / B bit. Since this command is in Real Mode, D / B is set to 0, indicating that this segment is a 16-bit segment. After the contents of the visible and invisible parts of the CS register are set, the loading of the CS register is completed. Later, when the CPU needs to perform the address calculation through the contents of CS, it only references the invisible part.
- In protected mode, when we execute an instruction that loads the CS register, the segment selector is loaded into the visible part of the CS register. At the same time, the CPU loads the corresponding selector into the corresponding descriptor table (GDT or LDT). ) Find the corresponding segment descriptor and load its contents into the invisible part of the CS register. Then when the CPU needs to perform the address calculation through the contents of CS, it only references the invisible part.
- As can be seen from the above description, in fact, when the CPU refers to the contents of the segment register for address calculation, the real-time mode and the protection mode are the same. In addition, I also understand why the content of the segment register we set in real-time mode still refers to the 16-bit segment in protected mode.
- So how do we set CS to reference 32-bit segments? The method is as we discussed earlier, using the jmp or call instruction, referring to a segment selector, and loading a segment descriptor into the GDT that references a 32-bit segment.
- It should be noted that if the contents of the CS register indicate that it is currently a 16-bit segment, the current address mode is also the 16-bit address mode, which has nothing to do with whether you are currently in real-time mode or protected mode. The 32-bit jmp instruction or call instruction that we load must use the 32-bit address mode. And our current boot code is 16-bit code, so we must add the address conversion prefix code 66h before this jmp / call instruction.
- The following example uses the jmp instruction to load a 32-bit segment. The meaning of the Jmpi instruction is to jump between segments. Its Opcode is Eah, and its format is: jmpi Offset, Segment Selector.
- # Since the current code is 16-bit code, and we want to execute the 32-bit address mode instruction, before the instruction
- # The address mode switching prefix 66h is required. If we write the jmp instruction directly, the compiler will generate the code.
- With #, this cannot be done, so we write the relevant data directly.
- .byte 0x66, 0xea # prefix + jmpi-opcode
- .long 0x1000 # Offset
- .word __KERNEL_CS # CS segment selector
- The above code is equivalent to a 32-bit instruction:
- jmpi 0x1000, __ KERNEL_CS
- If the segment space set by the segment descriptor referenced by the __KERNEL_CS segment selector is a linear address [0, 4 GB], and we place the OS Kernel at the physical address 1000h, then this jmpi instruction jumps to the entrance of the OS Kernel And start executing it.
- At this point, the startup phase is over and the OS is officially running!