What Is a Stack Overflow?

The stack is an abstract data type often used in computer science. Objects on the stack have a feature: The last object placed on the stack is always taken out first. This feature is often referred to as the last-in-first-out (LIFO) queue. Some operations are defined in the stack. The two most important ones are PUSH and POP. The PUSH operation adds an element to the top of the stack. The POP operation is the opposite, removing one element from the top of the stack and reducing the size of the stack by one.

The stack is an abstract data type often used in computer science. Objects on the stack have a feature: The last object placed on the stack is always taken out first. This feature is often referred to as the last-in-first-out (LIFO) queue. Some operations are defined in the stack. The two most important ones are PUSH and POP. The PUSH operation adds an element to the top of the stack. The POP operation is the opposite, removing one element from the top of the stack and reducing the size of the stack by one.
The stack overflow is caused by too many function calls, resulting in the call stack not being able to accommodate the return addresses of these calls, which is usually generated in recursion. Stack overflow is most likely caused by infinite recursion, but it may also be just too many stack levels.
Chinese name
Stack overflow
Applied discipline
computer science
Category
High-level language
Technology
Procedures and functions
RAM
Contiguous memory
Address
Fixed address
Field
Computer security

Stack overflow stack overflow

Stack overflow is to ignore the size of the local data block allocated in the stack, write too much data to the data block, causing the data to exceed the boundary, and the result overwrites other data. It can be understood as embedding a piece of code in a long string, and overwriting the return address of the procedure with the address of this code, so that when the procedure returns, the program will start executing this self-written code.
For example, the following program:
#include <stdio.h>
int main ()
{
char name [8];
printf ("Please type your name:");
gets (name);
printf ("Hello.% s!", name);
return 0;
}
Compile and execute, enter ipxodiAAAAAAAAAAAAAAAA, after executing gets (name), the stack is as follows:
Bottom of memory top of memory
name EBP ret
<------- [ipxodiAA] [AAAA] [AAAA] ............
^ & name
Top of stack bottom of stack
Because the name string we entered is too long, the name array can't hold it, so we have to continue writing 'A' to the top of the memory. If we apply for dynamic memory in advance, we can avoid stack overflow. In this example, because the growth direction of the stack is opposite to that of the memory, these 'A's cover the old elements of the stack. 'EBP ret' is covered by 'A'. When main returns, it will use the ASCII code of 'AAAA': 0x41414141 as the return address. The CPU will try to execute the instruction at 0x41414141, and the result will be an error. This is a stack overflow!

Stack overflow resolution

Ability to monitor the behavior of the four functions malloc, memset, memcpy, and free (the stack is not detected, and generally there are fewer stack overflows, which is easy to check. In addition, new and delete cannot monitor them due to their limited level). If an out-of-bounds operation is found, print it out and continue execution. In other words, the detection tool does not affect the behavior of the program. [1]

Stack overflow stack area

The stack is a contiguous block of memory that holds data. A register named Stack Pointer (SP) points to
Stack overflow
The top of the stack. The bottom of the stack is at a fixed address. The size of the stack is dynamically adjusted by the kernel at runtime. The CPU implements the instructions PUSH and POP, adding elements to and removing elements from the stack. The stack consists of logical stack frames. The logical stack frame is pushed onto the stack when the function is called, and the logical stack frame is popped from the stack when the function returns. The stack frame includes the parameters of the function, the local variables of the function, and the data needed to restore the previous stack frame, including the value of the instruction pointer (IP) when the function was called. The stack can grow both downwards (to a lower address in memory) and upwards, depending on the implementation. In our case, the stack is growing downwards. This is how many computers are implemented, including Intel, Motorola, SPARC and MIPS processors. The stack pointer (SP) is implementation-dependent. It can point to the last address of the stack, or to the next free available address after the stack. In our discussion, SP points to the last address of the stack. In addition to the stack pointer (SP points to the lower address of the top of the stack), a pointer to a fixed address within a frame is called a frame pointer (FP) for convenience. Some articles call it the LB-local base pointer. In theory, local variables can be referenced using SP plus offset. However, these offsets change when a word is pushed and popped. Although in some cases the compiler can track word operations on the stack and thus correct offsets, in some cases it cannot. And in all cases, considerable management overhead is introduced. And on some machines, such as Intel processors, accessing a variable by SP plus offset requires multiple instructions to achieve. Therefore, many compilers use a second register, FP, which can be referenced for both local variables and function parameters, because their distance from FP is not affected by PUSH and POP operations. In Intel CPUs, BP (EBP) is used for this purpose. In Motorola CPU, any address register except A7 (stack pointer SP) can be used as FP. Considering the growth direction of our stack, starting from the position of FP, the offset of the function parameter is positive, and the offset of the local variable is negative. The first thing that must be done when a routine is called is to save the previous FP (so that it can be restored when the routine exits). It then copies the SP to the FP, creates a new FP, and moves the SP forward to reserve space for local variables. This is called the prolog work of the routine. When the routine exits, the stack must be cleared. This is called the epilog of the routine. Intel's ENTER and LEAVE instructions, Motorola's LINK and UNLINK instructions, can be used for effective prologue and epilogue work.

Stack overflow stack overflow attack

JMP ESP Stack overflow using JMP ESP

Its use format is NNNNNNRSSSSS, where N = NOP, R = RET (address of jmp esp), and S = ShellCode. It is to cover the buffer with NOP (empty instruction, do nothing) until the original EIP position, we fill in the jmp esp address of a core dll in the system, followed by our ShellCode. Under normal circumstances, when the function returns, the RET instruction is executed. This is equal to the POP EIP. The saved EIP value of the original program is restored, and the interrupted return is completed. But here, we overwrite the saved EIP value and rewrite it to the address of jmp esp. In this way, after POP EIP, EIP = jmp esp address, and the stack pointer ESP will go down to point to the beginning of ShellCode. The program continues to execute. At this time, the content in the EIP is jmp esp, and the system executes jmp esp, which just jumps to our ShellCode.
If the ShellCode is an open port, then we can connect to it remotely; if the ShellCode is downloaded and executed, then we can let the target machine download and execute a file on the webpage, as long as you want to reach the function, you can think of ways to achieve. [2]

JMP EBX Stack overflow using JMP EBX

Its utilization format is NNNNN JESSSSSS. Here N = NOP, J = Jmp 04, E = jmp ebx address, and S = ShellCode. Here the position of J and E is the key, E is the entry position for error handling, and J is in front of it. In the first way, we know to overwrite the return address with another address. But what if it is an invalid address? The data pointed there may not be readable, or may not be executed, what happens? In fact, I believe everyone has encountered it, that is, the system will pop up a dialog box to report an error, we click OK, and the operation will be terminated. This is because as a system-level program, there is a sound error handling mechanism inside. Simply put, if an error occurs during runtime, windows will jump to a place dedicated to handling errors, corresponding to different errors, and executing different code. The code executed above is popping up a dialog box to report an error. So here we intentionally overwrite the returned address with a wrong address. In this case, Windows will jump to the entry that handles the error, and ebx points to the first 4 bytes of the entry! Then if we overwrite the error entry with the address of jmp ebx, it will jump to the first 4 bytes. How to jump to ShellCode? Here we write jmp 04, haha, skip back 4 bytes, just skip the overlay value and reach our ShellCode. [2]

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?