What Is a Core Dump?

A core dump, sometimes nicknamed in Chinese, is when the operating system terminates running when it receives certain signals, and writes out the contents of the process address space at this time and other information about the state of the process. A disk file. This information is often used for debugging.

In UNIX systems, the "master
The term core file is derived from core memory, the main random access storage medium of the 1950-1970s.
The coredump file generated by the program itself can generally be used to analyze where the program went wrong.
The coredump file analysis tool commonly used on the Linux platform is gdb; pstack and pflags are used on the Solaris platform; userdump and windbg are used on the Windows platform.
The dump triggered by an external program is generally used to analyze the running status of the process, such as analyzing memory usage / thread status.
Umem, a common memory analysis tool of Solaris, needs to first obtain the coredump file through gcore pid and then continue to analyze the memory situation.
One of the more common problems encountered by C / C ++ programmers is the unexpected core dump of their own code during the running process. There are many reasons for the core dump of the program. Different core dump problems have different solutions. At the same time, the difficulty of solving different core dump problems is also very different. Some are as short as a few seconds. Problems can be located within minutes, but some may take several days to resolve, which is a great challenge for software developers. The author has been engaged in C / C ++ language software development for many years, and has solved many such problems before and after, and has accumulated certain experience over time. Now I summarize the core dumps of common programs for software developers to share.
1. Core dump of the program caused by invalid pointer This situation is the most common type of core dump. There are roughly four reasons why a program may fail:
(1) A null pointer was operated.
(2) An operation was performed on an uninitialized pointer.
(3) Called again on a pointer that had called delete to release memory
delete to release repeatedly (who keeps you from assigning a pointer to NULL after the first delete).
(4) Multi-threaded access to global variables, resulting in abnormal memory values and program core dumps.
This type of problem is usually caused by an omission in the code writing. It is a low-level fault and it is relatively easy to solve. Use a debugging tool to debug the generated core file and compare the cause of the problem with the code to locate the problem in 10 minutes.
2. Program core dump caused by pointer out of bounds
This situation belongs to a core dump that is deeper and more difficult to resolve. When encountering this kind of problem, debug the core file with a debugging tool. Although the line of code can also be located, judging from the corresponding line of code, there may be no problem with this line of code itself, it is just a "victim". Such core dumps are difficult to find and difficult to resolve. According to the author's experience, this core dump is likely to be caused by a memory out of bounds during the processing of other code, usually caused by the following two factors. The first factor: The line of code where the core dump is located is a very simple operation, such as an assignment statement, "How could this be wrong?" Comment out the statement to run the program, and the core dump occurs on the next line of code. At this time, the operation of the corresponding code line is likely to be an operation on a global variable B. In this case, you need to shift your attention to the definition line code of the global variable, and take a closer look at the variables defined near the global variable. A and C are different because of different operating systems. Some need to pay attention to the variable A defined before the B variable, and some need to pay attention to the variable C defined after the B variable. Carefully search the code to see how the A and C variables are handled. Is there any place that may cause the memory to be out of bounds? It is most likely because the memory out of bounds for A and C operations caused the operation of the B variable to be harmed. B is bad enough.
The second factor: the value of the core memory location memory variable somehow appears
Out of value. At this point, you need to analyze the code and process carefully. First, check whether there is any problem with the code processing of this function, and focus on the problem-prone code lines such as memcpy, strstr, sprintf, strcpy, and strcat. If you confirm that there is no problem with this function processing, you need to carefully check the code according to the process. In this case, patience and confidence are the most needed. For this kind of problem, the code must be in a special logic, and the code processing lacks the necessary protection. It occurs once, there is not enough logging process, it is difficult to analyze, and the value of the memory variable from the core file is also Unable to locate the cause of the problem. However, if it reappears, it will have a relatively large reference value. There must be some commonality between the core file memory variables of the previous two times, and the fault needs to be analyzed and reproduced based on this feature. The author once encountered a string operation strstr on an uninitialized buffer A (at this time it will not be a core dump), but when the process went a lot and went to another operation on the variable B, the core appeared Dump; What's more, an error occurred in a linked list algorithm in the module, causing the pointer to cross the boundary.
3. The core dump of the program caused by the particularity of the operating system
Beginners are bound to feel inexplicable about this situation. "How can such a problem occur with such simple and standardized code?" The difference between this problem and 2 is that the problem is easy to reproduce. Run a core dump. Although puzzled about this type of core dump, it should be believed: The more prone a problem is, the easier it is to resolve. Just like compiling a program as a programmer, there are hundreds of compilation errors all at once. Don't worry at all. It is likely that a certain line of code has a few more characters. When you delete these codes and compile, all the hundreds of compilation errors are all Disappeared.
There are two situations in which such problems are often encountered.
First case: Program core dump caused by byte alignment. There may be two reasons: First, the byte alignment of the structure defined by the partner's module and its own module is different, which results in a core dump of the program; Second, in the code, reference to other modules The header file is included in the middle of the byte alignment syntax declaration in its own file, resulting in a change in byte alignment. Such problems are not very common, but once they appear, they often feel strange. In fact, it should be relatively simple to solve such problems from the source. The key lies in a good habit. If you define the interface message, take more time to structure the field definitions of the structure and move it to a multiple of 4 bytes. That's it, there should be no need to save that memory everywhere. The second case: The core dump of the program is caused by incorrect link parameters when compiling the program, which causes the core dump to occur when the program runs repeatedly operating large memory. After repeated code reduction, compilation, and running experiments, the problem law was finally found, so it was suspected to be related to the operating system or compilation, and the problem was finally solved. The root cause of the core dump of the program is caused by the coding mistakes made by the programmers during the program design. Most of these code errors are due to the failure to strictly abide by the corresponding code writing specifications. Therefore, it is necessary to fundamentally eliminate or reduce them. The core dump of a program still needs to start from strict adherence to code writing specifications [2] .

IN OTHER LANGUAGES

Was this article helpful? Thanks for the feedback Thanks for the feedback

How can we help? How can we help?