linux:生成core的几种方式-尊龙官方平台

linux:生成core的几种方式

el/2024/3/25 16:40:14

linux:生成core的几种方式


1.总结

在某些情况下,进程会生成core文件(核心转储),记录进程状态,帮助我们快速定位异常。

例如:

  • 当进程异常时如段错误退出,可以分析结果core,查看调用栈定位空指针处;
  • 当进程执行某处代码阻塞时,可以强制生成core,查看调用栈定位阻塞原因;
  • ……

以下几种方式可生成core:

  • 代码不严谨异常退出,如最常见的段错误(segmentation fault);
  • 进程收到sigabrt信号,进程退出并生成core;
  • 通过gcore(或gdb)对进程生成core,进程正常运行不终止;

2.环境版本

操作系统:

[test1280@test1280 20210113]$ uname -a
linux test1280 2.6.32-642.el6.x86_64 #1 smp tue may 10 17:27:01 utc 2016 x86_64 x86_64 x86_64 gnu/linux
[test1280@test1280 20210113]$ cat /etc/redhat-release 
centos release 6.8 (final)

环境变量:

[test1280@test1280 20210113]$ ulimit -c
unlimited

注意:

ulimit -c一定不能是0,最好是ulimited。

ulimit -c如果设置为0,将无法生成core文件。

更多参考:https://blog.csdn.net/test1280/article/details/73655994


3.示例

3.1.运行时异常

最常见的如段错误:

  • 空指针引用
  • 内存越界
  • ……

以操作空指针引起段错误为例:

demo1.c

#include 
#include 
#include struct student
{char *mname;char *maddr;int   mage;
};int fun3()
{struct student* pstudent = null;/* 对空指针null操作段错误 */pstudent->mname = "test1280";
}int fun2()
{fun3();
}int fun1()
{fun2();
}int main()
{fun1();
}

编译、执行:

[test1280@test1280 20210113]$ gcc -o demo1 demo1.c -g
[test1280@test1280 20210113]$ ./demo1 
segmentation fault (core dumped)

查看core文件:

[test1280@test1280 20210113]$ gdb -c core.3348 demo1
......
core was generated by `./demo1'.
program terminated with signal 11, segmentation fault.
#0  0x0000000000400484 in fun3 () at demo1.c:16
16		pstudent->mname = "test1280";
missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.x86_64
(gdb) bt
#0  0x0000000000400484 in fun3 () at demo1.c:16
#1  0x000000000040049b in fun2 () at demo1.c:21
#2  0x00000000004004ab in fun1 () at demo1.c:26
#3  0x00000000004004bb in main () at demo1.c:31
(gdb) quit

异常时的调用栈为:main->fun1->fun2->fun3

#0  0x0000000000400484 in fun3 () at demo1.c:16
#1  0x000000000040049b in fun2 () at demo1.c:21
#2  0x00000000004004ab in fun1 () at demo1.c:26
#3  0x00000000004004bb in main () at demo1.c:31

指明异常原因:

program terminated with signal 11, segmentation fault.

注:signal 11 = sigsegv

指明异常代码(源文件、源代码):

#0  0x0000000000400484 in fun3 () at demo1.c:16
16		pstudent->mname = "test1280";

其他错误也可能引起core生成,如除0操作:

int fun3()
{int i = 0/0;
}
floating point exception (core dumped)
program terminated with signal 8, arithmetic exception.
3.2.信号

信号可以是进程自己触发,又或者是手动触发。

3.2.1.abort

进程在执行到异常流程时,可以主动调用abort函数(c库stdlib)退出进程,并生成core文件。

demo2.c

#include 
#include 
#include int fun3()
{/* 异常流程,退出进程 */abort();
}int fun2()
{fun3();
}int fun1()
{fun2();
}int main()
{fun1();
}

编译、执行:

[test1280@test1280 20210113]$ gcc -o demo2 demo2.c -g
[test1280@test1280 20210113]$ ./demo2 
aborted (core dumped)

查看core文件:

[test1280@test1280 20210113]$ gdb -c core.3415 demo2
......
core was generated by `./demo2'.
program terminated with signal 6, aborted.
#0  0x0000003da0e325e5 in raise () from /lib64/libc.so.6
missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.x86_64
(gdb) bt
#0  0x0000003da0e325e5 in raise () from /lib64/libc.so.6
#1  0x0000003da0e33dc5 in abort () from /lib64/libc.so.6
#2  0x00000000004004cd in fun3 () at demo2.c:8
#3  0x00000000004004db in fun2 () at demo2.c:13
#4  0x00000000004004eb in fun1 () at demo2.c:18
#5  0x00000000004004fb in main () at demo2.c:23
(gdb) quit

异常时的调用栈为:main->fun1->fun2->fun3->abort->raise

#0  0x0000003da0e325e5 in raise () from /lib64/libc.so.6
#1  0x0000003da0e33dc5 in abort () from /lib64/libc.so.6
#2  0x00000000004004cd in fun3 () at demo2.c:8
#3  0x00000000004004db in fun2 () at demo2.c:13
#4  0x00000000004004eb in fun1 () at demo2.c:18
#5  0x00000000004004fb in main () at demo2.c:23

在调用abort时,调用raise,发送sigabrt信号到进程自身。

指明异常原因:

program terminated with signal 6, aborted.

注:signal 6 = sigabrt

指明异常代码(源文件、源代码):

#0  0x0000003da0e325e5 in raise () from /lib64/libc.so.6
3.2.2.kill
  • ctrl \

如果进程运行在前台,例如:

demo3.c

[test1280@test1280 20210113]$ cat demo3.c 
#include 
#include 
#include int fun3()
{while (1){sleep(1);}
}int fun2()
{fun3();
}int fun1()
{fun2();
}int main()
{fun1();
}

编译、执行:

[test1280@test1280 20210113]$ gcc -o demo3 demo3.c -g
[test1280@test1280 20210113]$ ./demo3 
【前台阻塞,在demo3执行完毕前,当前shell阻塞】

在当前shell键入ctrl \,发送sigquit信号到前台进程:

[test1280@test1280 20210113]$ gcc -o demo3 demo3.c -g
[test1280@test1280 20210113]$ ./demo3 
^\quit (core dumped)

此时,前台进程终止运行,生成core文件。

查看core文件:

[test1280@test1280 20210113]$ gdb -c core.3575 demo3
......
core was generated by `./demo3'.
program terminated with signal 3, quit.
#0  0x0000003da0eacbc0 in __nanosleep_nocancel () from /lib64/libc.so.6
missing separate debuginfos, use: debuginfo-install glibc-2.12-1.192.el6.x86_64
(gdb) bt
#0  0x0000003da0eacbc0 in __nanosleep_nocancel () from /lib64/libc.so.6
#1  0x0000003da0eaca50 in sleep () from /lib64/libc.so.6
#2  0x00000000004004d2 in fun3 () at demo3.c:9
#3  0x00000000004004e2 in fun2 () at demo3.c:15
#4  0x00000000004004f2 in fun1 () at demo3.c:20
#5  0x0000000000400502 in main () at demo3.c:25

注意:signal 3 = sigquit

program terminated with signal 3, quit.
  • kill

kill命令(或类似kill的命令),可以手动发送指定信号到指定进程。

例如,仍以demo3为例,可以保持前台挂起,新启动shell终端执行kill:

【终端1】
[test1280@test1280 20210113]$ ./demo3 
【终端1阻塞...】【终端2】
【先查demo3进程的pid=3587】
[test1280@test1280 ~]$ ps aux | grep demo3 | grep -v grep
test1280   3587  0.0  0.0   3920   328 pts/0    s    06:41   0:00 ./demo3
【执行kill命令发送sigquit到3587进程】
[test1280@test1280 ~]$ kill -sigquit 3587【终端1】
[test1280@test1280 20210113]$ ./demo3 
quit (core dumped)
【demo3进程收到在终端2通过kill发送的sigquit信号,进程退出,终端1不再阻塞】
3.3.gcore

若生产环境中进程出现异常阻塞,在不宕停进程的情况下想生成core,可以使用gcore。

gcore是一个调用gdb的脚本:

[test1280@test1280 20210113]$ which gcore
/usr/bin/gcore
[test1280@test1280 20210113]$ file `which gcore`
/usr/bin/gcore: posix shell script text executable
[test1280@test1280 20210113]$ vi `which gcore`
......

gcore的man描述:

generate a core dump of a running program with process id pid.
produced file is equivalent to a kernel produced core file as if the process crashed (and if “ulimit -c” were used to set up an appropriate core dump limit).
unlike after a crash, after gcore the program remains running without any change.

例如,仍以demo3为例,可以保持前台挂起,新启动shell终端执行gcore:

【终端1】
[test1280@test1280 20210113]$ ./demo3 
【终端1等待demo3进程宕停,终端1阻塞挂起】【终端2】
【先查demo3进程的pid=3644】
[test1280@test1280 20210113]$ ps aux | grep demo3 | grep -v grep
test1280   3644  0.0  0.0   3920   332 pts/0    s    06:52   0:00 ./demo3
【gcore <pid>生成core】
[test1280@test1280 20210113]$ gcore 3644
0x0000003da0eacbc0 in __nanosleep_nocancel () from /lib64/libc.so.6
saved corefile core.3644【终端1】
[test1280@test1280 20210113]$ ./demo3 
【终端1进程仍然阻塞等待demo3进程宕停,在执行gcore后,demo3进程仍运行】

gcore脚本,是调用gdb的gcore指令实现其功能:

[test1280@test1280 20210113]$ gdb
(gdb) help gcore
save a core file with the current state of the debugged process.
argument is optional filename.  default filename is 'core.'.
(gdb) quit

4.总结

进程会由于各种各样的原因主动地或被动地生成core。

但归咎起来,大体上都是通过内核信号生成:

* sigsegv:段错误
* sigabrt:abort
* sigquit:ctrl q

除上之外,还有其他信号也会导致进程出core,例如sigill、sigtrap等。

关键在于,何种情况会导致信号触发送到进程。

除了信号,gcore调用gdb的gcore指令,使得某个进程生成core文件但并不终止进程执行。

遗留问题:gcore实现原理,是否也是发送某一种特定信号,此信号会使得进程生成core但并不终止?


http://www.ngui.cc/el/5127035.html

相关文章

go:panic时core的生成(gotraceback)与调试

go:panic时core的生成(gotraceback)与调试 1.需求 基于go实现的application在异常panic时进程将退出,并在终端输出panic信息。 例如: package mainfunc main() {panic("test1280 :(") }[test1280test1280…

c/c :tcp bind error:address already in use

c/c:tcp bind error:address already in use 在编写、运行服务端程序时,经常会遇到的一个错误是:address already in use. address already in use 是在调用bind系统调用时出现的错误。 原因有两个: 1.bind一个已经…

go:disable http chunk mode

go:disable http chunk mode 1.结论 应用层显式设置 content-length,可以 disable chunk mode。 http.handlefunc("/", func(writer http.responsewriter, request *http.request) {writer.header().set("content-length", "2…

go:http transfer-encoding chunked 实时读写

go:http transfer-encoding chunked 实时读写 服务端: package mainimport ("fmt""net/http""time" )func main() {http.handlefunc("/", func(writer http.responsewriter, request *http.request) {flusher,…

go:http request cancelled 服务端感知

go:http request cancelled 服务端感知 1.背景 今天查问题的时候,偶然发现github上一个有意思的问题,记录下来。 原问题:https://github.com/golang/go/issues/23262 2.问题 首先,我们思考一个问题: 当…

go:read一个已经被canceled的http.request的应答

go:read一个已经被canceled的http.request的应答 1.复现 最近发现项目在处理chunk类型的http应答时,出现读数据异常报错,代码示例如下: server package mainimport ("bytes""net/http" )func main() {http…

docker:安装vim

docker:安装vim 安装命令:apt install vim -y 异常错误:unable to locate package vim # apt install vim reading package lists... done building dependency tree reading state information... done e: unable to locate package…

go:zap log rotate(日志轮转)

go:zap log rotate(日志轮转) demo: package mainimport ("go.uber.org/zap""go.uber.org/zap/zapcore""gopkg.in/natefinch/lumberjack.v2" )func main() {// 日志级别loglevel : "debug&qu…

go:zap 自义定时间戳格式

go:zap 自义定时间戳格式 1.背景 使得zap输出的日志时间戳形如:2021-05-25 22:36:23.107(毫秒) 2.demo: package mainimport ("go.uber.org/zap""go.uber.org/zap/zapcore""gopkg.in/natef…

go run command-line-arguments

go run command-line-arguments 1.复现 目录结构: $ tree test1280 test1280 |-- go.mod |-- hello.go -- main.go代码内容: /*go.mod*/ module test1280go 1.16/*hello.go*/ package mainimport "fmt"func hello() {fmt.println("hel…
网站地图