|
coredump调试记录 - PHP篇
coredump调试记录 - PHP篇
扶墙
贝壳产品技术
贝壳产品技术 “贝壳产品技术公众号”作为贝壳官方产品技术号,致力打造贝壳产品、技术干货分享平台,面向互联网/O2O开发/产品从业者,每周推送优质产品技术文章、技术沙龙活动及招聘信息等。欢迎大家关注我们。 242篇内容
2019年09月06日 20:00
作者扶墙(企业代号名),目前主要负责贝壳找房新装修业务线客源方向研发以及中心整体技术架构设计工作。从实际问题出发Linux生产环境一个PHP脚本发生了coredump,一群攻城狮利用一系列工具(strace看系统调用、gdb断点分析)进行定位、分析。在构造出可以稳定复现问题的环境下,顺利解决该问题。背景知识以及发展史中心思想:coredump和语言无关,本篇文章只是众多coredump场景之一,首先了解下coredump的前世今生。名词解释core:The name comes from magnetic core memory,[3] the principal form of random access memory from the 1950s to the 1970s. The name has remained long after magnetic core technology became obsolete.在20世纪50年代至70年代,磁芯内存(Magnetic Core Memory)是电脑随机存取存储器(即内存)的主要形式,Core Dump 中的 Core 沿用了磁芯内存的 Core 表达。dump:Earliest core dumps were paper printouts[6] of the contents of memory, typically arranged in columns of octal or hexadecimal numbers (a "hex dump"), sometimes accompanied by their interpretations as machine language instructions, text strings, or decimal or floating-point numbers (cf. disassembler).As memory sizes increased and post-mortem analysis utilities were developed, dumps were written to magnetic media like tape or disk.早期的core dump是打印在纸张上的内存内容,通常排列成八进制或者十六进制的数字,有时它伴随着机器语言的解释指令,字符串或者小数或者浮点型。随着现代分析工具的诞生以及内存容量的增长,Dump 内容被写到了诸如磁带或者磁盘这类磁芯(存储)介质上。dump 出来的内容是格式化的,可以使用一些工具来解析它。strace:是一个集诊断、调试、统计与一体的工具,我们可以使用strace对应用的系统调用和信号传递的跟踪结果来对应用进行分析,以达到解决问题或者是了解应用工作过程的目的。当然strace与专业的调试工具比如说gdb之类的是没法相比的,因为它不是一个专业的调试器。gdb:一个由GNU开源组织发布的、UNIX/LINUX操作系统下的、基于命令行的、功能强大的程序调试工具。对于一名Linux下工作的c++程序员,gdb是必不可少的工具,GDB主要可帮助工程师完成下面4个方面的功能:* 启动程序,可以按照工程师自定义的要求随心所欲的运行程序。* 让被调试的程序在工程师指定的断点处停住,断点可以是条件表达式。* 当程序被停住时,可以检查此时程序中所发生的事,并追索上文。* 动态地改变程序的执行环境。Linux:是一种自由和开放源码的类UNIX 操作系统。backtrace:支持应用程序进行自我调试(self-debugging)。摘要In computing, a core dump, crash dump, memory dump, or system dump[1] consists of the recorded state of the working memory of a computer program at a specific time, generally when the program has crashed or otherwise terminated abnormally.[2] In practice, other key pieces of program state are usually dumped at the same time, including the processor registers, which may include the program counter and stack pointer, memory management information, and other processor and operating system flags and information. A snapshot dump (or snap dump) is a memory dump requested by the computer operator or by the running program, after which the program is able to continue. Core dumps are often used to assist in diagnosing and debugging errors in computer programs. ----Wikipedia.释义:在计算机领域,核心转储、崩溃转储、内容转储或者系统转储,可以认为是,当程序崩溃或者异常终止时记录的计算机程序的工作内存状态信息。在实际场景下,其他关键的程序状态信息通常也会被转储下来,包括处理器寄存器信息,里面包含了程序计数器和栈指针、内存管理信息和其他处理器和操作系统层的各种标记及信息。镜像转储是计算机操作员或者正在运行的程序请求的内存的转储,在此之后程序得以继续运行。核心转储通常被用来辅助诊断、调试程序错误。可以简单想象一台高清相机按下快门的瞬间(只是信息会更丰富)。发展史内存中dump数据打印在纸张上进行分析(内容相对较少) → 现代操作系统创建一种特殊格式的文件,进行格式化输出(内容得到了极大的丰富,详见摘要部分)。目的只有一个:辅助攻城狮调试程序或者找到导致程序异常/崩溃的根本原因。延伸阅读:http://wikipedia.moesalih.com/Coredump 一、深入了解coredump1.1原理分析触发场景:应用程序崩溃或异常终止时,本小结讲说明在何种崩溃、何种异常终止情况下会产生coredump原理:coredump是一种基于异步信号的内存信息捕获机制,可以触发coredump的常见的信号有SIGBUS、SIGSEGV、SIGILL、SIGABRT、SIGQUIT等。查找man手册可以获得更多的信息。coredump机制提供用户态内存信息的镜像,方便用户事后debug程序。The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.所以SIGKILL即kill -9 是无法触发产生coredump的,详见下图。延伸阅读:http://man7.org/linux/man-pages/man7/signal.7.html1.2coredump造成的影响业务影响:从上述coredump产生的场景可得知,出Core意味着进程立即终止,服务不能正常响应,即服务受损,执行coredump的耗时与应用程序所占用内存资源多少正比,进程占用内存资源越多,coredump耗时越长,***例如当进程占用60G+以上内存时,完整Core文件需要15分钟才能完全写到磁盘上***(引自【搜索研发部官方博客】http://stblog.baidu-tech.com/p=1684),流量损失不可估量。心智负担:现代攻城狮日常工作大多应用如Java、PHP、Python等高级编程语言,相比C这类底层语言,对于coredump日常接触较少,举个例子:C语言工程师一个指针越界就会导致coredump(详见下例),而类似Java这种高级语言通过其强大的异常机制将我们从中解救了出来,正是在这种背景下,很多攻城狮见到core文件后往往无从下手,更有甚者甚者闻core色变。1.3如何生成一个core文件通过SIGQUIT 即"ctrl+\"work@VM_28_112_centos:~/duan/coredump$sleep10^\Quit(coredumped)gcore 工具work@VM_28_112_centos:~/duan/coredump$sleep10&[3]5299work@VM_28_112_centos:~/duan/coredump$gcore52990x00007fb8f0b31550in__nanosleep_nocancel()from/lib64/libc.so.6warning:targetfile/proc/5299/cmdlinecontainedunexpectednullcharactersSavedcorefilecore.5299代码异常#include#include#includeintmain(){/*SIGSEGV*//*charbuf[1];*//*buf[1]=1;*//*strcpy(buf,"111111111");*//*printf("bufis%p:%s\n",buf,buf);*//*return0;*//*----------*//*SIGFPE*/inta=0;intb=1;intc=b/a;return0;}1.4core文件格式work@VM_28_112_centos:~/duan/coredump$readelf-hcore.5299ELFHeader:Magic:7f454c46020101000000000000000000Class:ELF64Data:2'scomplement,littleendianVersion:1(current)OS/ABI:UNIX-SystemVABIVersion:0Type:CORE(Corefile)Machine:AdvancedMicroDevicesX86-64Version:0x1Entrypointaddress:0x0Startofprogramheaders:64(bytesintofile)Startofsectionheaders:368160(bytesintofile)Flags:0x0Sizeofthisheader:64(bytes)Sizeofprogramheaders:56(bytes)Numberofprogramheaders:15Sizeofsectionheaders:64(bytes)Numberofsectionheaders:17Sectionheaderstringtableindex:161.5coredump设置大小限制前面提到"进程占用60G+以上内存时,完整Core文件需要15分钟才能完全写到磁盘上",如果core文件大小不加以限制,存在机器磁盘被打满进而影响其他服务的风险(从机器利用率角度出发服务混布的情况还是比较普遍的),大小限制是一柄双刃剑 ---- 加限制后core文件会被截断,影响后续分析,具体配置方式 ulimit -c ,如果设置为0,则不会生成core文件另外为了提升服务可用性,我们一般会用Supervisor来管理工作进程,它的作用是当一个工作进程意外被杀死,supervisort监听到进程死后,会自动将它重新拉起,很方便的做到进程自动恢复的功能。在这种场景下一旦服务进程coredump,Supervisor就会将其"拉起",进而又会coredump,出现恶性循环。生成规则约束提升可阅读性一般会把服务名以及进程id作为core文件的命名规则,形如:cat/proc/sys/kernel/core_pattern/tmp/core-%e-%ppipe目前生产环境应用的不多,不做过多展开。二、回到我们遇到的case2.1具体场景前面说了这么多,主要目的是让大家对coredump有更全面的了解,同时降低一些心智负担。接下来要上场的就是我们的实际场景。背景:一个定时任务出现coredump,之前就发生过,通过其他hack方式规避掉了问题,并未深究。恰逢近日又遇此场景,千载难逢,决定一探究竟。确认配置信息:work@VM_28_112_centos:~/duan/coredump$cat/proc/sys/kernel/core_pattern/tmp/core-%e-%p//core文件放在了/tmp目录下,命名规则core-{可执行文件名}-{进程id}work@VM_28_112_centos:~/duan/coredump$ulimit-acorefilesize(blocks,-c)29//core文件大小受限,已生成的core文件极有可能是被截断的(亦即受损的,无法提供进行调试)1blocks=1024字节datasegsize(kbytes,-d)unlimitedschedulingpriority(-e)0filesize(blocks,-f)unlimitedpendingsignals(-i)127965maxlockedmemory(kbytes,-l)64maxmemorysize(kbytes,-m)unlimitedopenfiles(-n)65535pipesize(512bytes,-p)8POSIXmessagequeues(bytes,-q)819200real-timepriority(-r)0stacksize(kbytes,-s)8192cputime(seconds,-t)unlimitedmaxuserprocesses(-u)65535virtualmemory(kbytes,-v)unlimitedfilelocks(-x)unlimitedwork@VM_28_112_centos:~/duan/coredump$ll-tr/tmp/core-*-rw-------1wwwwww53248Aug115:50/tmp/core-php-26527-rw-------1wwwwww53248Aug115:51/tmp/core-php-27365-rw-------1wwwwww53248Aug115:52/tmp/core-php-27854-rw-------1wwwwww53248Aug115:53/tmp/core-php-28451-rw-------1wwwwww53248Aug115:54/tmp/core-php-29338-rw-------1wwwwww53248Aug115:55/tmp/core-php-29878-rw-------1wwwwww53248Aug115:56/tmp/core-php-30619-rw-------1wwwwww53248Aug115:57/tmp/core-php-31314-rw-------1wwwwww53248Aug115:58/tmp/core-php-31859运行环境centos7php 7.0.13WebframeworkdeliveredasaC-extensionforPHPphalcon=>enabledAuthor=>halconTeamandcontributorsVersion=>3.2.4BuildDate=>Jan29201817:41:27PoweredbyZephir=>Version0.10.4-11e39849b0Directive=>LocalValue=>MasterValuephalcon.db.escape_identifiers=>On=>Onphalcon.db.force_casting=>Off=>Offphalcon.orm.cast_on_hydrate=>Off=>Offphalcon.orm.column_renaming=>On=>Onphalcon.orm.disable_assign_setters=>Off=>Offphalcon.orm.enable_implicit_joins=>On=>Onphalcon.orm.enable_literals=>On=>Onphalcon.orm.events=>On=>Onphalcon.orm.exception_on_failed_save=>Off=>Offphalcon.orm.ignore_unknown_columns=>Off=>Offphalcon.orm.late_state_binding=>Off=>Offphalcon.orm.not_null_validations=>On=>Onphalcon.orm.update_snapshot_on_save=>On=>Onphalcon.orm.virtual_foreign_keys=>On=>Onwork@VM_28_112_centos:~/duan/coredump$php-vPHP7.0.13(cli)(builtec14201611:29:15)(NTS)Copyright(c)1997-2016ThePHPGroupZendEnginev3.0.0,Copyright(c)1998-2016ZendTechnologieswithZendOPcachev7.0.13,Copyright(c)1999-2016,byZendTechnologies三、一探究竟3.1工具集介绍gdb:调试core文件利器,只介绍接单用法,gdb {exec_file} {corefile} ,调试时exec_file一定要和产生coredump时的可执行文件保持完全一致,gdb更详细用法自行百度即可。ulimit: 如前面所述,管理coredump文件大小。strace :查看系统调用。3.2上手调试注意前两个代码片段中coredump文件大小。work@VM_28_112_centos:~/duan/coredump$gdb/usr/local/matrix/bin/php/tmp/core-php-31859GNUgdb(GDB)RedHatEnterpriseLinux7.6.1-110.el7Copyright(C)2013FreeSoftwareFoundation,Inc.LicenseGPLv3+:GNUGPLversion3orlaterThisisfreesoftware:youarefreetochangeandredistributeit.ThereisNOWARRANTY,totheextentpermittedbylaw.Type"showcopying"and"showwarranty"fordetails.ThisGDBwasconfiguredas"x86_64-redhat-linux-gnu".Forbugreportinginstructions,pleasesee:...Readingsymbolsfrom/usr/local/matrix/bin/php...Readingsymbolsfrom/usr/local/matrix/bin/php...(nodebuggingsymbolsfound)...done.(nodebuggingsymbolsfound)...done.BFD:Warning:/tmp/core-php-31859istruncated:expectedcorefilesize>=11030528,found:53248.[NewLWP31859]Failedtoreadavalidobjectfileimagefrommemory.Corewasgeneratedby`/usr/local/matrix/bin/php/data0/www/htdocs/xxxx-yyyy/cli/c'.Programterminatedwithsignal11,Segmentationfault.***#00x00007f2f1e2e430bin()//因为被截断所以堆栈信息不可访问***Missingseparatedebuginfos,use:debuginfo-installmatrix-php-7.0.13-1.x86_64(gdb)btPythonExceptionCannotaccessmemoryataddress0x7ffcdbea3260gdb)Corewasgeneratedby`/usr/local/matrix/bin/php/data0/www/htdocs/xxxx-yyyy/cli/c'.既然core文件大小设定了限制,我们把限制打开就行了,ulimit -c unlimited。work@VM_28_112_centos:~/duan/coredump$ulimit-cunlimited-bash:ulimit:corefilesize:cannotmodifylimit:Operationnotpermittedwork@VM_28_112_centos:~/duan/coredump$sudoulimit-cunlimitedsudo:ulimit:commandnotfoundsudo: ulimit: command not found 解决方案:sudo su 然后执行ulimit即可。因为sudo是要找到一个可运行的程序,而ulimit并不是一个独立的可运行的程序,如下:work@VM_28_112_centos:~/duan/coredump/phalcon$whereiscdcd:/usr/bin/cd/usr/share/man/man1/cd.1.gzwork@VM_28_112_centos:~/duan/coredump/phalcon$whereisulimitulimit:/usr/include/ulimit.h/usr/share/man/man1/ulimit.1.gz解决方案:sudo su & ulimit -c unlimitedsudo sh -c "ulimit -n 65535 & exec su $LOGNAME"3.3复现work@VM_28_112_centos:~/duan/coredump$/usr/local/matrix/bin/php/data0/www/htdocs/xxx-yyy/cli/cli.phphistory_deal_export_v2mainSegmentationfault(coredumped)work@VM_28_112_centos:~/duan/coredump$gdb/usr/local/matrix/bin/php-c/tmp/core-php-13396GNUgdb(GDB)RedHatEnterpriseLinux7.6.1-110.el7[1/55207]Copyright(C)2013FreeSoftwareFoundation,Inc.LicenseGPLv3+:GNUGPLversion3orlaterThisisfreesoftware:youarefreetochangeandredistributeit.ThereisNOWARRANTY,totheextentpermittedbylaw.Type"showcopying"and"showwarranty"fordetails.ThisGDBwasconfiguredas"x86_64-redhat-linux-gnu".Forbugreportinginstructions,pleasesee:...Readingsymbolsfrom/usr/local/matrix/bin/php...Readingsymbolsfrom/usr/local/matrix/bin/php...(nodebuggingsymbolsfound)...done.(nodebuggingsymbolsfound)...done.[NewLWP13396][Threaddebuggingusinglibthread_dbenabled]Usinghostlibthread_dblibrary"/lib64/libthread_db.so.1".Corewasgeneratedby`/usr/local/matrix/bin/php/data0/www/htdocs/preview-deal.fang.lianjia.com/cli/c'.Programterminatedwithsignal11,Segmentationfault.#00x00007fcc510e430binzephir_create_instance.constprop.180()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.soMissingseparatedebuginfos,use:debuginfo-installmatrix-php-7.0.13-1.x86_64(gdb)bt#00x00007fcc510e430binzephir_create_instance.constprop.180()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#10x00007fcc51239686inzim_Phalcon_Di_get()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#20x00000000006c3fd4inzend_call_function()#30x00007fcc510d77d1inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#40x00007fcc510d80cainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#50x00007fcc511c4a4binzim_Phalcon_Di_getShared()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#60x00000000006c3fd4inzend_call_function()#70x00007fcc510d77d1inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#80x00007fcc510d80cainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#90x00007fcc51395de5inzim_Phalcon_Dispatcher_dispatch()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#100x00000000006c3fd4inzend_call_function()#110x00007fcc510d77d1inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#120x00007fcc510d80cainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#130x00007fcc511fe52dinzim_Phalcon_Cli_Console_handle()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20121212/phalcon.so#140x000000000074bd7binZEND_DO_FCALL_SPEC_HANDLER()#150x000000000070dffbinexecute_ex()#160x0000000000757767inzend_execute()#170x00000000006d1b64inzend_execute_scripts()#180x0000000000676260inphp_execute_script()#190x00000000007592efindo_cli()#200x00000000004332dfinmain()关键信息:可以看到是在phalcon层面发生的coredump栈顶信息 zephir_create_instance.constprop.180借助下strace 工具 strace -s 2048 -T -tt -o /tmp/straceCore -v /usr/local/matrix/bin/php /data0/www/htdocs/someBiz/cli/cli.php history_deal_export_v2 main。...20:38:09.832524stat("/data0/www/htdocs/someBiz/app/library/BaseTask.php",{st_dev=makedev(253,16),st_ino=8655576,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=8,st_size=1029,st_atime=2019/08/02-11:11:02.613447598,st_mtime=2019/01/24-18:50:06.896831669,st_ctime=2019/01/24-18:50:07.760833714})=020:38:09.832619lstat("/data0/www/htdocs/someBiz/app/library/BaseTask.php",{st_dev=makedev(253,16),st_ino=8655576,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=8,st_size=1029,st_atime=2019/08/02-11:11:02.613447598,st_mtime=2019/01/24-18:50:06.896831669,st_ctime=2019/01/24-18:50:07.760833714})=020:38:09.832671open("/data0/www/htdocs/someBiz/app/library/BaseTask.php",O_RDONLY)=520:38:09.832708fstat(5,{st_dev=makedev(253,16),st_ino=8655576,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=8,st_size=1029,st_atime=2019/08/02-11:11:02.613447598,st_mtime=2019/01/24-18:50:06.896831669,st_ctime=2019/01/24-18:50:07.760833714})=020:38:09.832751fstat(5,{st_dev=makedev(253,16),st_ino=8655576,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=8,st_size=1029,st_atime=2019/08/02-11:11:02.613447598,st_mtime=2019/01/24-18:50:06.896831669,st_ctime=2019/01/24-18:50:07.760833714})=020:38:09.832812fstat(5,{st_dev=makedev(253,16),st_ino=8655576,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=8,st_size=1029,st_atime=2019/08/02-11:11:02.613447598,st_mtime=2019/01/24-18:50:06.896831669,st_ctime=2019/01/24-18:50:07.760833714})=020:38:09.832855mmap(NULL,1029,PROT_READ,MAP_SHARED,5,0)=0x7fad5213500020:38:09.832949munmap(0x7fad52135000,1029)=020:38:09.832988close(5)=020:38:09.833074stat("/data0/www/htdocs/someBiz/app/common/ExportConst.php",{st_dev=makedev(253,16),st_ino=8391700,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=64,st_size=31879,st_atime=2019/08/02-15:49:02.358751551,st_mtime=2019/07/31-15:48:29.901078099,st_ctime=2019/07/31-15:48:31.044078262})=020:38:09.833160lstat("/data0/www/htdocs/someBiz/app/common/ExportConst.php",{st_dev=makedev(253,16),st_ino=8391700,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=64,st_size=31879,st_atime=2019/08/02-15:49:02.358751551,st_mtime=2019/07/31-15:48:29.901078099,st_ctime=2019/07/31-15:48:31.044078262})=020:38:09.833206lstat("/data0/www/htdocs/someBiz/app/common",{st_dev=makedev(253,16),st_ino=8390886,st_mode=S_IFDIR|0755,st_nlink=2,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=8,st_size=4096,st_atime=2019/08/02-20:11:11.652054375,st_mtime=2019/07/31-15:48:31.044078262,st_ctime=2019/07/31-15:48:31.044078262})=020:38:09.833253open("/data0/www/htdocs/someBiz/app/common/ExportConst.php",O_RDONLY)=520:38:09.833290fstat(5,{st_dev=makedev(253,16),st_ino=8391700,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=64,st_size=31879,st_atime=2019/08/02-15:49:02.358751551,st_mtime=2019/07/31-15:48:29.901078099,st_ctime=2019/07/31-15:48:31.044078262})=020:38:09.833332fstat(5,{st_dev=makedev(253,16),st_ino=8391700,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=64,st_size=31879,st_atime=2019/08/02-15:49:02.358751551,st_mtime=2019/07/31-15:48:29.901078099,st_ctime=2019/07/31-15:48:31.044078262})=020:38:09.833374fstat(5,{st_dev=makedev(253,16),st_ino=8391700,st_mode=S_IFREG|0755,st_nlink=1,st_uid=99,st_gid=99,st_blksize=4096,st_blocks=64,st_size=31879,st_atime=2019/08/02-15:49:02.358751551,st_mtime=2019/07/31-15:48:29.901078099,st_ctime=2019/07/31-15:48:31.044078262})=020:38:09.833428mmap(NULL,31879,PROT_READ,MAP_SHARED,5,0)=0x7fad5212e00020:38:09.833783munmap(0x7fad5212e000,31879)=020:38:09.833830close(5)=020:38:09.833890---SIGSEGV{si_signo=SIGSEGV,si_code=SEGV_MAPERR,si_addr=0x18}---20:38:09.849002+++killedbySIGSEGV(coredumped)+++关键信息:当程序访问到:/data0/www/htdocs/someBiz/app/common/ExportConst.php 后程序接到了 SIGSEGV 信号,进而导致coredump。3.4结合业务代码进行分析入口文件usePhalcon\Di\FactoryDefault\CliasCliDI,Phalcon\Cli\ConsoleasConsoleApp;define('ROOT_PATH',dirname(__DIR__));define('APP_PATH',ROOT_PATH.DIRECTORY_SEPARATOR.'app');define("APP_MODE",get_cfg_var('lianjia.environment'));define("APP_NAME",'deal');$di=newCliDI();$config=includeROOT_PATH.'/config/'.APP_MODE.'.php';includeROOT_PATH.'/config/loader.php';includeROOT_PATH.'/config/cli_service.php';$di->get('dispatcher')->setDefaultNamespace('someBiz\tasks');//设置php错误日志$error_log='/tmp/'.APP_NAME.'_php_error_'.date('Ymd').'.log';ini_set('error_log',$error_log);\Framework\Context::init();$console=newConsoleApp();$console->setDI($di);/***处理console应用参数*/$arguments=array();foreach($argvas$k=>$arg){if($k==1){$arguments['task']=$arg;}elseif($k==2){$arguments['action']=$arg;}elseif($k>=3){$arguments['params'][]=$arg;}}//定义全局的参数,设定当前任务及动作define('CURRENT_TASK',(isset($argv[1])$argv[1]:null));define('CURRENT_ACTION',(isset($argv[2])$argv[2]:null));try{//处理参数$console->handle($arguments);//对应#130x00007fcc511fe52dinzim_Phalcon_Cli_Console_handle()}catch(\Exception$e){echo$e->getMessage();exit(255);}结合堆栈信息分析zephir_create_instance 函数从命名上能看出来是创建一个实例,函数定义详见:/***Createsanewinstancedynamically.Callconstructorwithoutparameters*/intzephir_create_instance(zval*return_value,constzval*class_nameTSRMLS_DC){zend_class_entry*ce;if(Z_TYPE_P(class_name)!=IS_STRING){zephir_throw_exception_string(spl_ce_RuntimeException,SL("Invalidclassname")TSRMLS_CC);returnFAILURE;}ce=zend_fetch_class(Z_STRVAL_P(class_name),Z_STRLEN_P(class_name),ZEND_FETCH_CLASS_DEFAULTTSRMLS_CC);if(!ce){ZVAL_NULL(return_value);returnFAILURE;}object_init_ex(return_value,ce);if(zephir_has_constructor_ce(ce)){returnzephir_call_class_method_aparams(NULL,ce,zephir_fcall_method,return_value,SL("__construct"),NULL,0,0,NULLTSRMLS_CC);}returnSUCCESS;}参考堆栈信息,梳理调用链如下:cphalcon/phalcon/Cli/Console.zep(handle函数) → cphalcon/phalcon/Cli/Dispatcher.zep(执行父类phalcon/Dispatcher/AbstractDispatcher.zep dispatch函数) – call di getShared→ cphalcon/phalcon/Di.zep(getShared函数) → cphalcon/phalcon/Di.zep(get函数) – call→ cphalcon/phalcon/Di.zep(create_instalce) → ext/kernel/object.c (zephir_create_instance) 至此coredumpzephir_create_instance.constprop看起来像是一个常量成员,结合strace日志,第一次访问到ExportConst.php文件就是在该class的成员变量中,进一步排查。大胆假设:小心求证:进一步推演:假设是因为这个原因,那不只是cli模式下服务异常,cgi服务也应该有问题才对,所以继续探索。测试接口:http://develop.api.home.ke.com/decorateplat/alphatest/versionemergency=1 {host 10.26.15.36 develop.api.home.ke.com}正常response{errno:10004,error:"paramsversionismissing",request_id:"2019080717544285115",data:{},cost:26}进一步测试:异常response:502BadGatewayTheproxyserverreceivedaninvalidresponsefromanupstreamserver.Sorryfortheinconvenience.Pleasereportthismessageandincludethefollowinginformationtous.Thankyouverymuch!URL:http://develop.api.home.ke.com/decorateplat/alphatest/versionemergency=1Server:vm_15_36_centosDate:2019/08/0717:56:45PoweredbyTengine日志信息:develop.api.home.ke.comtail-f/usr/local/matrix/var/log/php-fpm.log[02-Aug-201921:02:42]WARNING:[poolapi.home.ke.com]child25508exitedonsignal11(SIGSEGV-coredumped)after879742.488115secondsfromstart[02-Aug-201921:02:42]NOTICE:[poolapi.home.ke.com]child22853started[02-Aug-201921:04:47]WARNING:[poolapi.home.ke.com]child18875exitedonsignal11(SIGSEGV-coredumped)after1330478.281358secondsfromstart[02-Aug-201921:04:47]NOTICE:[poolapi.home.ke.com]child23251started[07-Aug-201917:35:44]WARNING:[poolapi.home.ke.com]child18878exitedonsignal11(SIGSEGV-coredumped)after1749935.065729secondsfromstart[07-Aug-201917:35:44]NOTICE:[poolapi.home.ke.com]child6541started[07-Aug-201917:51:53]WARNING:[poolapi.home.ke.com]child18876exitedonsignal11(SIGSEGV-coredumped)after1750904.354166secondsfromstart[07-Aug-201917:51:53]NOTICE:[poolapi.home.ke.com]child8994started[07-Aug-201917:56:45]WARNING:[poolapi.home.ke.com]child6541exitedonsignal11(SIGSEGV-coredumped)after1261.287392secondsfromstart[07-Aug-201917:56:45]NOTICE:[poolapi.home.ke.com]child9814started结论以及解决方案:至此问题应该分析清楚了,phalcon3.2.4版本程序访问到未定义的常量。导致服务coredump,在cli模式下进程异常退出,cgi模式下服务响应502,解决方案如下:典型的代码bug,好好测试即可避免,PS:当且仅当成员变量取值为一个不存在的常量时,才会引发coredump。除此之外均会导致Fatal Error。小伙伴们感兴趣的话自行验证吧。目前只发现3.2.4版本存在该问题,而生产环境有两个版本共存,即3.2.4与3.4.1,经验证3.4.1已修复该问题,故建议升级至3.4.1版本。后记为什么只有在一个类的成员变量访问到未定义常量时才会coredump?,目标锁定在phalcon框架,既然我们已经定位到了是在zephir_create_instance.constprop.180函数阶段发生的异常(详见backtrace栈顶信息)。到源码上一探究竟,phalcon 3.2.4 ext/kernel/object.c,源码上并没有 zephir_create_instance.constprop.180,继续对phalcon框架、zephir深入研究熟悉,里面用到了LLVM相关知识点。待下篇文章分析b zend_fetch_class coredump之前最后一次断点。Breakpoint3,0x00000000006c4db0inzend_fetch_class()(gdb)Singlesteppinguntilexitfromfunctionzend_fetch_class,whichhasnolinenumberinformation.0x00007fffe4029273inzephir_create_instance.constprop.180()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so(gdb)bt#00x00007fffe4029273inzephir_create_instance.constprop.180()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#10x00007fffe4250860inzim_Phalcon_Di_Service_resolve()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#20x00007fffeda6da35inxdebug_execute_internal(current_execute_data=0x7ffff4414650,return_value=0x7fffffff9cf0)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:2026#30x00000000006c4388inzend_call_function()#40x00007fffe401c751inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#50x00007fffe401d04ainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#60x00007fffe417eac3inzim_Phalcon_Di_get()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#70x00007fffeda6da35inxdebug_execute_internal(current_execute_data=0x7ffff44145d0,return_value=0x7fffffffa2c0)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:2026#80x00000000006c4388inzend_call_function()#90x00007fffe401c751inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#100x00007fffe401d04ainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#110x00007fffe410996binzim_Phalcon_Di_getShared()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#120x00007fffeda6da35inxdebug_execute_internal(current_execute_data=0x7ffff4414560,return_value=0x7fffffffa8e0)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:2026#130x00000000006c4388inzend_call_function()#140x00007fffe401c751inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#150x00007fffe401d04ainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#160x00007fffe4142d4ainzim_Phalcon_Cli_Console_handle()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#170x00007fffeda6da35inxdebug_execute_internal(current_execute_data=0x7ffff44144f0,return_value=0x7ffff44144d0)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:2026#180x000000000074bc14inZEND_DO_FCALL_SPEC_HANDLER()#190x000000000070dffbinexecute_ex()#200x00007fffeda6d079inxdebug_execute_ex(execute_data=0x7ffff4414030)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:1904#210x0000000000757767inzend_execute()#220x00000000006d1b64inzend_execute_scripts()---Typetocontinue,orqtoquit---#230x0000000000676260inphp_execute_script()#240x00000000007592efindo_cli()#250x00000000004332dfinmain()(gdb)cContinuing.Breakpoint3,0x00000000006c4db0inzend_fetch_class()(gdb)Continuing.Breakpoint3,0x00000000006c4db0inzend_fetch_class()(gdb)Continuing.ProgramreceivedsignalSIGSEGV,Segmentationfault.0x00007fffe402928binzephir_create_instance.constprop.180()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so(gdb)bt#00x00007fffe402928binzephir_create_instance.constprop.180()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#10x00007fffe417e5a6inzim_Phalcon_Di_get()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#20x00007fffeda6da35inxdebug_execute_internal(current_execute_data=0x7ffff4414630,return_value=0x7fffffff9650)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:2026#30x00000000006c4388inzend_call_function()#40x00007fffe401c751inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#50x00007fffe401d04ainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#60x00007fffe410996binzim_Phalcon_Di_getShared()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#70x00007fffeda6da35inxdebug_execute_internal(current_execute_data=0x7ffff44145c0,return_value=0x7fffffff9d20)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:2026#80x00000000006c4388inzend_call_function()#90x00007fffe401c751inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#100x00007fffe401d04ainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#110x00007fffe42d9fd5inzim_Phalcon_Dispatcher_dispatch()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#120x00007fffeda6da35inxdebug_execute_internal(current_execute_data=0x7ffff4414560,return_value=0x7fffffffa8c0)at/usr/local/src/xdebug-2.7.0RC2/xdebug.c:2026#130x00000000006c4388inzend_call_function()#140x00007fffe401c751inzephir_call_user_function()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so#150x00007fffe401d04ainzephir_call_class_method_aparams.part.35()from/usr/local/matrix/lib/php/extensions/no-debug-non-zts-20151012/phalcon.so.....参考资料*【维基百科】(http://wikipedia.moesalih.com/Coredump)* 【百度搜索研发部博客】(http://stblog.baidu-tech.com/p=1684)* 【man 7】*【signal7】(http://man7.org/linux/man-pages/man7/signal.7.html)*【core5】(http://man7.org/linux/man-pages/man5/core.5.html)*【backtrace】(http://man7.org/linux/man-pages/man3/backtrace.3.html)*【strace】(https://www.ibm.com/developerworks/cn/linux/l-tsl/index.html)*【gdb】(https://blog.csdn.net/21cnbao/article/details/7385161)* 【Linux】(https://zh.wikipedia.org/wiki/Linux)*【关于sudo】(https://stackoverflow.com/questions/17483723/command-not-found-when-using-sudo-ulimit)特别鸣谢克虏伯(企业代号名)老师审稿泽光(企业代号名)老师指点远明志、五花肉(企业代号名)提供支持飞流(企业代号名)协助排查如有疑问,欢迎大家在留言区一起交流讨论。最后广告时间:新装修事业部广纳英才,Java、PHP、FE、IOS、Android简历砸过来 P6~P8 坑位充足 hanxinxin@ke.com作 者:扶墙(企业代号名)出品人:义斋、克虏伯(企业代号名)---------- END ----------
预览时标签不可点
关闭更多小程序广告搜索「undefined」网络结果
|
|