ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

操作系统crash分析grid集群重启原因

2022-06-27 10:03:11  阅读:165  来源: 互联网

标签:27 crash 操作系统 23 grid 2022 x86


默认情况下linux是不能分析core文件需要安装内核调试和crash分析工具

从以下网址安装内核调试rpm和 crash
https://oss.oracle.com/ol7/debuginfo/
kernel-uek-debuginfo-4.14.35-1902.3.2.el7uek.x86_64.rpm
kernel-uek-debuginfo-common-4.14.35-1902.3.2.el7uek.x86_64.rpm
yum install crash

安装完毕后检查

[root@ht02 ~]# rpm -qa|grep kernel-uek-debuginfo
kernel-uek-debuginfo-common-4.14.35-1902.3.2.el7uek.x86_64
kernel-uek-debuginfo-4.14.35-1902.3.2.el7uek.x86_64
[root@ht02 ~]# uname -r
4.14.35-1902.3.2.el7uek.x86_64
[root@ht02 ~]# rpm -qa|grep crash
crash-7.2.3-10.el7.x86_64

 

19c设置cssd、cssdmoniter属性当grid驱逐或者crash时,操作系统生成core文件

开启crash dump

/u01/app/grid/bin/crsctl modify type ora.cssd.type -attr "ATTRIBUTE=REBOOT_OPTS, TYPE=string, DEFAULT_VALUE=,FLAGS=CONFIG" -init
/u01/app/grid/bin/crsctl modify type ora.cssdmonitor.type -attr "ATTRIBUTE=REBOOT_OPTS,TYPE=string, DEFAULT_VALUE=,FLAGS=CONFIG" -init
/u01/app/grid/bin/crsctl modify res ora.cssd -attr "REBOOT_OPTS=CRASHDUMP" -init
/u01/app/grid/bin/crsctl modify res ora.cssdmonitor -attr "REBOOT_OPTS=CRASHDUMP" -init


关闭 crash dump

/u01/app/grid/bin/crsctl modify res ora.cssd -attr "REBOOT_OPTS=" -init
/u01/app/grid/bin/crsctl modify res ora.cssdmonitor -attr "REBOOT_OPTS=" -init

 

11g开启crash dump 参考mosPre-11.2: Using Diagwait as a diagnostic to get more information for diagnosing Oracle Clusterware Node evictions (Doc ID 559365.1)
[+ASM1]@ht01[/home/grid]$crsctl get css diagwait
CRS-4678: Successful get diagwait 0 for Cluster Synchronization Services.
[root@ht01 ~]# /u01/app/grid/bin/crsctl set css diagwait 13
CRS-4684: Successful set of parameter diagwait to 13 for Cluster Synchronization Services.
[+ASM1]@ht01[/home/grid]$crsctl get css diagwait
CRS-4678: Successful get diagwait 13 for Cluster Synchronization Services
11g关闭 crash dump
crsctl unset css diagwait -force

kill ocssd.bin 进程   cssdmonitor导致操作系统自动重启

[root@ht02 ~]# crash /lib/debug/lib/modules/4.14.35-1902.3.2.el7uek.x86_64/vmlinux /var/crash/127.0.0.1-2022-06-23-05:27:58/vmcore

crash 7.2.3-10.el7
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

WARNING: kernel relocated [752MB]: patching 90846 gdb minimal_symbol values

please wait... (patching 90846 gdb minimal_symbol values)
KERNEL: /lib/debug/lib/modules/4.14.35-1902.3.2.el7uek.x86_64/vmlinux
DUMPFILE: /var/crash/127.0.0.1-2022-06-23-05:27:58/vmcore [PARTIAL DUMP]
CPUS: 4
DATE: Thu Jun 23 17:27:50 2022
UPTIME: 00:10:25
LOAD AVERAGE: 1.51, 1.43, 0.84
TASKS: 769
NODENAME: ht02
RELEASE: 4.14.35-1902.3.2.el7uek.x86_64
VERSION: #2 SMP Tue Jul 30 03:59:02 GMT 2019
MACHINE: x86_64 (3194 Mhz)
MEMORY: 14.6 GB
PANIC: "sysrq: SysRq : Trigger a crash"
PID: 3405
COMMAND: "cssdmonitor"
TASK: ffff96f176ddaf80 [THREAD_INFO: ffff96f176ddaf80]
CPU: 1
STATE: TASK_RUNNING (SYSRQ)

 查看ohasd_orarootagent_root.trc

2022-06-23 17:27:50.559 : CSSCLNT:3548346112: clsssRecvMsgA: got a disconnect from the server while waiting for message type 27
2022-06-23 17:27:50.559 :GIPCXCPT:3548346112:  gipcInternalSend: connection not valid for send operation endp 0x7f78b40811b0 [00000000000006de] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=dd87820c-fc4df1b5-3382))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ht02_)(GIPCID=fc4df1b5-dd87820c-3433))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 3433, readyRef (nil), ready 0, wobj 0x7f78b406c760, sendp (nil) status 0flags 0x2003861e, flags-2 0x0, usrFlags 0x20010 }, ret gipcretConnectionLost (12)
2022-06-23 17:27:50.559 :GIPCXCPT:3548346112:  gipcSendSyncF [clsssServerRPC_int : clsss.c : 8292]: EXCEPTION[ ret gipcretConnectionLost (12) ]  failed to send on endp 0x7f78b40811b0 [00000000000006de] { gipcEndpoint : localAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=)(GIPCID=dd87820c-fc4df1b5-3382))', remoteAddr 'clsc://(ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_ht02_)(GIPCID=fc4df1b5-dd87820c-3433))', numPend 0, numReady 0, numDone 0, numDead 0, numTransfer 0, objFlags 0x0, pidPeer 3433, readyRef (nil), ready 0, wobj 0x7f78b406c760, sendp (nil) status 0flags 0x2003861e, flags-2 0x0, usrFlags 0x20010 }, addr 0000000000000000, buf 0x7f78d37eb6f8, len 80, flags 0x8000000
2022-06-23 17:27:50.559 : CSSCLNT:3548346112: clsssServerRPC: send failed with err 12, msg type 7

2022-06-23 17:27:50.559 : CSSCLNT:3548346112: clsssCommonClientExit: RPC failure, rc 3

2022-06-23 17:27:50.559 : USRTHRD:4038760192: [     INFO]  clsnpoll_BlockMsg: lost connection with CSS
2022-06-23 17:27:50.559 : USRTHRD:4038760192: [     INFO]  clsnpoll_BlockMsg: calling sync
Trace file /u01/app/11.2.0/grid/diag/crs/ht02/crs/trace/ohasd_cssdmonitor_root.trc
Oracle Database 19c Clusterware Release 19.0.0.0.0 - Production
Version 19.3.0.0.0 Copyright 1996, 2019 Oracle. All rights reserved.
    CLSB:429202688: [     INFO] Argument count (argc) for this daemon is 1
    CLSB:429202688: [     INFO] Argument 0 is: /u01/app/grid/bin/cssdmonitor

  

标签:27,crash,操作系统,23,grid,2022,x86
来源: https://www.cnblogs.com/omsql/p/16415216.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有