龙空技术网

云贝教育 |「技术文章」关闭OGG进程失败提示有长事务解决方案

云贝教育 94

前言:

而今看官们对“oracleogg112安装”大致比较重视,大家都需要知道一些“oracleogg112安装”的相关文章。那么小编同时在网络上汇集了一些对于“oracleogg112安装””的相关知识,希望看官们能喜欢,同学们一起来了解一下吧!

作者:炎燚小寳

原文链接:

一、事件概述

客户反馈OGG进程无法停止,提示有大事务在运行,无法停止,最终定位到异常会话的游标没有正常关闭导致

二、故障分析

(一)故障现象

数据库2节点OGG的EXT进程无法停止。

(二)故障分析

1. 查看OGG日志。

GGSCI (xx2) 8> stop extract EJ1_FPSending STOP request to EXTRACT EJ1_FP ...STOP request pending. There are open, long-running transactions.Before you stop Extract, make the archives containing data for those transactions available for when Extract restarts.To force Extract to stop, use the SEND EXTRACT EJ1_FP, FORCESTOP command.Oldest redo log file necessary to restart Extract is:Redo Log Sequence Number 616, RBA 135673360.2022-10-21 10:55:41 WARNING OGG-01742 Command sent to EXTRACT EJ1_FP returned with an invalid response.

2. 查看当前的长事务。

GGSCI (xx2) 8> send extract EJ1_FP,showtrans duration 20 MINSending showtrans request to EXTRACT EJ1_FP ...Oldest redo log files necessary to restart Extract are:Redo Thread 1, Redo Log Sequence Number 47308, SCN 4585.1856782301 (19694281834461), RBA 575467024Redo Thread 2, Redo Log Sequence Number 70563, SCN 4585.1916741249 (19694341793409), RBA 310660624------------------------------------------------------------XID: 214.19.198565Items: 0Extract: EJ1_FPRedo Thread: 1Start Time: 2022-10-13:09:40:20SCN: 4585.1856782301 (19694281834461)Redo Seq: 47308Redo RBA: 575467024Status: Running------------------------------------------------------------XID: 234.31.119068Items: 0Extract: EJ1_FPRedo Thread: 1Start Time: 2022-10-13:10:45:36SCN: 4585.1858630286 (19694283682446)Redo Seq: 47378Redo RBA: 817386000Status: Running------------------------------------------------------------XID: 6.19.504269Items: 0Extract: EJ1_FPRedo Thread: 1Start Time: 2022-10-18:13:22:51SCN: 4585.1922052463 (19694347104623)Redo Seq: 47529Redo RBA: 246746640Status: Running------------------------------------------------------------XID: 54.10.981207Items: 0Extract: EJ1_FPRedo Thread: 2Start Time: 2022-10-18:08:19:13SCN: 4585.1916741249 (19694341793409)Redo Seq: 70563Redo RBA: 310660624Status: Running

3. 通过UNDO事务定位到SQL源头

select inst_id,saddr,sid,serial#,paddr,username,sql_id,status from gv$session where taddr in (select addr from gv$transaction where xidusn in (54,6,234,214))登录用户为sys,程序类型为sqlplus发起SQLID:g72kdvcacxvtfDELETE FROM PLAN_TABLE WHERE STATEMENT_ID=:1SQLID:9babjv8yq8ru3BEGIN DBMS_OUTPUT.GET_LINES(:LINES, :NUMLINES); END;

4. 综合以上日志初步分析:ORACLE在执行DELETE FROM PLAN_TABLE WHERE STATEMENT_ID=:1一直未提交,导致长事务产生,进一步影响OGG进程关闭。

三、故障定位

根据以上日志报告分析,ORACLE内部递归调用的删除SQL一直未提交,产生长事务,为此次故障的根本原因。

四、故障解决措施及建议

(一)解决措施

如果我们单纯为了解决问题,KILL对应的长事务即可。但如果想知道的问题根源,该怎么办?

五、模拟重现长事务等待

SQL中涉及的目标表是PLAN_TABLE,跟执行计划有关,同时源头用户是SYS且程序为sqlplus,是不是有会话通过操作系统认证登录数据库查看执行计划了?而DBMS_OUTPUT.GET_LINES函数只是输出执行计划附带的而已。

接下来进行模拟测试

---session1---sqlplus / as sysdba登录数据库,生成一条SQL的执行计划,不退出会话。此操作用10046记录。-bash-4.1$ sqlplus / as sysdbaSQL*Plus: Release 11.2.0.4.0 Production on Sat Oct 15 06:15:39 2022Copyright (c) 1982, 2013, Oracle. All rights reserved.Connected to:Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit ProductionWith the Partitioning, OLAP, Data Mining and Real Application Testing optionsSQL> oradebug setmypidalter session set events '10046 trace name context forever ,level 12';Statement processed.SQL>Session altered.SQL> set autot traceSQL> select 1 from t1;Execution Plan----------------------------------------------------------Plan hash value: 2990879143-----------------------------------------------------------------------| Id | Operation | Name | Rows | Cost (%CPU)| Time |-----------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 1 (0)| 00:00:01 || 1 | INDEX FULL SCAN | IDX_IDNAME | 1 | 1 (0)| 00:00:01 |-----------------------------------------------------------------------Note------ dynamic sampling used for this statement (level=2)Statistics----------------------------------------------------------0 recursive calls0 db block gets1 consistent gets0 physical reads0 redo size519 bytes sent via SQL*Net to client524 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)1 rows processedSQL> alter session set events '10046 trace name context off';Session altered.SQL> oradebug close_traceStatement processed.SQL> oradebug tracefile_name/u01/app/oracle/diag/rdbms/d11gr4/d11gr4/trace/d11gr4_ora_14810.trc---session2---查看是否有INACTIVE的事务SQL> SELECT ktuxesiz FROM x$ktuxe WHERE ktuxesta != 'INACTIVE';KTUXESIZ----------2SQL> select addr,xidusn,xidslot,xidsqn,status from v$transaction ;ADDR XIDUSN XIDSLOT XIDSQN STATUS---------------- ---------- ---------- ---------- ----------------00000000E6B17E38 9 24 1605 ACTIVESQL> select sid,username,sql_id,PREV_SQL_ID,sql_hash_value from v$session where taddr='00000000E6B17E38';SID USERNAME SQL_ID PREV_SQL_ID SQL_HASH_VALUE---------- ------------------------------ ------------- ------------- --------------152 SYS g72kdvcacxvtf 0SQL> select sql_fulltext from v$sql t where t.sql_id='g72kdvcacxvtf';SQL_FULLTEXT--------------------------------------------------------------------------------DELETE FROM PLAN_TABLE WHERE STATEMENT_ID=:1DELETE FROM PLAN_TABLE WHERE STATEMENT_ID=:1DELETE FROM PLAN_TABLE WHERE STATEMENT_ID=:1--session 3--查看10046日志,果然找到上述SQLPARSING IN CURSOR #18446604434621518272 len=45 dep=0 uid=0 oct=7 lid=0 tim=237421908196 hv=349105966 ad='e11266d0' sqlid='g72kdvcacxvtf'DELETE FROM PLAN_TABLE WHERE STATEMENT_ID=:1END OF STMTPARSE #18446604434621518272:c=45,e=45,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=2637181423,tim=237421908195BINDS #18446604434621518272:Bind#0oacdty=01 mxl=128(42) mxlc=00 mal=00 scl=00 pre=00oacflg=03 fl2=1000000 frm=01 csi=873 siz=128 off=0kxsbbbfp=ffff80ffbdc5d8b0 bln=128 avl=14 flg=05value="PLUS4294967295"

从以上测试结果可以验证,就是用户通过操作系统认证以sys登录数据库,查看执行计划,未关闭会话导致的长事务,如何避免该问题咧?查看完执行计划之后关闭会话即可,这样会话对应的游标会自动提交,有兴趣的读者可以验证此点。

六、处理结果

通过以上处理,长事务己消失,OGG进程可以正常关闭。

GGSCI (XX2) 9> send extract EJ1_FP,showtrans duration 20 MINSending showtrans request to EXTRACT EJ1_FP ...No transactions foundOldest redo log files necessary to restart Extract are:Redo Thread 1, Redo Log Sequence Number 47308, SCN 4585.1856782301 (19694281834461), RBA 575467024Redo Thread 2, Redo Log Sequence Number 70563, SCN 4585.1916741249 (19694341793409), RBA 310660624. 

标签: #oracleogg112安装