龙空技术网

ORA-00600 「4194」 故障处理

数据和云智能 136

前言:

而今兄弟们对“oracle11g导入12cdmp”大致比较看重,咱们都需要学习一些“oracle11g导入12cdmp”的相关资讯。那么小编同时在网摘上网罗了一些关于“oracle11g导入12cdmp””的相关文章,希望姐妹们能喜欢,同学们快快来学习一下吧!

客户有一套测试库主机宕机,主机启动后,数据库启动报ORA-00600 [4194],本文介绍处理过程。

1. 问题现象

启动数据库,数据进行实例恢复完成后,能成功open,但数据库在几分钟后自动宕掉,通过alert日志,可以看到宕库前报错是ORA-00600 arguments: [4194]。

alter database openBeginning crash recovery of 1 threads parallel recovery started with 3 processesStarted redo scan     -----> 扫描redo,进行实例恢复Completed redo scan read 149 KB redo, 107 data blocks need recoveryStarted redo application at Thread 1: logseq 84918, block 3Recovery of Online Redo Log: Thread 1 Group 3 Seq 84918 Reading mem 0  Mem# 0: /data/orauser01/app/oradata/orcl/redo03.logCompleted redo application of 0.12MB      -----> 进行实例恢复完成Completed crash recovery at Thread 1: logseq 84918, block 302, scn 13343268733370 107 data blocks read, 107 data blocks written, 149 redo k-bytes readWed Mar 24 18:10:56 2021Thread 1 advanced to log sequence 84919 (thread open)Thread 1 opened at log sequence 84919  Current log# 1 seq# 84919 mem# 0: /data/orauser01/app/oradata/orcl/redo01.logSuccessful open of redo thread 1MTTR advisory is disabled because FAST_START_MTTR_TARGET is not setWed Mar 24 18:10:56 2021SMON: enabling cache recovery[3151] Successfully onlined Undo Tablespace 2.Undo initialization finished serial:0 start:30524 end:30584 diff:60 (0 seconds)Verifying file header compatibility for 11g tablespace encryption..Verifying 11g file header compatibility for tablespace encryption completedSMON: enabling tx recoveryDatabase Characterset is ZHS16GBKNo Resource Manager plan activereplication_dependency_tracking turned off (no async multimaster replication found)Starting background process QMNCWed Mar 24 18:10:56 2021QMNC started with pid=23, OS id=3405 Completed: alter database open                  -----> open成功Wed Mar 24 18:10:57 2021db_recovery_file_dest_size of 4182 MB is 0.00% used. This is auser-specified limit on the amount of space that will be used by thisdatabase for recovery-related files, and does not reflect the amount ofspace available in the underlying filesystem or ASM diskgroup.Wed Mar 24 18:10:57 2021Errors in file /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/trace/chnldev_m001_3409.trc  (incident=1905820):ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []Incident details in: /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/incident/incdir_1905820/chnldev_m001_3409_i1905820.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Wed Mar 24 18:10:57 2021Starting background process CJQ0Wed Mar 24 18:10:57 2021CJQ0 started with pid=26, OS id=3422 Wed Mar 24 18:10:59 2021Dumping diagnostic data in directory=[cdmp_20210324181059], requested by (instance=1, osid=3409 (M001)), summary=[incident=1905820].Block recovery from logseq 84919, block 115 to scn 13343268734123Recovery of Online Redo Log: Thread 1 Group 1 Seq 84919 Reading mem 0  Mem# 0: /data/orauser01/app/oradata/orcl/redo01.logBlock recovery completed at rba 84919.3699.16, scn 3106.3100312748Block recovery from logseq 84919, block 60 to scn 13343268733483Recovery of Online Redo Log: Thread 1 Group 1 Seq 84919 Reading mem 0  Mem# 0: /data/orauser01/app/oradata/orcl/redo01.logBlock recovery completed at rba 84919.119.16, scn 3106.3100312115Errors in file /data/orauser01/app/orcle/diag/rdbms/orcl/chnldev/trace/chnldev_m001_3409.trc  (incident=1905821):ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []Incident details in: /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/incident/incdir_1905821/chnldev_m001_3409_i1905821.trcWed Mar 24 18:11:00 2021Errors in file /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/trace/chnldev_j001_3429.trc  (incident=1905844):ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []Incident details in: /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/incident/incdir_1905844/chnldev_j001_3429_i1905844.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Wed Mar 24 18:11:01 2021Sweep [inc][1905844]: completedSweep [inc][1905821]: completedSweep [inc][1905820]: completedSweep [inc2][1905820]: completedUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Errors in file /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/trace/chnldev_m001_3409.trc:ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []Dumping diagnostic data in directory=[cdmp_20210324181101], requested by (instance=1, osid=3409 (M001)), summary=[incident=1905821].Dumping diagnostic data in directory=[cdmp_20210324181103], requested by (instance=1, osid=3429 (J001)), summary=[incident=1905844].Block recovery from logseq 84919, block 115 to scn 13343268734123Recovery of Online Redo Log: Thread 1 Group 1 Seq 84919 Reading mem 0  Mem# 0: /data/orauser01/app/oradata/orcl/redo01.logBlock recovery completed at rba 84919.3699.16, scn 3106.3100312748Block recovery from logseq 84919, block 60 to scn 13343268734143Recovery of Online Redo Log: Thread 1 Group 1 Seq 84919 Reading mem 0  Mem# 0: /data/orauser01/app/oradata/orcl/redo01.logBlock recovery completed at rba 84919.3841.16, scn 3106.3100312768Errors in file /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/trace/chnldev_j001_3429.trc  (incident=1905845):ORA-00600: internal error code, arguments: [600], [ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []], [], [], [], [], [], [], [], [], [], []Incident details in: /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/incident/incdir_1905845/chnldev_j001_3429_i1905845.trcUse ADRCI or Support Workbench to package the incident.See Note 411.1 at My Oracle Support for error and packaging details.Errors in file /data/orauser01/app/oracle/diag/rdbms/orcl/chnldev/trace/chnldev_j001_3429.trc:ORA-00600: internal error code, arguments: [600], [ORA-00600: internal error code, arguments: [4194], [], [], [], [], [], [], [], [], [], [], []], [], [], [], [], [], [], [], [], [], []Dumping diagnostic data in directory=[cdmp_20210324181105], requested by (instance=1, osid=3429 (J001)), summary=[incident=1905845].
2. 问题分析

参考MOS文档:Step by step to resolve ORA-600 4194 4193 4197 on database crash (Doc ID 1428786.1),有该错误的介绍和处理方法。

The following error is occurring in the alert.log right before the database crashes.ORA-00600: internal error code, arguments: [4194], [#], [#], [], [], [], [], []This error indicates that a mismatch has been detected between redo records and rollback (undo) records.ARGUMENTS:Arg [a] - Maximum Undo record number in Undo blockArg [b] - Undo record number from Redo blockSince we are adding a new undo record to our undo block, we would expect that the new record number is equal to the maximum record number in the undo block plus one. Before Oracle can add a new undo record to the undo block it validates that this is correct. If this validation fails, then an ORA-600 [4194] will be triggered.

通过文档介绍,该错误是由重做记录与回滚记录不匹配引发。Oracle在验证Undo record number时,会对比redo change 和回滚段中的undo record number,若发现二者存在差异则报该4194错误。其错误argument[a][b],a代表回滚块中的最大undo record number,b代表重做日志中记录的undo record number。

这个问题通常发生在掉电或硬件故障导致数据库crash,在启动时,数据库执行正常的前滚(重做),然后回滚(撤销),这就是回滚时产生错误的地方。

3. 处理思路

通常最好的办法是通过备份进行恢复。

如果没有备份,那么可以通过特殊的初始化参数进行强制启动,然后做进一步处理。

我这里先按照Doc ID 1428786.1里提供的方法尝试处理。

4. 处理过程(1)启动数据库到nomount,创建pfile,方便添加参数

SYS@chnldev> startup nomountORACLE instance started.Total System Global Area  901971968 bytesFixed Size                  2258360 bytesVariable Size             843057736 bytesDatabase Buffers           50331648 bytesRedo Buffers                6324224 bytesSYS@chnldev> create pfile='/data/orauser01/pfile.ora' from spfile;create pfile='/data/orauser01/pfile.ora' from spfile*ERROR at line 1:ORA-01565: error in identifying file '?/dbs/spfile@.ora'ORA-27037: unable to obtain file statusLinux-x86_64 Error: 2: No such file or directoryAdditional information: 3

报错文件不存在,那么数据库应是用pfile启动的,确认一下。

SYS@chnldev> show parameter spfile;NAME                                 TYPE                   VALUE------------------------------------ ---------------------- ------------------------------spfile                               string

确认是通过pfile启动,那可以直接去$ORACLE_HOME/dbs下去修改pfile。

(2)修改pfile,添加如下参数

undo_management = manualevent = '10513 trace name context forever, level 2'
(3)关闭数据库,restrict模式启动
SYS@chnldev> shutdown immediateORA-01507: database not mountedORACLE instance shut down.SYS@chnldev> SYS@chnldev> startup restrictORACLE instance started.Total System Global Area  901971968 bytesFixed Size                  2258360 bytesVariable Size             843057736 bytesDatabase Buffers           50331648 bytesRedo Buffers                6324224 bytesDatabase mounted.Database opened.
(4)查询online的回滚段
SYS@chnldev> select tablespace_name, status, segment_name from dba_rollback_segs where status != 'OFFLINE';TABLESPACE_NAME                STATUS           SEGMENT_NAME------------------------------ ---------------- ------------------------------SYSTEM                         ONLINE           SYSTEM

这里十分重要,我们这里所有的回滚段都是offlien状态(system回滚段是永久online的)。

如果有online的非system回滚段,那么处理过程会更加复杂。

(5)重建undo表空间,并保留现有undo_tablespace参数

SYS@chnldev> create undo tablespace new_undotbs datafile '/data/orauser01/app/oradata/orcl/new_undotbs01.dbf' size 2000M;Tablespace created.SYS@chnldev> show parameter undoNAME                                 TYPE        VALUE------------------------------------ ----------- ------------------------------undo_management                      string      MANUALundo_retention                       integer     900undo_tablespace                      string      UNDOTBS1
(6)修改undo_tablespace参数,并重启数据库

修改pfile,将undo_tablespace参数修改为new_undotbs,然后重启数据库。

SYS@chnldev> shutdown immediate;Database closed.Database dismounted.ORACLE instance shut down.SYS@chnldev> SYS@chnldev> startupORACLE instance started.Total System Global Area  901971968 bytesFixed Size                  2258360 bytesVariable Size             843057736 bytesDatabase Buffers           50331648 bytesRedo Buffers                6324224 bytesDatabase mounted.Database opened.
(7)观察alert日志,没有再报错
Thu Mar 25 10:15:19 2021ALTER DATABASE   MOUNTSuccessful mount of redo thread 1, with mount id 1595909047Database mounted in Exclusive ModeLost write protection disabledCompleted: ALTER DATABASE   MOUNTThu Mar 25 10:15:23 2021ALTER DATABASE OPENThread 1 opened at log sequence 84920  Current log# 2 seq# 84920 mem# 0: /data/orauser01/app/oradata/orcl/redo02.logSuccessful open of redo thread 1MTTR advisory is disabled because FAST_START_MTTR_TARGET is not setSMON: enabling cache recovery[27669] Successfully onlined Undo Tablespace 8.Undo initialization finished serial:0 start:57898154 end:57898234 diff:80 (0 seconds)Verifying file header compatibility for 11g tablespace encryption..Verifying 11g file header compatibility for tablespace encryption completedSMON: enabling tx recoveryDatabase Characterset is ZHS16GBKNo Resource Manager plan activereplication_dependency_tracking turned off (no async multimaster replication found)Starting background process QMNCThu Mar 25 10:15:24 2021QMNC started with pid=20, OS id=27672 Completed: ALTER DATABASE OPENThu Mar 25 10:15:25 2021db_recovery_file_dest_size of 4182 MB is 0.00% used. This is auser-specified limit on the amount of space that will be used by thisdatabase for recovery-related files, and does not reflect the amount ofspace available in the underlying filesystem or ASM diskgroup.Starting background process CJQ0Thu Mar 25 10:15:26 2021CJQ0 started with pid=22, OS id=27688
(8)取消添加的参数,重启数据库,删除原undo表空间

SYS@chnldev> drop tablespace UNDOTBS1 including contents and datafiles;

Tablespace dropped.

墨天轮原文链接:(复制到浏览器或者点击“了解更多”立即查看)

关于作者

周玉其,云和恩墨服务总监,从业10年,一直专注于Oracle的开发及运维工作。曾参与大型BI项目的开发,并担任项目经理。近年在电力、政府、物流、运营商等行业从事DBA工作,并作为负责人和主要实施人,参与过多个项目的交付,积累了丰富的实战经验。

标签: #oracle11g导入12cdmp