在Oracle中,如何修复由于主库NOLOGGING引起的备库ORA-01578和ORA-26040错误?
众所周知,DG数据同步是基于日志流的,这也是为什么在配置DG阶段需要将主库设置为FORCE LOGGING的原因。但是,这也会带来很多问题,例如,会导致DML类型的SQL执行效率变慢,尤其在大批量数据更新或导入的时候显得尤为明显。DBA在使用数据泵进行迁移时希望在最少停机时间内完成,这时候就可能会考虑到以最小日志导入的方式以加快导入速度,然后重新同步备库。在这些场景中,DBA可能会使用NOLOGGING操作去节省大量数据插入的时间,而这种操作所带来的问题就是,如果该库在有备库的情况下,因为主库的NOLOGGING插入操作不会生成Redo,所以不会在备库上传输和应用,这会导致备库的数据出现问题,报ORA-01578和ORA-26040的错误。
在一个具有主备关系的主库上将FORCE_LOGGING设置为NOLOGGING模式,然后创建一张表LHR.TESTDGNOLOG,设置为NOLOGGING模式:
1SQL> ALTER DATABASE NO FORCE LOGGING; 2SQL> CREATE TABLE LHR.TESTDGNOLOG TABLESPACE USERS PCTFREE 99 AS SELECT ROWNUM N FROM XMLTABLE('1 TO 100'); 3SQL> ALTER TABLE LHR.TESTDGNOLOG NOLOGGING;
之后使用/* +append*/插入数据并提交:
1SQL> INSERT /*+ APPEND */ INTO LHR.TESTDGNOLOG SELECT ROWNUM N FROM XMLTABLE('1 TO 1000'); 2SQL> COMMIT
这时候在备库对该表进行查询会看到如下报错信息:
1SQL>SELECT COUNT(1) FROM LHR.TESTDGNOLOG; 2SELECT COUNT(1) FROM LHR.TESTDGNOLOG 3 * 4ERROR at line 1: 5ORA-01578: ORACLE data block corrupted (file # 4, block # 819) 6ORA-01110: data file 4: '/data/data1/ORCL2/datafile/o1_mf_users_3ft1e9qb_.dbf' 7ORA-26040: Data block was loaded using the NOLOGGING option
对于这种情况,在Oracle的不同版本中有不同的处理办法。
(一)Oracle 11g
在Oracle 11g中,如果遇到这样的问题,可以通过将包含缺少数据的数据文件从主库复制到物理备库再重命名数据文件来解决问题。
1、查询主库
1SQL> SELECT NAME, UNRECOVERABLE_CHANGE# FROM V$DATAFILE; 2NAME UNRECOVERABLE_CHANGE# 3-------------------------------------------- --------------------- 4+DATADG/orcl/datafile/system.270.972381717 0 5+DATADG/orcl/datafile/sysaux.265.972381717 0 6+DATADG/orcl/datafile/undotbs1.261.972381717 0 7+DATADG/orcl/datafile/users.259.972381717 6252054
2、查询备库
1sys@ORCLDG>SELECT NAME, UNRECOVERABLE_CHANGE# FROM V$DATAFILE; 2NAME UNRECOVERABLE_CHANGE# 3--------------------------------------------------------- --------------------- 4/data/data1/ORCLDG/datafile/o1_mf_system_3dt1e9op_.dbf 0 5/data/data1/ORCLDG/datafile/o1_mf_sysaux_3ct1e9nb_.dbf 0 6/data/data1/ORCLDG/datafile/o1_mf_undotbs1_3gt1e9qq_.dbf 0 7/data/data1/ORCLDG/datafile/o1_mf_users_3ft1e9qb_.dbf 5383754
3、比较主数据库和备用数据库的查询结果
在以上两个查询结果中,比较UNRECOVERABLE_CHANGE#列的值。如果主库中UNRECOVERABLE_CHANGE#列的值大于备库中的同一列,那么需要将这些数据文件在备库恢复。
将主库对应的数据文件拷贝至备库:
1SQL> ALTER TABLESPACE USERS BEGIN BACKUP; 2SQL> EXIT 3ASMCMD>cp +DATADG/orcl/datafile/users.259.972381717 /tmp 4$ scp /tmp/users.259.972381717 10.10.10.123:/data/data1/ORCL2/datafile/ 5SQL> ALTER TABLESPACE USERS END BACKUP;
在备库上,将旧的数据文件RENAME至新的数据文件:
1SQL> STARTUP MOUNT FORCE 2SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 3SQL> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=MANUAL; #在备库执行RENAME操作时,需要此参数为MANUAL 4SQL> ALTER DATABASE RENAME FILE '/data/data1/ORCLDG/datafile/o1_mf_users_3ft1e9qb_.dbf' TO '/data/data1/ORCLDG/datafile/users.259.972381717'; 5SQL> ALTER SYSTEM SET STANDBY_FILE_MANAGEMENT=AUTO; 6SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE USING CURRENT LOGFILE DISCONNECT FROM SESSION;
之后就可以在备库查询到实例表LHR.TESTDGNOLOG:
1SQL> SELECT COUNT(1) FROM LHR.TESTDGNOLOG; 2 COUNT(1) 3---------- 4 1100
(二)Oracle 12.1
对于这种情况,在Oracle 12.1版本中,RMAN提供了一种便捷的方式让DBA不再需要在主库上进行数据文件的备份传输而可以直接在备库使用restore database (or datafile ) from service进行恢复。
当然,如果数据文件是正常的状态,RMAN可以根据它们的数据文件头进行跳跃恢复。如果,由于NOLOGGING操作导致某些块被标记为损坏的,那么这部分数据文件就是需要恢复的。在恢复命令中有FORCE选项。因为有些时候数据文件是同步的,实时日志应用进程还是在运行的。这个时候,为了恢复,需要停止应用。一旦停止了应用,那么就不需要执行RESOTORE DATABASE FORCE操作,因为现在数据文件的状态是过旧的,就算不加FORCE选项RMAN也是不会跳过这些数据文件的。
备库关掉实时日志应用,并重启至MOUNT状态:
1SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL; 2SQL> SHUTDOWN IMMEDIATE 3Database closed. 4Database dismounted. 5ORACLE instance shut down. 6SQL> STARTUP MOUNT 7ORACLE instance started
备库登陆RMAN,使用restore database (or datafile ) from service进行恢复:
1RMAN> RESTORE DATABASE FROM SERVICE 'primary_db'; #这里的primary_db为备库至主库的TNS连接串的别名 2Starting restore at 2018-07-03 17:00:35 3using target database control file instead of recovery catalog 4allocated channel: ORA_DISK_1 5channel ORA_DISK_1: SID=29 device type=DISK 6channel ORA_DISK_1: starting datafile backup set restore 7channel ORA_DISK_1: using network backup set from service primary_db 8channel ORA_DISK_1: specifying datafile(s) to restore from backup set 9channel ORA_DISK_1: restoring datafile 00001 to /data/data1/ORCLDG/datafile/o1_mf_system_02t1t9ck_.dbf 10channel ORA_DISK_1: restore complete, elapsed time: 00:00:25 11channel ORA_DISK_1: starting datafile backup set restore 12channel ORA_DISK_1: using network backup set from service primary_db 13channel ORA_DISK_1: specifying datafile(s) to restore from backup set 14channel ORA_DISK_1: restoring datafile 00003 to /data/data1/ORCLDG/datafile/o1_mf_sysaux_03t1t9d3_.dbf 15channel ORA_DISK_1: restore complete, elapsed time: 00:00:15 16channel ORA_DISK_1: starting datafile backup set restore 17channel ORA_DISK_1: using network backup set from service primary_db 18channel ORA_DISK_1: specifying datafile(s) to restore from backup set 19channel ORA_DISK_1: restoring datafile 00004 to /data/data1/ORCLDG/datafile/o1_mf_undotbs1_04t1t9di_.dbf 20channel ORA_DISK_1: restore complete, elapsed time: 00:00:02 21channel ORA_DISK_1: starting datafile backup set restore 22channel ORA_DISK_1: using network backup set from service primary_db 23channel ORA_DISK_1: specifying datafile(s) to restore from backup set 24channel ORA_DISK_1: restoring datafile 00006 to /data/data1/ORCLDG/datafile/o1_mf_users_05t1t9dm_.dbf 25channel ORA_DISK_1: restore complete, elapsed time: 00:00:15 26Finished restore at 2018-07-03 17:01:34
当然要记得去起库并开启实时日志应用进程。以上恢复过程也可以直接恢复相关数据文件即可:
1RMAN> RESTORE DATAFILE 7 FROM SERVICE 'LHR122'; 2 3Starting restore at 2018-07-20 09:39:28 4using target database control file instead of recovery catalog 5allocated channel: ORA_DISK_1 6channel ORA_DISK_1: SID=24 device type=DISK 7 8channel ORA_DISK_1: starting datafile backup set restore 9channel ORA_DISK_1: using network backup set from service lhr122 10channel ORA_DISK_1: specifying datafile(s) to restore from backup set 11channel ORA_DISK_1: restoring datafile 00007 to /u04/oradata/lhr122dg/LHR122DG/datafile/users01_bk.dbf 12channel ORA_DISK_1: restore complete, elapsed time: 00:00:03 13Finished restore at 2018-07-20 09:39:32 14 15RMAN> alter database open; 16 17Statement processed
(三)Oracle 12.2
在Oracle 12.2中,Oracle提供了一种更方便的方式去进行恢复主库会将未记录的块的列表发送至备库,并记录在备库控制文件中,DBA可以从备库的V$NONLOGGED_BLOCK这个视图查看到相关信息。不需要发送主库的整个数据文件,而是在RMAN执行一个简单的命令来恢复它们:
1RECOVER DATABASE NONLOGGED BLOCK
首先,在备库停止实时日志应用:
1SQL> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE CANCEL;
备库登陆RMAN执行:
1RECOVER DATABASE NONLOGGED BLOCK
注意:执行此步骤前请确认主备库的LOG_ARCHIVE_CONFIG参数已经设置:
1RMAN> Recover Database Nonlogged Block; 2Starting recover at 2018-07-03 14:54:22 3using target database control file instead of recovery catalog 4allocated channel: ORA_DISK_1 5channel ORA_DISK_1: SID=56 device type=DISK 6starting recovery of nonlogged blocks 7List of Datafiles 8================= 9File Status Nonlogged Blocks Blocks Examined Blocks Skipped 10---- ------ ---------------- --------------- -------------- 11File Status Nonlogged Blocks Blocks Examined Blocks Skipped 12---- ------ ---------------- --------------- -------------- 131 OK 0 0 102399 143 OK 0 0 63999 154 OK 0 0 8959 167 OK 0 3403 2836 17Details of nonlogged blocks can be queried from v$nonlogged_block view 18recovery of nonlogged blocks complete, elapsed time: 00:00:08 19Finished recover at 2018-07-03 14:54:32
恢复完成后,V$NONLOGGED_BLOCK视图中不再有数据。最后别忘了开启实时日志应用进程。
综上来看,在Oracle 12.2中这个特性在数据仓库等一些场景是可以尝试的。以往DBA开启FORCE_LOGGING造成大量的Redo日志并且影响一部分DML语句的执行效率。在Oracle 12.2中可以尝试使用NOLOGGING操作去节省大量数据插入的时间,然后在系统空闲时间进行备库恢复操作。但是,这种操作也存在弊端,因为备库的可用性就大大降低了。