环境:oracle rac 11g,node1和node2,java程序直接连的node1
现象:业务系统流程处理中,提交时失败,查看后台日志报错:
ORA-01115: 从文件 读取块时出现 IO 错误 (块 # ) ORA-01110: 数据文件 6: '+DATA/orcl/cms.dbf' ORA-15081: 无法将 I/O 操作提交到磁盘 ORA-27072: 文件 I/O 错误 Linux-x86_64 Error: 5: Input/output error Additional information: 4 Additional information: 43291184 Additional information: -1 01115. 00000 - "IO error reading block from file %s (block # %s)" *Cause: Device on which the file resides is probably offline *Action: Restore access to the device
查看磁盘,发现根磁盘已100%使用,于是清理一部分根磁盘没用的文件,问题依旧。
DBV检查数据文件:
dbv file='+DATA/orcl/cms.dbf' userid=grid/grid
直接报IO错误,基本上磁盘硬件出现了问题
通过java报错定位到对应是业务表处理报错,通过select count(*) from 流程表 就会如上的错误,刚开始怀疑是磁盘满导致的,最后发现是硬盘故障,磁盘阵列有几块盘坏了。
工程师换完硬盘后并对硬盘逻辑坏块做了修复,dbv后能列出具体坏块了:
DBVERIFY - Verification complete Total Pages Examined : 4187648 Total Pages Processed (Data) : 1665462 Total Pages Failing (Data) : 1 Total Pages Processed (Index): 1489155 Total Pages Failing (Index): 0 Total Pages Processed (Other): 938505 Total Pages Processed (Seg) : 0 Total Pages Failing (Seg) : 0 Total Pages Empty : 94487 Total Pages Marked Corrupt : 39 Total Pages Influx : 0 Total Pages Encrypted : 0 Highest block SCN : 0 (0.0)
通过rman统计哪些对象有坏块
RMAN>backup check logical validate datafile
最后通过sql查询坏块对应的对象列表
SELECT e.owner, e.segment_type, e.segment_name, e.partition_name, c.file# , greatest(e.block_id, c.block#) corr_start_block# , least(e.block_id+e.blocks-1, c.block#+c.blocks-1) corr_end_block# , least(e.block_id+e.blocks-1, c.block#+c.blocks-1) - greatest(e.block_id, c.block#) + 1 blocks_corrupted , null description FROM dba_extents e, v$database_block_corruption c WHERE e.file_id = c.file# AND e.block_id <= c.block# + c.blocks - 1 AND e.block_id + e.blocks - 1 >= c.block# UNION SELECT s.owner, s.segment_type, s.segment_name, s.partition_name, c.file# , header_block corr_start_block# , header_block corr_end_block# , 1 blocks_corrupted , 'Segment Header' description FROM dba_segments s, v$database_block_corruption c WHERE s.header_file = c.file# AND s.header_block between c.block# and c.block# + c.blocks - 1 UNION SELECT null owner, null segment_type, null segment_name, null partition_name, c.file# , greatest(f.block_id, c.block#) corr_start_block# , least(f.block_id+f.blocks-1, c.block#+c.blocks-1) corr_end_block# , least(f.block_id+f.blocks-1, c.block#+c.blocks-1) - greatest(f.block_id, c.block#) + 1 blocks_corrupted , 'Free Block' description FROM dba_free_space f, v$database_block_corruption c WHERE f.file_id = c.file# AND f.block_id <= c.block# + c.blocks - 1 AND f.block_id + f.blocks - 1 >= c.block# ORDER BY file#, corr_start_block#;
输出结果为坏块的表、索引等内容
解决办法:
1.索引直接重建
2.业务表可以把可用的数据导出,然后重新建表还原数据;有备份直接从备份恢复