UPDATE: What Worked…
So after reading the link from @Raj and reading @ora-600’s answer I tried to validate the database with the RMAN command
backup check logical validate database;. While this worked fine, it was also clear that it was not looking at everything that the
ANALYZE INDEX command would.
After trying many different variations, I finally discovered that this command would work:
ANALYZE TABLE .. VALIDATE STRUCTURE CASCADE offline;
Yep, just switching to OFFLINE appears to fix it. And that eventually led me to this bug# 5752105 on this page: http://www.eygle.com/case/10204_buglist.htm. I am not in a position to prove it right now (cannot apply any patches for the time being), but I strongly suspect that this is what I was running into.
So while the question is not fully answered, I am going to mark @ora-600’s very helpful answer as correct so that he can collect Paul White’s very generous bounty.
I think the article Raj quoted (https://www.pythian.com/blog/analyze-index-validate-structure-dark-side/) describes this pretty well. “clustering factor” was also my first guess while reading the description of your problem. I also prefer to use RMAN to check for corruption.
RMAN> backup check logical validate database;
Afterwards you can query
V$DATABASE_BLOCK_CORRUPTION for details on any corrupt block.
Normally there is a reason for huge clustering. You can check the following details:
- How many rows does the index have?
- How many blocks does it consume?
This may help you to determine if the clustering_factor is high. You can also query
dba_extents to determine the clustering_factor.
- What type of management does the tablespace use? I guess it’s not ASSM since the database is pretty old and was upgraded several times.
Why is clustering bad for performance? When Oracle reads data from disk or cache it always reads in block. If the block is half empty you loose 50% of your reading performance. In case you do a full table or full index scan Oracle scans all blocks belonging to the segment (index or table). It does not check if a block is empty or not. Oracle reads from the first block to the last (HWM). If your index has 10mil blocks but it only needs 1mil blocks Oracle reads 9mil trash blocks.
ASSM (Automatic Segment Space Management) helps a lot to reduce/prevent clustering. If possible you should migrate Manual Segment Space Managed Tbs to ASSM Tbs.
Why checking for corruption? I always start analyzing a database by gathering ASM or statspack reports from the production database (not a test/dev db). This does not hurt anyone and will give you a lot of details on the database and it’s little secrets. Corruption does usually not affect performance since the database does not try to repair it — it just crashes the current query.