In the previous entry regarding The CBO and Indexes, we saw how the CBO with a query that selected 5 distinct values in an IN list, representing 5% of the data, decided to use a FTS because the costs of doing so were less than that of using a corresponding index. These costs (using the I/O costing model) represented the number of expected I/Os and the FTS was basically going to perform fewer I/Os than the index. Less I/Os, less cost and so the FTS was selected as the preferred access path.
However, by default, the CBO when determining these costs via the I/O costing model makes two very important assumptions which may not necessarily be true.
Assumption one is that all I/Os are likely to be “physical I/Os” which all need to be costed and taken into account.
Assumption two is that all I/Os are costed equally, even though the size of a multiblock I/O performed typically during a FTS is larger and so potentially more costly than a single block I/O usually associated with an index access.
Today, I’m only going to focus on this second assumption.
Now, when performing and processing data from a multiblock I/O as performed during a FTS operation, it’s typical for such operations to be more resource intensive than that of a single block I/O as performed during an index range scan, as the associated overheads are likely be greater such as having to read more actual data off the disk, having to transfer more data into the SGA, having to process more data in each associated block, etc.
Therefore, not all I/Os are equal. However, by default the CBO ignores all these possible differences and costs all I/Os associated with a FTS (multiblock) and an index (single block) as being equivalent or the same.
Now, this hardly seems fair or indeed accurate and desirable when determining the true cost differences between an index and a FTS. Shouldn’t the fact that a single block I/O is likely to be less resource intensive and take less elapsed time to process be taken into consideration when determining these relative costs ?
Enter the optimizer_index_cost_adj parameter.
The purpose of this parameter is simply to “adjust” the corresponding costs associated with an index to (hopefully) more accurately reflect the relative I/O costs between using an index and a FTS. If for example a single block I/O only takes 1/2 the time and resources to perform compared to a multiblock I/O, shouldn’t these associated I/O cost differences be reflected when determining whether or not to use an index and perhaps reduce the index related costs by 1/2 as a result ?
This parameter has a very simple impact on how the CBO costs the use of an index based access path. It takes the value of the optimizer_index_cost_adj as a percentage and adjusts the cost of an index related range scan access path to only be the corresponding percentage of the total index cost. By default, it has a value of 100 meaning that a single block I/O is 100% when compared to that of a multiblock I/O which in turn means that the index related I/O costs are treated the same as that of a multiblock FTS I/O. A default value of 100 therefore has no effect on the overall cost of using an index related access path.
However, if the optimizer_index_cost_adj only has a value of (say) 25, it means that all single block I/O are only 25% as costly as that of a multiblock I/O and so index related range scan costs are adjusted to be only 25% of that of the total index access path cost.
Going back to the previous demo where the FTS was selected, I calculated the cost of using the index when retrieving the 5% of data to be:
index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)
2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854) = 2 + 5 x 7 + 43 = 37 + 43 = 80.
The cost of using a FTS was calculated as being only 65. A cost of 65 for the FTS is less than a cost of 80 for the index and so the FTS was selected.
This time, the linked demo sets the optimizer_index_cost_adj = 25 before running the exact same query again.
We notice of couple of key differences. The first obvious difference is that the plan has changed and that the CBO has now decided to use the index. The second difference is the associated cost relating to the use of the index. Previously, it was calculated as being 80 but now it only has a cost of 20. The maths is pretty simple as with an optimizer_index_cost_adj = 25, we need only mutliply the previous total with 0.25:
(2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854)) x 0.25 = (2 + 5 x 7 + 43) x 0.25= 80 x 0.25 = 20.
Note also that just the index range scan cost component was previously 2 + 5 x ceil(0.01 x 602) = 37, but is now also adjusted to 37 x 0.25 which rounds to9.
Basically by setting the optimizer_index_cost_adj = 25, we have effectively reduced the overall cost of using the index based execution path down from 80 to just 20, to just 25% of the previous total index cost.
The cost of the FTS remains unchanged at 65. The index access path at just 20 is now less than the FTS alternative and so the index is now chosen by the CBO.
Yes, all these numbers and costs make sense when one understands how the CBO performs its calculations and the effect of setting the optimizer_index_cost_adj parameter to a non-default value.
The optimizer_index_cost_adj parameter can therefore obviously have a very significant impact in the behaviour and subsequent performance of the database as the CBO will reduce (or maybe increase) the actual costs of index related access paths by the percentage denoted in the optimizer_index_cost_adj parameter. It can potentially dramatically increase (or decrease) the likelihood of an index access path being chosen over a FTS.
There are typically 3 very different ways in which this parameter is set, which I’ll list in increasing order of preference.
1) Set it arbitrarily to a very low figure such that indexes reign supreme as their associated costs get adjusted to such a low figure by the CBO that a FTS access path has little chance of being chosen (for example, here’s a suggestion to set it to a magical value of 12). Generally a very bad thing to do in any database …
2) Set it to a value that the DBA determines is an approximate percentage of the costs associated with a single block I/O when compared to a multiblock I/O. An improvement of option 1), but I still prefer the next option 3) …
3) Leave it at the default value of 100 such that it has no impact and the CBO does not use it to adjust the cost of an index access path
I’ll explain in Part II a sensible approach in setting the optimizer_index_cost_adj parameter and why option 3 is the preferred option with any currently supported version of Oracle.