Table of contents
Preamble
While reading a bit on 12cR1 new features I saw statistics feedback that was formally called cardinality feedback that is available starting with Oracle 11gR2. So before moving to 12cR1 I decided to give the cardinality feedback feature, which I did not know yet, a try !!
Testing has been done using a Oracle Enterprise Edition in 11.2.0.4 running under HP-UX 11iv3 (11.31).
Cardinality feedback configuration
I’m gonna use the below test table:
DROP TABLE test1; CREATE TABLE test1(id1 NUMBER, id2 NUMBER,descr VARCHAR2(100)) NOLOGGING TABLESPACE tools; DECLARE i NUMBER; nbrows NUMBER; BEGIN i:=1; nbrows:=10000000; LOOP EXIT WHEN i>nbrows; IF (MOD(i,200000) = 0) THEN INSERT INTO test1 VALUES(4,ROUND(dbms_random.VALUE(1,nbrows)),RPAD('A',100,'A')); ELSE INSERT INTO test1 VALUES(MOD(i,4),ROUND(dbms_random.VALUE(1,nbrows)),RPAD('A',100,'A')); END IF; i:=i+1; END LOOP; COMMIT; END; / |
The idea of this test table is the following (the first column can be seen in many blog post as it is a quite well famous method):
- First column to have a column that contains not too many distinct value and particularly skewed (value 5 occur much less than other values)
- Second column is a random number to allow few computation if needed
- Third column is one hundred characters to artificially increase table size and generate high number of block read in case of full table scan (FTS)
Then I create two indexes and gather table statistics with no histograms:
SQL> CREATE INDEX test1_idx_id1 ON test1(id1) TABLESPACE tools; INDEX created. SQL> CREATE INDEX test1_idx_id2 ON test1(id2) TABLESPACE tools; INDEX created. SQL> EXEC dbms_stats.gather_table_stats('yjaquier', 'test1'); PL/SQL PROCEDURE successfully completed. |
As we have already seen at first execution the gather of statistics will not generate any histograms:
SQL> SELECT object_id FROM dba_objects WHERE owner='YJAQUIER' AND object_name='TEST1'; OBJECT_ID ---------- 109195 SQL> SELECT * FROM sys.col_usage$ WHERE obj#=109195; no ROWS selected SQL> col COLUMN_NAME FOR a15 SQL> SELECT table_name,column_name,histogram FROM user_tab_columns WHERE table_name = 'TEST1' ORDER BY column_name; TABLE_NAME COLUMN_NAME HISTOGRAM ------------------------------ --------------- --------------- TEST1 DESCR NONE TEST1 ID1 NONE TEST1 ID2 NONE |
So we get below test table with id1 column distribution:
SQL> SELECT id1, COUNT(*) FROM test1 GROUP BY id1 ORDER BY id1; ID1 COUNT(*) ---------- ---------- 0 2499950 1 2500000 2 2500000 3 2500000 4 50 |
And the most important thing: as the id1 column is lacking histogram Cost Based Optimizer (CBO) is not aware of column distribution …
Then we have to find a query that would benefit from histograms on id1 column and so benefit from cardinality feedback feature. As, again, because we are lacking good statistics on our test table. I thought it would be easy but at then end it was clearly not piece of cake as all the simple queries that came to my mind were not working or because even if query benefited from cardinality feedback the explain plan did not change (??).
From Oracle official documentation cardinality feedback may be used for below conditions:
- tables with no statistics
- multiple conjunctive or disjunctive filter predicates on a table
- Predicates containing complex operators for which the optimizer cannot accurately compute selectivity estimates
So all below SQL statements are not providing good results (either cardinality feedback not used or used but not changing explain plan):
SQL> SELECT * FROM test1 WHERE id1=5; SQL> SELECT * FROM test1 WHERE id1=5 AND id2>5000000; |
When displaying explain plan with additional information if you get below message:
Warning: basic plan statistics not available. These are only collected when:
* hint ‘gather_plan_statistics’ is used for the statement or
* parameter ‘statistics_level’ is set to ‘ALL’, at session or system level
Then as self stated you can also alter your session or use (un-documented) gather_plan_statistics hint to activate plan extended statistics:
SQL> SELECT /*+ gather_plan_statistics */ ... SQL> ALTER SESSION SET statistics_level=ALL; |
Cardinality feedback testing
After long and painful testing I have found below (stupid) SQL statement that could be used to demonstrate feature (the comment is used to easily find the sql_id in V$SQL). The SQL statement used is exactly the one below (uppercase or lowercase make a difference when searching in V$SQL):
select /* Yannick */ sum(id2) from (select * from test1 where id1=4 and id2>5000000);
To prepare my environment I’m using:
SQL> ALTER SYSTEM flush shared_pool; SYSTEM altered. SQL> ALTER SYSTEM flush shared_pool; SYSTEM altered. SQL> SET lines 200 pages 200 TIMING ON SQL> col sql_text FOR a90 SQL> col PLAN_TABLE_OUTPUT FOR a180 SQL> ALTER SESSION SET statistics_level=ALL; SESSION altered. |
So let’s execute the query and see how explain plan is changing:
SQL> SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000); SUM(ID2) ---------- 165770312 Elapsed: 00:00:07.81 SQL> SELECT sql_id,child_number,sql_text FROM v$sql WHERE sql_text LIKE 'select /* Yannick */%'; SQL_ID CHILD_NUMBER SQL_TEXT ------------- ------------ ------------------------------------------------------------------------------------------ 1r81h55uhssm5 0 SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) Elapsed: 00:00:00.01 SQL> SELECT sql_id,child_number,use_feedback_stats,reason FROM V$SQL_SHARED_CURSOR WHERE sql_id IN (SELECT sql_id FROM v$sql WHERE sql_text LIKE 'select /* Yannick */%'); SQL_ID CHILD_NUMBER U REASON ------------- ------------ - -------------------------------------------------------------------------------- 1r81h55uhssm5 0 Y Elapsed: 00:00:00.12 |
So query executed in around 8 seconds and we see that sql_id is candidate (use_feedback_stats column equals to Y) for cardinality feedback usage
Let’s execute a second time the exact same query:
SQL> SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000); SUM(ID2) ---------- 165770312 Elapsed: 00:00:00.00 SQL> SELECT sql_id,child_number,sql_text FROM v$sql WHERE sql_text LIKE 'select /* Yannick */%'; SQL_ID CHILD_NUMBER SQL_TEXT ------------- ------------ ------------------------------------------------------------------------------------------ 1r81h55uhssm5 0 SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) 1r81h55uhssm5 1 SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) Elapsed: 00:00:00.01 SQL> SELECT sql_id,child_number,use_feedback_stats,reason FROM V$SQL_SHARED_CURSOR WHERE sql_id IN (SELECT sql_id FROM v$sql WHERE sql_text LIKE 'select /* Yannick */%'); 2 SQL_ID CHILD_NUMBER U REASON ------------- ------------ - -------------------------------------------------------------------------------- 1r81h55uhssm5 0 Y <ChildNode><ChildNumber>0</ChildNumber><ID>3</ID><reason>Optimizer mismatch(13)< 1r81h55uhssm5 1 Y Elapsed: 00:00:00.07 |
Query second execution is instantaneous, we also see that a second child has been created for the same sql_id. We also see that second child is also candidate for cardinality feedback feature, so let’s execute a third time the exact same query:
SQL> SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000); SUM(ID2) ---------- 165770312 Elapsed: 00:00:00.01 SQL> SELECT sql_id,child_number,sql_text FROM v$sql WHERE sql_text LIKE 'select /* Yannick */%'; SQL_ID CHILD_NUMBER SQL_TEXT ------------- ------------ ------------------------------------------------------------------------------------------ 1r81h55uhssm5 0 SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) 1r81h55uhssm5 1 SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) 1r81h55uhssm5 2 SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) Elapsed: 00:00:00.02 SQL> SELECT sql_id,child_number,use_feedback_stats,reason FROM V$SQL_SHARED_CURSOR WHERE sql_id IN (SELECT sql_id FROM v$sql WHERE sql_text LIKE 'select /* Yannick */%'); 2 SQL_ID CHILD_NUMBER U REASON ------------- ------------ - -------------------------------------------------------------------------------- 1r81h55uhssm5 0 Y <ChildNode><ChildNumber>0</ChildNumber><ID>3</ID><reason>Optimizer mismatch(13)< 1r81h55uhssm5 1 Y <ChildNode><ChildNumber>1</ChildNumber><ID>3</ID><reason>Optimizer mismatch(13)< 1r81h55uhssm5 2 N Elapsed: 00:00:00.08 |
Third execution, also executed instantaneously, generated a final child that is no more candidate for cardinality feedback feature. Now let’s compare the different explain plan. First child doing an index range scan on the two indexes and we can see the difference between E-Rows (Estimated rows) and A-Rows (Actual rows). This is where, as poor statistics, the optimizer if failing to estimate correct number of returned rows:
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id=>'1r81h55uhssm5', cursor_child_no=>'0', format=> 'all allstats')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 1r81h55uhssm5, child NUMBER 0 ------------------------------------- SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) PLAN hash VALUE: 2595670947 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-ROWS |E-Bytes| COST (%CPU)| E-TIME | A-ROWS | A-TIME | Buffers | Reads | OMem | 1Mem | O/1/M | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | 17815 (100)| | 1 |00:00:07.78 | 5852 | 3321 | | | | | 1 | SORT AGGREGATE | | 1 | 1 | 9 | | | 1 |00:00:07.78 | 5852 | 3321 | | | | |* 2 | VIEW | INDEX$_join$_002 | 1 | 1000K| 8789K| 17815 (2)| 00:03:16 | 21 |00:00:07.78 | 5852 | 3321 | | | | |* 3 | HASH JOIN | | 1 | | | | | 21 |00:00:07.78 | 5852 | 3321 | 1627K| 1627K| 1/0/0| |* 4 | INDEX RANGE SCAN| TEST1_IDX_ID1 | 1 | 1000K| 8789K| 2050 (3)| 00:00:23 | 50 |00:00:00.01 | 3 | 0 | | | | |* 5 | INDEX RANGE SCAN| TEST1_IDX_ID2 | 1 | 1000K| 8789K| 176K (4)| 00:32:27 | 4998K|00:00:03.94 | 5849 | 3321 | | | | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- Query Block Name / Object Alias (IDENTIFIED BY operation id): ------------------------------------------------------------- 1 - SEL$F5BB74E1 2 - SEL$838CAA44 / TEST1@SEL$2 3 - SEL$838CAA44 4 - SEL$838CAA44 / indexjoin$_alias$_001@SEL$838CAA44 5 - SEL$838CAA44 / indexjoin$_alias$_002@SEL$838CAA44 Predicate Information (IDENTIFIED BY operation id): --------------------------------------------------- 2 - filter(("ID1"=4 AND "ID2">5000000)) 3 - ACCESS(ROWID=ROWID) 4 - ACCESS("ID1"=4) 5 - ACCESS("ID2">5000000) COLUMN Projection Information (IDENTIFIED BY operation id): ----------------------------------------------------------- 1 - (#keys=0) SUM("TEST1"."ID2")[22] 2 - "ID2"[NUMBER,22], "ID1"[NUMBER,22] 3 - (#keys=1) "ID1"[NUMBER,22], "ID2"[NUMBER,22] 4 - ROWID[ROWID,10], "ID1"[NUMBER,22] 5 - ROWID[ROWID,10], "ID2"[NUMBER,22] |
Second is now doing an unique index range scan on one index, we still see than the table access by index rowid is still not well estimated (big difference between E-Rows and A-Rows columns). The note section of the explain plan is confirming cardinality feature usage with with cardinality feedback used for this statement:
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id=>'1r81h55uhssm5', cursor_child_no=>'1', format=> 'all allstats')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 1r81h55uhssm5, child NUMBER 1 ------------------------------------- SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) PLAN hash VALUE: 629791772 ---------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-ROWS |E-Bytes| COST (%CPU)| E-TIME | A-ROWS | A-TIME | Buffers | ---------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | 2029 (100)| | 1 |00:00:00.01 | 53 | | 1 | SORT AGGREGATE | | 1 | 1 | 9 | | | 1 |00:00:00.01 | 53 | |* 2 | TABLE ACCESS BY INDEX ROWID| TEST1 | 1 | 1000K| 8789K| 2029 (2)| 00:00:23 | 21 |00:00:00.01 | 53 | |* 3 | INDEX RANGE SCAN | TEST1_IDX_ID1 | 1 | 53 | | 2027 (2)| 00:00:23 | 50 |00:00:00.01 | 3 | ---------------------------------------------------------------------------------------------------------------------------------------- Query Block Name / Object Alias (IDENTIFIED BY operation id): ------------------------------------------------------------- 1 - SEL$F5BB74E1 2 - SEL$F5BB74E1 / TEST1@SEL$2 3 - SEL$F5BB74E1 / TEST1@SEL$2 Predicate Information (IDENTIFIED BY operation id): --------------------------------------------------- 2 - filter("ID2">5000000) 3 - ACCESS("ID1"=4) COLUMN Projection Information (IDENTIFIED BY operation id): ----------------------------------------------------------- 1 - (#keys=0) SUM("TEST1"."ID2")[22] 2 - "ID2"[NUMBER,22] 3 - "TEST1".ROWID[ROWID,10] Note ----- - CARDINALITY feedback used FOR this statement |
Remark:
In my case the plan generated using the cardinality feedback feature is performing much better than the original one, unfortunately it is not always the case…
Third explain is not different from previous one except that now cardinality is well estimated:
SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor(sql_id=>'1r81h55uhssm5', cursor_child_no=>'2', format=> 'all allstats')); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ SQL_ID 1r81h55uhssm5, child NUMBER 2 ------------------------------------- SELECT /* Yannick */ SUM(id2) FROM (SELECT * FROM test1 WHERE id1=4 AND id2>5000000) PLAN hash VALUE: 629791772 ---------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-ROWS |E-Bytes| COST (%CPU)| E-TIME | A-ROWS | A-TIME | Buffers | ---------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | | 2029 (100)| | 1 |00:00:00.01 | 53 | | 1 | SORT AGGREGATE | | 1 | 1 | 9 | | | 1 |00:00:00.01 | 53 | |* 2 | TABLE ACCESS BY INDEX ROWID| TEST1 | 1 | 21 | 189 | 2029 (2)| 00:00:23 | 21 |00:00:00.01 | 53 | |* 3 | INDEX RANGE SCAN | TEST1_IDX_ID1 | 1 | 53 | | 2027 (2)| 00:00:23 | 50 |00:00:00.01 | 3 | ---------------------------------------------------------------------------------------------------------------------------------------- . . . Note ----- - CARDINALITY feedback used FOR this statement |
In 12cR1 the feature is called statistics feedback and add the below feature:
The optimizer also creates a SQL plan directive so that other SQL statements can benefit from the information obtained during this initial execution
There is also a new IS_REOPTIMIZABLE column in V$SQL:
This columns shows whether the next execution matching this child cursor will trigger a reoptimization. The values are:
Y: If the next execution will trigger a reoptimization
R: If the child cursor contains reoptimization information, but will not trigger reoptimization because the cursor was compiled in reporting mode
N: If the child cursor has no reoptimization information
Remark:
The above example does not work in 12cR1 (12.1.0.2) due to different optimizer, so to test SQL Plan Directive I will have to find something different…
References
- Statistics Feedback (Formerly Cardinality Feedback)
- Statistics Feedback – Frequently Asked Questions (Doc ID 1344937.1)
- Cardinality Feedback – Oracle Scratchpad
- How does the METHOD_OPT parameter work?