Wednesday, October 21, 2015

Extended Statistics

In Oracle Database 11g, extended statistics (multi-column statistics) have been introduced, which allow you to collect statistics on a group of columns as a whole and on functions, thus enabling the optimizer to calculate the cardinality of these predicates correctly. As a result, the optimizer is aware of the correct selectivity (cardinality). This blog demonstrates why you need extended statistics and how you can create them.

You can see the difference between actual rows(5773) return and expected rows(1501) in plan. This is because optimizer unable to expect the number of rows based upon multiple predicates (PROD_ID,PROMO_ID).  

SQL> SELECT COUNT(*) FROM SALES WHERE PROD_ID=13 AND PROMO_ID=999;
  COUNT(*)
----------
      5773
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());
PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------------------------------
SQL_ID  ff4sb7w8gjsqv, child number 0
-------------------------------------
SELECT COUNT(*) FROM SALES WHERE PROD_ID=13 AND PROMO_ID=999
Plan hash value: 2677122151
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name            | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                 |       |       |    58 (100)|          |    |  |
|   1 |  SORT AGGREGATE               |                 |     1 |     8 |            |          |    |  |
|   2 |   PARTITION RANGE ALL         |                 |  1501 | 12008 |    58   (0)| 00:00:01 |     1 |    28 |
|   3 |    BITMAP CONVERSION COUNT    |                 |  1501 | 12008 |    58   (0)| 00:00:01 |    |  |
|   4 |     BITMAP AND                |                 |       |       |            |          |    |  |
|*  5 |      BITMAP INDEX SINGLE VALUE| SALES_PROD_BIX  |       |       |            |          |     1 |    28 |
|*  6 |      BITMAP INDEX SINGLE VALUE| SALES_PROMO_BIX |       |       |            |          |     1 |    28 |
-----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   5 - access("PROD_ID"=13)
   6 - access("PROMO_ID"=999)
24 rows selected.

Lets create extended statistics on PROD_ID,PROMO_ID columns and check the difference.

SQL> select dbms_stats.create_extended_stats('SH','SALES', '(PROD_ID, PROMO_ID)') from dual;
DBMS_STATS.CREATE_EXTENDED_STATS('SH','SALES','(PROD_ID,PROMO_ID)')
------------------------------------------------------------------------------------------------
SYS_STUZ9TAAY#1ATP74RIO0#PT#ZL
SQL>  SELECT COUNT(*) FROM SALES WHERE PROD_ID=13 AND PROMO_ID=999;
  COUNT(*)
----------
  5773
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR());
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------
SQL_ID  fgphzv18qvmju, child number 1
-------------------------------------
 SELECT COUNT(*) FROM SALES WHERE PROD_ID=13 AND PROMO_ID=999
Plan hash value: 332322895
--------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name             | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |                  |       |       |   102 (100)|          |    |          |
|   1 |  SORT AGGREGATE                 |                  |     1 |     8 |            |          |    |          |
|*  2 |   VIEW                          | index$_join$_001 |  5102 | 40816 |   102  (15)| 00:00:02 |    |          |
|*  3 |    HASH JOIN                    |                  |       |       |            |          |    |          |
|   4 |     PARTITION RANGE ALL         |                  |  5102 | 40816 |    29   (0)| 00:00:01 |  1 |       28 |
|   5 |      BITMAP CONVERSION TO ROWIDS|                  |  5102 | 40816 |    29   (0)| 00:00:01 |    |          |
|*  6 |       BITMAP INDEX SINGLE VALUE | SALES_PROD_BIX   |       |       |            |          |  1 |       28 |
|   7 |     PARTITION RANGE ALL         |                  |  5102 | 40816 |    58   (0)| 00:00:01 |  1 |       28 |
|   8 |      BITMAP CONVERSION TO ROWIDS|                  |  5102 | 40816 |    58   (0)| 00:00:01 |    |          |
|*  9 |       BITMAP INDEX SINGLE VALUE | SALES_PROMO_BIX  |       |       |            |          |  1 |       28 |
--------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   2 - filter(("PROD_ID"=13 AND "PROMO_ID"=999))
   3 - access(ROWID=ROWID)
   6 - access("PROD_ID"=13)
   9 - access("PROMO_ID"=999)
29 rows selected.

Now you can see actual value(5773) return and expected value(5102) in plan are both nearly same. Of course COST is high in this scenario, that we can ignore for now. 


OR

You can create extended stats using gather_table_stats proc.

SQL> execute dbms_stats.gather_table_stats('sh','customers',-
> method_opt =>'for all columns size skewonly -
> for columns(lower(cust_state_province)) size skewonly');

============================================================
Once the extensions are created, the data dictionary user_stat_extensions view provides information about them. 

Sources:
http://www.oracle.com/webfolder/technetwork/tutorials/obe/db/11g/r1/prod/perform/multistats/multicolstats.htm
https://blogs.oracle.com/optimizer/entry/how_do_i_know_what_extended_statistics_are_needed_for_a_given_workload
https://blogs.oracle.com/optimizer/entry/extended_statistics

-Asifkhan

No comments:

Post a Comment