Blogger templates

Rule Based Optimizer Vs Cost based optimizer


Rule Based Optimizer Vs Cost based optimizer



Rule Based Optimizer Vs Cost based optimizer


Rule Based Optimizer
                A long time ago, the only optimizer in the Oracle database was the Rule-Based Optimizer (RBO). Basically, the RBO used a set of rules to determine how to execute a query. One of the rules among them is like, if an index was available on a table, the RBO rules said to always use the index. There are some cases where the use of an index slowed down a query. For example, assume someone put an index on the GENDER column, which holds one of two values, MALE and FEMALE. Then someone issues the following query:

SELECT * FROM emp WHERE gender='FEMALE'; 
                If the above query returned approximately 50% of the rows, then using an index would actually slow things down. It would be faster to read the entire table and throw away all rows that have MALE values. Experts in Oracle query optimization have come to a rule of thumb that says if the number of rows returned is more than 5-10% of the total table volume, using an index would slow things down. The RBO would always use an index if present because its rules said to.
                It became obvious that the RBO, armed with its set of discrete rules, did not always make great decisions.
RBO rules
Always use the Index - If an index can be used to access a table, choose the index. Indexes are always preferred over a full-table scan of a sort merge join; (a sort merge join does not require an index).
Always starts with the driving table - The last table in ‘from’ clause will be the driving table – For the RBO, this should be the table that chooses the least amount of rows. The RBO uses this driving table as the first table when performing nested loop join operations.
Full-table scans as a last resort - The RBO are not aware of Oracle parallel query and multi-block reads, and do not consider the size of the table. Hence, the RBO dislikes full-table scans and will only use them when no index exists.
Any index will do - The RBO will sometimes choose a less than ideal index to service a query. This is because the RBO does not have access to statistics that show the selectivity of indexed columns.
The biggest problem with the RBO was that it did not take the data distribution into account. So the Cost-Based Optimizer (CBO) was born. The CBO uses statistics about the table, its indexes and the data distribution to make better informed decisions. Using our previous example, assume that the company has employees that are 95% female and 5% male. If you query for females, then you do not want to use the index. If you query for males, then you would like to use the index. The CBO has information at hand to help make these kinds of determinations that were not available in the old RBO.
Syntax:
To turn on the RBO, open init.ora file and edit the following parameter.
Optimizer_mode = rule/choose
For only a particular session
Alter session set optimizer_mode = rule/choose


Cost based Optimizer (CBO)
Cost based optimizer has two modes:
a.       all_rows
b.       first_rows

The all_rows optimizer mode is designed to minimize computing
resources and it favors full-table scans.  Index access (first_rows) adds additional I/O overhead, but they return rows faster, back to the originating query.

optimizer_mode=first_rows_      This CBO mode will return rows as soon as possible, even if the overall query runs longer or consumes more computing resources than other plans. The first_rows optimizer_mode usually involves choosing an index scan over a full-table scan because index access will return rows quickly. Because the first_rows mode favors index scans over full-table scans, the first_rows mode is more appropriate for OLTP systems where the end user needs to see small result sets as quickly as possible.
 
optimizer_mode=all_rows_         This CBO mode ensures that the overall computing resources are minimized, even if no rows are available until the entire query has completed. The all_rows access method often favors a parallel full-table scan over a full-index scan, and sorting over presorted retrieval via an index. Because the all_rows mode favors full-table scans, it is best suited for data warehouses, decision-support systems, and batch-oriented databases where intermediate rows are not required for real-time viewing.
 
optimizer_mode=first_rows_n    This Oracle9i Database optimizer mode enhancement optimizes queries for a small, expected return set. The values are first_rows_1, first_rows_10, first_rows_100, and first_rows_1000. The CBO uses the n in first_rows_n as an important driver in determining cardinalities for query result sets. By telling the CBO, a priori, that we only expect a certain number of rows back from the query, the CBO will be able to make a better decision about whether to use an index to access the table rows.
The optimizer mode can be set at the system-wide level, for an individual session, or for a specific SQL statement:
Syntax:
alter system set optimizer_mode=first_rows_10;
alter session set optimizer_goal = all_rows;
select /*+ first_rows(100) */ from student;


               

CBO gets different set of data from which it chooses the best execution plan. The data includes the following

                SQL query                                          CBO                                                      Execution Plan
 

                OS data                                                Disk data                             Table and Column data
CPU Consumption                                      RAID Level                                               Rows/Block
Sequential read time                                   Block size                                  Partitioning
Scattered read speed                                 RAID & Striping                       Parallelism

 

Optimizer Settings ( optimizer_index_cost_adj, optimizer_index_caching, Parallel_automatic_tunning, db_file_multiblock_read_count and some other parameters as following)

Initialization parameters that affect the CBO
              Apart from generating statistics, the setup of the below mentioned parameters play a very crucial role in dictating how your system will work. Setting these will depend a lot on what kind of environment you want to create, Online, Batch processing, Data warehousing or a combination of more than one.

OPTIMIZER_FEATURES_ENABLE
This is set to a version number such as- 8.1.5, 8.1.7, 9.0.0. Since new features and functionality are being added to CBO in every release, its behavior may change and result in different execution plans. You can set this to a version number for which your application is tuned. Please note setting it to a lower version will prevent the use of new features that have come in later versions.
e.g.: optimizer_features_enable = 8.1.7

OPTIMIZER_MAX_PERMUTATIONS
This parameter specifies the maximum number of permutations that should be considered for queries with joins, to choose an execution plan. This will influence the parse time of queries. This parameter should be set to a lower value. Make sure the other parameters mentioned in this section are set properly so that the optimizer finds an optimal execution plan within the specified limits. It defaults to 80000 in Oracle 8, which means no limits! In Oracle 9
i it is defaulted to 2000.
e.g.: optimizer_max_permutations = 2000

OPTIMIZER_INDEX_COST_ADJ
Optimizer_Index_Cost_Ad takes a value between 1 and 10000 (default 100). Setting this value makes the index more or less friendly to the optimizer. For example setting it to 50 tells the optimizer that an execution plan that uses index(es) is half as expensive as other execution plans that are not using them. The lower the value (less than 100), the less full table scan executions will take place in the system.
Use of indexes in OLTP is a crucial factor for deciding the response time, but sometimes, full table scans provide better throughput. Nested loop joins are influenced by this parameter value. Set this parameter to a value that is optimal for your application. I recommend set this parameter between 10 - 20 for OLTP and 50 for DSS Systems..
If you are having slow performance because the CBO first_rows optimizer mode is favoring too many full-table scans, you can reset the optimizer_index_cost_adj parameter to immediately tune all of the SQL in your database to favor index scans over full-table scans. This is a "silver bullet" that can improve the performance of an entire database in cases where the database is OTLP and you have verified that the full-table scan costing is too low.

OPTIMIZER_INDEX_CACHING
This tells optimizer to favor nested loops and IN-list iterators over sort-merge and hash joins. The default value is 100 and makes nested loops and IN-list operator look less expensive and more appealing to Optimizer. The value basically indicates the percentage of index blocks that the optimizer should assume are present in cache. I recommend set this parameter to 85.

OPTIMIZER_PERCENT_PARALLEL
OPTIMIZER_PERCENT_PARALLEL takes a value between 0 and 100. A low value favors indexes and a higher value will favor full table scans. The optimizer uses this parameter in working out the cost of a full table scan. A value of 100 makes use of degree of parallelism set at object level. I prefer setting it to zero to favor use of indexes and prevent use of parallel query in computing the costing.
It is renamed to “_OPTIMIZER_PERCENT_PARALLEL” in Oracle 9i and its value  should not be altered unless recommended by Oracle support.
e.g.: optimizer_percent_parallel = 0

COMPATIBLE
This parameter is used to provide backward compatibility with earlier releases. This may also restrict the use of some new features. CBO has undergone lot of changes in release 8. It is advisable to set this parameter to 8.1.0 or higher. Only three digits are required to be specified, however, you can specify more for record purposes.
e.g.: compatible = 8

DB_FILE_MULTIBLOCK_READ_COUNT
This parameter determines the number of database blocks read in one input/output operation during a full table scan. The value set is used in computing the cost of full table scans. OLTP systems will not benefit by setting a high value for this parameter as the computed cost for full table scan execution plans would reduced. The maximum size is Operating system dependent.
e.g.: db_file_multiblock_read_count = 1 (may be set to a higher value)

9) SORT_AREA_SIZE
This parameter defines the maximum memory space allocated per user process that requires sorting of data and for insert and updates to bitmap indexes. Optimizer uses the set value for determining the cost of sorting in a query. This defaults to 64K. Normally a size of 64K to 1M is appropriate for OLTP systems.
Sort-merge joins are influenced by this parameter value. The bigger the size the more appealing will be sort-merge joins over nested-loop and hash joins. The lower the size the more sorting will take place in temporary segments. So assign a value that has been evaluated for your setup.
Oracle recommends the use of PGA_AGGREGATE_TARGET instead of this parameter from Oracle 9
i.
e.g: sort_area_size = 1048576

SORT_MULTIBLOCK_READ_COUNT
This specifies the number of database blocks to be read each time a sort performs a read from a temporary segment. The default value is 2. General recommendation is to keep the default value. However, if you intend to hike up the SORT_AREA_SIZE considerably, you may evaluate the improvement by changing this parameter.
e.g.: sort_multiblock_read_count = 2

HASH_JOIN_ENABLED
Hash joins are available only in CBO. In hash joins, a hash table is created on the join key of the smallest sized table. It then joins the other tables to find the match. Hash joins may prove to be faster than other type of joins in some conditions, especially when the index is missing or search criteria is not very selective. Hash joins require a large amount of memory as the hash tables are retained; this may sometimes result in memory swapping.
Nested-loop joins return the first row faster than sort-merge and hash joins and are preferred for OLTP, but other types of joins cannot be ignored for running other aspects of the applications.
e.g.: hash_join_enabled = true

HASH_AREA_SIZE
This specifies the maximum amount of memory in bytes to be used for a hash join per process. It is defaulted to 2 times SORT_AREA_SIZE.
Oracle recommends the use of PGA_AGGREGATE_TARGET instead of this parameter from Oracle 9
i.
e.g.: hash_area_size = 2097152
Setting this to a very low number may sometimes result in the following error.
ORA-6580: Hash Join ran out of memory while keeping large rows in memory.

HASH_MULTIBLOCK_IO_COUNT
This specifies how many sequential blocks a hash join reads and writes in one Input/Output activity. Oracle recommends not changing or assigning a value to this parameter; this will let oracle decide on the appropriate value for each individual query. In such casees, the value of the parameter will appear as 0 in the V$PARAMETER view.
This parameter is renamed to _HASH_MULTIBLOCK_IO_COUNT as of Oracle 9
i.
e.g.: hash_multi_block_io_count = 0

BITMAP_MERGE_AREA_SIZE
This parameter is relevant for systems using bitmap indexes. It specifies the amount of memory Oracle uses to merge bitmaps retrieved from a range scan of a bitmap index. The default value is 1 MB, which is considered sufficient for most setups.
Oracle recommends use of PGA_AGGREGATE_TARGET instead of this parameter from Oracle 9
i.
e.g.: bitmap_merge_area_size = 1048576

QUERY_REWRITE_ENABLED
This parameter is relevant for systems using Materialized views, Function based indexes and stored outlines. Setting this parameter enables query rewrite for the database. The materialized views should also be query rewrite enabled to allow the optimizer to redirect a query to it rather than the actual tables listed in the FROM clause. Query rewriting is internally done by the optimizer based on what results are expected and whether these are already present in materialized view form.
e.g.: query_rewrite_enabled = true

QUERY_REWRITE_INTEGRITY
This parameter determines the extent to which the optimizer must enforce query rewriting to Materialized views. This determines the accuracy of the query rewrite. It can be set to ENFORCED, TRUSTED or STALE_TOLERATED. ENFORCED option uses Materialized views if they contain fresh data, thus guarantying the integrity and consistency of data. TRUSTED uses relationships that have been defined in the system for rewriting the query. STALE_TOLERATED allows the use of relationships that are not enforced and may use stale data. For OLTP systems, ENFORCED would be the ideal setup, as reports need to be up-to-date.
e.g.: query_rewrite_integrity = enforced

ALWAYS_ANTI_JOIN
This parameter specifies the join method for anti-joins, for example when a NOT IN operator is present in your query. It can be set to NESTED_LOOPS, MERGE or HASH. It is defaulted to NESTED_LOOPS in Oracle 8
i and CHOOSE in Oracle 9i.
This parameter is renamed to _ALWAYS_ANTI_JOIN as of Oracle 9
i.
e.g.: always_anti_join = nested_loops

ALWAYS_SEMI_JOIN
This parameter specifies the join method for semi-joins. These types of joins are carried out by Optimizer after transforming a query. In such joins, duplicate values from the inner table are removed and then the type of join specified in the parameter is used to perform a semi-join. It can be set to NESTED_LOOPS, MERGE or HASH. In Oracle 8
i, it is defaulted to STANDARD and in Oracle 9i it is defaulted to CHOOSE, to pick up an appropriate join.
This parameter is renamed to _ALWAYS_SEMI_JOIN as of Oracle 9
i.
e.g.: always_semi_join = nested_loops

STAR_TRANSFORMATION_ENABLED
This specifies whether query transformation will be applied to star queries. It can be set to TRUE, FALSE or TEMP_DISABLE (transformation will take place but will not use temporary tables). I presently set it to FALSE due to some known issues of sub-optimal queries being generated. If you intend to use this, please upgrade your version to 8.1.7.4 and above.
e.g.: star_transformation_enabled = false

PARALLEL_BROADCAST_ENABLED
This parameter refers to parallel executions in cluster databases. It is meant for improving hash and sort-merge join operations where a very large result set is joined with a very small result set. When this option is enabled, the optimizer broadcasts a copy of all rows in the smaller result set to all cluster databases that are processing some rows of the larger result set.
It is obsolete in release 9.2.0.
e.g.: parallel_broadcast_enabled = false

OPTIMIZER_DYNAMIC_SAMPLING
This parameter is introduced in release 9
i. It is meant for situations where tables are not analyzed. As CBO depends heavily on statistics, the parameter tells the optimizer to sample the unanalyzed tables that are being used in a query. A level of 0 to 10 can be specified, the higher the value the more time optimizer spends in sampling.
e.g.: optimizer_dynamic_sampling = 1

PARTITION_VIEW_ENABLED
This parameter is meant for backward compatibility to support partition views. Oracle recommends use of partition tables rather than partition views. If you are migrating to CBO, chances are that you may not be using partition views.
e.g.: partition_view_enabled = false

CURSOR_SHARING
This parameter determines what kind of SQL statements can share the same cursor. It can be set to FORCE, SIMILAR or EXACT. FORCE will try to squeeze statements that may differ in some literals to share the same cursor. SIMILAR is somewhat the same but will try to maintain the plan optimization for identical statements. EXACT allows statements with exact identical text to share a cursor.
Using FORCE may sometimes result in unexpected results.
e.g.: cursor_sharing = exact

PGA_AGGREGATE_TARGET
Introduced in Oracle 9
i, this parameter specifies the aggregate PGA memory available to all server processes attached to an instance. This parameter can be set for automatic sizing of SQL working areas. It replaces other existing parameters like SORT_AREA_SIZE, BITMAP_MERGE_AREA_SIZE and HASH_AREA_SIZE.
It can be set to a value between 10 MB to 4000 GB, depending on the setup requirement.

Make a first estimate for PGA_AGGREGATE_TARGET  based on the following rule:
PGA_AGGREGATE_TARGET  = (Total Physical Memory * 0.16)    - For OLTP                     systems  (16% of Server Physical Mem)
PGA_AGGREGATE_TARGET  = (Total Physical Memory * 0 .4  )   - For DSS                                    systems   (40% of Server Physical Mem)

The ideal optimizer settings depend on your environment and are heavily influenced by your system's costs for scattered disk reads versus sequential disk reads. Listing 1 contains a great script you can use to measure these I/O costs on your database.

Gathering statistics
Oracle provides more than one way of generating statistics.

1. DBMS_UTILITY
2. ANALYZE command
3. DBMS_DDL
4. DBMS_STATS

NOTE Oracle strongly recommends that you use the DBMS_STATS package rather than ANALYZE to collect optimizer statistics. That package lets you collect statistics in parallel, collect global statistics for partitioned objects, and fine tune your statistics collection in other ways. Further, the cost-based optimizer will eventually use only statistics that have been collected by DBMS_STATS
1. DBMS_UTILITY
                DBMS_UTILITY package provides a lots of procedure and functions to organize and analyze the Oracle database objects. Some the procedures / functions available under this package (for analysis) are listed below.
Analyze_databse
Analyzes all the tables, clusters, and indexes in a database.

Syntax:
dbms_utility.analyze_database (
method           IN VARCHAR2,
estimate_rows    IN NUMBER DEFAULT NULL, -- # of rows to est.
estimate_percent IN NUMBER DEFAULT NULL, -- % of rows for est.
method_opt       IN VARCHAR2 DEFAULT NULL);

-- method options:
ESTIMATE, COMPUTE and DELETE

-- method_opt options:
FOR TABLE
FOR ALL [INDEXED] COLUMNS] [SIZE n]
FOR ALL INDEXES

Analyze_part_object
                Equivalent to analyze table or index for partitioned objects
Syntax:
dbms_utility.analyze_part_object (
schema        IN VARCHAR2 DEFAULT NULL,
object_name   IN VARCHAR2 DEFAULT NULL,
object_type   IN CHAR DEFAULT 'T',
command_type  IN CHAR DEFAULT 'E',
command_opt   IN VARCHAR2 DEFAULT NULL,
sample_clause IN VARCHAR2 DEFAULT 'sample 5 percent');


-- command type options
C (compute statistics)
D (delete statistics)
E (estimate statistics)
V (validate structure)
Analyze_schema
                Analyzes all the tables, clusters, and indexes in a schema
Syntax:
dbms_utility.analyze_schema (
schema           IN VARCHAR2,
method           IN VARCHAR2,
estimate_rows    IN NUMBER   DEFAULT NULL,
estimate_percent IN NUMBER   DEFAULT NULL,
method_opt       IN VARCHAR2 DEFAULT NULL);

-- method options
COMPUTE
DELETE
ESTIMATE

2. ANALYSE Command
The ANALYZE command provides Oracle with the ability to collect statistics that are then used by the optimizer or to delete statistics about the object from the data dictionary or to validate the structure of the object or to identify migrated and chained rows of the table or cluster. The optimizer MUST have up to date statistics to allow your SQL to perform at its best. You must make sure that your statistics on a table and index are up to date, so you should analyze your tables on a regular basis.
However, before we go much further, you should note that you should use the Oracle supplied package, DBMS_STATS. You should not use the ANALYZE command as it does not deal well with many of the features that tables can utilize today. So please remember to always use DBMS_STATS to analyze your table.
Syntax:
 ANALYZE
    { INDEX [schema.]index
            { { COMPUTE STATISTICS
              | ESTIMATE STATISTICS [SAMPLE integer {ROWS | PERCENT}]
              | DELETE STATISTICS }
            | VALIDATE STRUCTURE }
    | {TABLE [schema.]table | CLUSTER [schema.]cluster}
            { { COMPUTE
              | ESTIMATE [SAMPLE integer {ROWS | PERCENT}]
              | DELETE } STATISTICS
            | VALIDATE STRUCTURE [CASCADE]
            | LIST CHAINED ROWS [INTO [schema.]table] } }

where:

INDEX
    identifies an index to be analyzed.  If you omit schema, Oracle
    assumes the index is in your own schema.

TABLE
    identifies a table to be analyzed.  If you omit schema, Oracle
    assumes the table is in your own schema.  When you collect
    statistics for a table, Oracle also automatically collects the
    statistics for each of the table's indexes.

CLUSTER
    identifies a cluster to be analyzed.  If you omit schema, Oracle
    assumes the cluster is in your own schema.  When you collect
    statistics for a cluster, Oracle also automatically collects the
    statistics for all the cluster's tables and all their indexes,
    including the cluster index.

COMPUTE STATISTICS
    computes exact statistics about the analyzed object and stores them
    in the data dictionary.

ESTIMATE STATISTICS
    estimates statistics about the analyzed object and stores them in
    the data dictionary.
            SAMPLE
                   specifies the amount of data from the analyzed object
                   Oracle samples to estimate statistics.  If you omit
                   this parameter, Oracle samples 1064 rows.  If you
                   specify more than half of the data, Oracle reads all
                   the data and computes the statistics.
            ROWS
                   causes Oracle to sample integer rows of the table or
                   cluster or integer entries from the index.  The
                   integer must be at least 1.
            PERCENT
                   causes Oracle to sample integer percent of the rows
                   from the table or cluster or integer percent of the
                   index entries.  The integer can range from 1 to 99.

DELETE STATISTICS
    deletes any statistics about the analyzed object that are currently
    stored in the data dictionary.

VALIDATE STRUCTURE
    validates the structure of the analyzed object.  If you use this
    option when analyzing a cluster, Oracle automatically validates the
    structure of the cluster's tables.

CASCADE
    validates the structure of the indexes associated with the table or
    cluster.  If you use this option when validating a table, Oracle
    also validates the table's indexes.  If you use this option when
    validating a cluster, Oracle also validates all the clustered
    tables' indexes, including the cluster index.

LIST CHAINED ROWS
    identifies migrated and chained rows of the analyzed table or
    cluster.  You cannot use this option when analyzing an index.
            INTO
                   specifies a table into which Oracle lists the
                   migrated and chained rows.  If you omit schema,
                   Oracle assumes the list table is in your own schema.
                   If you omit this clause altogether, Oracle assumes
                   that the table is named CHAINED_ROWS.  The list table
                   must be on your local database.

PREREQUISITES:
    The object to be analyzed must be in your own schema or you must have the ANALYZE ANY system privilege.
You should also remember that statistics get old and dated; you should regularly schedule to re-analyze your tables and indexes.

3. DBMS_DDL package
                Under this package too, we have some procedure to collect the statistics about the oracle schemas. Some of the procedures are listed below.
Analyze_object
                Equivalent to SQL ANALYZE TABLE,   CLUSTER, or INDEX



Syntax:
dbms_ddl.analyze_object(
type             VARCHAR2,
schema           VARCHAR2,
name             VARCHAR2,
method           VARCHAR2,
estimate_rows    NUMBER DEFAULT NULL,
estimate_percent NUMBER DEFAULT NULL,
method_opt       VARCHAR2 DEFAULT NULL,
partname         VARCHAR2 DEFAULT NULL);
METHOD: ESTIMATE', 'COMPUTE' or 'DELETE'

METHOD_OPT: [ FOR TABLE ],
            [ FOR ALL [INDEXED] COLUMNS] [SIZE n], or
            [ FOR ALL INDEXES ]


4. Statistics Gathering Procedures in the DBMS_STATS Package
 Procedure
Collects
GATHER_INDEX_STATS
Index statistics
GATHER_TABLE_STATS
Table, column, and index statistics
GATHER_SCHEMA_STATS
Statistics for all objects in a schema
GATHER_DATABASE_STATS
Statistics for all objects in a database
GATHER_SYSTEM_STATS
CPU and I/O statistics for the system

 

GATHER_INDEX_STATS Procedure
This procedure gathers index statistics. It is equivalent to running ANALYZE INDEX [ownname.]indname [PARTITION partname] COMPUTE STATISTICS | ESTIMATE STATISTICS SAMPLE estimate_percent PERCENT
It does not execute in parallel.

Syntax
DBMS_STATS.GATHER_INDEX_STATS (
   ownname          VARCHAR2,
   indname          VARCHAR2,
   partname         VARCHAR2 DEFAULT NULL,
   estimate_percent NUMBER   DEFAULT NULL,
   stattab          VARCHAR2 DEFAULT NULL,
   statid           VARCHAR2 DEFAULT NULL,
   statown          VARCHAR2 DEFAULT NULL );

Parameters
Parameter 
Description 
ownname 
Schema of index to analyze. 
indname 
Name of index. 
partname 
Name of partition. 
estimate_percent 
Percentage of rows to estimate (NULL means compute). The valid range is [0.000001,100). This value may be increased automatically to achieve better results. 
stattab 
User stat table identifier describing where to save the current statistics. 
statid 
Identifier (optional) to associate with these statistics within stattab
statown

Schema containing stattab (if different than ownname). 


Exceptions
ORA-20000: Index does not exist or insufficient privileges.
ORA-20001: Bad input value.


GATHER_TABLE_STATS Procedure
This procedure gathers table and column (and index) statistics. It attempts to parallelize as much of the work as possible, but there are some restrictions as described in the individual parameters. This operation does not parallelize if the user does not have select privilege on the table being analyzed.

Syntax
DBMS_STATS.GATHER_TABLE_STATS (
   ownname          VARCHAR2,
   tabname          VARCHAR2,
   partname         VARCHAR2 DEFAULT NULL,
   estimate_percent NUMBER   DEFAULT NULL,
   block_sample     BOOLEAN  DEFAULT FALSE,
   method_opt       VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1',
   degree           NUMBER   DEFAULT NULL,
   granularity      VARCHAR2 DEFAULT 'DEFAULT',
   cascade          BOOLEAN  DEFAULT FALSE,
   stattab          VARCHAR2 DEFAULT NULL,
   statid           VARCHAR2 DEFAULT NULL,
   statown          VARCHAR2 DEFAULT NULL);

Parameters
Parameter 
Description 
ownname

Schema of table to analyze. 
tabname

Name of table. 
partname

Name of partition. 
estimate_percent

Percentage of rows to estimate (NULL means compute) The valid range is [0.000001,100). This value may be increased automatically to achieve better results. 
block_sample

Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics. 
method_opt

Method options of the following format (the phrase 'SIZE 1' is required to ensure gathering statistics in parallel and for use with the phrase hidden):
FOR ALL [INDEXED | HIDDEN] COLUMNS [SIZE integer]
FOR COLUMNS [SIZE integer] column|attribute
[,column|attribute ...]
Optimizer related table statistics are always gathered. 
degree

Degree of parallelism (NULL means use table default value). 
granularity

Granularity of statistics to collect (only pertinent if the table is partitioned).
DEFAULT: Gather global- and partition-level statistics.
SUBPARTITION: Gather subpartition-level statistics.
PARTITION: Gather partition-level statistics.
GLOBAL: Gather global statistics.
ALL: Gather all (subpartition, partition, and global) statistics. 
cascade

Gather statistics on the indexes for this table. Index statistics gathering is not parallelized. Using this option is equivalent to running the gather_index_stats procedure on each of the table's indexes. 
stattab

User stat table identifier describing where to save the current statistics. 
statid

Identifier (optional) to associate with these statistics within stattab
statown

Schema containing stattab (if different than ownname). 


Exceptions
ORA-20000: Table does not exist or insufficient privileges.
ORA-20001: Bad input value.



GATHER_SCHEMA_STATS Procedure
This procedure gathers statistics for all objects in a schema.

Syntax
DBMS_STATS.GATHER_SCHEMA_STATS (
   ownname          VARCHAR2,
   estimate_percent NUMBER   DEFAULT NULL,
   block_sample     BOOLEAN  DEFAULT FALSE,
   method_opt       VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1',
   degree           NUMBER   DEFAULT NULL,
   granularity      VARCHAR2 DEFAULT 'DEFAULT',
   cascade          BOOLEAN  DEFAULT FALSE,
   stattab          VARCHAR2 DEFAULT NULL,
   statid           VARCHAR2 DEFAULT NULL,
   options          VARCHAR2 DEFAULT 'GATHER',
   objlist     OUT  ObjectTab,
   statown          VARCHAR2 DEFAULT NULL);

Parameters
Parameter 
Description 
ownname

Schema to analyze (NULL means current schema). 
estimate_percent

Percentage of rows to estimate (NULL means compute): The valid range is [0.000001,100). 
block_sample

Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics. 
method_opt

Method options of the following format (the phrase 'SIZE 1' is required to ensure gathering statistics in parallel and for use with the phrase hidden):
FOR ALL [INDEXED | HIDDEN] COLUMNS [SIZE integer]
This value is passed to all of the individual tables. 
degree

Degree of parallelism (NULL means use table default value). 
granularity

Granularity of statistics to collect (only pertinent if the table is partitioned).
DEFAULT: Gather global- and partition-level statistics.
SUBPARTITION: Gather subpartition-level statistics.
PARTITION: Gather partition-level statistics.
GLOBAL: Gather global statistics.
ALL: Gather all (subpartition, partition, and global) statistics.  
cascade

Gather statistics on the indexes as well.
Index statistics gathering is not parallelized. Using this option is equivalent to running the gather_index_stats procedure on each of the indexes in the schema in addition to gathering table and column statistics. 
stattab

User stat table identifier describing where to save the current statistics. 
statid

Identifier (optional) to associate with these statistics within stattab. 
options

Further specification of which objects to gather statistics for:
GATHER: Gather statistics on all objects in the schema.
GATHER STALE: Gather statistics on stale objects as determined by looking at the *_tab_modifications views. Also, return a list of objects found to be stale.
GATHER EMPTY: Gather statistics on objects which currently have no statistics. also, return a list of objects found to have no statistics.
LIST STALE: Return list of stale objects as determined by looking at the *_tab_modifications views.
LIST EMPTY: Return list of objects which currently have no statistics. 
objlist

List of objects found to be stale or empty. 
statown

Schema containing stattab (if different than ownname). 


Exceptions
ORA-20000: Schema does not exist or insufficient privileges.
ORA-20001: Bad input value.


GATHER_DATABASE_STATS Procedure
This procedure gathers statistics for all objects in the database.

Syntax
DBMS_STATS.GATHER_DATABASE_STATS (
   estimate_percent NUMBER   DEFAULT NULL,
   block_sample     BOOLEAN  DEFAULT FALSE,
   method_opt       VARCHAR2 DEFAULT 'FOR ALL COLUMNS SIZE 1',
   degree           NUMBER   DEFAULT NULL,
   granularity      VARCHAR2 DEFAULT 'DEFAULT',
   cascade          BOOLEAN  DEFAULT FALSE,
   stattab          VARCHAR2 DEFAULT NULL,
   statid           VARCHAR2 DEFAULT NULL,
   options          VARCHAR2 DEFAULT 'GATHER',
   objlist     OUT  ObjectTab,
   statown          VARCHAR2 DEFAULT NULL);

Parameters
Parameter 
Description 
estimate_percent

Percentage of rows to estimate (NULL means compute): The valid range is [0.000001,100). 
block_sample

Whether or not to use random block sampling instead of random row sampling. Random block sampling is more efficient, but if the data is not randomly distributed on disk, then the sample values may be somewhat correlated. Only pertinent when doing an estimate statistics. 
method_opt

Method options of the following format (the phrase 'SIZE 1' is required to ensure gathering statistics in parallel and for use with the phrase hidden):
FOR ALL [INDEXED | HIDDEN] COLUMNS [SIZE integer]
This value is passed to all of the individual tables. 
degree

Degree of parallelism (NULL means use table default value). 
granularity

Granularity of statistics to collect (only pertinent if the table is partitioned).
DEFAULT: Gather global- and partition-level statistics.
SUBPARTITION: Gather subpartition-level statistics.
PARTITION: Gather partition-level statistics.
GLOBAL: Gather global statistics.
ALL: Gather all (subpartition, partition, and global) statistics. 
cascade

Gather statistics on the indexes as well. Index statistics gathering is not parallelized. Using this option is equivalent to running the gather_index_stats procedure on each of the indexes in the database in addition to gathering table and column statistics. 
stattab

User stat table identifier describing where to save the current statistics.
The statistics table is assumed to reside in the same schema as the object being analyzed, so there must be one such table in each schema to use this option. 
statid

Identifier (optional) to associate with these statistics within stattab
options

Further specification of which objects to gather statistics for:
GATHER STALE: Gather statistics on stale objects as determined by looking at the *_tab_modifications views. Also, return a list of objects found to be stale.
GATHER EMPTY: Gather statistics on objects which currently have no statistics. also, return a list of objects found to have no statistics.
LIST STALE: Return list of stale objects as determined by looking at the *_tab_modifications views.
LIST EMPTY: Return list of objects which currently have no statistics. 
objlist

List of objects found to be stale or empty. 
statown

Schema containing stattab (if different than ownname). 




Exceptions
ORA-20000: Insufficient privileges.
ORA-20001: Bad input value.


In order to use the DBMS_STATS Package, the user needs to get the execute privilege. Run the following as SYS:
    grant execute on DBMS_STATS to user_xyz;
Then,
   Exec DBMS_STATS.GATHER_TABLE_STATS(‘schema_name’,table_name’,[..])


Points about CBO
  1. Tuned SQL cannot be magically improved by CBO.
  2. Best results with CBO are seen with the untuned applications.
  3. The CBO cannot distinguish grossly uneven key data spreads. If your data is not uniformly distributed, use inline hints to assist the execution plans
  4. The CBO will be used at least one table is involved in SQL that has been analyzed.
  5. When a table is analyzed, so are all associated indexes. If an index is subsequently dropped and recreated, it must be reanalyzed.


In closing
                For a well tuned SQL application, what ever the optimizer, will returns the same result. CBO is only proven better, for the un-tuned SQL applications. 

0 comments:

Post a Comment

http://helpallworld.blogspot.com/
VOTE US?