AWR mining for performance trend analysis

Preamble

Following a performance issue we had on a BI environment we have extracted with Automatic Workload Repository (AWR) reports few SQLs that are running for tens of minutes not to say multiple hours. Before the mess started we had an hardware storage issue (I/O switches) which triggered additional I/Os to recover situation. In parallel applicative team that got no feedback about the situation that was under recover started to reorganize tables to try to reduce High Water Mark and increase performance. Overall the benefit was the opposite because when storage issues were completely resolved we did not get the exact performance before it all started.

The big question you must answer is: how was it behaving before the issue and do those SQLs had their execution time changed ?

Ideally you would need to have a baseline when performance were good and you would be able to compare, as we have already seen in this blog post. Another option is to use AWR tables and mine into them to try to compare how SQLs have diverged over time. For this obviously you need historical AWR snapshots so the must to keep at least 15 days of history (not to say one month) and change the too low default 7 days value. Example with 1 hour snapshot and 30 days history:

SQL> EXEC dbms_workload_repository.modify_snapshot_settings (INTERVAL=>60, retention=>43200);
 
PL/SQL PROCEDURE successfully completed.

Checking which SQLs diverge is also overall a very interesting information and can trigger nice discover in your batch jobs scheduling (too early, too late,…) and/or jobs running in parallel even if you are not stuck in a performance issue situation.

So far testing has been done on Oracle Enterprise Edition 11.2.0.4 with Tuning and Diagnostic packs running on RedHat Linux release 6.4 64 bits. I will complement this post on other releases and operating system in future…

Parallel downgrades

If you are using Cloud Control one thing you have surely noticed (thanks to the red arrow) in SQL Monitoring page is the decrease in allocated parallel processes if your server is overloaded or sessions are exaggeratedly using parallelism:

awr_mining01
awr_mining01

This can also be seen at SQL level with something like:

SQL> SET lines 150 pages 1000
SQL> SELECT
  sql_id,
  sql_exec_id,
  TO_CHAR(sql_exec_start,'dd-mon-yyyy hh24:mi:ss') AS sql_exec_start,
  ROUND(elapsed_time/1000000) AS "Elapsed(s)",
  px_servers_requested AS "Requested DOP",
  px_servers_allocated AS "Allocated DOP",
  ROUND(cpu_time/1000000) AS "CPU(s)",
  buffer_gets AS "Buffer Gets",
  ROUND(physical_read_bytes /(1024*1024)) AS "Phys reads(MB)",
  ROUND(physical_write_bytes/(1024*1024)) AS "Phys writes(MB)",
  ROUND(user_io_wait_time/1000000) AS "IO Wait(s)"
FROM v$sql_monitor
WHERE px_servers_requested<>px_servers_allocated
ORDER BY sql_exec_start,sql_id;
 
SQL_ID        SQL_EXEC_ID SQL_EXEC_START       Elapsed(s) Requested DOP Allocated DOP     CPU(s) Buffer Gets Phys reads(MB) Phys writes(MB) IO Wait(s)
------------- ----------- -------------------- ---------- ------------- ------------- ---------- ----------- -------------- --------------- ----------
262dzg4ab75nt    16777216 02-dec-2016 15:22:34        263            32             0         10      511923           3625             810        250
883v5mk5bqwq8    16777216 02-dec-2016 15:23:43         41            64            14          8      156832           1426               0         27
7rykdg0zdyjz5    16777216 02-dec-2016 15:24:16        159            32             0          9      560335            802             856        151
amzmxuns5dctz    16777216 02-dec-2016 15:24:59        114            32             0          5       36383            569             548        108
414x9b7p4z10x    16777280 02-dec-2016 15:26:30          0            16            14          0           5              0               0          0
414x9b7p4z10x    16777281 02-dec-2016 15:26:31          0            16            10          0           5              0               0          0
414x9b7p4z10x    16777282 02-dec-2016 15:26:31          0            16            14          0           5              0               0          0
3472f0m6nm343    16778004 02-dec-2016 15:26:33          0            32            14          0         381              0               0          0
414x9b7p4z10x    16777283 02-dec-2016 15:26:35          0            16            14          0           5              0               0          0
3472f0m6nm343    16778005 02-dec-2016 15:26:40          0            32            14          0         381              0               0          0
3472f0m6nm343    16778006 02-dec-2016 15:26:46          0            32            14          0         381              0               0          0
 
11 ROWS selected.

Problem is that V$SQL_MONITORING has no history version so if you come late onto database you will not be able to get any past information about it…

What you can have is overall parallel situation of your database with:

SQL> SET lines 150 pages 1000
SQL> SELECT name, VALUE
FROM V$SYSSTAT
WHERE LOWER(name) LIKE '%parallel%'
ORDER BY 1;
 
NAME                                                                  VALUE
---------------------------------------------------------------- ----------
DBWR parallel query checkpoint buffers written                      1132544
DDL statements parallelized                                            1864
DFO trees parallelized                                               295265
DML statements parallelized                                              21
Parallel operations downgraded 1 TO 25 pct                            55785
Parallel operations downgraded 25 TO 50 pct                           12427
Parallel operations downgraded 50 TO 75 pct                           78033
Parallel operations downgraded 75 TO 99 pct                            8125
Parallel operations downgraded TO serial                             150542
Parallel operations NOT downgraded                                   241815
queries parallelized                                                 208744
 
11 ROWS selected.

As this table has no historical equivalent you can have values for each AWR snapshot:

SQL> SET lines 150 pages 1000
SQL> SELECT
  TO_CHAR(hsn.begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
  TO_CHAR(hsn.end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
  hsy.stat_name,
  hsy.VALUE
FROM dba_hist_sysstat hsy, dba_hist_snapshot hsn
WHERE hsy.snap_id = hsn.snap_id
AND hsy.instance_number = hsn.instance_number
AND LOWER(hsy.stat_name) LIKE '%parallel%'
ORDER BY hsn.snap_id DESC;
 
 
BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    STAT_NAME                                                             VALUE
-------------------- -------------------- ---------------------------------------------------------------- ----------
30-nov-2016 23:00:18 01-dec-2016 00:00:48 queries parallelized                                                 182440
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 25 TO 50 pct                           10602
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 1 TO 25 pct                            48914
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DML statements parallelized                                              17
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DDL statements parallelized                                            1376
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 50 TO 75 pct                           56565
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded 75 TO 99 pct                            7782
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DBWR parallel query checkpoint buffers written                       936382
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations NOT downgraded                                   225266
30-nov-2016 23:00:18 01-dec-2016 00:00:48 DFO trees parallelized                                               248642
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded TO serial                             117995
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 50 TO 75 pct                           56500
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DFO trees parallelized                                               248352
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations NOT downgraded                                   225118
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DBWR parallel query checkpoint buffers written                       919901
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 75 TO 99 pct                            7780
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 25 TO 50 pct                           10542
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded 1 TO 25 pct                            48899
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DML statements parallelized                                              17
30-nov-2016 22:00:52 30-nov-2016 23:00:18 queries parallelized                                                 182170
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded TO serial                             117847
30-nov-2016 22:00:52 30-nov-2016 23:00:18 DDL statements parallelized                                            1356
.

With LAG analytic function you might even get the trend of one particular system statistics. I have chosen Parallel operations downgraded to serial means all queries that have moved from parallel to serial. On some you might expect a big performance penalty:

SQL> SET lines 150 pages 1000
SQL> SELECT
  TO_CHAR(hsn.begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
  TO_CHAR(hsn.end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
  hsy.stat_name,
  hsy.VALUE - hsy.prev_value AS VALUE
  FROM (SELECT snap_id,instance_number,stat_name,VALUE,LAG(VALUE,1,VALUE) OVER (ORDER BY snap_id) AS prev_value
        FROM dba_hist_sysstat
        WHERE stat_name = 'Parallel operations downgraded to serial') hsy,
        dba_hist_snapshot hsn
WHERE hsy.snap_id = hsn.snap_id
AND hsy.instance_number = hsn.instance_number
AND hsy.VALUE - hsy.prev_value<>0
ORDER BY hsn.snap_id DESC;
 
BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    STAT_NAME                                                             VALUE
-------------------- -------------------- ---------------------------------------------------------------- ----------
30-nov-2016 23:00:18 01-dec-2016 00:00:48 Parallel operations downgraded TO serial                                148
30-nov-2016 22:00:52 30-nov-2016 23:00:18 Parallel operations downgraded TO serial                                264
30-nov-2016 21:00:15 30-nov-2016 22:00:52 Parallel operations downgraded TO serial                                160
30-nov-2016 20:00:42 30-nov-2016 21:00:15 Parallel operations downgraded TO serial                                 65
30-nov-2016 19:00:25 30-nov-2016 20:00:42 Parallel operations downgraded TO serial                                 31
30-nov-2016 18:00:29 30-nov-2016 19:00:25 Parallel operations downgraded TO serial                                  8
30-nov-2016 17:00:57 30-nov-2016 18:00:29 Parallel operations downgraded TO serial                               1789
30-nov-2016 16:00:11 30-nov-2016 17:00:57 Parallel operations downgraded TO serial                                588
30-nov-2016 15:00:37 30-nov-2016 16:00:11 Parallel operations downgraded TO serial                               1214
30-nov-2016 14:01:07 30-nov-2016 15:00:37 Parallel operations downgraded TO serial                                856
30-nov-2016 09:00:24 30-nov-2016 10:00:26 Parallel operations downgraded TO serial                                603
30-nov-2016 07:00:03 30-nov-2016 08:00:22 Parallel operations downgraded TO serial                                  6
30-nov-2016 06:00:42 30-nov-2016 07:00:03 Parallel operations downgraded TO serial                                502
30-nov-2016 05:00:24 30-nov-2016 06:00:42 Parallel operations downgraded TO serial                                  8
30-nov-2016 04:00:04 30-nov-2016 05:00:24 Parallel operations downgraded TO serial                               3032
30-nov-2016 03:00:37 30-nov-2016 04:00:04 Parallel operations downgraded TO serial                               1161
30-nov-2016 02:00:39 30-nov-2016 03:00:37 Parallel operations downgraded TO serial                               1243
30-nov-2016 01:01:04 30-nov-2016 02:00:39 Parallel operations downgraded TO serial                                492
30-nov-2016 00:01:05 30-nov-2016 01:01:04 Parallel operations downgraded TO serial                                 51
29-nov-2016 23:00:00 30-nov-2016 00:01:05 Parallel operations downgraded TO serial                                 94
29-nov-2016 22:00:09 29-nov-2016 23:00:00 Parallel operations downgraded TO serial                               7150
29-nov-2016 21:00:07 29-nov-2016 22:00:09 Parallel operations downgraded TO serial                                167
29-nov-2016 20:00:33 29-nov-2016 21:00:07 Parallel operations downgraded TO serial                                124
29-nov-2016 19:00:41 29-nov-2016 20:00:33 Parallel operations downgraded TO serial                                157
29-nov-2016 18:00:22 29-nov-2016 19:00:41 Parallel operations downgraded TO serial                                820
29-nov-2016 17:00:59 29-nov-2016 18:00:22 Parallel operations downgraded TO serial                                 10
29-nov-2016 16:00:29 29-nov-2016 17:00:59 Parallel operations downgraded TO serial                                 46
29-nov-2016 15:00:59 29-nov-2016 16:00:29 Parallel operations downgraded TO serial                                761
29-nov-2016 14:00:23 29-nov-2016 15:00:59 Parallel operations downgraded TO serial                              10342
29-nov-2016 11:00:08 29-nov-2016 12:00:14 Parallel operations downgraded TO serial                                  8
29-nov-2016 09:00:31 29-nov-2016 10:00:45 Parallel operations downgraded TO serial                                163
29-nov-2016 08:00:33 29-nov-2016 09:00:31 Parallel operations downgraded TO serial                                 47
29-nov-2016 07:00:07 29-nov-2016 08:00:33 Parallel operations downgraded TO serial                              10920
29-nov-2016 06:00:17 29-nov-2016 07:00:07 Parallel operations downgraded TO serial                               1409
.

To be honest I was not expecting so high figures as for few 1 hour interval I got more than 10 thousands queries downgraded to serial (!!). The problem is, I think, coming from the maximum number of parallel processes that we have set to contain an applicative issue:

SQL> show parameter parallel_max_servers
 
NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_max_servers                 INTEGER     60

Unstable execution time

Goal of this chapter is to identify queries that have a highly divergent execution time. The table to use here is DBA_HIST_SQLSTAT that contains tons of very useful information. In this view never ever use the xx_TOTAL columns as if you statement has been aged out from library cache the cumulative value will restart from 0. Discussing with teammates I have decided to use the statistical function called standard deviation that is by default available in Oracle as analytics function called STDDEV. In plain English standard deviation is the average of the difference to average value (!!). Here below I have chosen to keep sql_id that have a maximum elapsed time of 5 minutes and where standard deviation is two times greater that minimum execution time to keep only extreme values:

SQL> SET lines 150 pages 1000
SQL> WITH sql_id_stdded AS (SELECT
  sql_id,
  SUM(total_exec) AS total_exec,
  ROUND(MIN(avg_elapsed_time),2) AS min_elapsed_time,
  ROUND(MAX(avg_elapsed_time),2) AS max_elapsed_time,
  ROUND(stddev_elapsed_time,2) AS stddev_elapsed_time
  FROM (SELECT
          sql_id,
          total_exec,
          avg_elapsed_time,
          STDDEV(avg_elapsed_time) OVER(PARTITION BY sql_id) AS stddev_elapsed_time
        FROM (SELECT
                hsq.sql_id,
                hsq.plan_hash_value,
                SUM(NVL(hsq.executions_delta,0)) AS total_exec,
                SUM(hsq.elapsed_time_delta)/DECODE(SUM(NVL(hsq.executions_delta,0)),0,1,SUM(hsq.executions_delta))/1000000 AS avg_elapsed_time
              FROM dba_hist_sqlstat hsq, dba_hist_snapshot hsn
              WHERE hsq.snap_id = hsn.snap_id
              AND hsq.instance_number = hsn.instance_number
              AND hsq.executions_delta > 0
              GROUP BY hsq.sql_id, hsq.plan_hash_value))
        GROUP BY sql_id, stddev_elapsed_time)
SELECT
  sql_id,
  total_exec,
  TO_CHAR(CAST(NUMTODSINTERVAL(min_elapsed_time,'second') AS INTERVAL DAY(2) TO SECOND(0))) AS min_elapsed_time,
  TO_CHAR(CAST(NUMTODSINTERVAL(max_elapsed_time,'second') AS INTERVAL DAY(2) TO SECOND(0))) AS max_elapsed_time,
  TO_CHAR(CAST(NUMTODSINTERVAL(stddev_elapsed_time,'second') AS INTERVAL DAY(2) TO SECOND(0))) AS stddev_elapsed_time
FROM sql_id_stdded
WHERE stddev_elapsed_time>2*min_elapsed_time
AND max_elapsed_time>5*60
AND total_exec>1
ORDER BY stddev_elapsed_time DESC;
 
SQL_ID        TOTAL_EXEC MIN_ELAPSED_T MAX_ELAPSED_T STDDEV_ELAPSE
------------- ---------- ------------- ------------- -------------
6hds16zkc8cgm          9 +00 00:09:04  +00 05:29:46  +00 01:49:56
76g9pn3z3a35u          8 +00 00:07:23  +00 02:47:35  +00 01:13:09
6w05thcf4w6pm          2 +00 00:02:30  +00 01:38:45  +00 01:08:04
bmtkpjynyhs88          8 +00 00:02:19  +00 02:38:59  +00 01:08:00
7k9cjk1q686f2         10 +00 00:09:38  +00 02:20:49  +00 00:53:56
56xyy7uq071g6          4 +00 00:08:56  +00 01:30:50  +00 00:44:57
cvbq6vqk8dbf3         13 +00 00:01:36  +00 02:25:31  +00 00:44:04
g5khnky3q36m1         18 +00 00:07:37  +00 01:38:04  +00 00:37:19
cru4zku27tv0p          2 +00 00:00:55  +00 00:47:09  +00 00:32:42
2m2ww08p9btvc          2 +00 00:09:50  +00 00:48:35  +00 00:27:24
08kdqk1abqm0v          2 +00 00:07:34  +00 00:45:43  +00 00:26:59
12txp882z4ucy         16 +00 00:04:17  +00 00:57:28  +00 00:22:17
1gta6uu65u8nw         14 +00 00:06:16  +00 00:48:53  +00 00:20:15
gcdtt7t6rf0hg         16 +00 00:02:59  +00 00:51:18  +00 00:18:29
3kffqq3kwta74          3 +00 00:00:43  +00 00:32:37  +00 00:18:13
1nqaj68tn9xp2          2 +00 00:05:46  +00 00:28:15  +00 00:15:54
b8pyc5puhh9c7          8 +00 00:00:49  +00 00:29:44  +00 00:15:40
2svyb8a5n6qb5         10 +00 00:00:56  +00 00:45:44  +00 00:15:31
8zsu7t63hj1zp         14 +00 00:03:46  +00 00:33:57  +00 00:14:13
1tdc0da6km50h         67 +00 00:01:09  +00 00:30:28  +00 00:14:04
3uvcgnm27xf1c          3 +00 00:06:21  +00 00:33:03  +00 00:13:22
ch3xpjhf3y2bf         11 +00 00:03:51  +00 00:22:21  +00 00:13:05
fdcpq4j8kxd49         13 +00 00:04:31  +00 00:22:50  +00 00:12:57
csctx8ttu58d9         25 +00 00:00:30  +00 00:26:47  +00 00:12:15
dv1cvuw300ny4         11 +00 00:03:14  +00 00:28:38  +00 00:10:40
gsx2423tf2a7f          7 +00 00:04:35  +00 00:29:30  +00 00:10:33
62nt4sxdsy586         19 +00 00:02:29  +00 00:27:16  +00 00:10:29
1jb164khmsyzj         60 +00 00:00:40  +00 00:19:42  +00 00:09:15
6mgwwp95jaxzz         13 +00 00:03:56  +00 00:15:46  +00 00:08:22
8v38gy0u6a2mk          5 +00 00:01:02  +00 00:15:50  +00 00:08:20
9mt8p7z1wjjkn          9 +00 00:02:12  +00 00:17:34  +00 00:08:09
byd0g9xfpwcuj          2 +00 00:00:10  +00 00:11:15  +00 00:07:51
7pm2wxdvu3mc7         11 +00 00:02:55  +00 00:13:27  +00 00:07:26
4u50sp3h1szcp          5 +00 00:02:52  +00 00:11:38  +00 00:06:12
7430xmabdv8av          3 +00 00:02:25  +00 00:12:17  +00 00:05:01
cd3xmx3sk7vrv         21 +00 00:00:22  +00 00:05:06  +00 00:03:21
 
36 ROWS selected.

If we take first sql_id query says minimum execution time is 9 minutes and 4 seconds and maximum one is 5 hours 29 minutes and 46 seconds. What an amazing difference !

To focus on this sql_id you might use something like. It is strongly suggested to execute this query in a graphical query tool because it is difficult to see something in pure command line. All timing are in seconds while size are in MegaBytes:

SQL> SELECT
    hsq.sql_id,
    hsq.plan_hash_value,
    NVL(SUM(hsq.executions_delta),0) AS total_exec,
    ROUND(SUM(hsq.elapsed_time_delta)/1000000,2) AS elapsed_time_total,
		ROUND(SUM(hsq.px_servers_execs_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta)),2) AS avg_px_servers_execs,
    ROUND(SUM(hsq.elapsed_time_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/1000000,2) AS avg_elapsed_time,
    ROUND(SUM(hsq.cpu_time_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/1000000,2) AS avg_cpu_time,
    ROUND(SUM(hsq.iowait_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/1000000,2) AS avg_iowait,
    ROUND(SUM(hsq.clwait_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/1000000,2) AS avg_cluster_wait,
    ROUND(SUM(hsq.apwait_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/1000000,2) AS avg_application_wait,
    ROUND(SUM(hsq.ccwait_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/1000000,2) AS avg_concurrency_wait,
    ROUND(SUM(hsq.rows_processed_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta)),2) AS avg_rows_processed,
    ROUND(SUM(hsq.buffer_gets_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta)),2) AS avg_buffer_gets,
    ROUND(SUM(hsq.disk_reads_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta)),2) AS avg_disk_reads,
    ROUND(SUM(hsq.direct_writes_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta)),2) AS avg_direct_writes,
    ROUND(SUM(hsq.io_interconnect_bytes_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/(1024*1024),0) AS avg_io_interconnect_mb,
    ROUND(SUM(hsq.physical_read_requests_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta)),0) AS avg_phys_read_requests,
    ROUND(SUM(hsq.physical_read_bytes_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/(1024*1024),0) AS avg_phys_read_mb,
    ROUND(SUM(hsq.physical_write_requests_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta)),0) AS avg_phys_write_requests,
    ROUND(SUM(hsq.physical_write_bytes_delta)/DECODE(SUM(hsq.executions_delta),0,NULL,SUM(hsq.executions_delta))/(1024*1024),0) AS avg_phys_write_mb
FROM dba_hist_sqlstat hsq
WHERE hsq.sql_id='6hds16zkc8cgm'
GROUP BY hsq.sql_id, hsq.plan_hash_value;
 
SQL_ID        PLAN_HASH_VALUE TOTAL_EXEC ELAPSED_TIME_TOTAL AVG_PX_SERVERS_EXECS AVG_ELAPSED_TIME AVG_CPU_TIME AVG_IOWAIT AVG_CLUSTER_WAIT AVG_APPLICATION_WAIT AVG_CONCURRENCY_WAIT AVG_ROWS_PROCESSED AVG_BUFFER_GETS AVG_DISK_READS AVG_DIRECT_WRITES AVG_IO_INTERCONNECT_MB AVG_PHYS_READ_REQUESTS AVG_PHYS_READ_MB AVG_PHYS_WRITE_REQUESTS AVG_PHYS_WRITE_MB
------------- --------------- ---------- ------------------ -------------------- ---------------- ------------ ---------- ---------------- -------------------- -------------------- ------------------ --------------- -------------- ----------------- ---------------------- ---------------------- ---------------- ----------------------- -----------------
6hds16zkc8cgm       705417430          2           15342.95                   16          7671.48        42.59    7521.09                0                 3.14                77.09                959         7948659       739317.5                 0                      3                    187                3                       0                 0
6hds16zkc8cgm      2195747324          2            1087.57                   32           543.78         43.8     354.06                0                  1.6               140.49                961       6115419.5       739927.5                 0                      6                    299                6                       0                 0
6hds16zkc8cgm      4190635369          1           29581.68                   32         29581.68        44.39   29366.99                0                    0               131.88               1226         6573306         709640                 0                      7                    350                7                       0                 0
6hds16zkc8cgm      1445916266          1            8639.51                   12          8639.51        37.77    8538.27                0                 7.14                50.05                919         4620163         698482                 0                      7                    342                7                       0                 0
6hds16zkc8cgm      3246633284          1           15650.16                   12         15650.16        47.34   15533.76                0                  .42                51.99               1364         7318402         731331                 0                      1                     73                1                       0                 0
6hds16zkc8cgm       463109863          1            7905.76                   16          7905.76        41.31    7761.58                0                    0                72.88                890         4709496         756848                 0                      7                    330                7                       0                 0
6hds16zkc8cgm      2566346142          1            9315.98                   27          9315.98        53.88    9131.03                0                  5.2                93.75               1394        11788790         754218                 0                      5                    211                5                       0                 0
6hds16zkc8cgm      2387297713          1            5672.93                   12          5672.93        43.45    5565.33                0                27.04                35.22               1074         8563338         697194                 0                      0                     31                0                       0                 0
 
8 ROWS selected.

PLAN_HASH_VALUE column is the column to use to understand if too SQL plan are identical or not rather than comparing plans line by line. Here above we see that SQL plan of the same SQL is almost never the same, that could explain the huge difference in response time. We also see huge differences in I/O wait time from 354 seconds to 29366 seconds…

To display all the different plan and start to dig into them you can use (I’m not displaying mine as it would be too long):

SELECT * FROM TABLE(dbms_xplan.display_awr(sql_id=>'6hds16zkc8cgm',format=>'all allstats'));

You should correlate the above figures with what you can find in DBA_HIST_ACTIVE_SESS_HISTORY. Do not use TIME_WAITED column in this view but count 10 seconds per line. 10 seconds because it is the frequency of snapshot taken from V$ACTIVE_SESSION_HISTORY. And V$ACTIVE_SESSION_HISTORY has a frequency of 1 second (that’s why you would simply use count if selecting from V$ACTIVE_SESSION_HISTORY):

SQL> SET lines 150 pages 1000
SQL> col wait_class FOR a15
SQL> col event FOR a35
SQL> SELECT
     sql_id, sql_plan_hash_value, actual_dop, TO_CHAR(sql_exec_start,'dd-mon-yyyy hh24:mi:ss') AS sql_exec_start,
     wait_class, event, COUNT(*)*10 AS "time_waited (s)"
     FROM
     (SELECT sql_id,sql_plan_hash_value,TRUNC(px_flags / 2097152) AS actual_dop,
      sql_exec_start,
      DECODE(NVL(wait_class,'ON CPU'),'ON CPU',DECODE(session_type,'BACKGROUND','BCPU','CPU'),wait_class) AS wait_class,
      NVL(event,'ON CPU') AS event
      FROM dba_hist_active_sess_history
      WHERE sql_id='6hds16zkc8cgm') a
     GROUP BY sql_id, sql_plan_hash_value, actual_dop, sql_exec_start, wait_class,event
     ORDER BY sql_exec_start, sql_plan_hash_value, wait_class,event;
 
SQL_ID        SQL_PLAN_HASH_VALUE ACTUAL_DOP SQL_EXEC_START       WAIT_CLASS      EVENT                               time_waited (s)
------------- ------------------- ---------- -------------------- --------------- ----------------------------------- ---------------
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 CPU             ON CPU                                           10
6hds16zkc8cgm          4190635369          0 01-dec-2016 15:38:50 Other           reliable message                                 20
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 USER I/O        db FILE parallel read                          1830
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 USER I/O        db FILE sequential read                         160
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 USER I/O        direct PATH read                              27030
6hds16zkc8cgm          4190635369         16 01-dec-2016 15:38:50 USER I/O        read BY other SESSION                           180
6hds16zkc8cgm          2566346142          0 02-dec-2016 16:40:27 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 CPU             ON CPU                                          300
6hds16zkc8cgm          2566346142          0 02-dec-2016 16:40:27 Configuration   LOG buffer SPACE                                 10
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 USER I/O        db FILE sequential read                          10
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 USER I/O        direct PATH read                               8240
6hds16zkc8cgm          2566346142         14 02-dec-2016 16:40:27 USER I/O        read BY other SESSION                           650
6hds16zkc8cgm          2195747324         16 16-nov-2016 16:17:50 CPU             ON CPU                                           20
6hds16zkc8cgm          2195747324         16 16-nov-2016 16:17:50 USER I/O        db FILE sequential read                         300
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 CPU             ON CPU                                           30
6hds16zkc8cgm          1759484988          0 17-nov-2016 18:11:25 Other           reliable message                                 10
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 USER I/O        db FILE parallel read                           140
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 USER I/O        db FILE scattered read                           10
6hds16zkc8cgm          1759484988            17-nov-2016 18:11:25 USER I/O        db FILE sequential read                          70
6hds16zkc8cgm          2195747324            18-nov-2016 18:06:45 CPU             ON CPU                                           50
6hds16zkc8cgm          2195747324            18-nov-2016 18:06:45 USER I/O        db FILE parallel read                            30
6hds16zkc8cgm          2195747324            18-nov-2016 18:06:45 USER I/O        db FILE sequential read                          50
6hds16zkc8cgm          3246633284          0 19-nov-2016 16:19:49 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 CPU             ON CPU                                           20
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 Scheduler       resmgr:cpu quantum                               10
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 USER I/O        db FILE parallel read                           770
6hds16zkc8cgm          3246633284            19-nov-2016 16:19:49 USER I/O        db FILE sequential read                          30
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 CPU             ON CPU                                           20
6hds16zkc8cgm          2195747324          0 20-nov-2016 15:41:33 Other           reliable message                                 10
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 USER I/O        db FILE parallel read                           160
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 USER I/O        db FILE sequential read                          20
6hds16zkc8cgm          2195747324            20-nov-2016 15:41:33 USER I/O        read BY other SESSION                            10
6hds16zkc8cgm          2195747324            21-nov-2016 15:49:06 CPU             ON CPU                                           20
6hds16zkc8cgm          2195747324          0 21-nov-2016 15:49:06 Other           reliable message                                 10
6hds16zkc8cgm          2195747324            21-nov-2016 15:49:06 USER I/O        db FILE parallel read                           140
6hds16zkc8cgm          2195747324            21-nov-2016 15:49:06 USER I/O        db FILE sequential read                          40
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 CPU             ON CPU                                           20
6hds16zkc8cgm           463109863          0 22-nov-2016 16:54:09 Other           reliable message                                 10
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 USER I/O        db FILE parallel read                           550
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 USER I/O        db FILE sequential read                          30
6hds16zkc8cgm           463109863          8 22-nov-2016 16:54:09 USER I/O        direct PATH read                               7120
6hds16zkc8cgm          2387297713          0 23-nov-2016 17:06:32 Application     enq: KO - fast object checkpoint                 20
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 CPU             ON CPU                                           30
6hds16zkc8cgm          2387297713          0 23-nov-2016 17:06:32 Configuration   LOG buffer SPACE                                 10
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 USER I/O        db FILE sequential read                          40
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 USER I/O        direct PATH read                               5480
6hds16zkc8cgm          2387297713          6 23-nov-2016 17:06:32 USER I/O        read BY other SESSION                            60
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 CPU             ON CPU                                           60
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 Configuration   LOG buffer SPACE                                 20
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 USER I/O        db FILE sequential read                         400
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 USER I/O        direct PATH read                               5780
6hds16zkc8cgm           705417430          6 24-nov-2016 16:48:13 USER I/O        read BY other SESSION                           170
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 CPU             ON CPU                                           70
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 USER I/O        db FILE parallel read                          1980
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 USER I/O        db FILE sequential read                          80
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 USER I/O        direct PATH read                              13430
6hds16zkc8cgm          3246633284          6 25-nov-2016 15:39:23 USER I/O        read BY other SESSION                            10
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 CPU             ON CPU                                           50
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 USER I/O        db FILE sequential read                         140
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 USER I/O        direct PATH read                                360
6hds16zkc8cgm          2195747324         16 26-nov-2016 15:32:14 USER I/O        read BY other SESSION                            10
6hds16zkc8cgm          2195747324         16 27-nov-2016 16:27:27 CPU             ON CPU                                           40
6hds16zkc8cgm          2195747324         16 27-nov-2016 16:27:27 USER I/O        direct PATH read                                280
6hds16zkc8cgm           705417430          0 28-nov-2016 16:42:39 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 CPU             ON CPU                                           10
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 Concurrency     buffer busy waits                                90
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 Configuration   LOG buffer SPACE                                 10
6hds16zkc8cgm           705417430          0 28-nov-2016 16:42:39 Other           reliable message                                 10
6hds16zkc8cgm           705417430         10 28-nov-2016 16:42:39 USER I/O        direct PATH read                               8520
6hds16zkc8cgm          1445916266          0 29-nov-2016 16:33:40 Application     enq: KO - fast object checkpoint                 10
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 CPU             ON CPU                                           30
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 USER I/O        db FILE parallel read                           880
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 USER I/O        db FILE sequential read                         120
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 USER I/O        direct PATH read                               7520
6hds16zkc8cgm          1445916266          6 29-nov-2016 16:33:40 USER I/O        read BY other SESSION                            30
6hds16zkc8cgm           463109863          0 30-nov-2016 16:33:40 CPU             ON CPU                                           10
6hds16zkc8cgm           463109863         15 30-nov-2016 16:33:40 CPU             ON CPU                                           10
6hds16zkc8cgm           463109863         15 30-nov-2016 16:33:40 USER I/O        db FILE sequential read                         150
6hds16zkc8cgm           463109863         15 30-nov-2016 16:33:40 USER I/O        direct PATH read                                190
6hds16zkc8cgm           463109863          0                      CPU             ON CPU                                           20
6hds16zkc8cgm           463109863                                 CPU             ON CPU                                           10
6hds16zkc8cgm           463109863          0                      Concurrency     CURSOR: pin S wait ON X                         430
6hds16zkc8cgm           463109863          0                      Concurrency     LIBRARY cache LOCK                               10
6hds16zkc8cgm           705417430          0                      CPU             ON CPU                                           10
6hds16zkc8cgm           705417430                                 CPU             ON CPU                                           10
6hds16zkc8cgm           705417430          0                      Concurrency     CURSOR: pin S wait ON X                          90
6hds16zkc8cgm           705417430          0                      Concurrency     LIBRARY cache LOCK                               20
6hds16zkc8cgm          1445916266                                 CPU             ON CPU                                           10
6hds16zkc8cgm          2195747324                                 CPU             ON CPU                                           40
6hds16zkc8cgm          2195747324          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          2195747324          0                      Concurrency     CURSOR: pin S wait ON X                         310
6hds16zkc8cgm          2387297713          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          2387297713          0                      Concurrency     CURSOR: pin S wait ON X                         110
6hds16zkc8cgm          3246633284                                 CPU             ON CPU                                           10
6hds16zkc8cgm          3246633284          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          3246633284          0                      Concurrency     CURSOR: pin S wait ON X                         110
6hds16zkc8cgm          3246633284                                 USER I/O        db FILE sequential read                          10
6hds16zkc8cgm          4190635369          0                      CPU             ON CPU                                           10
6hds16zkc8cgm          4190635369          0                      Concurrency     CURSOR: pin S wait ON X                         310
 
99 ROWS selected.

We see that top number one wait even while executing our query is direct path read. This wait event is new in 11g where Oracle has decided to bypass buffer cache and read directly in PGA for large table that does not fit in SGA. Overall it is not really an issue, THE SQL behind must be tune to favor more optimal choices. See Direct path chapter for more information.

Direct path

On one of our database we had below Top Foreground wait event:

awr_mining02
awr_mining02

Even if you will immediately focus on SQL tuning you might want to know which SQL are mainly responsible of this:

SQL> SET lines 150 pages 1000
SQL> col wait_class FOR a10
SQL> col event FOR a25
SQL> SELECT
     sql_id, sql_plan_hash_value, TO_CHAR(sql_exec_start,'dd-mon-yyyy hh24:mi:ss') AS sql_exec_start,
     wait_class, event, COUNT(*)*10 AS "time_waited (s)"
     FROM
     (SELECT sql_id,sql_plan_hash_value,sql_exec_start,
      DECODE(NVL(wait_class,'ON CPU'),'ON CPU',DECODE(session_type,'BACKGROUND','BCPU','CPU'),wait_class) AS wait_class,
      NVL(event,'ON CPU') AS event
      FROM dba_hist_active_sess_history
      WHERE event LIKE '%direct%') a
     GROUP BY sql_id, sql_plan_hash_value, sql_exec_start, wait_class,event
     ORDER BY 6 DESC;
 
SQL_ID        SQL_PLAN_HASH_VALUE SQL_EXEC_START       WAIT_CLASS EVENT                     time_waited (s)
------------- ------------------- -------------------- ---------- ------------------------- ---------------
amzmxuns5dctz          1006290636 21-nov-2016 15:30:51 USER I/O   direct PATH read temp              117670
amzmxuns5dctz          2716462349 30-nov-2016 15:28:23 USER I/O   direct PATH read temp              101540
amzmxuns5dctz          2716462349 03-dec-2016 15:40:04 USER I/O   direct PATH read temp               96800
                                0                      USER I/O   direct PATH read                    90720
amzmxuns5dctz          1006290636 25-nov-2016 06:39:34 USER I/O   direct PATH WRITE temp              50740
f92r4f37kn015          2776764876 30-nov-2016 03:22:25 USER I/O   direct PATH WRITE temp              48010
114w500wy5039          2301194886 04-dec-2016 16:34:14 USER I/O   direct PATH WRITE temp              45480
cvbq6vqk8dbf3           720130659 25-nov-2016 15:42:05 USER I/O   direct PATH read                    36520
amzmxuns5dctz          2716462349 26-nov-2016 15:40:00 USER I/O   direct PATH read temp               33610
f92r4f37kn015          2776764876 24-nov-2016 00:57:35 USER I/O   direct PATH WRITE temp              33300
b2cwuxca44yw8          1919045833 01-dec-2016 16:30:03 USER I/O   direct PATH read                    31890
amzmxuns5dctz          2716462349 30-nov-2016 06:35:58 USER I/O   direct PATH read temp               31830
amzmxuns5dctz          1006290636 24-nov-2016 15:10:44 USER I/O   direct PATH read temp               28210
amzmxuns5dctz          2716462349 30-nov-2016 06:35:58 USER I/O   direct PATH WRITE temp              27350
6hds16zkc8cgm          4190635369 01-dec-2016 15:38:50 USER I/O   direct PATH read                    27030
amzmxuns5dctz          2716462349 28-nov-2016 15:13:16 USER I/O   direct PATH read temp               26850
6hds16zkc8cgm          3246633284 05-dec-2016 16:15:22 USER I/O   direct PATH read                    23020
aqh9x3yay6khc          2336987607 21-nov-2016 20:54:01 USER I/O   direct PATH read temp               21060
amzmxuns5dctz          2716462349 01-dec-2016 15:08:29 USER I/O   direct PATH read temp               20510
9mt8p7z1wjjkn          2327812543 04-dec-2016 11:00:32 USER I/O   direct PATH read                    20450
48091vbtjj4xa          1708748504 28-nov-2016 04:03:28 USER I/O   direct PATH read temp               20090
amzmxuns5dctz          2716462349 03-dec-2016 06:36:36 USER I/O   direct PATH WRITE temp              17190
f92r4f37kn015          2776764876 27-nov-2016 00:41:30 USER I/O   direct PATH read temp               17190
amzmxuns5dctz          2716462349 29-nov-2016 15:43:15 USER I/O   direct PATH read temp               16860
1wzd30k3m1ghn          1433478644 07-dec-2016 06:29:49 USER I/O   direct PATH read temp               15990
amzmxuns5dctz          2716462349 30-nov-2016 15:28:23 USER I/O   direct PATH WRITE temp              15830
amzmxuns5dctz          2716462349 07-dec-2016 06:35:28 USER I/O   direct PATH WRITE temp              14850
1wzd30k3m1ghn          1433478644 29-nov-2016 08:07:57 USER I/O   direct PATH read temp               14850
08kdqk1abqm0v          2813508085 04-dec-2016 15:49:02 USER I/O   direct PATH read                    14680
aqh9x3yay6khc          2336987607 22-nov-2016 21:07:34 USER I/O   direct PATH read temp               14450
5tjqq1cggd0c2          3765923229 23-nov-2016 19:04:17 USER I/O   direct PATH read                    14280
5tjqq1cggd0c2          1896303415 22-nov-2016 19:23:11 USER I/O   direct PATH read                    14210
5tjqq1cggd0c2          3837189456 29-nov-2016 19:34:04 USER I/O   direct PATH read                    14010
grb144xf2asf3           759327608 25-nov-2016 22:17:39 USER I/O   direct PATH read                    13790
f92r4f37kn015          2776764876 05-dec-2016 00:11:15 USER I/O   direct PATH read temp               13460
6hds16zkc8cgm          3246633284 25-nov-2016 15:39:23 USER I/O   direct PATH read                    13430
amzmxuns5dctz          1006290636 21-nov-2016 07:14:12 USER I/O   direct PATH read temp               12780
b4yrbsczwf9xw          3470921796 27-nov-2016 03:58:25 USER I/O   direct PATH read                    12720
grb144xf2asf3          1075986364 05-dec-2016 22:48:32 USER I/O   direct PATH read                    12310
g8n42y0duhggt           152545410 24-nov-2016 21:13:25 USER I/O   direct PATH read                    12060
a2xhxhppc23y1           327407190 06-dec-2016 22:56:59 USER I/O   direct PATH read                    11930
1wzd30k3m1ghn          1433478644 23-nov-2016 07:13:00 USER I/O   direct PATH read temp               11820
0pmqzssfmmdcv          3444765716 04-dec-2016 01:57:54 USER I/O   direct PATH read                    11480
1zzsqqwax38kk           660240898 27-nov-2016 06:58:14 USER I/O   direct PATH read                    11200
amzmxuns5dctz          2716462349 07-dec-2016 06:35:28 USER I/O   direct PATH read temp               11160
aqh9x3yay6khc          2336987607 29-nov-2016 20:01:40 USER I/O   direct PATH read temp               11110
ckapap92tfy3n          2695238043 23-nov-2016 23:47:49 USER I/O   direct PATH read                    11050
aqh9x3yay6khc          2336987607 03-dec-2016 22:33:33 USER I/O   direct PATH read temp               10990
amzmxuns5dctz          2716462349 03-dec-2016 06:36:36 USER I/O   direct PATH read temp               10950
1wzd30k3m1ghn          1433478644 26-nov-2016 06:48:59 USER I/O   direct PATH read temp               10950
1wzd30k3m1ghn          1433478644 06-dec-2016 06:45:53 USER I/O   direct PATH read temp               10870
1wzd30k3m1ghn          1433478644 02-dec-2016 06:41:28 USER I/O   direct PATH read temp               10690
57svfqqryy7h0          3182238936 28-nov-2016 05:17:30 USER I/O   direct PATH read                    10540
74447zmmmw0zk           316009422 29-nov-2016 02:41:26 USER I/O   direct PATH read                    10470
f92r4f37kn015          2776764876 28-nov-2016 02:22:45 USER I/O   direct PATH read temp               10310
0cy2upaz2mtp3           420311346 01-dec-2016 17:54:20 USER I/O   direct PATH read                    10220
daxpwau39csac          2945512965 29-nov-2016 18:07:05 USER I/O   direct PATH read                    10200
                                0                      USER I/O   direct PATH WRITE                   10060
.

NULL sql_id is checkpoint process.

Here above we see than sql_id amzmxuns5dctz is the one to focus on…

Checkpoint

After a reboot of our database we have (luckily) noticed a huge increase in physical writes as shown in HP Performance Manager application:

awr_mining03
awr_mining03

We have investigated in all directions we could: OS, database, SQL tuning,… We noticed we mistakenly (since long time) set FAST_START_MTTR_TARGET to 300. To be honest I have never really understood the added value of this parameter even if I understand the description. What’s the point to tune a situation (recovery) that (hopefully) happen rarely and impacting all year long your performance. I prefer to let the checkpoint occur at redo log switch and set FAST_START_MTTR_TARGET to 0 (default value).

Anyways that said we reset the parameter and guess what ? Physical write decreased a lot:

awr_mining04
awr_mining04

Then I wanted to see at Oracle the decrease in number of checkpoint as well as decrease in number of write due to incremental check pointing activity. DBA_HIST_SYSSTAT comes to the rescue. In meanwhile a teammate changed AWR frequency so I had to tune a bit my query to have the sum per hour:

SELECT TO_CHAR(TRUNC(begin_interval_time,'HH'),'dd-mon-yyyy hh24:mi:ss') AS TIME,
stat_name,
SUM(VALUE) AS VALUE
FROM(
SELECT
hsn.snap_id,
  hsn.begin_interval_time,
  hsn.end_interval_time,
  hsy.stat_name,
  hsy.VALUE - hsy.prev_value AS VALUE
  FROM (SELECT snap_id,instance_number,stat_name,VALUE,LAG(VALUE,1,VALUE) OVER (ORDER BY snap_id,stat_name) AS prev_value
        FROM dba_hist_sysstat
        WHERE stat_name IN 'DBWR checkpoints') hsy,
        dba_hist_snapshot hsn
WHERE hsy.snap_id = hsn.snap_id
AND hsy.instance_number = hsn.instance_number
ORDER BY hsn.snap_id DESC)
GROUP BY TRUNC(begin_interval_time,'HH'),snap_id,stat_name
ORDER BY snap_id;

The two statistics name I will use are (please refer to official documenatiopn or V$STATNAME for a complete list of available ones):

Name Description
DBWR checkpoint buffers written Number of buffers that were written for checkpoints
DBWR checkpoints Number of times the DBWR was asked to scan the cache and write all blocks marked for a checkpoint or the end of recovery. This statistic is always larger than “background checkpoints completed”

I initially exported the result set in Excel format and build graph in Excel directly but finally decided to test the graph capability of SQL Developer. To access it use the Reports tab, or activate it in View and Reports menu. Then Create a new one in User Defined Reports using Chart style and Area as Chart Type, and any other option you like. A good trick is to connect to a database and check the Use Live Data in Property/Data option to directly see how your report looks like.

The queries to be displayed must be built to return the following three values: ‘x axis value’,’serie name’ and ‘y axis value’.

We see number of checkpoint decreasing after December the 7TH AM (we did the change around 10:00 AM CET):

awr_mining05
awr_mining05

But more impressive the number of buffer written:

awr_mining06
awr_mining06

But we did not change anything on the database before the applicative started to complain all their queries were slow. The change in FAST_START_MTTR_TARGET solved performance troubles but does not explain the root cause of the issue. Restoring a backup we have been able to extract and load old AWR figures (see chapter on this) to finally be able to perform difference AWR reports. Then obviously I have focused on Top Segments Comparison by Physical Writes paragraph of my AWR difference report (a good day before the issue versus a bad one) and saw this:

awr_mining07
awr_mining07

I noticed many new comers in top physical writes while the first one had 500% increase in number of physical writes…

The table to use in this situation is DBA_HIST_SEG_STAT, but you need the object id to fetch it:

SQL> SET lines 150 pages 1000
SQL> col object_name FOR a30
SQL> SELECT object_id,owner,object_name,object_type
     FROM dba_objects
     WHERE owner||'.'||object_name IN ('HUB.CRM_INVOICE_PREV_PK','E2DWH.BACKLOG_NEW',
     'E2DWH.E2_X_DWH_2_PK','E2DWH.E2_X_DWH2_IDX_SO_ITEM__ID','E2DWH.E2_X_DWH2_IDX_LAST_UPD');
 
 OBJECT_ID OWNER                          OBJECT_NAME                    OBJECT_TYPE
---------- ------------------------------ ------------------------------ -------------------
  43994766 E2DWH                          BACKLOG_NEW                    TABLE
  44003115 HUB                            CRM_INVOICE_PREV_PK            INDEX
  44005025 E2DWH                          E2_X_DWH2_IDX_SO_ITEM__ID      INDEX
  44005026 E2DWH                          E2_X_DWH2_IDX_LAST_UPD         INDEX
  44005027 E2DWH                          E2_X_DWH_2_PK                  INDEX

Or as they claim use DBA_HIST_SEG_STAT_OBJ table (even if it is really difficult to guess which key this table has):

SQL> SET lines 150 pages 1000
SQL> col object_name FOR a50
SQL> SELECT DISTINCT obj#,owner||'.'||object_name||' ('||NVL2(subobject_name,object_type || ': ' || subobject_name,object_type)||')' AS object_name
     FROM DBA_HIST_SEG_STAT_OBJ
     WHERE owner||'.'||OBJECT_NAME IN ('HUB.CRM_INVOICE_PREV_PK','E2DWH.BACKLOG_NEW',
     'E2DWH.E2_X_DWH_2_PK','E2DWH.E2_X_DWH2_IDX_SO_ITEM__ID','E2DWH.E2_X_DWH2_IDX_LAST_UPD');
 
      OBJ# OBJECT_NAME
---------- --------------------------------------------------
  43994766 E2DWH.BACKLOG_NEW (TABLE)
  44005027 E2DWH.E2_X_DWH_2_PK (INDEX)
  44005026 E2DWH.E2_X_DWH2_IDX_LAST_UPD (INDEX)
  44003115 HUB.CRM_INVOICE_PREV_PK (INDEX)
  44005025 E2DWH.E2_X_DWH2_IDX_SO_ITEM__ID (INDEX)

Then you can access to physical writes of this object with something like:

SQL> SET lines 150 pages 1000
SQL> SELECT
     TO_CHAR(begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
     TO_CHAR(end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
     hss.physical_writes_delta,
     hss.physical_write_requests_delta
     FROM dba_hist_seg_stat hss, dba_hist_snapshot hsn
     WHERE hss.snap_id = hsn.snap_id
     AND hss.instance_number = hsn.instance_number
     AND hss.obj# = 44003115
     ORDER BY hss.snap_id
 
BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    PHYSICAL_WRITES_DELTA PHYSICAL_WRITE_REQUESTS_DELTA
-------------------- -------------------- --------------------- -----------------------------
04-nov-2016 14:00:31 04-nov-2016 15:00:48                742817                        658007
05-nov-2016 14:00:08 05-nov-2016 15:00:26                749335                        652146
05-nov-2016 15:00:26 05-nov-2016 16:00:38                     0                             0
06-nov-2016 14:00:41 06-nov-2016 15:01:00                815897                        718208
07-nov-2016 14:00:20 07-nov-2016 15:00:36                722403                        613129
08-nov-2016 14:00:37 08-nov-2016 15:00:56                608903                        546690
09-nov-2016 14:00:51 09-nov-2016 15:00:08                633603                        549895
10-nov-2016 14:00:39 10-nov-2016 15:00:59                722215                        656331
11-nov-2016 14:00:54 11-nov-2016 15:00:14                606513                        535992
11-nov-2016 15:00:14 11-nov-2016 16:00:28                     0                             0
11-nov-2016 16:00:28 11-nov-2016 17:00:46                     0                             0
11-nov-2016 17:00:46 11-nov-2016 18:02:20                     0                             0
11-nov-2016 18:02:20 11-nov-2016 19:00:48                     0                             0
11-nov-2016 19:00:48 11-nov-2016 20:00:15                     0                             0
18-nov-2016 14:00:40 18-nov-2016 15:00:19               4514625                       4298799
19-nov-2016 14:00:22 19-nov-2016 15:00:56               4811683                       4610613
20-nov-2016 14:00:28 20-nov-2016 15:00:10               2404822                       2267424
21-nov-2016 14:00:08 21-nov-2016 15:00:34               2300106                       2145016
22-nov-2016 14:00:06 22-nov-2016 15:00:39               3589330                       3428962
22-nov-2016 15:00:39 22-nov-2016 16:00:04               1422715                       1373614
23-nov-2016 14:00:39 23-nov-2016 15:00:06               4505033                       4318507
23-nov-2016 17:00:40 23-nov-2016 17:10:56                     0                             0
24-nov-2016 14:00:34 24-nov-2016 15:00:59               2027948                       1873524
25-nov-2016 14:00:07 25-nov-2016 14:55:52               4477194                       4288564
26-nov-2016 14:00:59 26-nov-2016 15:00:18               2350258                       2210634
27-nov-2016 14:00:37 27-nov-2016 15:00:14               4886175                       4681261
28-nov-2016 14:00:19 28-nov-2016 15:00:45               2512397                       2394625
29-nov-2016 14:00:23 29-nov-2016 15:00:59               3476692                       3323396
29-nov-2016 15:00:59 29-nov-2016 16:00:29               1686445                       1632668
30-nov-2016 14:01:07 30-nov-2016 15:00:37               4468593                       4279528
01-dec-2016 14:00:40 01-dec-2016 15:00:02               2549539                       2426793
02-dec-2016 14:00:18 02-dec-2016 15:00:01               4781775                       4589442
 
32 ROWS selected.

We clearly see the trend with a 7-8 times increase in number of writes.

If you check for a segment that has no figures for a period (appearing or disappearing objects) then the query is a bit more complex to build. In other word building a query to report what appear in a difference AWR report is not so easy. Using standard variance (STDDEV) I have tried to build a query that would show me the segments that have varied the most for physical writes. Again if the segment appear or disappear:

SQL> SET lines 150 pages 1000
SQL> col object_name FOR a50
SQL> SELECT
     TO_CHAR(hsn.begin_interval_time,'dd-mon-yyyy hh24:mi:ss') AS begin_interval_time,
     TO_CHAR(hsn.end_interval_time,'dd-mon-yyyy hh24:mi:ss') AS end_interval_time,
     --obj#,
     (SELECT DISTINCT owner||'.'||object_name||' ('||NVL2(subobject_name,object_type || ': ' || subobject_name,object_type)||')'
      FROM dba_hist_seg_stat_obj
      WHERE obj#=hss.obj# AND dbid=hss.dbid AND dataobj#=hss.dataobj# AND ts#=hss.ts#) AS object_name,
     hss.physical_writes_delta,
     hss.stddev_physical_writes_delta
     FROM
     (SELECT
     snap_id,
     dbid,
     instance_number,
     obj#,
     dataobj#,
     ts#,
     physical_writes_delta,
     ROUND(STDDEV(physical_writes_delta) over (PARTITION BY obj#)) AS stddev_physical_writes_delta,
     COUNT(*) OVER (PARTITION BY obj#) AS nb
     FROM dba_hist_seg_stat
     GROUP BY snap_id,dbid,instance_number,obj#,dataobj#,ts#,physical_writes_delta) hss, dba_hist_snapshot hsn
     WHERE hss.snap_id = hsn.snap_id
     AND hss.instance_number = hsn.instance_number
     AND hss.nb >= 5
     ORDER BY stddev_physical_writes_delta DESC,hss.snap_id;
 
BEGIN_INTERVAL_TIME  END_INTERVAL_TIME    OBJECT_NAME                                        PHYSICAL_WRITES_DELTA STDDEV_PHYSICAL_WRITES_DELTA
-------------------- -------------------- -------------------------------------------------- --------------------- ----------------------------
04-nov-2016 14:00:31 04-nov-2016 15:00:48 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   742817                      1763019
05-nov-2016 14:00:08 05-nov-2016 15:00:26 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   749335                      1763019
05-nov-2016 15:00:26 05-nov-2016 16:00:38 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
06-nov-2016 14:00:41 06-nov-2016 15:01:00 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   815897                      1763019
07-nov-2016 14:00:20 07-nov-2016 15:00:36 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   722403                      1763019
08-nov-2016 14:00:37 08-nov-2016 15:00:56 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   608903                      1763019
09-nov-2016 14:00:51 09-nov-2016 15:00:08 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   633603                      1763019
10-nov-2016 14:00:39 10-nov-2016 15:00:59 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   722215                      1763019
11-nov-2016 14:00:54 11-nov-2016 15:00:14 HUB.CRM_INVOICE_PREV_PK (INDEX)                                   606513                      1763019
11-nov-2016 15:00:14 11-nov-2016 16:00:28 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 16:00:28 11-nov-2016 17:00:46 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 17:00:46 11-nov-2016 18:02:20 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 18:02:20 11-nov-2016 19:00:48 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
11-nov-2016 19:00:48 11-nov-2016 20:00:15 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
18-nov-2016 14:00:40 18-nov-2016 15:00:19 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4514625                      1763019
19-nov-2016 14:00:22 19-nov-2016 15:00:56 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4811683                      1763019
20-nov-2016 14:00:28 20-nov-2016 15:00:10 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2404822                      1763019
21-nov-2016 14:00:08 21-nov-2016 15:00:34 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2300106                      1763019
22-nov-2016 14:00:06 22-nov-2016 15:00:39 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  3589330                      1763019
22-nov-2016 15:00:39 22-nov-2016 16:00:04 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  1422715                      1763019
23-nov-2016 14:00:39 23-nov-2016 15:00:06 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4505033                      1763019
23-nov-2016 17:00:40 23-nov-2016 17:10:56 HUB.CRM_INVOICE_PREV_PK (INDEX)                                        0                      1763019
24-nov-2016 14:00:34 24-nov-2016 15:00:59 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2027948                      1763019
25-nov-2016 14:00:07 25-nov-2016 14:55:52 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4477194                      1763019
26-nov-2016 14:00:59 26-nov-2016 15:00:18 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2350258                      1763019
27-nov-2016 14:00:37 27-nov-2016 15:00:14 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4886175                      1763019
28-nov-2016 14:00:19 28-nov-2016 15:00:45 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2512397                      1763019
29-nov-2016 14:00:23 29-nov-2016 15:00:59 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  3476692                      1763019
29-nov-2016 15:00:59 29-nov-2016 16:00:29 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  1686445                      1763019
30-nov-2016 14:01:07 30-nov-2016 15:00:37 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4468593                      1763019
01-dec-2016 14:00:40 01-dec-2016 15:00:02 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  2549539                      1763019
02-dec-2016 14:00:18 02-dec-2016 15:00:01 HUB.CRM_INVOICE_PREV_PK (INDEX)                                  4781775                      1763019
04-nov-2016 00:00:55 04-nov-2016 01:00:44 HUB.CRM_INVOICE_PK (INDEX)                                        474743                      1141594
04-nov-2016 20:00:26 04-nov-2016 21:00:47 HUB.CRM_INVOICE_PK (INDEX)                                        765938                      1141594
05-nov-2016 21:00:38 05-nov-2016 22:00:04 HUB.CRM_INVOICE_PK (INDEX)                                        919735                      1141594
06-nov-2016 19:00:44 06-nov-2016 20:00:11 HUB.CRM_INVOICE_PK (INDEX)                                        107619                      1141594
06-nov-2016 20:00:11 06-nov-2016 21:00:38 HUB.CRM_INVOICE_PK (INDEX)                                        474607                      1141594
07-nov-2016 20:00:30 07-nov-2016 21:00:45 HUB.CRM_INVOICE_PK (INDEX)                                        725956                      1141594
08-nov-2016 20:00:48 08-nov-2016 21:00:02 HUB.CRM_INVOICE_PK (INDEX)                                        113880                      1141594
08-nov-2016 20:00:48 08-nov-2016 21:00:02 HUB.CRM_INVOICE_PK (INDEX)                                        339013                      1141594
.

In above query I have limited the sample size to at least 5 elements because below this limit I rate standard variance number values makes no sense. We almost get this inside an AWR report except the appearing/disappearing segments…

The Service Request we opened at Oracle support reached to the same exact conclusion…

AWR figures extract and load

If you really come too late to check performance and you then lack the one when they were “good” one of the solution is to restore a database backup and create a temporary database on a test server. You might not always be able to do it but if you can afford do it (at the same time it will validate your backup/restore policy) ! Once this backup has been started export the newest AWR figure from your production database using awrextr.sql script located in in $ORACLE_HOME/rdbms/admin, you just need to create a directory and a bit of free disk space.

Transfer the file to your test server where you have recovered the production backup (so containing old AWR figures) and load in it latest AWR with awrload.sql script located in $ORACLE_HOME/rdbms/admin.

At first try I got below error message:

ERROR AT line 1:
ORA-20105: unable TO move AWR data TO SYS
ORA-06512: AT "SYS.DBMS_SWRF_INTERNAL", line 2984
ORA-20107: NOT allowed TO move AWR data FOR local dbid
ORA-06512: AT line 3

So changed the DBID of my restored database with NID (database New ID) tool (of course the restore must has been well done):

[oraxyz@server11 ~]$ nid target=sys DBNAME=sidxyz
 
DBNEWID: Release 11.2.0.4.0 - Production on Fri Dec 9 15:11:09 2016
 
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
 
Password:
Connected to database EDWHUB (DBID=3256370993)
 
Connected to server version 11.2.0
 
Control Files in database:
    /ora_edwhub1/ctrl/edwhub/control01.ctl
    /ora_edwhub1/ctrl/edwhub/control02.ctl
 
 
The following datafiles are offline immediate:
    /ora_edwhub1/data02/edwhub/logdmfact2_01.db (284)
    /ora_edwhub/data02/edwhub/logdmfact2_01.db (290)
 
NID-00122: Database should have no offline immediate datafiles
 
 
Change of database name failed during validation - database is intact.
DBNEWID - Completed with validation errors.

Once recover well done and DBID change the import will run successfully and you can then generate AWR difference reports…

References

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>