PL/SQL performance tuning with bulk SQL/binding and parallelism

Preamble

As usual, I would say, the question is coming from a colleague asking me if I know BULK COLLECT and FORALL keywords in PL/SQL to improve context switch between SQL and PL/SQL. I knew it as I have read article from Steven Feuerstein: Bulk Processing with BULK COLLECT and FORALL.

But I have never taken the time to test it and to write a small post on it. The testing done by my colleague was giving impressive improvement and so I wanted to see it with my own eyes and practice, as well, a bit on it.

You might think (like I have done for multiple years) that those SQL statements executed in your PL/SQL blocks are free of charge and does not impact at all the flow of your program. In reality they are and Oracle switch from PL/SQL engine to SQL engine. This switch is called a context switch and the aim of this post is testing how it can simply improved using BULK COLLECT and FORALL keywords as well as working with collections to process multiple row at same time. Those two keywords are included, by Oracle, in Bulk SQL feature. For deeper explanation refer either to Oracle documentation or Steven’s article…

Testing has been done using a 12cR1 (12.1.0.2) Enterprise edition running on a Oracle Linux Server release 7.3 virtual machine. My virtual machine has 8GB of memory and 4 threads, the database has 2GB of memory (MEMORY_TARGET). No need to say that if you plan to compare results you must not launch heavy task on your server (desktop in my case) in parallel…

Setup

I create my usual test table with with id column as primary key:

DROP TABLE test01;
 
CREATE TABLE test01 (
id NUMBER,
val VARCHAR2(100)
)
NOLOGGING
TABLESPACE users;
 
ALTER TABLE test01
ADD CONSTRAINT test01_pk PRIMARY KEY (id)
USING INDEX TABLESPACE users
ENABLE VALIDATE;
 
--ALTER TABLE TEST01 DROP CONSTRAINT TEST01_PK;

I fill it with 50 millions rows, almost 8GB on disk and maximum I could afford on my limited virtual machine:

DECLARE
i NUMBER;
nbrows NUMBER;
BEGIN
  i:=1;
  nbrows:=50000000;
  LOOP
    EXIT WHEN i>nbrows;
    INSERT INTO test01 VALUES(i,RPAD('A',100,'A'));
    IF (MOD(i,1000000)=0) THEN
      COMMIT;
    END IF;
    i:=i+1;
  END LOOP;
  COMMIT;
END;
/
 
EXEC dbms_stats.gather_table_stats(ownname => 'yjaquier', tabname => 'test01', degree=> 2, -
method_opt => 'for all columns size auto', CASCADE => TRUE);

The (stupid) test will be an update of all table rows to change the value of val column.

Classical method

The classical method is a cursor on table and a loop to fetch all rows and update current one using the primary key. I replace the 100 A by 100 B to be able to follow if update has been done. I also calculate the execution time and flush buffer cache and shared pool before each run not to be impacted by blocks already in memory. I also commit every one million rows not to fill my 8GB undo tablespace:

ALTER SYSTEM flush buffer_cache;
ALTER SYSTEM flush shared_pool;
SET serveroutput ON SIZE 999999
DECLARE
  start_time NUMBER;
  end_time NUMBER;
  CURSOR cursor1 IS SELECT id,val
  FROM test01;
  item1 cursor1%ROWTYPE;
  i NUMBER;
BEGIN
  start_time:=dbms_utility.get_time;
  i:=1;
  OPEN cursor1;
  LOOP
    FETCH cursor1 INTO item1;
    EXIT WHEN cursor1%notfound;
    UPDATE /* Yannick */ test01 SET val=RPAD('B',100,'B') WHERE id=item1.id;
    i:=i+1;
    IF (MOD(i,1000000)=0) THEN
      COMMIT;
    END IF;
  END LOOP;
  CLOSE cursor1;
  COMMIT;
  end_time:=dbms_utility.get_time;
  dbms_output.put_line('Explicit Cursor Loop Update: ' || TO_CHAR((end_time - start_time)/100) || ' seconds');
END;
/

I have got an average of around 2150 seconds over 7 runs.

Bulk SQL method

The one with BULK COLLECT and FORALL. To complement the code I have chosen to:

Limit memory usage by limiting number of returned rows to 1,000,000 using LIMIT keyword. I also commit every million rows then.
Trap error message using SAVE EXCEPTIONS keyword.

The test:

ALTER SYSTEM flush buffer_cache;
ALTER SYSTEM flush shared_pool;
SET serveroutput ON SIZE 999999
DECLARE
  start_time NUMBER;
  end_time NUMBER;
  i NUMBER;
  CURSOR cursor1 IS SELECT id,val
  FROM test01;
  TYPE collection1 IS TABLE OF cursor1%ROWTYPE INDEX BY PLS_INTEGER;
  item1 collection1;
  dml_errors EXCEPTION;
  PRAGMA exception_init(dml_errors, -24381);
BEGIN
  start_time:=dbms_utility.get_time;
  OPEN cursor1;
  LOOP
    FETCH cursor1 BULK COLLECT INTO item1 limit 1000000;
    EXIT WHEN item1.COUNT=0;
    FORALL i IN item1.FIRST..item1.LAST save exceptions
      UPDATE /* Yannick */ test01 SET val=RPAD('C',100,'C') WHERE id=item1(i).id;
    COMMIT;
  END LOOP;
  CLOSE cursor1;
  COMMIT;
  end_time:=dbms_utility.get_time;
  dbms_output.put_line('FORALL and BULK COLLECT Update: ' || TO_CHAR((end_time - start_time)/100) || ' seconds');
EXCEPTION
  WHEN dml_errors THEN
    IF cursor1%isopen THEN CLOSE cursor1;
    END IF;
    FOR i IN 1..sql%bulk_exceptions.COUNT LOOP
      dbms_output.put_line(SQLERRM(-(sql%bulk_exceptions(i).error_code)));
      dbms_output.put_line(sql%bulk_exceptions(i).error_index);
    END LOOP;
    COMMIT;
  WHEN OTHERS THEN
    dbms_output.put_line('Unrecognized error.');
    RAISE;
END;
/

I have got an average of around 980 seconds over 7 runs.

Overall the improvement is around 2 times !! Moreover even if few slight modifications are needed it is not a complete redesign of your PL/SQL block. Here the example is deliberately simple but this type of loop to update few columns is a classical way of working in BI environment. I rate this bulk SQL/binding features as a rapid and simple quick win to implement and benefit from !

DBMS_PARALLEL_EXECUTE method

In above testing my colleague tried to use DML parallel execution, that is, I have discovered not activated by default. Only parallel DDL and query (ouf) is parallel enabled by default. Obviously in the classical method as the statement is updating one row at a time there is nothing to parallelize. But in the bulk SQL one you might have expected the PL/SQL block to be parallelize. With the help of the /* Yannick */ comment I can fetch V$SQL view (you should flush the shared pool before) and with PX_SERVERS_EXECUTIONS column you can see if your statement has used parallel processes and how many.

SQL> col sql_text FOR a50 word_wrapped
SQL> SET lines 150 pages 1000
SQL> SELECT
     sql_id,
     SUBSTR(sql_text,1,200) AS sql_text,
     px_servers_executions,
     disk_reads,
     buffer_gets,
     plsql_exec_time,
     elapsed_time
     FROM v$sql
     WHERE sql_text LIKE '%/* Yannick */%'
     AND sql_text NOT LIKE 'select%';
 
SQL_ID        SQL_TEXT                                           PX_SERVERS_EXECUTIONS DISK_READS BUFFER_GETS PLSQL_EXEC_TIME ELAPSED_TIME
------------- -------------------------------------------------- --------------------- ---------- ----------- --------------- ------------
32h9ka1576yny DECLARE   start_time NUMBER;   end_time NUMBER;                        0    1662231   252397556       218299035   2196358064
              CURSOR cursor1 IS SELECT id,val   FROM test01;
              item1 cursor1%ROWTYPE;   i NUMBER; BEGIN
              start_time:=dbms_utility.get_time;   i:=1;   OPEN
              cursor1;
 
SQL_ID        SQL_TEXT                                           PX_SERVERS_EXECUTIONS DISK_READS BUFFER_GETS PLSQL_EXEC_TIME ELAPSED_TIME
------------- -------------------------------------------------- --------------------- ---------- ----------- --------------- ------------
9zjc7pp7sp0sa DECLARE   start_time NUMBER;   end_time NUMBER;                        0    1662256   111747162          435065    993450741
              i NUMBER;   CURSOR cursor1 IS SELECT id,val   FROM
              test01;   TYPE collection1 IS TABLE OF
              cursor1%ROWTYPE INDEX BY PLS_INTEGER;   item1
              collection1;

As you can see it is serial, so how could we benefit from modern multicores processor and update the table in parallel and inside a PL/SQL block. For example how to benefit from:

SQL> ALTER SESSION enable parallel dml;
 
SESSION altered.

Well reading around you cannot with a simple UPDATE as this cannot run in parallel on a non-partitioned table. The only option would be to issue multiple jobs in parallel working each on a subset of the table. DBMS_PARALLEL_EXECUTE comes to the rescue to ease this type of implementation. The package principle is to split the table you plan to update in multiple smaller chunks and execute multiple tasks on all chunks in parallel. Each execution will be done through an Oracle job (CREATE JOB privilege needed).

It starts by a task creation:

SQL> EXEC dbms_parallel_execute.create_task('task01');
 
PL/SQL PROCEDURE successfully completed.
 
SQL> SET lines 150 pages 1000
SQL> col task_name FOR a10
SQL> SELECT task_name, chunk_type, status FROM user_parallel_execute_tasks;
 
TASK_NAME  CHUNK_TYPE   STATUS
---------- ------------ -------------------
task01     UNDELARED    CREATED

You have the choice to create chunks by rowid, column value or user defined by SQL. I have chosen to create chunks based on ID column value, I have created chunks of one million rows to keep same spirit as previous tests:

SQL> EXEC dbms_parallel_execute.create_chunks_by_number_col(task_name => 'task01', table_owner => 'YJAQUIER', -
table_name => 'TEST01', table_column => 'ID', chunk_size => 1000000);
 
PL/SQL PROCEDURE successfully completed.

If you mess up in creating chunks you can still use:

SQL> EXEC dbms_parallel_execute.drop_chunks(task_name => 'task01');
 
PL/SQL PROCEDURE successfully completed.

You can control chunks have been created with:

SQL> SELECT chunk_id, status, start_id, end_id
     FROM user_parallel_execute_chunks
     WHERE task_name='task01'
     ORDER BY chunk_id;
 
  CHUNK_ID STATUS                 START_ID     END_ID
---------- -------------------- ---------- ----------
         1 UNASSIGNED                    1    1000000
         2 UNASSIGNED              1000001    2000000
         3 UNASSIGNED              2000001    3000000
         4 UNASSIGNED              3000001    4000000
         5 UNASSIGNED              4000001    5000000
         6 UNASSIGNED              5000001    6000000
         7 UNASSIGNED              6000001    7000000
         8 UNASSIGNED              7000001    8000000
         9 UNASSIGNED              8000001    9000000
        10 UNASSIGNED              9000001   10000000
        11 UNASSIGNED             10000001   11000000
        12 UNASSIGNED             11000001   12000000
        13 UNASSIGNED             12000001   13000000
        14 UNASSIGNED             13000001   14000000
        15 UNASSIGNED             14000001   15000000
        16 UNASSIGNED             15000001   16000000
        17 UNASSIGNED             16000001   17000000
        18 UNASSIGNED             17000001   18000000
        19 UNASSIGNED             18000001   19000000
        20 UNASSIGNED             19000001   20000000
        21 UNASSIGNED             20000001   21000000
        22 UNASSIGNED             21000001   22000000
        23 UNASSIGNED             22000001   23000000
        24 UNASSIGNED             23000001   24000000
        25 UNASSIGNED             24000001   25000000
        26 UNASSIGNED             25000001   26000000
        27 UNASSIGNED             26000001   27000000
        28 UNASSIGNED             27000001   28000000
        29 UNASSIGNED             28000001   29000000
        30 UNASSIGNED             29000001   30000000
        31 UNASSIGNED             30000001   31000000
        32 UNASSIGNED             31000001   32000000
        33 UNASSIGNED             32000001   33000000
        34 UNASSIGNED             33000001   34000000
        35 UNASSIGNED             34000001   35000000
        36 UNASSIGNED             35000001   36000000
        37 UNASSIGNED             36000001   37000000
        38 UNASSIGNED             37000001   38000000
        39 UNASSIGNED             38000001   39000000
        40 UNASSIGNED             39000001   40000000
        41 UNASSIGNED             40000001   41000000
        42 UNASSIGNED             41000001   42000000
        43 UNASSIGNED             42000001   43000000
        44 UNASSIGNED             43000001   44000000
        45 UNASSIGNED             44000001   45000000
        46 UNASSIGNED             45000001   46000000
        47 UNASSIGNED             46000001   47000000
        48 UNASSIGNED             47000001   48000000
        49 UNASSIGNED             48000001   49000000
        50 UNASSIGNED             49000001   50000000
 
50 ROWS selected.

SQL> select chunk_id, status, start_id, end_id from user_parallel_execute_chunks where task_name='task01' order by chunk_id;CHUNK_ID STATUS START_ID END_ID ---------- -------------------- ---------- ---------- 1 UNASSIGNED 1 1000000 2 UNASSIGNED 1000001 2000000 3 UNASSIGNED 2000001 3000000 4 UNASSIGNED 3000001 4000000 5 UNASSIGNED 4000001 5000000 6 UNASSIGNED 5000001 6000000 7 UNASSIGNED 6000001 7000000 8 UNASSIGNED 7000001 8000000 9 UNASSIGNED 8000001 9000000 10 UNASSIGNED 9000001 10000000 11 UNASSIGNED 10000001 11000000 12 UNASSIGNED 11000001 12000000 13 UNASSIGNED 12000001 13000000 14 UNASSIGNED 13000001 14000000 15 UNASSIGNED 14000001 15000000 16 UNASSIGNED 15000001 16000000 17 UNASSIGNED 16000001 17000000 18 UNASSIGNED 17000001 18000000 19 UNASSIGNED 18000001 19000000 20 UNASSIGNED 19000001 20000000 21 UNASSIGNED 20000001 21000000 22 UNASSIGNED 21000001 22000000 23 UNASSIGNED 22000001 23000000 24 UNASSIGNED 23000001 24000000 25 UNASSIGNED 24000001 25000000 26 UNASSIGNED 25000001 26000000 27 UNASSIGNED 26000001 27000000 28 UNASSIGNED 27000001 28000000 29 UNASSIGNED 28000001 29000000 30 UNASSIGNED 29000001 30000000 31 UNASSIGNED 30000001 31000000 32 UNASSIGNED 31000001 32000000 33 UNASSIGNED 32000001 33000000 34 UNASSIGNED 33000001 34000000 35 UNASSIGNED 34000001 35000000 36 UNASSIGNED 35000001 36000000 37 UNASSIGNED 36000001 37000000 38 UNASSIGNED 37000001 38000000 39 UNASSIGNED 38000001 39000000 40 UNASSIGNED 39000001 40000000 41 UNASSIGNED 40000001 41000000 42 UNASSIGNED 41000001 42000000 43 UNASSIGNED 42000001 43000000 44 UNASSIGNED 43000001 44000000 45 UNASSIGNED 44000001 45000000 46 UNASSIGNED 45000001 46000000 47 UNASSIGNED 46000001 47000000 48 UNASSIGNED 47000001 48000000 49 UNASSIGNED 48000001 49000000 50 UNASSIGNED 49000001 5000000050 rows selected.

Now execute the task with code below. One important parameter is PARALLEL_LEVEL:

ALTER SYSTEM flush buffer_cache;
ALTER SYSTEM flush shared_pool;
SET serveroutput ON SIZE 999999
DECLARE
  start_time NUMBER;
  end_time NUMBER;
  sql_stmt VARCHAR2(32767);
BEGIN
  start_time:=dbms_utility.get_time;
  sql_stmt := 'update /* Yannick */ test01 set val=rpad(''D'',100,''D'') where id between :start_id and :end_id';
 
  dbms_parallel_execute.run_task(task_name => 'task01', sql_stmt => sql_stmt, language_flag => dbms_sql.native, parallel_level => 25);
 
  end_time:=dbms_utility.get_time;
  dbms_output.put_line('DBMS_PARALLEL_EXECUTE Update: ' || TO_CHAR((end_time - start_time)/100) || ' seconds');
END;
/

To know which jobs have been created for your task you need to get JOB_PREFIX column from USER_PARALLEL_EXECUTE_TASKS view and use this prefix with LIKE keyword to get job names list (not too easy to explain but below SQL is sel-explaining I hope):

SQL> col task_name FOR a10
SQL> col job_prefix FOR a15
SQL> SELECT task_name, status, job_prefix, parallel_level FROM user_parallel_execute_tasks;
 
TASK_NAME  STATUS              JOB_PREFIX      PARALLEL_LEVEL
---------- ------------------- --------------- --------------
task01     FINISHED            TASK$_442                   25
 
SQL> col job_name FOR a20
SQL> col run_duration FOR a15
SQL> SELECT job_name, status, run_duration
     FROM user_scheduler_job_run_details
     WHERE job_name LIKE (SELECT job_prefix || '%' FROM user_parallel_execute_tasks WHERE task_name='task01');
 
JOB_NAME             STATUS                         RUN_DURATION
-------------------- ------------------------------ ---------------
TASK$_442_11       SUCCEEDED                      +000 00:02:24
TASK$_442_10       SUCCEEDED                      +000 00:09:19
TASK$_442_2        SUCCEEDED                      +000 00:09:22
TASK$_442_1        SUCCEEDED                      +000 00:09:22
TASK$_442_13       SUCCEEDED                      +000 00:02:20
TASK$_442_3        SUCCEEDED                      +000 00:09:18
TASK$_442_8        SUCCEEDED                      +000 00:02:07
TASK$_442_14       SUCCEEDED                      +000 00:01:25
TASK$_442_15       SUCCEEDED                      +000 00:02:06
TASK$_442_18       SUCCEEDED                      +000 00:02:16
TASK$_442_19       SUCCEEDED                      +000 00:02:20
TASK$_442_21       SUCCEEDED                      +000 00:09:14
TASK$_442_23       SUCCEEDED                      +000 00:02:09
TASK$_442_25       SUCCEEDED                      +000 00:02:13
TASK$_442_5        SUCCEEDED                      +000 00:02:22
TASK$_442_9        SUCCEEDED                      +000 00:01:43
TASK$_442_22       SUCCEEDED                      +000 00:02:25
TASK$_442_24       SUCCEEDED                      +000 00:10:10
TASK$_442_6        SUCCEEDED                      +000 00:02:26
TASK$_442_12       SUCCEEDED                      +000 00:09:24
TASK$_442_4        SUCCEEDED                      +000 00:02:25
TASK$_442_7        SUCCEEDED                      +000 00:09:16
TASK$_442_16       SUCCEEDED                      +000 00:08:24
TASK$_442_17       SUCCEEDED                      +000 00:02:20
TASK$_442_20       SUCCEEDED                      +000 00:11:06
 
25 ROWS selected.

SQL> col task_name for a10 SQL> col job_prefix for a15 SQL> select task_name, status, job_prefix, parallel_level from user_parallel_execute_tasks;TASK_NAME STATUS JOB_PREFIX PARALLEL_LEVEL ---------- ------------------- --------------- -------------- task01 FINISHED TASK$_442 25SQL> col job_name for a20 SQL> col run_duration for a15 SQL> select job_name, status, run_duration from user_scheduler_job_run_details where job_name like (select job_prefix || '%' from user_parallel_execute_tasks where task_name='task01');JOB_NAME STATUS RUN_DURATION -------------------- ------------------------------ --------------- TASK$_442_11 SUCCEEDED +000 00:02:24 TASK$_442_10 SUCCEEDED +000 00:09:19 TASK$_442_2 SUCCEEDED +000 00:09:22 TASK$_442_1 SUCCEEDED +000 00:09:22 TASK$_442_13 SUCCEEDED +000 00:02:20 TASK$_442_3 SUCCEEDED +000 00:09:18 TASK$_442_8 SUCCEEDED +000 00:02:07 TASK$_442_14 SUCCEEDED +000 00:01:25 TASK$_442_15 SUCCEEDED +000 00:02:06 TASK$_442_18 SUCCEEDED +000 00:02:16 TASK$_442_19 SUCCEEDED +000 00:02:20 TASK$_442_21 SUCCEEDED +000 00:09:14 TASK$_442_23 SUCCEEDED +000 00:02:09 TASK$_442_25 SUCCEEDED +000 00:02:13 TASK$_442_5 SUCCEEDED +000 00:02:22 TASK$_442_9 SUCCEEDED +000 00:01:43 TASK$_442_22 SUCCEEDED +000 00:02:25 TASK$_442_24 SUCCEEDED +000 00:10:10 TASK$_442_6 SUCCEEDED +000 00:02:26 TASK$_442_12 SUCCEEDED +000 00:09:24 TASK$_442_4 SUCCEEDED +000 00:02:25 TASK$_442_7 SUCCEEDED +000 00:09:16 TASK$_442_16 SUCCEEDED +000 00:08:24 TASK$_442_17 SUCCEEDED +000 00:02:20 TASK$_442_20 SUCCEEDED +000 00:11:0625 rows selected.

Even if the tasks are all in success you might have some chunks that have not been processed correctly:

SQL> SELECT chunk_id, status, start_id, end_id
     FROM user_parallel_execute_chunks
     WHERE task_name='task01'
     ORDER BY chunk_id;
 
  CHUNK_ID STATUS                 START_ID     END_ID
---------- -------------------- ---------- ----------
         1 PROCESSED                     1    1000000
         2 PROCESSED               1000001    2000000
         3 PROCESSED               2000001    3000000
         4 PROCESSED               3000001    4000000
         5 PROCESSED               4000001    5000000
         6 PROCESSED               5000001    6000000
         7 PROCESSED               6000001    7000000
         8 PROCESSED               7000001    8000000
         9 PROCESSED               8000001    9000000
        10 PROCESSED               9000001   10000000
        11 PROCESSED              10000001   11000000
        12 PROCESSED              11000001   12000000
        13 PROCESSED              12000001   13000000
        14 PROCESSED              13000001   14000000
        15 PROCESSED              14000001   15000000
        16 PROCESSED              15000001   16000000
        17 PROCESSED              16000001   17000000
        18 PROCESSED              17000001   18000000
        19 PROCESSED              18000001   19000000
        20 PROCESSED              19000001   20000000
        21 PROCESSED              20000001   21000000
        22 PROCESSED_WITH_ERROR   21000001   22000000
        23 PROCESSED              22000001   23000000
        24 PROCESSED_WITH_ERROR   23000001   24000000
        25 PROCESSED              24000001   25000000
        26 PROCESSED              25000001   26000000
        27 PROCESSED              26000001   27000000
        28 PROCESSED_WITH_ERROR   27000001   28000000
        29 PROCESSED              28000001   29000000
        30 PROCESSED_WITH_ERROR   29000001   30000000
        31 PROCESSED_WITH_ERROR   30000001   31000000
        32 PROCESSED              31000001   32000000
        33 PROCESSED_WITH_ERROR   32000001   33000000
        34 PROCESSED_WITH_ERROR   33000001   34000000
        35 PROCESSED_WITH_ERROR   34000001   35000000
        36 PROCESSED_WITH_ERROR   35000001   36000000
        37 PROCESSED              36000001   37000000
        38 PROCESSED              37000001   38000000
        39 PROCESSED_WITH_ERROR   38000001   39000000
        40 PROCESSED              39000001   40000000
        41 PROCESSED              40000001   41000000
        42 PROCESSED              41000001   42000000
        43 PROCESSED              42000001   43000000
        44 PROCESSED              43000001   44000000
        45 PROCESSED              44000001   45000000
        46 PROCESSED              45000001   46000000
        47 PROCESSED              46000001   47000000
        48 PROCESSED              47000001   48000000
        49 PROCESSED              48000001   49000000
        50 PROCESSED              49000001   50000000
 
50 ROWS selected.

SQL> select chunk_id, status, start_id, end_id from user_parallel_execute_chunks where task_name='task01' order by chunk_id;CHUNK_ID STATUS START_ID END_ID ---------- -------------------- ---------- ---------- 1 PROCESSED 1 1000000 2 PROCESSED 1000001 2000000 3 PROCESSED 2000001 3000000 4 PROCESSED 3000001 4000000 5 PROCESSED 4000001 5000000 6 PROCESSED 5000001 6000000 7 PROCESSED 6000001 7000000 8 PROCESSED 7000001 8000000 9 PROCESSED 8000001 9000000 10 PROCESSED 9000001 10000000 11 PROCESSED 10000001 11000000 12 PROCESSED 11000001 12000000 13 PROCESSED 12000001 13000000 14 PROCESSED 13000001 14000000 15 PROCESSED 14000001 15000000 16 PROCESSED 15000001 16000000 17 PROCESSED 16000001 17000000 18 PROCESSED 17000001 18000000 19 PROCESSED 18000001 19000000 20 PROCESSED 19000001 20000000 21 PROCESSED 20000001 21000000 22 PROCESSED_WITH_ERROR 21000001 22000000 23 PROCESSED 22000001 23000000 24 PROCESSED_WITH_ERROR 23000001 24000000 25 PROCESSED 24000001 25000000 26 PROCESSED 25000001 26000000 27 PROCESSED 26000001 27000000 28 PROCESSED_WITH_ERROR 27000001 28000000 29 PROCESSED 28000001 29000000 30 PROCESSED_WITH_ERROR 29000001 30000000 31 PROCESSED_WITH_ERROR 30000001 31000000 32 PROCESSED 31000001 32000000 33 PROCESSED_WITH_ERROR 32000001 33000000 34 PROCESSED_WITH_ERROR 33000001 34000000 35 PROCESSED_WITH_ERROR 34000001 35000000 36 PROCESSED_WITH_ERROR 35000001 36000000 37 PROCESSED 36000001 37000000 38 PROCESSED 37000001 38000000 39 PROCESSED_WITH_ERROR 38000001 39000000 40 PROCESSED 39000001 40000000 41 PROCESSED 40000001 41000000 42 PROCESSED 41000001 42000000 43 PROCESSED 42000001 43000000 44 PROCESSED 43000001 44000000 45 PROCESSED 44000001 45000000 46 PROCESSED 45000001 46000000 47 PROCESSED 46000001 47000000 48 PROCESSED 47000001 48000000 49 PROCESSED 48000001 49000000 50 PROCESSED 49000001 5000000050 rows selected.

You can try to understand why with ERROR_CODE and ERROR_MESSAGE columns of USER_PARALLEL_EXECUTE_CHUNKS view:

SQL> col error_message FOR a40 word_wrapped
SQL> SELECT error_code, error_message
     FROM user_parallel_execute_chunks
     WHERE chunk_id=22;
 
ERROR_CODE ERROR_MESSAGE
---------- ----------------------------------------
    -30036 ORA-30036: unable TO extend SEGMENT BY 8
           IN undo TABLESPACE 'UNDOTBS1'

And here you see the mistake I have done, I have tried to proceed the 50 chunks with 25 task running in parallel. It means I would need to have undo to store the update (without committing) of 25 millions rows. It is really far from the previous methods I have used. And to be honest it was not faster that the bulk method. I have refined the approach by creating chunks of 100,000 rows (so 500 chunks) and processing 10 in parallel and got 640.39 seconds…

With chunks of 50,000 (so 1000 chucks) and 20 jobs in parallel I got 972.92 seconds… This just to show that finding the good balance between chunks size and parallelism is not so easy and might require multiple tries…

Another cool query to see progress of your task is:

SQL> SELECT status, COUNT(*)
     FROM user_parallel_execute_chunks
     WHERE task_name='task01'
     GROUP BY status
    ORDER BY status
 
STATUS                 COUNT(*)
-------------------- ----------
ASSIGNED                      6
PROCESSED                   165
UNASSIGNED                  329

Over 7 runs (500 chunks of 100,000 rows and 10 jobs in parallel) I have got an average of 610 seconds which is again a 40% improvement versus bulk SQL. Of course this is really more complex to implement and you must be able to divide and conquer to implement parallel processing with DBMS_PARALLEL_EXECUTE package. On top of it as each processed chunks are committed it also helps to decrease your undo footprint. I would have loved to be aware of this package sooner when I have developed few BI PL/SQL processes…

At the end to clean the task use:

SQL> EXEC dbms_parallel_execute.drop_task('task01');
 
PL/SQL PROCEDURE successfully completed.

Raw results

The raw results I have gotten with the three methods:

Classical	Bulk SQL	DBMS_PARALLEL_EXECUTE
1784.23	1106.4	640.39
2195.67	1026.9	517.1
2306.09	870.39	640.56
1997.96	927.7	637.14
2196.34	968.7	631.4
2428.13	993.36	569.85
2115.6	919.14	565.35

References

Getting ORA-04030 Error While Executing A Procedure Having BULK COLLECT (Doc ID 1221283.1)
Excessive REDO Generation For FORALL Loop With SAVE EXCEPTION Clause (Doc ID 1995875.1)
How to Perform Multiple Updates Using PL/SQL and Bulk Binds (Doc ID 728819.1)
Bulk Binding – What it is, Advantages, and How to use it (Doc ID 74123.1)
8 Bulk Update Methods Compared
Oracle Database – Parallel DML
Working with Collections
On Working in Parallel
The Joy of Low-Hanging Fruit
Loading data fast – regular insert vs. bulk insert
Bulk Collect and FORALL

About Post Author

Yannick Jaquier

administrator

Find more about me on social media.

See author's posts

Share the knowledge!

8 thoughts on “PL/SQL performance tuning with bulk SQL/binding and parallelism”

Anil M says:
December 6, 2018 at 13:16
Thanks for good example, I still have few doubts, please clarify.
As per example considered above having below select query which consideres all records of table(its an example of full table update)
SELECT id,val FROM test01;
If we have condition as below select query, is it possible to handle using DBMS_PARALLEL_EXECUTE? If yes please give an example
SELECT id,val FROM test01 where id between 1 to 50000;
- Yannick Jaquier says:
  December 12, 2018 at 16:12
  I have not tried it but yes should not be specifically a problem to add an additional where clause to the update statement.
ARUL JEBIN says:
August 6, 2018 at 08:40
YOU HAVE PASSED 25 PARALLEL EXECUTION. BUT ONLY 6 TASKS IS ASSIGNED. WHAT COULD BE THE REASON.
- Yannick Jaquier says:
  August 7, 2018 at 16:25
  Good catch ! Coming from a wrong copy/paste of a different test. Updated the output…
  Thanks,
  Yannick.
James says:
June 1, 2017 at 20:01
Great article! One question…
JOB_NAME STATUS RUN_DURATION
——————– —————————— —————
TASK$_728_5 SUCCEEDED +000 00:02:00
TASK$_728_7 SUCCEEDED +000 00:09:16
TASK$_728_8 SUCCEEDED +000 00:00:00
TASK$_728_9 SUCCEEDED +000 00:01:16
TASK$_728_10 SUCCEEDED +000 00:09:19
TASK$_728_2 SUCCEEDED +000 00:09:22
TASK$_728_3 SUCCEEDED +000 00:09:18
TASK$_728_6 SUCCEEDED +000 00:00:00
TASK$_728_4 SUCCEEDED +000 00:00:00
TASK$_728_1 SUCCEEDED +000 00:09:22
I see some of the jobs did not process any chunks but still says it’s succeeded. (TASK$_728_8, TASK$_728_6, TASK$_728_4). Do you know why that happens??
- Yannick Jaquier says:
  June 2, 2017 at 11:28
  Thanks for comment !
  Good catch, maybe the screenshot is coming from tasks that got undo space error while processing chunks…
Pierre-yves COLLE says:
February 23, 2017 at 16:24
A big thanks to Yannick for this wonderful very clear, detailed and easy to understand and use article.
Although I’m not a rookie guy in SQL and ORACLE, having spent multiple years here programming applications,
I learned a great new feature that I was absolutely not aware of ORACLE and thanks to his valuable example I was able to test it
Not only the value added of this ORACLE feature is well documented but also the example allow you to feel immediately its interest and not to stay just theoretical as it is the case many times in many other articles being on ORACLE or on other technology.
BIG BIG THANKS AGAIN
- Yannick Jaquier says:
  February 23, 2017 at 16:39
  Welcome and thanks a ton for enthusiastic comment !
  Yannick.

IT World

PL/SQL performance tuning with bulk SQL/binding and parallelism

Table of contents

Preamble

Setup

Classical method

Bulk SQL method

DBMS_PARALLEL_EXECUTE method

Raw results

References

About Post Author

Yannick Jaquier

Related Posts:

8 thoughts on “PL/SQL performance tuning with bulk SQL/binding and parallelism”

Anil M says:

Yannick Jaquier says:

ARUL JEBIN says:

Yannick Jaquier says:

James says:

Yannick Jaquier says:

Pierre-yves COLLE says:

Yannick Jaquier says:

Leave a Reply Cancel Reply