Table of contents
Preamble
As usual, I would say, the question is coming from a colleague asking me if I know BULK COLLECT and FORALL keywords in PL/SQL to improve context switch between SQL and PL/SQL. I knew it as I have read article from Steven Feuerstein: Bulk Processing with BULK COLLECT and FORALL.
But I have never taken the time to test it and to write a small post on it. The testing done by my colleague was giving impressive improvement and so I wanted to see it with my own eyes and practice, as well, a bit on it.
You might think (like I have done for multiple years) that those SQL statements executed in your PL/SQL blocks are free of charge and does not impact at all the flow of your program. In reality they are and Oracle switch from PL/SQL engine to SQL engine. This switch is called a context switch and the aim of this post is testing how it can simply improved using BULK COLLECT and FORALL keywords as well as working with collections to process multiple row at same time. Those two keywords are included, by Oracle, in Bulk SQL feature. For deeper explanation refer either to Oracle documentation or Steven’s article…
Testing has been done using a 12cR1 (12.1.0.2) Enterprise edition running on a Oracle Linux Server release 7.3 virtual machine. My virtual machine has 8GB of memory and 4 threads, the database has 2GB of memory (MEMORY_TARGET). No need to say that if you plan to compare results you must not launch heavy task on your server (desktop in my case) in parallel…
Setup
I create my usual test table with with id column as primary key:
DROP TABLE test01; CREATE TABLE test01 ( id NUMBER, val VARCHAR2(100) ) NOLOGGING TABLESPACE users; ALTER TABLE test01 ADD CONSTRAINT test01_pk PRIMARY KEY (id) USING INDEX TABLESPACE users ENABLE VALIDATE; --ALTER TABLE TEST01 DROP CONSTRAINT TEST01_PK; |
I fill it with 50 millions rows, almost 8GB on disk and maximum I could afford on my limited virtual machine:
DECLARE i NUMBER; nbrows NUMBER; BEGIN i:=1; nbrows:=50000000; LOOP EXIT WHEN i>nbrows; INSERT INTO test01 VALUES(i,RPAD('A',100,'A')); IF (MOD(i,1000000)=0) THEN COMMIT; END IF; i:=i+1; END LOOP; COMMIT; END; / EXEC dbms_stats.gather_table_stats(ownname => 'yjaquier', tabname => 'test01', degree=> 2, - method_opt => 'for all columns size auto', CASCADE => TRUE); |
The (stupid) test will be an update of all table rows to change the value of val column.
Classical method
The classical method is a cursor on table and a loop to fetch all rows and update current one using the primary key. I replace the 100 A by 100 B to be able to follow if update has been done. I also calculate the execution time and flush buffer cache and shared pool before each run not to be impacted by blocks already in memory. I also commit every one million rows not to fill my 8GB undo tablespace:
ALTER SYSTEM flush buffer_cache; ALTER SYSTEM flush shared_pool; SET serveroutput ON SIZE 999999 DECLARE start_time NUMBER; end_time NUMBER; CURSOR cursor1 IS SELECT id,val FROM test01; item1 cursor1%ROWTYPE; i NUMBER; BEGIN start_time:=dbms_utility.get_time; i:=1; OPEN cursor1; LOOP FETCH cursor1 INTO item1; EXIT WHEN cursor1%notfound; UPDATE /* Yannick */ test01 SET val=RPAD('B',100,'B') WHERE id=item1.id; i:=i+1; IF (MOD(i,1000000)=0) THEN COMMIT; END IF; END LOOP; CLOSE cursor1; COMMIT; end_time:=dbms_utility.get_time; dbms_output.put_line('Explicit Cursor Loop Update: ' || TO_CHAR((end_time - start_time)/100) || ' seconds'); END; / |
I have got an average of around 2150 seconds over 7 runs.
Bulk SQL method
The one with BULK COLLECT and FORALL. To complement the code I have chosen to:
- Limit memory usage by limiting number of returned rows to 1,000,000 using LIMIT keyword. I also commit every million rows then.
- Trap error message using SAVE EXCEPTIONS keyword.
The test:
ALTER SYSTEM flush buffer_cache; ALTER SYSTEM flush shared_pool; SET serveroutput ON SIZE 999999 DECLARE start_time NUMBER; end_time NUMBER; i NUMBER; CURSOR cursor1 IS SELECT id,val FROM test01; TYPE collection1 IS TABLE OF cursor1%ROWTYPE INDEX BY PLS_INTEGER; item1 collection1; dml_errors EXCEPTION; PRAGMA exception_init(dml_errors, -24381); BEGIN start_time:=dbms_utility.get_time; OPEN cursor1; LOOP FETCH cursor1 BULK COLLECT INTO item1 limit 1000000; EXIT WHEN item1.COUNT=0; FORALL i IN item1.FIRST..item1.LAST save exceptions UPDATE /* Yannick */ test01 SET val=RPAD('C',100,'C') WHERE id=item1(i).id; COMMIT; END LOOP; CLOSE cursor1; COMMIT; end_time:=dbms_utility.get_time; dbms_output.put_line('FORALL and BULK COLLECT Update: ' || TO_CHAR((end_time - start_time)/100) || ' seconds'); EXCEPTION WHEN dml_errors THEN IF cursor1%isopen THEN CLOSE cursor1; END IF; FOR i IN 1..sql%bulk_exceptions.COUNT LOOP dbms_output.put_line(SQLERRM(-(sql%bulk_exceptions(i).error_code))); dbms_output.put_line(sql%bulk_exceptions(i).error_index); END LOOP; COMMIT; WHEN OTHERS THEN dbms_output.put_line('Unrecognized error.'); RAISE; END; / |
I have got an average of around 980 seconds over 7 runs.
Overall the improvement is around 2 times !! Moreover even if few slight modifications are needed it is not a complete redesign of your PL/SQL block. Here the example is deliberately simple but this type of loop to update few columns is a classical way of working in BI environment. I rate this bulk SQL/binding features as a rapid and simple quick win to implement and benefit from !
DBMS_PARALLEL_EXECUTE method
In above testing my colleague tried to use DML parallel execution, that is, I have discovered not activated by default. Only parallel DDL and query (ouf) is parallel enabled by default. Obviously in the classical method as the statement is updating one row at a time there is nothing to parallelize. But in the bulk SQL one you might have expected the PL/SQL block to be parallelize. With the help of the /* Yannick */ comment I can fetch V$SQL view (you should flush the shared pool before) and with PX_SERVERS_EXECUTIONS column you can see if your statement has used parallel processes and how many.
SQL> col sql_text FOR a50 word_wrapped SQL> SET lines 150 pages 1000 SQL> SELECT sql_id, SUBSTR(sql_text,1,200) AS sql_text, px_servers_executions, disk_reads, buffer_gets, plsql_exec_time, elapsed_time FROM v$sql WHERE sql_text LIKE '%/* Yannick */%' AND sql_text NOT LIKE 'select%'; SQL_ID SQL_TEXT PX_SERVERS_EXECUTIONS DISK_READS BUFFER_GETS PLSQL_EXEC_TIME ELAPSED_TIME ------------- -------------------------------------------------- --------------------- ---------- ----------- --------------- ------------ 32h9ka1576yny DECLARE start_time NUMBER; end_time NUMBER; 0 1662231 252397556 218299035 2196358064 CURSOR cursor1 IS SELECT id,val FROM test01; item1 cursor1%ROWTYPE; i NUMBER; BEGIN start_time:=dbms_utility.get_time; i:=1; OPEN cursor1; SQL_ID SQL_TEXT PX_SERVERS_EXECUTIONS DISK_READS BUFFER_GETS PLSQL_EXEC_TIME ELAPSED_TIME ------------- -------------------------------------------------- --------------------- ---------- ----------- --------------- ------------ 9zjc7pp7sp0sa DECLARE start_time NUMBER; end_time NUMBER; 0 1662256 111747162 435065 993450741 i NUMBER; CURSOR cursor1 IS SELECT id,val FROM test01; TYPE collection1 IS TABLE OF cursor1%ROWTYPE INDEX BY PLS_INTEGER; item1 collection1; |
As you can see it is serial, so how could we benefit from modern multicores processor and update the table in parallel and inside a PL/SQL block. For example how to benefit from:
SQL> ALTER SESSION enable parallel dml; SESSION altered. |
Well reading around you cannot with a simple UPDATE as this cannot run in parallel on a non-partitioned table. The only option would be to issue multiple jobs in parallel working each on a subset of the table. DBMS_PARALLEL_EXECUTE comes to the rescue to ease this type of implementation. The package principle is to split the table you plan to update in multiple smaller chunks and execute multiple tasks on all chunks in parallel. Each execution will be done through an Oracle job (CREATE JOB privilege needed).
It starts by a task creation:
SQL> EXEC dbms_parallel_execute.create_task('task01'); PL/SQL PROCEDURE successfully completed. SQL> SET lines 150 pages 1000 SQL> col task_name FOR a10 SQL> SELECT task_name, chunk_type, status FROM user_parallel_execute_tasks; TASK_NAME CHUNK_TYPE STATUS ---------- ------------ ------------------- task01 UNDELARED CREATED |
You have the choice to create chunks by rowid, column value or user defined by SQL. I have chosen to create chunks based on ID column value, I have created chunks of one million rows to keep same spirit as previous tests:
SQL> EXEC dbms_parallel_execute.create_chunks_by_number_col(task_name => 'task01', table_owner => 'YJAQUIER', - table_name => 'TEST01', table_column => 'ID', chunk_size => 1000000); PL/SQL PROCEDURE successfully completed. |
If you mess up in creating chunks you can still use:
SQL> EXEC dbms_parallel_execute.drop_chunks(task_name => 'task01'); PL/SQL PROCEDURE successfully completed. |
You can control chunks have been created with:
SQL> SELECT chunk_id, status, start_id, end_id FROM user_parallel_execute_chunks WHERE task_name='task01' ORDER BY chunk_id; CHUNK_ID STATUS START_ID END_ID ---------- -------------------- ---------- ---------- 1 UNASSIGNED 1 1000000 2 UNASSIGNED 1000001 2000000 3 UNASSIGNED 2000001 3000000 4 UNASSIGNED 3000001 4000000 5 UNASSIGNED 4000001 5000000 6 UNASSIGNED 5000001 6000000 7 UNASSIGNED 6000001 7000000 8 UNASSIGNED 7000001 8000000 9 UNASSIGNED 8000001 9000000 10 UNASSIGNED 9000001 10000000 11 UNASSIGNED 10000001 11000000 12 UNASSIGNED 11000001 12000000 13 UNASSIGNED 12000001 13000000 14 UNASSIGNED 13000001 14000000 15 UNASSIGNED 14000001 15000000 16 UNASSIGNED 15000001 16000000 17 UNASSIGNED 16000001 17000000 18 UNASSIGNED 17000001 18000000 19 UNASSIGNED 18000001 19000000 20 UNASSIGNED 19000001 20000000 21 UNASSIGNED 20000001 21000000 22 UNASSIGNED 21000001 22000000 23 UNASSIGNED 22000001 23000000 24 UNASSIGNED 23000001 24000000 25 UNASSIGNED 24000001 25000000 26 UNASSIGNED 25000001 26000000 27 UNASSIGNED 26000001 27000000 28 UNASSIGNED 27000001 28000000 29 UNASSIGNED 28000001 29000000 30 UNASSIGNED 29000001 30000000 31 UNASSIGNED 30000001 31000000 32 UNASSIGNED 31000001 32000000 33 UNASSIGNED 32000001 33000000 34 UNASSIGNED 33000001 34000000 35 UNASSIGNED 34000001 35000000 36 UNASSIGNED 35000001 36000000 37 UNASSIGNED 36000001 37000000 38 UNASSIGNED 37000001 38000000 39 UNASSIGNED 38000001 39000000 40 UNASSIGNED 39000001 40000000 41 UNASSIGNED 40000001 41000000 42 UNASSIGNED 41000001 42000000 43 UNASSIGNED 42000001 43000000 44 UNASSIGNED 43000001 44000000 45 UNASSIGNED 44000001 45000000 46 UNASSIGNED 45000001 46000000 47 UNASSIGNED 46000001 47000000 48 UNASSIGNED 47000001 48000000 49 UNASSIGNED 48000001 49000000 50 UNASSIGNED 49000001 50000000 50 ROWS selected. |
Now execute the task with code below. One important parameter is PARALLEL_LEVEL:
ALTER SYSTEM flush buffer_cache; ALTER SYSTEM flush shared_pool; SET serveroutput ON SIZE 999999 DECLARE start_time NUMBER; end_time NUMBER; sql_stmt VARCHAR2(32767); BEGIN start_time:=dbms_utility.get_time; sql_stmt := 'update /* Yannick */ test01 set val=rpad(''D'',100,''D'') where id between :start_id and :end_id'; dbms_parallel_execute.run_task(task_name => 'task01', sql_stmt => sql_stmt, language_flag => dbms_sql.native, parallel_level => 25); end_time:=dbms_utility.get_time; dbms_output.put_line('DBMS_PARALLEL_EXECUTE Update: ' || TO_CHAR((end_time - start_time)/100) || ' seconds'); END; / |
To know which jobs have been created for your task you need to get JOB_PREFIX column from USER_PARALLEL_EXECUTE_TASKS view and use this prefix with LIKE keyword to get job names list (not too easy to explain but below SQL is sel-explaining I hope):
SQL> col task_name FOR a10 SQL> col job_prefix FOR a15 SQL> SELECT task_name, status, job_prefix, parallel_level FROM user_parallel_execute_tasks; TASK_NAME STATUS JOB_PREFIX PARALLEL_LEVEL ---------- ------------------- --------------- -------------- task01 FINISHED TASK$_442 25 SQL> col job_name FOR a20 SQL> col run_duration FOR a15 SQL> SELECT job_name, status, run_duration FROM user_scheduler_job_run_details WHERE job_name LIKE (SELECT job_prefix || '%' FROM user_parallel_execute_tasks WHERE task_name='task01'); JOB_NAME STATUS RUN_DURATION -------------------- ------------------------------ --------------- TASK$_442_11 SUCCEEDED +000 00:02:24 TASK$_442_10 SUCCEEDED +000 00:09:19 TASK$_442_2 SUCCEEDED +000 00:09:22 TASK$_442_1 SUCCEEDED +000 00:09:22 TASK$_442_13 SUCCEEDED +000 00:02:20 TASK$_442_3 SUCCEEDED +000 00:09:18 TASK$_442_8 SUCCEEDED +000 00:02:07 TASK$_442_14 SUCCEEDED +000 00:01:25 TASK$_442_15 SUCCEEDED +000 00:02:06 TASK$_442_18 SUCCEEDED +000 00:02:16 TASK$_442_19 SUCCEEDED +000 00:02:20 TASK$_442_21 SUCCEEDED +000 00:09:14 TASK$_442_23 SUCCEEDED +000 00:02:09 TASK$_442_25 SUCCEEDED +000 00:02:13 TASK$_442_5 SUCCEEDED +000 00:02:22 TASK$_442_9 SUCCEEDED +000 00:01:43 TASK$_442_22 SUCCEEDED +000 00:02:25 TASK$_442_24 SUCCEEDED +000 00:10:10 TASK$_442_6 SUCCEEDED +000 00:02:26 TASK$_442_12 SUCCEEDED +000 00:09:24 TASK$_442_4 SUCCEEDED +000 00:02:25 TASK$_442_7 SUCCEEDED +000 00:09:16 TASK$_442_16 SUCCEEDED +000 00:08:24 TASK$_442_17 SUCCEEDED +000 00:02:20 TASK$_442_20 SUCCEEDED +000 00:11:06 25 ROWS selected. |
Even if the tasks are all in success you might have some chunks that have not been processed correctly:
SQL> SELECT chunk_id, status, start_id, end_id FROM user_parallel_execute_chunks WHERE task_name='task01' ORDER BY chunk_id; CHUNK_ID STATUS START_ID END_ID ---------- -------------------- ---------- ---------- 1 PROCESSED 1 1000000 2 PROCESSED 1000001 2000000 3 PROCESSED 2000001 3000000 4 PROCESSED 3000001 4000000 5 PROCESSED 4000001 5000000 6 PROCESSED 5000001 6000000 7 PROCESSED 6000001 7000000 8 PROCESSED 7000001 8000000 9 PROCESSED 8000001 9000000 10 PROCESSED 9000001 10000000 11 PROCESSED 10000001 11000000 12 PROCESSED 11000001 12000000 13 PROCESSED 12000001 13000000 14 PROCESSED 13000001 14000000 15 PROCESSED 14000001 15000000 16 PROCESSED 15000001 16000000 17 PROCESSED 16000001 17000000 18 PROCESSED 17000001 18000000 19 PROCESSED 18000001 19000000 20 PROCESSED 19000001 20000000 21 PROCESSED 20000001 21000000 22 PROCESSED_WITH_ERROR 21000001 22000000 23 PROCESSED 22000001 23000000 24 PROCESSED_WITH_ERROR 23000001 24000000 25 PROCESSED 24000001 25000000 26 PROCESSED 25000001 26000000 27 PROCESSED 26000001 27000000 28 PROCESSED_WITH_ERROR 27000001 28000000 29 PROCESSED 28000001 29000000 30 PROCESSED_WITH_ERROR 29000001 30000000 31 PROCESSED_WITH_ERROR 30000001 31000000 32 PROCESSED 31000001 32000000 33 PROCESSED_WITH_ERROR 32000001 33000000 34 PROCESSED_WITH_ERROR 33000001 34000000 35 PROCESSED_WITH_ERROR 34000001 35000000 36 PROCESSED_WITH_ERROR 35000001 36000000 37 PROCESSED 36000001 37000000 38 PROCESSED 37000001 38000000 39 PROCESSED_WITH_ERROR 38000001 39000000 40 PROCESSED 39000001 40000000 41 PROCESSED 40000001 41000000 42 PROCESSED 41000001 42000000 43 PROCESSED 42000001 43000000 44 PROCESSED 43000001 44000000 45 PROCESSED 44000001 45000000 46 PROCESSED 45000001 46000000 47 PROCESSED 46000001 47000000 48 PROCESSED 47000001 48000000 49 PROCESSED 48000001 49000000 50 PROCESSED 49000001 50000000 50 ROWS selected. |
You can try to understand why with ERROR_CODE and ERROR_MESSAGE columns of USER_PARALLEL_EXECUTE_CHUNKS view:
SQL> col error_message FOR a40 word_wrapped SQL> SELECT error_code, error_message FROM user_parallel_execute_chunks WHERE chunk_id=22; ERROR_CODE ERROR_MESSAGE ---------- ---------------------------------------- -30036 ORA-30036: unable TO extend SEGMENT BY 8 IN undo TABLESPACE 'UNDOTBS1' |
And here you see the mistake I have done, I have tried to proceed the 50 chunks with 25 task running in parallel. It means I would need to have undo to store the update (without committing) of 25 millions rows. It is really far from the previous methods I have used. And to be honest it was not faster that the bulk method. I have refined the approach by creating chunks of 100,000 rows (so 500 chunks) and processing 10 in parallel and got 640.39 seconds…
With chunks of 50,000 (so 1000 chucks) and 20 jobs in parallel I got 972.92 seconds… This just to show that finding the good balance between chunks size and parallelism is not so easy and might require multiple tries…
Another cool query to see progress of your task is:
SQL> SELECT status, COUNT(*) FROM user_parallel_execute_chunks WHERE task_name='task01' GROUP BY status ORDER BY status STATUS COUNT(*) -------------------- ---------- ASSIGNED 6 PROCESSED 165 UNASSIGNED 329 |
Over 7 runs (500 chunks of 100,000 rows and 10 jobs in parallel) I have got an average of 610 seconds which is again a 40% improvement versus bulk SQL. Of course this is really more complex to implement and you must be able to divide and conquer to implement parallel processing with DBMS_PARALLEL_EXECUTE package. On top of it as each processed chunks are committed it also helps to decrease your undo footprint. I would have loved to be aware of this package sooner when I have developed few BI PL/SQL processes…
At the end to clean the task use:
SQL> EXEC dbms_parallel_execute.drop_task('task01'); PL/SQL PROCEDURE successfully completed. |
Raw results
The raw results I have gotten with the three methods:
Classical | Bulk SQL | DBMS_PARALLEL_EXECUTE |
---|---|---|
1784.23 | 1106.4 | 640.39 |
2195.67 | 1026.9 | 517.1 |
2306.09 | 870.39 | 640.56 |
1997.96 | 927.7 | 637.14 |
2196.34 | 968.7 | 631.4 |
2428.13 | 993.36 | 569.85 |
2115.6 | 919.14 | 565.35 |
References
- Getting ORA-04030 Error While Executing A Procedure Having BULK COLLECT (Doc ID 1221283.1)
- Excessive REDO Generation For FORALL Loop With SAVE EXCEPTION Clause (Doc ID 1995875.1)
- How to Perform Multiple Updates Using PL/SQL and Bulk Binds (Doc ID 728819.1)
- Bulk Binding – What it is, Advantages, and How to use it (Doc ID 74123.1)
- 8 Bulk Update Methods Compared
- Oracle Database – Parallel DML
- Working with Collections
- On Working in Parallel
- The Joy of Low-Hanging Fruit
- Loading data fast – regular insert vs. bulk insert
- Bulk Collect and FORALL
Anil M says:
Thanks for good example, I still have few doubts, please clarify.
As per example considered above having below select query which consideres all records of table(its an example of full table update)
SELECT id,val FROM test01;
If we have condition as below select query, is it possible to handle using DBMS_PARALLEL_EXECUTE? If yes please give an example
SELECT id,val FROM test01 where id between 1 to 50000;
Yannick Jaquier says:
I have not tried it but yes should not be specifically a problem to add an additional where clause to the update statement.
ARUL JEBIN says:
YOU HAVE PASSED 25 PARALLEL EXECUTION. BUT ONLY 6 TASKS IS ASSIGNED. WHAT COULD BE THE REASON.
Yannick Jaquier says:
Good catch ! Coming from a wrong copy/paste of a different test. Updated the output…
Thanks,
Yannick.
James says:
Great article! One question…
JOB_NAME STATUS RUN_DURATION
——————– —————————— —————
TASK$_728_5 SUCCEEDED +000 00:02:00
TASK$_728_7 SUCCEEDED +000 00:09:16
TASK$_728_8 SUCCEEDED +000 00:00:00
TASK$_728_9 SUCCEEDED +000 00:01:16
TASK$_728_10 SUCCEEDED +000 00:09:19
TASK$_728_2 SUCCEEDED +000 00:09:22
TASK$_728_3 SUCCEEDED +000 00:09:18
TASK$_728_6 SUCCEEDED +000 00:00:00
TASK$_728_4 SUCCEEDED +000 00:00:00
TASK$_728_1 SUCCEEDED +000 00:09:22
I see some of the jobs did not process any chunks but still says it’s succeeded. (TASK$_728_8, TASK$_728_6, TASK$_728_4). Do you know why that happens??
Yannick Jaquier says:
Thanks for comment !
Good catch, maybe the screenshot is coming from tasks that got undo space error while processing chunks…
Pierre-yves COLLE says:
A big thanks to Yannick for this wonderful very clear, detailed and easy to understand and use article.
Although I’m not a rookie guy in SQL and ORACLE, having spent multiple years here programming applications,
I learned a great new feature that I was absolutely not aware of ORACLE and thanks to his valuable example I was able to test it
Not only the value added of this ORACLE feature is well documented but also the example allow you to feel immediately its interest and not to stay just theoretical as it is the case many times in many other articles being on ORACLE or on other technology.
BIG BIG THANKS AGAIN
Yannick Jaquier says:
Welcome and thanks a ton for enthusiastic comment !
Yannick.