Friday, March 1, 2013

Hot Topic - SELECT FOR UPDATE/WHERE CURRENT OF


"Holy Cow, two blog posts in one day!" - Harry Caray

Ben Thompson over at Informed Mix currently wrote about using "select for update/where current of" syntax and in the mother of all coincidences one of the developers that writes code that hits my Informix engines came over to tell me about the evolution of performance improvements he went through to speed up a bulk data delete application. Here is his story, from static SQL all the way to prepared statements using the "select for update/where current of" syntax.

First, the formalities. This particular developer is writing an application that will delete a bunch 'o data from a table. Since he is a nice guy he is writing the application in a way that will avoid long transactions and not hold too many locks at once.

Basically he is doing one of these:

select primary key fields of rows to delete

begin work

for each row:
   delete by primary key

   if time to commit:
      commit work
      begin work

commit work

In his first attempt, he built and executed a static delete sql string for each row to delete.

delete from table where field1 = "ABC" and field2 = "123";
delete from table where field1 = "DEF" and field2 = "456";
delete from table where field1 = "GHI" and field2 = "789";

Of course this works, but has the unpleasant side effect of being the worst way possible to do this since the engine has to parse, optimize and execute each statement every time.

This yielded a result of 463.18 deletes per second.

In his second attempt, he used a prepared statement for the delete which is better because the engine only has to parse and optimize the statement once.

delete from table where field1 = ? and field2 = ?;

This gave a result of 1016.50 deletes per second.

In his third (and currently, final) attempt he took advantage of the "select for update/update where current of" functionality.

declare cursor cur for select 1 from the_table where date_field < today - 180 for update

open cur

begin work

for each row in cur:
   delete from the_table where current of cur

   if time to commit:
      commit work
      begin work

commit work

This was a huge boost to performance since the engine already knows where the row we want to delete lives and we don't have to go through indexes and data pages to find it again.

Final results were 1923.00 deletes per second.

No comments:

Post a Comment