Monday, March 4, 2013
John Adamski posted a question to the IIUG SIGs about how to identify a session that caused the long transaction that eventually put his system in a Blocked:LONGTX state. A few of us came back with some responses, but it wasn't until John Miller III from IBM and Informix Fun Facts replied with "finding the session that caused your long transaction isn't very useful, you need to prevent this situation from happening with the LTXHWM and LTXEHWM ONCONFIG parameters" that I realized these config parameters are typically underutilized.
LTXHWM and LTXEHWM
LTXHWM is the percentage of logical log space a transaction can span before it is declared to be a long transaction and is rolled back by the engine.
John points out that this rollback only affects the session being rolled back (well, excluding any extra strain on the system rolling back a long transaction creates), nothing else is blocked.
LTXEHWM is the percentage of logical log space a transaction can span before the engine decides we are in a bad enough situation that we need to block all other modifications to try and ensure we have enough logical log space to complete the roll back. This does affect other sessions and if your system is in a Blocked:LONGTX your users will be very unhappy and you are likely going to have a very bad day.
Your day will get infinitely worse if even after blocking other sessions you do not have enough logical log space to complete the rollback of the long transaction. You can read about recovering from a long transaction hang in the manuals if you want. I warn you, it involves a point in time restore or a call to IBM Informix support.
The Default Values
Out of the box LTXHWM is 70 and LTXEHWM is 80. Are these too high? They probably are if you have significant logical log space.
On one of my instances I have 16 GB of logical log space. This means that one of my users can start a transaction that spans 11.2 GB of logical log space before he gets the boot and the engine starts rolling back his long transaction. That is going to be a lot of rolling back and if it took this long transaction 3 hours to get to the point it needed to roll back I can expect the rollback to last at least 3 hours.
Avoid the One Size Fits All LTXHWM/LTXEHWM Recommendations
I had always believed that the 70/80 defaults were too high,and I've always heard that a better alternative is 50/60 and admittedly that is what I have always set my values to.
I'm rethinking that now. I don't know why it took me this long to rethink this, probably because I haven't had any problems with long transactions lately so I've been thinking about other things. One thing is for sure, a LTXHWM of 50 is still way too high for my system.
This is an OLTP system with no long running transactions and a lot of logical log space configured so we can run for at least 48 hours without needing to back up a logical log in case we have a TSM outage.
My LTXHWM should be a lot lower, probably something as ridiculous sounding as 5. I know the kind of transactions that run on my system and no single transaction should span more than 1 GB of logical log space and if it does, something is wrong. I want to kill that transaction ASAP so I only have a performance hit while rolling back 1 GB of logical logs instead of 11 GB.
I think my LTXEHWM setting of 60 is OK, maybe it could even go a little bit higher. The only reason I would want to block my engine to allow a transaction to rollback is if I thought I was in danger of not being able to complete the rollback because of limited logical log space. Setting this ONCONFIG lower doesn't make any sense.
Posted by Andrew Ford at 4:17 PM