Thursday, January 29, 2015

Resolving In Place Alters - Part 4, The Slot Table

By the end of Resolving In Place Alters, Part 3 I was able to read the raw data pages that belong to the table with pending IPAs, but I wasn't able to see if a data page had a pending IPA or not. In this installment I will show you how I solved that problem using what I know about the Informix Data Page structure and the structure of my table before and after the in place alter was performed.

Anatomy of the Informix Data Page

When Informix stores data on disk, it doesn't just write your row data. It also writes some "housekeeping" data at the beginning of the page and at the end of the page that help the engine to know what actually is stored in this data page.

The first 24 bytes of a page contain the page header with all kinds of information that you're going to have a hard time finding documentation about. The last N bytes of a data page contain a timestamp and the slot table, which you may have better luck finding some documentation on. In between the page header and page footer (slot table + timestamp) you'll find your actual data. We will need information from each of the 3 parts of the data page.

Friday, January 23, 2015

Resolving In Place Alters - Part 3

In Resolving In Place Alters - Part 2 I decided that I want to try to identify only the pages with a pending IPA and only "fix" those data pages in an attempt to speed up the resolving of IPAs and limit the work load added to the system when doing so.

To do this, I will need to read the actual data pages stored on disk and use information in the page header and slot table to determine if a page needs to be fixed. More on this in the next blog, today I just want to talk about how to read in the raw data pages.

Thursday, January 15, 2015

Resolving In Place Alters - Part 2

In Part 1 I talked about how I knew that the prescribed method of resolving the pending in place alters in my large table by simply issuing a single large dummy update statement to update each row won't work for large tables. There are just too many problems with long transactions, locks and excessive I/O that can trash the engine.

I decided the first thing I would try would be to update every row in the table, but do it in multiple transactions to avoid long transactions and holding on to a lot of locks. Nothing too interesting, pretty standard stuff really. The logic looks like this: