Friday, January 23, 2015

Resolving In Place Alters - Part 3

In Resolving In Place Alters - Part 2 I decided that I want to try to identify only the pages with a pending IPA and only "fix" those data pages in an attempt to speed up the resolving of IPAs and limit the work load added to the system when doing so.

To do this, I will need to read the actual data pages stored on disk and use information in the page header and slot table to determine if a page needs to be fixed. More on this in the next blog, today I just want to talk about how to read in the raw data pages.

As far as I know, I only have 2 good options. Bypass the engine and open and read the data in the chunk directly or use the sysmaster:sysrawdsk table. I tried both options and did not notice a performance difference either way, so I decided to use sysrawdsk simply because it made things easier since I can use SQL to query and get the raw data pages.

sysmaster:sysrawdsk looks like this (for those concerned about the Internal Use Only warning, just don't do any of this outside and you'll be fine):

    create table informix.sysrawdsk             { Internal Use Only             }
        pg_chunk        integer,        { physical page address - chunk }
        pg_offset       integer,        { physical page address - offset}
        offset          smallint,       { bytes into page               }
        loc             char(23),       { location representation       }
        hexdata         char(40),       { 16 bytes hexdumped from offset}
        ascdata         char(16)        { 16 bytes ascii-dumped         }
    create unique index informix.sysrawdskidx on sysrawdsk (pg_chunk, pg_offset, offset);

Each row in the table represents 16 bytes of raw data. For example:

   pg_chunk = 4 and 
   pg_offset = 1000 and 
   offset = 64;


34313037 30353632 36302020 20202020

Gives me the 16 bytes of data (in hex) that starts on the 64th byte of page 1000 in chunk 4.

We are making progress. Now that I can query the raw data to use to my advantage later, I need to figure out how to find the pages that belong to the table I'm trying to fix. sysmaster to the rescue yet again, this time in the form of the sysextents table.

   dbsname = "mydb" and
   tabname = "mytab"
order by
   offset asc;

      chunk      offset        size

          4      851706       77423
          4      929129       77423
          4     1006552       77423
          4     1083975       77423
          4     1161398       77423
          4     1238821       77423
          4     1316244       77423
          4     1393667       77423
          4     1471090       77423
          4     1548513       77423
          4     1625936       77423
          4     1703535      524288
          4     2227823      524288
          4     2752111      524288
          4     3276399      524288
          4     3800687      524288
          4     4324975      524288
          4     4849263      524288
          4     5373551      524288
In review, we now have:
  1. A way to read the raw data pages from sysrawdsk
  2. A way to find the pages that belong to our table
Now we need to find the data pages in these extents, back to good old sysmaster.

The sysmaster:sysptnbit table holds the partition bit maps, this data will tell us if a page we are looking at is a "Data Page with Room for another Row" (bitmap value of 4), a "Data Page without Room for another Row" (bitmap value of 12) or some page type we don't care about. Now would be a good time to mention that my row size is fixed with no variable length fields and one row fits completey on a page so there are no remainder pages. If you have either of these conditions, well things are going to be a little more difficult for you.
{ Partition Bit Maps }
    create table informix.sysptnbit
        pb_partnum      integer,        { partnum for this partition    }
        pb_pagenum      integer,        { logical pagenum represented   }
        pb_bitmap       integer         { bitmap value for page         }
    create unique index informix.sysptnbitidx on sysptnbit (pb_partnum, pb_pagenum);

Here things get a little annoying for us. The data in sysptnbit is referenced by logical page and the pages we are looking at in sysrawdsk are referenced by physical page. If you need a refresher course on how to convert a physical page reference (chunk and offset) to a logical page reference (partnum and page number) then here is my best attempt...

Logical page numbers in a partition are sequential starting with page 0 in the first extent. The second page in the first extent is page number 1, the third page in the first extent is page number 2 and so on and so forth. If the first extent has 1000 pages, the last page in the first extent will be logical page number 999 and the first page in the second extent will be logical page number 1000. The second page in the second extent will be page number 1001, the third page in the second extent will be 1002 and so on and so forth. We can use this logic and the sysmaster:sysptnext table to find the logical page number for any physical page.

Here is an example with chunk 4, page 5373552.  

Step 1 - Find the first page of the extent that this page belongs to
   chunk = 4 and
   5373552 between offset and offset + size;


Step 2 - Find the partnum and extent number that maps to this extent
   pe_chunk = 4 and
   pe_offset = 5373551; -- first page of the extent that page 5373552 of chunk 4 belongs to

 pe_partnum pe_extnum

    4194317         1

Step 3 - Find the total number of pages that belong to the extents created before this extent
   nvl(sum(pe_size), 0) pages
   pe_chunk = 4 and
   pe_partnum = 4194317 and
   pe_extnum < 1;


Step 4 - We now know the logical page number for the first page of the extent that our page belongs to, just need to calculate our page number
Our Logical Page Number = 77423 + (5373552 - 5373551) = 77424

Taking our new found logical page address (partnum and pagenum) we can query sysptnbit and see if this is a page we are interested in (bitmap value 4 or 12)

   pb_partnum = 4194317 and
   pb_pagenum = 77424;



One last thing to note before I conclude Part 3, if no bitmap row had been found then I would know that I am at the end of valid data for this extent and can stop looking at the pages it contains and move on to the next extent.

Next time - Part 4: How to figure out if the data page I just found has a pending IPA or not.

Now, have some ads. I insist.

No comments:

Post a Comment