Tuesday, December 14, 2010

Table Level Restore - Pretty Useful Stuff

"Hello Andrew. Yeah, well, um we kind of ran an update with a bad where clause and messed up some of our data. Can you fix just the rows we updated?"

"Oh yeah, sure. We've got the backups and the logs so we'll be OK."

"How long will it take? This is production and it is kind of a big deal"

"You're in luck. Informix has a feature called Table Level Restore that allows me to extract only the rows we want from a backup in a few minutes. Without this feature we would have to perform a full restore to a test engine, which can take hours, and then extract the rows from the restored instance."

"Oh thank heavens we run Informix and not some other database engine that doesn't have this feature."

Table-Level Restore

You can use Table-Level Restore (TLR) to extract an entire table or a subset of rows in a table from your backups. The feature has been around since version 10, is easy to use and can come in really handy if you find yourself in the situation I described above.

Saturday, December 11, 2010

Promoting the 2011 IIUG Informix Conference

Bear with me for one more "Non Technical Pimp the 2011 IIUG Informix Conference" post. After this we'll get back to the good stuff. I had to take advantage of Table Level Restore last night and will be writing about that experience soon.

One of the things the IIUG Conference Planning Committee is doing to promote Informix and the conference this year is to run a Google Ads campaign. We're running text ads world wide that are displayed when you search for Informix related terms and on web sites containing content relevant to Informix. We're also running banner ads across the globe that appear on the web sites we think Informix users frequent.

This is a very inexpensive way for us to promote the conference, which we like since our main objective is to provide a great conference to Informix users while keeping the registration fee very affordable (you can register for the 3 day conference for as low as $525 between now and January 31!). The jury is still out as to how effective this form of advertising will be for us, but it is worth every penny to see Informix appear in an advertisement when I'm online.

I have two favors to ask:

Saturday, December 4, 2010

2011 IIUG CPC - Day 2, Matrigrid Continues to Take Shape

Making the Final Decisions on Sessions and Scheduling them to Avoid Conflicts


Friday, December 3, 2010

2011 IIUG Conference Planning Committee - Day 1, The Matrigrid Starts to Take Shape

Hand picking the 80+ sessions from a record number of submitted abstracts

Getting a Handle on the Hands on Labs

 I spy with my little eye: nutritional snacks, a borrowed bell and a potential electrical fire

Wednesday, December 1, 2010

Bloggin Ain't Easy

Have I really not written anything since August? Wow, my bad.

What can I say? I haven't been wearing my Informix DBA hat at work lately. I've been playing Python developer so my attention has been focused on other things. That's one of the great things about Informix, it runs so well that I don't have to devote 100% of my time to keeping the engines humming. This is great for me because I get to do some development work which I find interesting and my company gets to use me as a Software Engineer when the DBA workload is low.

So what got me focused on Informix again? Well, I just printed my boarding pass for a flight tomorrow to attend an IIUG Conference Planning Committee meeting this weekend. This is my first year volunteering for the CPC and I'm excited to be part of this group that has consistently put on a great conference for the Informix community.

If you haven't heard, the 2011 IIUG Informix Conference will be held May 15th - 18th at the Overland Park Marriott in Kansas and you can find a lot more information at the conference website.

The theme of the conference for 2011 is to Go Cruising with Informix. Come learn from to the best and the brightest Informix speakers about how you can take advantage of your Informix engines and take advantage of the cruise like all inclusive nature of the conference which includes meals, drinks and entertainment.

Among other conference logistic planning things the CPC will be putting together the session grid for the conference this weekend, no easy task. We've received A TON of submissions this year and it will be our job to pick the right sessions and schedule them appropriately to maximize your training dollars.

Speaking of your training dollars. If you have them, then there is no better way to spend them then on the IIUG conference. Of course I have to say this as a member of the IIUG Board of Directors and now the CPC, but here are some numbers and you can be the judge.

Early Bird Registration (Before February 1, 2011) is $625 and if you're an IIUG member (it is freakin' free to join) you get $100 off conference registration.

That's $525 for the conference registration. Show me where you can get 3 full days of Informix training by the people who actually wrote the code, great food (breakfast, lunch and food at every night time activity), drinks (open bar dude!), entertainment, free internet, access to the vendors in the EXPO hall and free IBM certification training for $525. You can't. You simply can not beat the value this conference provides.

If you need something a little more formal than the promise of an open bar to show your boss, check the Conference Website for a full Conference Justification write up.

Friday, August 20, 2010

Free Discover Informix Event - North America

I just heard about a free Informix event in San Jose, CA. If you're in the area, register for the event and hear from Steve Shoaf, Worldwide Sales Manager, Informix, Jerry Keesee, Director, Informix Database Development and Craig Humphreys, Vice President, Technical Services. After the talky talky there will be lunch and a complimentary Tech Museum tour.

Discover the New Face of Informix

When: September 16, 2010 8:00 AM - 12:00 PM
Where: The Tech Museum, San Jose, CA

Participants will:
  • Learn how to achieve faster time to value for data analysis and reporting with Informix Warehouse
  • Find out how Informix is the ideal platform for applications that require embedded data management
  • Get an up-close look at the capabilities that set Informix apart, including exceptional performance, high availability, easy administration and the ability to scale on demand to meet your changing business needs
  • Hear IBM's Informix product strategy and roadmap
  • Meet your industry peers, Business Partners and Informix experts
  • Take the opportunity to have your questions answered
  • Network with other forward-thinking IT leader
 Register for this event today!

Tuesday, August 17, 2010

Discover hassle-free data management with IBM Informix

Not sure if this is old news or not but Infoworld is hosting an ebook sponsored by IBM about Informix, the overview goes a little something like this:

"Today, unprecedented data volumes and shrinking budgets are stretching IT departments to their limits. Now, leading businesses are discovering that success can be derived from the right database-one that does not tie up resources, yet performs better at lowers costs. Read this ebook to discover how IBM® Informix® can help your business."

The ebook talks about the low maintenance overhead, scalability and efficient use of hardware, reliability and embeddability of Informix and even touches on Informix as a warehouse that can utilize compression features to reduce storage, energy and administration costs.

Take a look see for yourself.

Thursday, August 12, 2010

Take a 10 Minute 1 Page Survey - Win a Giant iPod

Everyone has their favorite Informix uptime story. Mine is about the time I came across a 5.0 instance that had been running for years. I thought it was neat, I had never seen a version earlier than 7.2x and when I asked the other people in my group why they hadn't told me about this instance they were just as shocked as I was to find it still running. The damn thing had been running quietly in the background, still doing work mind you, and since it caused no problems they simply forgot it existed!

Oninit, Advanced Data Tools and IBM are conducting a survey to collect Informix uptime and DBA resource data to be used in a "Continuous Availability White Paper" that will be distributed, as Advanced Data Tools puts it, to every CTO, IT Director and IT Manager along with a list of all major application vendors now porting their applications to Informix and a list of key reasons to choose Informix.

I took the survey yesterday and it was quick and painless. Seriously. 1 page, in and out in 10 minutes and as a special thank you for filling out the survey I was entered into a drawing for and iPad.

If you have not yet taken the survey you have to hurry because it is over soon (Friday?).

Here be the link: http://www.advancedatatools.com/Informix/Survey.html

Wednesday, August 11, 2010

Informix HDR Will Save Your Butt

If I had to guess, I would say that most production database engines utilize RAID technology to protect against the inevitable disk failure and the ones that don't probably should. Disk is cheap and the revenue saved by avoiding an extended outage can be enough to pay for disk mirroring many times over.

If I had to guess again, I would say that not nearly enough production database engines utilize High Availability Data Replication (HDR) to protect against the inevitable server failure. Why is this? Servers can fail too. Sure, servers are more expensive than disks and sure the MTTF is longer than disks but the money lost during an extended outage that could have been avoided with HDR is probably going to be more than the cost of implementing an HDR solution.

HDR continuously replicates the changes made to a Primary server to a Secondary server that can be quickly converted to a Primary if the original Primary fails. As an added bonus, the Secondary server can be used for reads and writes allowing you to make use of this hardware to improve performance instead of letting it sit there idle. You could also implement multiple Remote Standalone Secondary (RSS) or Shared Disk Secondary (SDS) servers to create a grid if your Informix Edition supports this. I'm going to focus on a single HDR Secondary which is available for no cost in Innovator-C.

As with most Informix features, HDR is incredibly easy to configure and does not require much administration.

Monday, August 9, 2010

Installing and Configuring Informix Connect on the Client

It is likely that some of the clients connecting to your Informix engine will run on a different server than the engine.  Enabling a remote server for Informix connections is a simple three step process of setting up some environment variables, installing the free Informix Connect product and creating a sqlhosts file.


We need to create an INFORMIXDIR to hold our Connect install.  This is very similar to what we did with the engine install, except I like to name the INFORMIXDIR /opt/informix-connect to instead of just /opt/informix.

root> mkdir /opt/informix-connect-3.50.FC7
root> chown informix:informix /opt/informix-connect-3.50.FC7
root> ln -s /opt/informix-connect-3.50.FC7 /opt/informix-connect
root> ls -ltr /opt
total 4
drwxr-xr-x  2 informix informix 4096 Aug  1 20:56 informix-connect-3.50.FC7
lrwxrwxrwx  1 root     root       30 Aug  1 20:56 informix-connect -> /opt/informix-connect-3.50.FC7

Thursday, August 5, 2010

Autoextenze: Dbspace Enhancement for Informix. No Gimmick . . . Just Science!

Recently Google alerted me about new Informix sutff on the web, this time it was a question on stackoverflow.com from an Oracle DBA who was also administering an Informix engine and wanted to know if Informix had autoextend functionality for chunks similar to what Oracle has for datafiles in a tablespace. It was late, but I was up playing some poker trying to kill time before a 2 AM maintenance so I replied.

I answered with No, there isn't anything built into the engine but you should be able to use the SQL Admin API and the Informix Scheduler to accomplish the same thing. I've never implemented anything like this and I don't know if anyone else has, but it seemed like an interesting project so I gave it a shot.

My Informix Autoextend functionality consists of:
  • A Stored Procedure that is run by the Scheduler and adds a chunk to a dbspace if needed
  • A Stored Procedure that executes the ALARMPROGRAM to notify us when a chunk is added
  • A modification to alarmprogram.sh to capture my new Event Classes for autoextend
  • A Task run by the Scheduler for checking each dbspace I want to autoextend

Tuesday, August 3, 2010

Configuring Informix Virtual Processors

Virtual Processor (VP) is the technical term for an Informix process. Informix forks off a handful of these VPs at engine startup to take advantage of multiple CPUs or cores and increase parallelism. Each VP is multithreaded allowing Informix to efficiently use the CPU time each process is awarded by the OS, letting ready threads do work while other threads are busy waiting on something else to finish.

There are multiple types of VPs, each performing a specific job.  Some VPs are configurable (i.e. the number of VPs and what CPU they should run on, etc.) through the VPCLASS ONCONFIG parameter and some are not.  The 2 VPs you need to be concerned about right now are:
  • CPU - Runs all session threads and some system threads.  Runs KAIO (Kernel Asynchronous I/O) if enabled. This process will consume most of the CPU cycles given to Informix by the OS and is configurable.
  • AIO - Performs all I/O except for physical or logical logging if KAIO is disabled. Performs I/O to cooked chunks if KAIO is enabled and is configurable.
Initial CPUVP Configuration

Sunday, August 1, 2010

IIUG Insider #121 - New Informix Packaging

I was up late playing some poker on the interwebs and the IIUG Newsletter hit my Inbox. I was knocked out of the tournament just shy of the money shortly thereafter. Gary, you owe me the $1.75 I was going to win from my freeroll because the two events are clearly related.

In case you don't get these delivered to your Inbox (did you know IIUG membership is free?) the new Insider can be found on the IIUG Website and contains a super awesome writeup (much better than my feeble attempt) on the new Informix packaging.

Wow! Look at all that stuff that comes with Growth and Ultimate Editions now. I'm really happy to see Enterprise Replication included for no additional charge in the Choice, Growth and Ultimate Editions. Even awesomer is ER getting included in the free for production Innovator-C Edition.

If you have never used ER, let me just say it is freakin' awesome. The day we replaced our trigger based replication (and the in house application from the 9th circle of Hell called change_log) was a good day.

Saturday, July 31, 2010

Increasing Buffers and Reducing I/O with Informix

If you're following along with my posts on installing, initializing, adding space and creating databases and tables in Informix then you're ready to start doing some basic performance tuning. You can improve performance dramatically by making some simple changes to the ONCONFIG and bouncing the engine. The first and most important change to make is to increase Buffers, this is where Informix will be caching all of the data you are reading in from disk. More data cached means less I/O and less I/O has to mean better performance and since I/O will most likely be your current bottleneck, reducing I/O should result in the most performance gain.

How Informix Caches Data

Thursday, July 29, 2010

CWTL - Informix Post End of Service Support Options and Upgrade Best Practices

The next Informix Chat With The Labs has been scheduled for Thursday August 26, 11:30 AM Eastern Time.

Join in and listen to Albert Martin (Director, WW Data Management Support), Charles Gonsalves (Sr. Manager, WW Informix Support), Ron Privett (Sr. Engineer, Americas Down Systems and Diagnostics Team) and Chris Fender (Program Director, WW IM S&S Strateger and Sales) talk about your support options after your Informix version reaches End of Service and the best practices for upgrading to a new version of Informix.

Or as IBM puts it:

When a software version reaches its End of Service date - when support is no longer available through standard Software Subscription and Support – there is no need to tackle support issues on your own. Discover new flexible options available to you when your Informix software reaches End of Service. Available for Informix versions 7.3, 9.4 and 10, these options will provide you with access to expert IBM Support professionals and will help you maintain business continuity until you upgrade to the latest version of Informix. We will also cover best practices when upgrading from Informix 10 to Informix 11.50; including upgrading standalone and replicated environments.

Register for this free event today and grab your presentation materials before the webcast.

Wednesday, July 28, 2010

Creating Dbspaces, Databases, Tables and Indexes in Informix

I guess at some point you'd like to actually create a database and create some tables with indexes to hold some data.  This logging and backup stuff is fun and all, but what good is a database without data?  We know that you put database objects in dbspaces and we have 2 of those already, the rootdbs and llogdbs01, but we don't want to put our data in these dbspaces.  If we have multiple disks at our disposal we would put the chunks that make up these dbspaces on different disks to minimize I/O contention and increase parallelism, we would want to do the same thing with our dbspaces that hold our tables and indexes.  Even if you are limited to one or two disks it is a good idea to keep your data in separate dbspaces.  First of all our llogdbs01 dbspace only has 11 pages available, so we can't fit much data in there and secondly you want to avoid filling up the rootdbs to give Informix space for housekeeping.

Create a new dbspace for data called datadbs01 using the steps we used to create the llogdbs01 dbspace.  I'll create this dbspace with 1 4GB chunk, feel free to create a smaller or larger dbspace depending on the amount of storage you need/have available.  We will use the default page size of 2K, even though different sizes ranging from 2K to 16K are available, this is a topic for another day.

For performance and data integrity reasons do not use journaled filesystems (EXT3, EXT4, ZFS, etc.) for any of your cooked chunks, including the rootdbs and llogdbs01 dbspace chunks we created earlier.  Art Kagel explains why in his blog post New Journaled Filesystem Rant.

informix> touch /home/informix/chunks/DATADBS01.01
informix> chmod 660 /home/informix/chunks/DATADBS01.01
informix> onspaces -c -d datadbs01 -p /home/informix/chunks/DATADBS01.01 -o 0 -s 4194304
Verifying physical disk space, please wait ...
Space successfully added.

** WARNING **  A level 0 archive of Root DBSpace will need to be done.
Take a Level 0 backup like we're asked to and verify the dbspace was added by running onstat -d

informix> ontape -s -L 0 -d
informix> onstat -d
IBM Informix Dynamic Server Version 11.50.UC7IE -- On-Line -- Up 01:06:04 -- 144148 Kbytes

address  number   flags      fchunk   nchunks  pgsize   flags    owner    name
4ae5b808 1        0x40001    1        1        2048     N  B     informix rootdbs
4ae5bc80 2        0x40001    2        1        2048     N  B     informix llogdbs01
4be72410 3        0x40001    3        1        2048     N  B     informix datadbs01
 3 active, 2047 maximum

address  chunk/dbs     offset     size       free       bpages     flags pathname
4ae5b968 1      1      0          1048576    518077                PO-B- /home/informix/chunks/ROOTDBS.01
4ae5bde0 2      2      0          1048576    11                    PO-B- /home/informix/chunks/LLOGDBS01.01
4be72570 3      3      0          2097152    2097099               PO-B- /home/informix/chunks/DATADBS01.01
 3 active, 32766 maximum

NOTE: The values in the "size" and "free" columns for DBspace chunks are
      displayed in terms of "pgsize" of the DBspace to which they belong.

Expanded chunk capacity mode: always
Easy peasy lemon squeezy.

Now lets create an unbuffered logging database named 'blog' in datadbs01 using dbaccess, a curses based Informix utility for running SQL statements.  Typically I would use dbaccess in Menu mode for something like this, but to make the examples clearer I will use dbaccess in Interactive Non-Menu mode (by supplying a hyphen as the second command line argument to dbaccess).

informix> dbaccess - -
> create database blog in datadbs01 with log;

Database created.
I have successfully created the blog database and am currently connected to it.  When I told Informix to create blog in datadbs01 all of the housekeeping stuff for the blog database were put in datadbs01 and datadbs01 will be the default dbspace for any tables, indexes, etc. I create.

Now lets create some tables.  How about some tables to record blog entries for multiple blogs, call them blog and blog_post.  For this type of stuff, things that involve more typing and are a little more complicated, I like to have dbaccess execute a .sql file that I prepare before hand.

informix> vi create_tabs.sql

create table blog (
   id             serial not null,
   name           varchar(255)
) in datadbs01 extent size 256 next size 256 lock mode row;

create table blog_post (
   id             serial not null,
   blog_id        integer not null,
   title          varchar(255)
) in datadbs01 extent size 1024 next size 1024 lock mode row;

informix> dbaccess blog create_tabs.sql

Database selected.

Table created.

Table created.

Database closed.

If you're familiar with databases, create_tabs.sql should look familiar enough except for the Informix specific stuff.  The Serial data type is an auto-incrementing integer, in datadbs01 puts the table in the datadbs01 dbspace, extent size and next size specify the first and next extent sizes in KB and finally lock mode row enables row level locking on this table vs. page level locking.

Looks like we need some referential constraints and some indexes on those tables.  You could do this as part of the table creation, but if you do Informix will name the supporting indexes all funky and if you have to do something with them later it will be a pain.  Because of this I like to manually create the indexes and create foreign and primary keys afterwards.

informix> vi create_idx.sql

create unique index blog_pk on blog (id) in datadbs01;
alter table blog add constraint primary key (id)
    constraint blog_pk;

create unique index blog_post_pk on blog_post (id) in datadbs01;
alter table blog_post add constraint primary key (id)
    constraint blog_post_pk;

create index blog_post_fk1 on blog_post (blog_id) in datadbs01;
alter table blog_post add constraint foreign key (blog_id)
    references blog (id)
    constraint blog_post_fk1;

informix> dbaccess blog create_idx.sql

Database selected.

Index created.

Table altered.

Index created.

Table altered.

Index created.

Table altered.

Database closed.
I created 2 unique indexes with 2 primary keys on top of them and 1 non unique index with a foreign key on top of it.  This way requires a little bit more typing, but it is how I like to do it and if you want to put the underlying referential constraint indexes in a non default dbspace and give these indexes a name then you should do it this way too.

Time to put some data in these tables.  I'm going to use a INSERT INTO ... VALUES SQL to put data into the blog table and use a dbaccess LOAD FROM .. INSERT INTO SQL to load data from a pipe delimited file into blog_post.

informix> vi load.sql

insert into blog (id, name) values (0, "Informix DBA");
insert into blog (id, name) values (0, "Informix technology");

select * from blog;

informix> dbaccess blog load.sql

Database selected.

1 row(s) inserted.

1 row(s) inserted.

id    1
name  Informix DBA

id    2
name  Informix technology

2 row(s) retrieved.

Database closed.

informix> vi load.unl

0|1|Creating Dbspaces, Databases, Tables and Indexes in Informix|
0|1|ZOMG, FYI - IM Informix Tech Support FTW|
0|1|Informix Backup and Restore - The Bare Minimum|
0|2|A bug can undermine your troubleshooting|
0|2|Informix Editions revisited|
0|2|New Informix editions: Bargain time?|

informix> dbaccess blog -

Database selected.

> load from load.unl insert into blog_post;

6 row(s) loaded.

> select * from blog_post;

id       1
blog_id  1
title    Creating Dbspaces, Databases, Tables and Indexes in Informix

id       2
blog_id  1
title    ZOMG, FYI - IM Informix Tech Support FTW

id       3
blog_id  1
title    Informix Backup and Restore - The Bare Minimum

id       4
blog_id  2
title    A bug can undermine your troubleshooting

id       5
blog_id  2
title    Informix Editions revisited

id       6
blog_id  2
title    New Informix editions: Bargain time?

6 row(s) retrieved.

The last thing I want to do is show you how to add a chunk to an existing dbspace, something you might want to do if you need more space for your data.  This is very similar to creating a dbspace and is done via the same onspaces command

informix> touch /home/informix/chunks/DATADBS01.02
informix> chmod 660 /home/informix/chunks/DATADBS01.02
informix> onspaces -a datadbs01 -p /home/informix/chunks/DATADBS01.02 -o 0 -s 4194304
Verifying physical disk space, please wait ...
Chunk successfully added.

informix> onstat -d

IBM Informix Dynamic Server Version 11.50.UC7IE -- On-Line -- Up 00:26:37 -- 144148 Kbytes

address  number   flags      fchunk   nchunks  pgsize   flags    owner    name
4ae5b808 1        0x60001    1        1        2048     N  B     informix rootdbs
4ae5bb88 2        0x40001    2        1        2048     N  B     informix llogdbs01
4bd65d28 3        0x60001    3        2        2048     N  B     informix datadbs01
 3 active, 2047 maximum

address  chunk/dbs     offset     size       free       bpages     flags pathname
4ae5b968 1      1      0          1048576    518259                PO-B- /home/informix/chunks/ROOTDBS.01
4ae5bce8 2      2      0          1048576    11                    PO-B- /home/informix/chunks/LLOGDBS01.01
4bb51cc8 3      3      0          2097152    2095214               PO-B- /home/informix/chunks/DATADBS01.01
4bdb73e0 4      3      0          2097152    2097149               PO-B- /home/informix/chunks/DATADBS01.02
 4 active, 32766 maximum

NOTE: The values in the "size" and "free" columns for DBspace chunks are
      displayed in terms of "pgsize" of the DBspace to which they belong.

Expanded chunk capacity mode: always

Carlton Doe has a good book on Informix administration available on Amazon.

Tuesday, July 27, 2010

ZOMG, FYI - IM Informix Tech Support FTW

I work in a relatively small office where everyone is at most an 18 second walk away from each other.  Care to guess what the most common form of communication between us is?  Instant messaging.  Is it because we're too lazy to get up from our desks or we're nerdy introverts who resist human interaction (gasp)?  Well, that may be some of it, but the real reason is that IM is often the most effective form of communication for what we need to talk about.  Sure, if we need to talk about something that is going to involve a whiteboard then IM is not the best choice, but if someone is asking me about the command line arguments to start an application with verbose output and a user specified config file or the whereabouts of a log file on a server then I'm going to prefer to type that information out.

Informix Technical Support has a new support offering that can take advantage of the power of IM and render the Phonetic Alphabet obsolete.

Support over the Phone: Can you run onstat dash g as in Golf space r as in Romeo q as in Quebec m as in Mike?
Me: Did I catch a Niner in there?

Support over Chat: Can you run onstat -g rqm and email me the output?
Me: It is in your inbox.

The Informix Technical Support Chat web page describes the new offering like so:

Technical Support Chat allows you to quickly interact 1:1 with one of our Level 2 Support Professionals online instead of over the phone. You can discuss your request and conveniently share command strings and URLs. At any time during the chat, you or the Support Professional can choose to resolve the issue over the phone.

I recently had the opportunity to take advantage of this new service and the experience was as advertised.  My Support Enginer (shout out to my man Jay P., thanks for the help) was able to ask me to run commands and collect diagnostics info and I was able to give accurate feedback instantly either through the IM client or via email for the bigger output.  This was much easier than attempting to describe what I'm seeing over the phone to Jay and I'll definitely be using Technical Support Chat again.


Monday, July 26, 2010

Informix Backup and Restore - The Bare Minimum

Christian Szell: Is it safe?... Is it safe?

Babe: You're talking to me?

Christian Szell: Is it safe?

Babe: Is what safe?

Christian Szell: Is it safe?

Babe: I don't know what you mean. I can't tell you something's safe or not, unless I know specifically what you're talking about.

Christian Szell: Is it safe?

Babe: Tell me what the "it" refers to.

Christian Szell: Is it safe?

Babe: Yes, it's safe, it's very safe, it's so safe you wouldn't believe it.

Christian Szell: Is it safe?

Babe: No. It's not safe, it's... very dangerous, be careful.

- Marathon Man, 1976

Hopefully you will never need to restore from a backup, but chances are you will and it can be as scary as dental torture.  Disks fail, servers fail, disasters destroy data centers, users accidentally delete or update the wrong data and other generally sucky things happen to the databases we administer.  Day in and day out database backups are annoying little things that suck up I/O resources and storage space only to be discarded when the next backup comes along, but there is no better feeling than being able to restore the last backup and get your system back online with minimal data loss after a failure.  There is no worse feeling than needing a backup and not having one.  The good news is basic Informix backup and restore is pretty simple and straight forward in recent versions of Informix.

Saturday, July 24, 2010

Informix Quick Start Guide

I added a page to my blog called Informix Innovator-C Quick Start Guide to capture the steps required to install, initialize and administer Informix that I've been burying in my blog posts.

The information is relevant to most versions of Informix and not just Innovator-C as the name implies.

I'll continue to add to this page and hopefully it will be helpful to the people using Informix for the first time.

Friday, July 23, 2010

Blogging About Logging - Informix Physical and Logical Logging

What rolls down stairs alone or in pairs
Rolls over your neighbor's dog?

What's great for a snack and fits on your back?
It's Log, Log, Log!

It's Log, Log.  It's big, it's heavy, it's wood.
It's Log, Log.  It's better than bad, it's good!

Everyone wants a Log!  You're gonna love it, Log!
Come on and get your Log!  Everyone needs a Log!

- Ren and Stimpy

Logical Logging

Each time you insert, update or delete a row a logical log record containing information about what was changed and how is written to the logical log (if transaction logging is enabled for your database.  You do have this enabled, right?)  Each time you create or alter a database object (table, index, sequence, etc.) a logical log record is written to the logical log (even if transaction logging is disabled).  Each time you run a backup, a checkpoint happens, you add a chunk or dbspace or a new extent is allocated a logical log record is written to the logical log.

Thursday, July 22, 2010

IBM Redbooks - Migrating from Microsoft SQL Server to IBM Informix

Just noticed a new IBM Redbook was published today and is available for download here.

I just downloaded it, but have not read it yet.  I will be taking a look at it today.

Here is the abstract from the IBM Redbooks website

In this IBM® Redbooks® publication, we discuss considerations, and describe a methodology, for transitioning from Microsoft® SQL Server 2008 to the Informix® Dynamic Server. We focus on the topic areas of data, applications, and administration, providing information about the differences in features and functionality, including the data types, data manipulation language, data definition language, and stored procedures. Understanding the features and functionality of the two products assists you in developing a migration plan. 

Tuesday, July 20, 2010

The New Phone Books's Here! Informix Repackaging and Repricing - Part Duex

Page 73 - Johnson, Navin R.! I'm somebody now! Millions of people look at this book everyday! This is the kind of spontaneous publicity - your name in print - that makes people. I'm in print! Things are going to start happening to me now.

- The Jerk, 1979

IBM has tweaked the repackaging of Informix that they did on May 25th of this year and has put together a matrix showing how the newly priced Informix products stack up against similarly priced Microsoft and Oracle products.

The big news in repackaging is that Ultimate-C for Windows and Mac are gonzo and Innovator-C is now your free Informix engine of choice for all operating systems, not just Linux.

But remember, for almost everyone needing a free database engine this will be OK.  I broke down each feature that is excluded from Innovator-C here, here, here and here and there aren't any major show stoppers.

Monday, July 19, 2010

Initializing Informix Innovator-C on Linux

If you are new to Informix, say this over and over and over again.  Then say it one more time.

Engine initialization is something you do once.  Engine initialization is not the same as starting the engine.

When you initialize an engine you give birth to a new instance.  A previous instance that was occupying the same location as this new instance is overwritten with the data for the new instance.  If you have unintentionally reinitialized an engine, the only way to bring it back is from a restore of your last backup.

oninit -iy instructs Informix to initialize the engine without asking if you're sure you'd like to wipe out any existing instances.  I don't recommend you use the -y option when running oninit, your brain will get used to typing this and you can accidentally wipe out your whole instance without realizing it until it is too late.  I feel queasy just typing that command, even in a blog.

More Free Stuff - Training by Art Kagel from Advanced DataTools

In the spirit of Informix and free stuff Advanced DataTools continues their
Informix Performance Tuning Webcasts Series by Art Kagel.  If you want to attend this free 2 hours webcast you have to act fast because the event is tomorrow, July 20 at 2:00 PM EST and registration is open now.

The focus is on performance tuning historical data collection:
  • What is worth collecting over time?
  • Collection frequency
  • Now that I have the data, what do I do with it?
Again, this is a free webcast but you have to register here.  I registered earlier today and it was quick and painless, only my name and email address were required.

Thursday, July 15, 2010

Informix Chat with the Labs Replay

IBM hosted the monthly Informix Chat with the Labs call a few days ago.  The topic was Best Practices for Running Informix in VMWare vSphere Virtual Environments.

If you missed it you can find the audio replay and presentation materials for this and other CWTL from the past at developerWorks

Some of the topics you will find there:
  • Best Practices for Using Auto-Update Statistics
  • What's New in Informix 11.50.xC6
  • HDR Best Practics and Performance Tuning
  • Informix Availability Best Practices
  • Update Anywhere Clusters in 11.50
  • Informix 11.50 Continuous Availability Feature
IBM has also been translating the CWTL to Spanish. Que bueno!

If there is a topic you would like to see chatted about but you don't know who to ask, let me know and I'll pass along the request.

Wednesday, July 14, 2010

You Know What Else is Free?

Membership to the International Informix Users Group is free.

If you are new to Informix and are not a member of the IIUG I strongly suggest you join today.
What you get for a $0 investment and 10 minutes of your time:
  • Monthly newsletter to keep you up to date with everything Informix
  • Access to the Software Repository where you can download Member supplied DBA utilities, scripts and applications and Free IBM Software
  • Access to the SIGs, a collection of message boards dedicated to Informix where you can ask the community for answers and technical advice
  • Discounts on conference registration
  • Access to presentation materials from previous conferences
  • Access to the Job Board

Tuesday, July 13, 2010

Installing Innovator-C on Linux

You want to take advantage of this new free for production database engine, Innovator-C, but you have no experience with Informix and don't know where to begin.

If the above statement describes you, then you have come to the right place as I will attempt to describe in great detail (maybe too much detail) the Informix engine installation process.

What you will need:

  • A machine running a supported version of Linux as documented in the machine notes. Most major and recent Linux distributions are supported (RHEL4 and 5, SLES 10 and 11, Ubuntu 6.06.01 and Debian 4.0). I am using 64 bit Red Hat Enterprise Linux 5
  • root access. The install is performed by root and root will be needed to create the informix user and informix group
  • Innovator-C (32 bit version for 32 bit OS, 64 bit version for 64 bit OS), found here
Get the system ready

As root, create the informix group, create the informix user as part of the informix group and set the password for the informix user.

root> groupadd informix
root> useradd -g informix informix
root> passwd informix

Sunday, July 11, 2010

Innovator-C Excluded Features - The Final Installment

Here is the last installment of the features that are not available with the Innovator-C free version of Informix.

No Multiple triggers and views

I'll admit it, I don't really understand what this feature is. Triggers and views are certainly available in Innovator-C, but what does it mean to disallow multiple triggers and views. Does it mean you can't have multiple Insert triggers on the same table? Probably. If so, work around this limitation by creating a stored procedure that does what you want on an Insert and create an Insert trigger that calls this procedure. I can't find anything in the Informix docs that define Multiple Triggers or Multiple Views. When I find out which feature is excluded here I'll let you know.

No Web feature service

I'm assuming this is the Web Datablade. Never had a need for it and I'm assuming people needing an Innovator-C type database won't miss it either.

No Node DataBlade

Same goes for the Node Datablade, which allows you to use the Node datatype (right?). I won't cry about this feature being left out.

Saturday, June 5, 2010

Innovator-C and What Ain't Free, Part Three

Picking up where we left of yesterday...

No Parallel backup/restore

If you're using Onbar and a Storage Manager to manage your backups you can back up dbspaces in parallel and reduce your backup times.

This is only beneficial if both of the following are true:

1. Your Storage Manager is writing to multiple media (i.e. you're backing up to multiple disk drives or tapes) in parallel.

2. Your Dbspaces are on different disk drives so reading 2 dbspaces in parallel actually reads from 2 drives in parallel.

And this really only comes into play if your databases are big enough to actually make you concerned about backup times.

My guess is that a lot of adopters of Innovator-C will simply use ontape to disk as a backup solution, probably with the following command

ontape -s -L 0 -t STDIO | gzip -c > backup.gz

No Partitioning

Partitioning is splitting up the data in a table, usually by some kind of expression.

Something along the lines of put rows for a table with column1 = 1 in partition1, column1 = 2 in partition2, column1 = 3 in partition3 and everything else in partitionRemainder.

One benefit of doing this is partition elimination for queries. If your SQL where clause has 'and column1 = 2' in it, Informix will know it only has to look at the rows in partition2 and won't scan partitions 1, 3 and Remainder and this can improve performance of your big daddy reporting type queries.

Friday, June 4, 2010

Excluded Innovator-C Features - Dude, It Will Be Ok

Back to the breakdown of the excluded features in the free for production Innovator-C.

No Private Memory Cache for VP

I enabled this on my not for free production environments when the feature came out and I didn't get any kudos for improving performance. It is a nice feature that probably benefits a lot of larger, more active engines than mine (and I consider my engines to be pretty active), all I'm saying is that you're not going to cry yourself to sleep on your huge pillow because this feature was left out of Innovator-C.

What is going on with this feature is you can enable a memory cache for each CPUVP giving individual CPUVP processes their own chunk of memory to allocate to the threads they are running. How does this help? There is no coordination or locking to be done with the other CPUVPs when you are dealing with private memory cache because only one CPUVP is manipulating the private cache (or at least this is how I understand this feature to work at the most basic level).

No Direct I/O for Cooked Files

If you are using cooked files (filesystem files) for your dbspace chunks to hold your data you can not enable Direct I/O for these cooked files in Innovator-C and this *can* result in a I/O performance decrease of, oh, I don't know, 5% or somewhere in that neighborhood.

Use raw devices and this feature exclusion is moot. Not a deal breaker IMO even if you do use cooked chunks. This is a relatively new feature and people were using cooked chunks without Direct I/O to do real stuff before it was added.


I-Don't know exactly what this feature is, guess I won't miss it. I thought it might mean I can't reference another database on another engine with the following syntax on server1

select * from dbname@server2:table1;

But I tried this on my Innovator-C install and it worked. Any explanations welcomed in the comments.

Unavailable Features in Innovator-C - Don't Panic

If you have been following IBM's recent announcement of Innovator-C, a free version of Informix for all supported platforms that can be used in production environments, you may have seen a list that looks something like this:

Maximum of 4 CPUVPs
Maximum of 2GB memory allocated to Informix
Maximum of 2 Enterprise Replication Admin nodes
1 Read/Write HDR Server or 1 Read/Write RSS Server
No DBSpace Prioritization during backup/restore
No Recovery Time Objective Policy
No Private Memory Cache for VP
No Direct I/O for Cooked Files
No Parallel Data Query (PDQ)
No High Performance Loader (HPL)
No Parallel Index Build
No Parallel backup/restore
No Partitioning
No Column Level Encryption
No Compression
No Label based access control (LBAC)
No Last commit concurrency
No Multiple triggers and views
No Web feature service
No Node DataBlade
No Auto-gather statistics during index build
No Point-in-time table recovery
No SQL Warehouse
No Shared Disk Secondary

That would be the excluded features list. At first glance you may be all WTF, but if you take a closer look at what is actually in the list you will see that IBM didn't really give away a crippled version of the engine for free. They simply eliminated some of the features that are only needed by very large engines and niche features.

Here is my take on the restrictions and excluded features:

Thursday, May 27, 2010

Need a Free Database?

On May 25th IBM announced they will be repackaging their Informix Dynamic Server offerings.

Enterprise Edition, Workgroup Edition are enjoying a well deserved retirement and have been replaced with two new options: Ultimate Edition and Growth Edition. Developer Edition, the unpaid intern, is still around.

There is a lot of good stuff going on here, previously for cost features now included at no additional cost and previously unavailable features in certain editions now included at no additional cost.

But I don't want to talk about that. I want to talk about the free stuff, the C versions.

IBM announced two additional editions, Innovator-C and Ultimate-C. I've read that C stands for Community, as in Community provided support I guess. The kind of community support one can find on comp.databases.informix and the IIUG SIGs.

These C versions are FREE and can be used in production environments. I couldn't believe it either until I saw it explicitly stated in the IBM License Information document

1) IBM Base License

This Program has been provided to Licensee at no charge.

This License Information serves as Licensee's Proof of Entitlement (PoE).

Well hot damn. So what's the catch?

Community support is free, IBM support ain't. If you want to be able to call IBM if you have a problem or question you will have to purchase some support. This is pretty reasonable.

You can't redistribute the C versions without a distribution agreement and this will probably cost money. It is OK to install a free version in your data center and build a production application on top of it, it is not OK to bundle Informix with your application and distribute it for free.

And of course there are some restrictions on the features, resources and platforms available for the C versions.

Innovator-C Edition - Available for all platforms

Maximum of 4 CPUVPs (VPCLASS cpu,num=4)
Maximum of 2GB memory allocated to Informix (SHMTOTAL 2097152)
Maximum of 2 Enterprise Replication Admin nodes
1 Read/Write HDR Server or 1 Read/Write RSS Server
No DBSpace Prioritization during backup/restore
No Recovery Time Objective Policy
No Private Memory Cache for VP
No Direct I/O for Cooked Files
No Parallel Data Query (PDQ)
No High Performance Loader (HPL)
No Parallel Index Build
No Parallel backup/restore
No Partitioning
No Column Level Encryption
No Compression
No Label based access control (LBAC)
No Last commit concurrency
No Multiple triggers and views
No Web feature service
No Node DataBlade
No Auto-gather statistics during index build
No Point-in-time table recovery
No SQL Warehouse
No Shared Disk Secondary

Ultimate-C Edition - Available on Windows and Mac OS X only

Maximum of 16 CPUVPs (VPCLASS cpu,num=16)
Maximum of 16GB memory allocated to Informix (SHMTOTAL 16777216)
1 Read/Write HDR Server or 1 Read/Write RSS Server
No Compression
No Shared Disk Secondary

If you're not familiar with Informix and you've stumbled across this blog looking for alternatives to other free database engines this may look like a scary list things you don't get. I can assure you it is not. These restrictions are a very small subset of the Informix features and none of these feature or resource restrictions should prevent you from using Innovator-C or Ultimate-C as your free database engine of choice.

One very important limitation that does NOT exist is the limitation on data size. If you have a 5 petabyte database, Innovator-C and Ultimate-C will support it.

I should say that all of this is just my interpretation of the IBM documentation on the new Informix packaging, I am not an IBMer and I could be wrong on some things, but I don't think I am.

Thursday, May 13, 2010

OLTP Benchmark - Update Statistics

Thought I'd post a quick entry about something that you're probably thinking I should have done already, update statistics.

Well it didn't do much good, which is kind of what I expected since Informix now does some updating of statistics automatically when you build an index and these statistics are enough to let the optimizer pick the correct index for my benchmark SQLs.

Side note: I think this is a great addition to the engine, it sure has seem to have reduced the number of "Did you update statistics" posts on C.D.I. but I wish there was a way to disable this functionality. When update statistics runs as part of the index build it runs slower than if I were updating my statistics manually (something I'm going to do anyway after my index build) and this extends the time that I am officially creating an index.

I guess if I have to find something to complain about in the engine I'm happy it is a small complaint like this one.

OLTP Benchmark - OPTOFC

I had always intended to try OPTOFC and the other settings that help reduce network traffic between the clients and the server, but upon the advice from Cesar and The Clown I decided to try them now. I'll admit that I didn't expect this setting to be the breakthrough setting I was looking for, but it sure was!

OPTOFC is an environment variable that you set on the client that optimizes the opening, fetching and closing of cursors by deferring the opening of a cursor until the first fetch is executed and automatically closing the cursor once the last row is returned to the client. In the benchmark I'm doing a lot of single row lookups, before this would be 3 network messages. 1 to open the cursor, another to fetch the row and a third to close the cursor. With OPTOFC set this is reduced to 1 network message that opens, fetches the 1 row and closes the cursor and boy howdy did it do the trick by increasing my TPM to 9607! That's an improvement of 31.5%, not bad for a simple environment variable change.

10,000 TPM is now within reach.

Without the 3 cheat codes from the previous post I was able to reach a TPM of 7855, a 27% improvement.

Wednesday, May 12, 2010

OLTP Benchmark - Cheats

I am tired of not making any real progress on improving my TPM so it is now time to cheat. I call the next 3 things cheats because they are not something I would do in a production environment or they artificially improve the benchmark TPMs. People who do benchmarks for a living would probably consider this just part of playing the benchmarking game, so lets play.

Cheat #1 - Buffered Logging

I can't turn off logging all together but I can improve performance by using buffered logging instead of unbuffered logging. The downside to using buffered logging in a production environment is the possibility of the user committing a transaction, that commit being written to the logical log buffer, a response being sent back to the user saying the transaction was committed followed by an engine crash before the logical log buffer is flushed to disk which will cause that transaction to be rolled back (even tho the user was told it was comitted) when the engine is restarted and recovery happens.

The upside to buffered logging is it is faster. Changing LOGBUFF to 256 and turning on buffered logging for the ptpcc database improved my TPM to 6785, an improvement of almost 10%.

Cheat #2 - Those Pesky Checkpoints

A funny thing happens around 11.5 minutes into my 15 minute benchmark, my performance falls off a cliff for about 1 to 2 minutes while I flush all my data to disk during a checkpoint. It is nice to see processing not stop all together as it would in the days of blocking checkpoints (I want to marry non blocking checkpoints because I love them so much) but it would be nicer to not see them at all.

If we're just playing the benchmarking game all we need to do is make it through the 15 minute benchmark time without doing any writes (foreground, LRU or chunk). So far I don't have any foreground or LRU writes so I should just have to avoid any checkpoints during my run.

I chose to change the RTO_SERVER_RESTART ONCONFIG from 120 to 1800. This parameter tells the engine to fire a checkpoint when the time to recover from a failure reaches 30 minutes. Not too practical for a real world production OLTP engine, but great for a 15 minute benchmark.

I suppose I could also have disabled auto checkpoints and simply set the old school CKPTINTVL ONCONFIG parameter to 1800. I will do this if RTO_SERVER_RESTART starts firing checkpoints during our 15 minute run.

TPM with no checkpoints = 7032 an improvement of 3.6% and I broke the 7000 TPM barrier!

Cheat #3 - Prime the buffer cache

After the last benchmark run I checked the last few lines of onstat -P to see how many of my 1.5 million buffers were actually filled with data. Turns out that 45.5% of my buffers had gone unused. This is just a factor of my benchmark time being short and not having enough time to populate the buffers with read in pages. I'm going to take advantage of all this space by prepopulating my buffers with the data I think the benchmark will need, specifically the customer and stock data pages.

If I run the following 3 SQLs before I run my benchmark I should populate the buffers with the data I want and hopefully get a better read cache hit rate.

select {+full (oorder)} count(distinct o_c_id) from oorder;
select {+full (customer)} count(distinct c_zip) from customer;
select {+full (stock)} count(distinct s_quantity) from stock;

Doing this increased my read cache hit % up to 99.81 from around 98.5 for a new TPM of 7307.

This also reopened an area I can tune. Now that the oorder, customer and stock table a fully cached before the benchmark and we are no longer doing any writes during the benchmark my I/O to physical disks has changed a lot.

dbspace4 had 0 I/Os during the last benchmark and dbspace3 had just a handful of I/Os. I should redistribute my data and indexes to take advantage of this.

Tuesday, May 11, 2010


In my ongoing quest to minimize I/O I will next try to reduce the number of index pages I have. 4 tables in the benchmark get rows inserted: history, order_line, oorder and new_order and the history table doesn't have any indexes on it.

This means the remaining tables have no inserts and I can set FILLFACTOR to 100 and put as much index data on a page that will fit and hopefully reduce my I/O.

For the 3 tables that currently do have indexes and are inserted into I will start by changing the FILLFACTOR from the default of 80 to 90 and see what happens. Over the course of the benchmark I increase the number of rows in these tables by 5%, so a more optimal FILLFACTOR may be 95%. I'll try that second and then try a FILLFACTOR of 100% for all tables just for fun.

FILLFACTOR 90% for my 3 insert tables - TPM 6190, meh, not much improvement over 6187.


FILLFACTOR 100% - TPM 6165

I think it is interesting that FILLFACTOR 100% outperformed FILLFACTOR 95%, other than that nothing of any real excitement going on here.

Back to the drawing board.

Monday, May 10, 2010


Add another parameter to the list of things that do not improve the benchmark TPM.

MAX_FILL_DATA_PAGES is a newer ONCONFIG parameter that when set to 1 will try to cram more rows with variable length data into a data page than it normally would. I thought this would be a good way to decrease I/O by reducing the number of data pages but I was wrong.

With MAX_FILL_DATA_PAGES set to 0 the engine will only store a row with variable length data on a page if the page has enough free space to allow the new row to grow to the maximum size. When MAX_FILL_DATA_PAGES is set to 1 the engine will store the row on a page if there is at least 10% of the page still free after inserting the new row.

One disadvantage to using this parameter is that a row with variable length data could need to grow in size and that growth could cause the row to be split across multiple pages if there is not enough free room and as we all learned in elementary school, this is a very bad thing.

With MAX_FILL_DATA_PAGES set to 1 and a drop/recreate/reload of the ptpcc database (something I do before each benchmark, BTW) the TPM dropped to 6095.

Well, I'm not making much progress. I'm wondering if I can get over 7000 TPM at this point. I'm not too concerned yet. I've only done some easy ONCONFIG changes and some spreading of I/O over my extra disks. There are still a lot of things to try.

OLTP Benchmark - CPUVPs

I currently have 4 CPUVPs configured for my 2 dual core CPUs. Can I get more from my physical CPUs if I create more CPUVPs?

The last TPM running 4 CPUVPs was 6187.

If I increase CPUVPs to 6, what happens? I get a decreased TPM of 6042 is what happens.

Maybe I was too aggressive, what if I just increase CPUVPs to 5, just 1 more CPUVP than real processors? Well, I get a TPM of 6027 and I'm done trying to increase CPUVPs to increase performance. At least on my hardware for this benchmark anyway.

What if I reduce CPUVPs to 3? I get a TPM of 6173, better than when I increased CPUVPs but worse (but not by much) than running 4 CPUVPs on a machine with 4 CPUs.

For now I'm going to stick with my original configuration as far as number of CPUVPs goes.

What about affinity? Would it help to pin my 4 CPUVPs down to a physical CPU and prevent my oninit processes from jumping around from CPU to CPU?

Unfortunately, that didn't work to well either giving me a TPM of 6083. Pffffffffffffft, what a bummer. No improvements today.

Friday, May 7, 2010

OLTP Benchmark - Spreading out the I/O

The Fastest DBA Contest benchmark was run against a server with only 1 disk so spreading out the I/O over multiple dbspaces wouldn't do too much good. My server has access to multiple RAID10 sets that are made up of 6 to 10 drives each so I think I can get some benefit from spreading out the I/O. dbspace1 has 6 drives, dbspaces 2 and 3 have 8 drives and dbspace 4 has 10 drives.

After the last benchmark I ran the following SQL against the sysmaster database to identify the I/O for each table and index

tabname::char(20) tab,
(pagreads + pagwrites) io,
dbsname = "ptpcc" and
tabname not matches ("sys*")
order by
(pagreads + pagwrites) desc;


tab                             io    pagreads   pagwrites

stock                       833785      417350      416435
customer                    226990      134711       92279
order_line                   75287       42206       33081
stock_pk                     45504       45504           0
oorder_uix1                  36457       23932       12525
oorder                       29367       26773        2594
order_line_pk                27214       11809       15405
oorder_pk                    19565       16879        2686
customer_pk                  17865       17865           0
new_order_pk                  8496        6196        2300
district                      4793        4745          48
customer_ix1                  3894        3894           0
item                          3773        3773           0
new_order                     1979         773        1206
history                       1522          12        1510
item_pk                        815         815           0
warehouse                       11           5           6
district_pk                      7           7           0
warehouse_pk                     1           1           0
Using this information I can place my tables and indexes in the appropriate dbspaces to spread the reads and writes across my disks evenly (giving the RAID10 set with 10 disks more I/Os to do and the RAID10 set with only 6 disks less I/Os to do) and hopefully improve my benchmark TPM.

Here is how I chose to distribute my data

stock dbspace4
customer dbspace3
order_line dbspace2
stock_pk dbspace1
oorder_uix1 dbspace2
oorder dbspace2
order_line_pk dbspace1
oorder_pk dbspace1
customer_pk dbspace2
new_order_pk dbspace1
district dbspace2
customer_ix1 dbspace1
item dbspace2
new_order dbspace1
history dbspace2
item_pk dbspace1
warehouse dbspace2
district_pk dbspace1
warehouse_pk dbspace2

Making these changes and rerunning the benchmark gives us a TPM of 6187, an improvement of almost 5%. Not exactly the barn burning results I was hoping for, most likely a result of reducing our I/O by increasing out buffers but it is an improvement.

OLTP Benchmark - Buffers

What is the first thing everyone does when tuning an engine for the first time? Check the buffers and increase them if they haven't been changed from the default setting. We want to minimize I/O because I/O sucks and takes a long time. Putting as much likely to be used again very soon data into cache is a good way to minimize I/O.

Our baseline was 3651 TPM with buffers set to 50000. I'm sure we can improve on that number if we reduce the number of disk reads and writes. Taking advantage of the fact that this is an OLTP benchmark I can configure nearly all of the available memory on the box to buffers (3GB or 1572864 2K buffers) and configure just enough memory to the virtual segment to handle all of the sessions (250 MB).

Running the benchmark again I get the following results

2010-05-07 16:35:07: benchmark - 100 terminals running for 900 seconds
2010-05-07 16:35:07:                       sql        total          min          max          avg          num
2010-05-07 16:35:07:          dbSelDistOrderId   366.158430     0.000592     3.143735     0.103085         3552
2010-05-07 16:35:07:             dbSelDistPymt  1435.361208     0.000605     0.837176     0.037623        38151
2010-05-07 16:35:07:             dbSelCustById  2255.096493     0.000784     1.295496     0.059110        38151
2010-05-07 16:35:07:          dbSelNewestOrder   595.117024     0.000480     5.193018     0.172648         3447
2010-05-07 16:35:07:             dbSelCustData   150.336596     0.000696     0.435675     0.038957         3859
2010-05-07 16:35:07:           dbSelOrderLines    69.367385     0.000373     0.499633     0.020124         3447
2010-05-07 16:35:07:          dbSelCustByOrder  1348.661282     0.000582     0.652879     0.038489        35040
2010-05-07 16:35:07:               dbUpdOOrder   677.845172     0.000256     0.566081     0.019345        35040
2010-05-07 16:35:07:                 dbUpdDist   763.428072     0.000250     0.676018     0.019180        39803
2010-05-07 16:35:07:                dbSelOrder   135.996560     0.000631     0.804294     0.039454         3447
2010-05-07 16:35:07:             dbUpdDistPymt  4233.915816     0.000257     8.117577     0.110978        38151
2010-05-07 16:35:07:                dbUpdStock  7224.883937     0.000263     0.797545     0.019243       375455
2010-05-07 16:35:07:            dbSelCustCount   150.360743     0.000795     1.164771     0.069676         2158
2010-05-07 16:35:07:             dbUpdWhsePymt  4263.935629     0.000254     8.083272     0.111765        38151
2010-05-07 16:35:07:   dbUpdCustBalanceAndData    71.946746     0.000414     0.333886     0.018644         3859
2010-05-07 16:35:07:             dbInsHistPymt   728.578052     0.000265     0.503783     0.019097        38151
2010-05-07 16:35:07:          dbUpdCustBalance   666.788779     0.000283     0.656385     0.019444        34292
2010-05-07 16:35:07:        dbSelCustByNameOrd    94.072595     0.000762     0.373746     0.046318         2031
2010-05-07 16:35:07:             dbSelNewOrder  2017.488472     0.003504     5.246324     0.057577        35040
2010-05-07 16:35:07:             dbSelCustWhse  5384.940138     0.000726     6.392744     0.135290        39803
2010-05-07 16:35:07:          dbSelCustByIdOrd    82.164412     0.000758     0.852723     0.058026         1416
2010-05-07 16:35:07:            dbUpdOrderLine  1070.116667     0.000348     1.189829     0.030540        35040
2010-05-07 16:35:07:                dbSelStock 19846.689419     0.000703     1.215853     0.052860       375455
2010-05-07 16:35:07:         dbSelOrderLineAmt  1375.268063     0.000681     0.623969     0.039249        35040
2010-05-07 16:35:07:             dbSelWhsePymt  1448.440517     0.000599     0.677760     0.037966        38151
2010-05-07 16:35:07:            dbInsOrderLine  7268.765215     0.000281     0.920966     0.019360       375455
2010-05-07 16:35:07:               dbInsOOrder   764.685358     0.000281     0.520975     0.019212        39803
2010-05-07 16:35:07:             dbDelNewOrder   770.737759     0.000263     7.087722     0.021996        35040
2010-05-07 16:35:07:                 dbSelDist  5383.367946     0.000620     6.788749     0.135250        39803
2010-05-07 16:35:07:             dbInsNewOrder   767.917783     0.000262     0.398216     0.019293        39803
2010-05-07 16:35:07:         dbUpdCustDelivery  1270.802428     0.000293     1.166629     0.036267        35040
2010-05-07 16:35:07:           dbSelCountStock   118.598752     0.006119     1.274016     0.033389         3552
2010-05-07 16:35:07:                 dbSelItem 14574.856113     0.000633     0.837902     0.038819       375455
2010-05-07 16:35:07:                    totals 87376.689563            -            -            -      2240081
2010-05-07 16:35:07:
2010-05-07 16:35:07: time: 901.060969, total transactions: 88584, TPM: 5898.646354, SPM: 149162.891976
Maxing out buffers increases out TPM by almost 62% to 5899, not bad for a simple config change and engine bounce.

Next I think I'll take advantage of my 3 unused dbspaces to spread what I/O I have left around to my idle disks.

OTLP Benchmark - Baseline

I'm ready to start benchmarking, but before I start we need to talk about the hardware I'm running the benchmark against and produce some baseline results.

Server/Storage Properties

2 Dual Core CPUs
OS: 64 bit RHEL5
Informix: 11.50.FC6

rootdbs: 8GB RAID10 spread over 6 physical drives, currently holds a 4GB physical log
llogdbs: 8GB RAID10 spread over 8 physical drives, currently holds 32 256 MB logical logs
dbspace1: 49GB RAID10 spread over 6 physical drives
dbspace2: 108GB RAID10 spread over 8 physical drives
dbspace3: 69GB RAID10 spread over 8 physical drives
tempdbs: 4GB RAID10 spread over 10 physical drives

The client (the ptpcc.py process that runs the benchmark) will run on a separate but identical server.

For the baseline benchmark, here is a sampling of the ONCONFIG values I used

NETTYPE soctcp,1,200,NET
VPCLASS cpu,num=4,noage
VPCLASS aio,num=10,noage
BTSCANNER num=1,threshold=5000,rangesize=-1,alice=6,compression=default
BUFFERPOOL size=2K,buffers=50000,lrus=8,lru_min_dirty=50,lru_max_dirty=60

All tables used in the baseline benchmark were created with row level locking in dbspace1 and the original set of indexes.

Before each benchmark I will bounce the engine and allow the benchmark to run for 5 minutes without collecting statistics. After this 5 minute ramp up time the benchmark will run for 15 minutes and produce some statistics that look like this:

2010-05-07 15:11:26: benchmark - 100 terminals running for 900 seconds
2010-05-07 15:31:28:                       sql        total          min          max          avg          num
2010-05-07 15:31:28:          dbSelDistOrderId   264.492183     0.000589     4.998652     0.120443         2196
2010-05-07 15:31:28:             dbSelDistPymt   666.518798     0.000611     0.874549     0.028071        23744
2010-05-07 15:31:28:             dbSelCustById  3865.562448     0.000819     2.452900     0.162802        23744
2010-05-07 15:31:28:          dbSelNewestOrder  3540.358412     0.000530    13.473251     1.700460         2082
2010-05-07 15:31:28:             dbSelCustData    69.895807     0.000703     0.853232     0.030310         2306
2010-05-07 15:31:28:           dbSelOrderLines    33.498745     0.000373     0.484876     0.016090         2082
2010-05-07 15:31:28:                dbSelOrder    69.810965     0.000608     0.529128     0.033531         2082
2010-05-07 15:31:28:          dbSelCustByOrder  1081.806514     0.000592     1.636825     0.049017        22070
2010-05-07 15:31:28:               dbUpdOOrder  1071.331497     0.000262     2.102251     0.048542        22070
2010-05-07 15:31:28:                 dbUpdDist   369.739051     0.000248     0.864587     0.015036        24591
2010-05-07 15:31:28:           dbSelCountStock  5653.657284     0.006115    14.663261     2.574525         2196
2010-05-07 15:31:28:             dbUpdDistPymt  3211.747694     0.000262     6.936206     0.135266        23744
2010-05-07 15:31:28:                dbUpdStock  3607.889070     0.000283     1.362896     0.015618       231006
2010-05-07 15:31:28:            dbSelCustCount   207.250488     0.017086     2.808686     0.166466         1245
2010-05-07 15:31:28:             dbUpdWhsePymt  2955.763594     0.000259     5.403844     0.124485        23744
2010-05-07 15:31:28:   dbUpdCustBalanceAndData    32.769989     0.000403     0.146913     0.014211         2306
2010-05-07 15:31:28:             dbInsHistPymt   368.780551     0.000269     1.330070     0.015532        23744
2010-05-07 15:31:28:          dbUpdCustBalance   345.092451     0.000297     1.336634     0.016097        21438
2010-05-07 15:31:28:        dbSelCustByNameOrd    45.854596     0.000788     0.640742     0.038566         1189
2010-05-07 15:31:28:             dbSelNewOrder  1986.016858     0.003527     6.152908     0.089987        22070
2010-05-07 15:31:28:             dbSelCustWhse  5416.771947     0.000852     6.570103     0.220275        24591
2010-05-07 15:31:28:          dbSelCustByIdOrd   108.330620     0.003462     1.446552     0.121311          893
2010-05-07 15:31:28:            dbUpdOrderLine  1515.644348     0.000358     2.189919     0.068674        22070
2010-05-07 15:31:28:                dbSelStock 26748.686979     0.000714     5.385313     0.115792       231006
2010-05-07 15:31:28:         dbSelOrderLineAmt   688.818336     0.000692     1.372060     0.031211        22070
2010-05-07 15:31:28:             dbSelWhsePymt   708.854190     0.000605     0.904277     0.029854        23744
2010-05-07 15:31:28:            dbInsOrderLine  4126.250988     0.000295     2.563897     0.017862       231006
2010-05-07 15:31:28:               dbInsOOrder   604.548543     0.000291     2.471327     0.024584        24591
2010-05-07 15:31:28:             dbDelNewOrder   479.087660     0.000269     8.575327     0.021708        22070
2010-05-07 15:31:28:                 dbSelDist  3854.715855     0.000630     7.683246     0.156753        24591
2010-05-07 15:31:28:             dbInsNewOrder   412.719221     0.000263     2.343799     0.016783        24591
2010-05-07 15:31:28:         dbUpdCustDelivery  2324.551898     0.000327     2.426604     0.105326        22070
2010-05-07 15:31:28:                 dbSelItem 12098.294207     0.000647     1.985179     0.052372       231006
2010-05-07 15:31:28:                    totals 88535.111786            -            -            -      1383948
2010-05-07 15:31:28:
2010-05-07 15:31:28: time: 901.863525, total transactions: 54876, TPM: 3650.840630, SPM: 92072.556084

The last line shows the total results for the entire benchmark. time is the actual run time of the benchmark in seconds, TPM is the number of transactions per minute and SPM is the number of SQLs executed per minute.

Above the total results we see metrics for each SQL statement that make up each transaction. sql is a name for the SQL statement, total is the total number of seconds spent executing the SQL, min, max and avg are the minimum, maximum and average number of seconds for a single SQL and num is the number of times this SQL was executed.

Now the fun can begin and will most likely involve increasing the buffers to something reasonable to take advantage of the 4GB of memory I have on this machine.

Fastest Informix DBA Contest at IIUG 2010

One of my favorite parts of the IIUG conference is the Advanced Data Tools Fastest DBA contest. This year Lester gave us a real challenge: tune an engine for a TPCC like OLTP benchmark. This alone was a fun challenge but what really put you to the test was the amount of time you had. You had 1 hour from start to finish which included 10 minutes to initialize the engine and 10 minutes to run the benchmark leaving you only 40 minutes to get familiar with the benchmark and do any tuning or index building you wanted to do.

For me the 40 minutes felt like 4 and I didn't think of anything special to do to tune the engine and finished in 4th place but I did manage to claim the fastest Middle-Aged Informix DBA title. The prize for winning this category was a very nice shirt and many jokes at my expense from Tom Girsch and others.

Now that I am back home and have more than 40 minutes to play around with the benchmark I want to see how fast I can get it to run.

The first thing I did is rewrite the Java version in Python. Why would I do this? I don't know, I just like Python and rewriting the original benchmark was a good way for me to understand what the benchmark was doing. During my rewrite I added some logic to capture the execution times of each SQL, the number of times it was executing, the minimum execution time and the maximum execution time so I can see what needs to be improved or how much I've improved things.

Now that I have a good idea of what the benchmark does, I can give a brief overview of what is going on, something I couldn't do a week ago during my 40 minutes at the IIUG conference.

The benchmark simulates a order processing system. The system can take orders, schedule orders for delivery, process payments, check stock levels and check order statusii. There are 50 warehouses that each contain different amounts of the 100000 items in stock. Each warehouse has 10 districts that each have 3000 customers (50 * 10 * 3000 = 1.5 million customers) that can place orders, be delivered items or make payments, etc. from 1 of the 100 terminals that are running in parallel.

When the benchmark starts 100 terminal threads are created and run for 10 minutes continuously picking a transaction type to run.

43% of the time a payment transaction is picked
4% of the time a order status check transaction is picked
4% of the time a delivery transaction is picked
4% of the time a stock level check transaction is picked
45% of the time a new order transaction is picked

Each transaction type consists of multiple SQL statements that select, insert, update and delete and simulates an OLTP environment.

Next time I'll get into the configuration of my test system and share the results of my baseline and then start changing stuff and see if it helps or hurts.

Thursday, April 29, 2010

dbdelete in python

At the 2010 IIUG Informix conference Nick Geib from IBM and I did a technical session on python and Informix. One of the examples we gave of how an Informix DBA can leverage python to make his or her life easier was a dbdelete utility, a simple python script that can be used to delete a large amount of data without causing a long transaction, excessive locks and can run some onstats during the delete to determine if we should stop deleting for a little while because it is impacting performance.

One of the session attendees, a skeptical perl zealot, was so blown away by python and my super awesome public speaking skills that he asked for the complete dbdelete.py source code so he could immediately start learning more about python or find something stupid that I've done so he can give me shit.

Either way, here is the code Darryl:


import os
import sys
import getopt
import time
import informixdb

def usage(server):
print "%s -s <server> -d <database> -w <where clause> -n <num rows> -s <sleep seconds> [-v]" % sys.argv[0]
print "   -s : DBSERVERNAME, default %s" % server
print "   -d : database name, required"
print "   -w : where clause, required"
print "   -n : number of rows per transaction, optional, default 10"
print "   -s : sleep seconds in between each transaction, optional, default 1"
print "   -v : verbose output, optional, default off"

# wt4logbf executes onstat to identify the number of threads waiting for a log buffer
# if more than maxlogbfwt threads waiting on logbf are found we will sleep for sleepSeconds
# threads waiting for logbf is an indication that HDR is behind and if we do not throttle
# back out deletes engine performance will drop
def wt4logbf(maxlogbfwt, sleepSeconds):
# execute onstat -g ath and count the number of threads waiting on logbf
logbfwt = int(os.popen("onstat -g ath | grep logbf | wc -l").readlines()[0])

# sleep sleepSeconds and recheck number of waiting threads
# repeat until number of threads waiting for logbf is below maxlogbfwt
while logbfwt >= maxlogbfwt:
print "max logbf waits reached [%d >= %d] sleeping %d seconds" % (logbfwt, maxlogbfwt, sleepSeconds)


logbfwt = int(os.popen("onstat -g ath | grep logbf | wc -l").readlines()[0])

def main():
server = os.getenv("INFORMIXSERVER")
database = None
where = None
numDelPerTransaction = 10
sleepSeconds = 1
verbose = False

# parse command line arguments
opts, args = getopt.getopt(sys.argv[1:], "S:d:w:n:s:v?")

for opt, val in opts:
if opt == "-S":
server = val
if opt == "-d":
database = val
if opt == "-w":
where = val
if opt == "-n":
numDelPerTransaction = int(val)
if opt == "-s":
sleepSeconds = int(val)
if opt == "-v":
verbose = True
if opt == "-?":

# if the required arguments were not passed display the usage and exit
if (numDelPerTransaction < 1) or (sleepSeconds < 0) or (where is None):

# sql to select the primary key fields (pkcol1 and pkcol2) from table1 that
# meet the user defined where clause
sqlSelect = """
""" % (where, )

# sql to delete a row by the primary key of table1
sqlDelete = """
delete from
pkcol1 = :1 and
pkcol2 = :2

# connect to the database
dbCon = informixdb.connect("%s@%s" % (database, server), autocommit = False)
except informixdb.DatabaseError, e:
print "unable to connect to %s@%s, %ld"  % (database, server, e.sqlcode)

# define select and delete cursors
dbSelectCursor = dbCon.cursor(rowformat = informixdb.ROW_AS_OBJECT, hold=True)
dbDeleteCursor = dbCon.cursor()
except informixdb.DatabaseError, e:
print "unable to define cursors, %ld"  % (e.sqlcode, )

# set some session attributes
dbSelectCursor.execute("set lock mode to wait")
dbSelectCursor.execute("set isolation dirty read")
except informixdb.DatabaseError, e:
print "unable to set session attributes, %ld"  % (e.sqlcode, )

# select the primary key of all rows in table1 that meet our where clause

numRowsInTransaction = 0
totalRows = 0
startTime = time.time()

# for each row that meets our where clause, delete it
# committing the transaction and checking engine load at the user
# defined intervals
for dbRow in dbSelectCursor:
if verbose:
print "deleting row pkcol1 = %ld and pkcol2 = %ld" % (dbRow.pkcol1, dbRow.pkcol2)

# attempt to delete this row
dbDeleteCursor.execute(sqlDelete, (dbRow.pkcol1, dbRow.pkcol2))
numRowsInTransaction = numRowsInTransaction + 1
totalRows = totalRows + 1
except informixdb.DatabaseError, e:
print "unable to delete row pkcol1 = %ld and pkcol2 = %ld, %ld" % (dbRow.pkcol1, dbRow.pkcol2, e.sqlcode)

# if we have met out rows to delete per transaction limit,
# commit the transaction, sleep and check engine load
if numRowsInTransaction == numDelPerTransaction:

print "deleted %d rows [%f rows/second], sleeping %d seconds" % (totalRows, totalRows / (time.time() - startTime), sleepSeconds)

numRowsInTransaction = 0

wt4logbf(2, 30)

# commit the last transaction

print "deleted %d rows" % totalRows

except informixdb.DatabaseError, e:
print "unable to execute %s,  %ld" % (sqlSelect, e.sqlcode)

if __name__ == "__main__":