Friday, June 4, 2010

Unavailable Features in Innovator-C - Don't Panic


If you have been following IBM's recent announcement of Innovator-C, a free version of Informix for all supported platforms that can be used in production environments, you may have seen a list that looks something like this:

Maximum of 4 CPUVPs
Maximum of 2GB memory allocated to Informix
Maximum of 2 Enterprise Replication Admin nodes
1 Read/Write HDR Server or 1 Read/Write RSS Server
No DBSpace Prioritization during backup/restore
No Recovery Time Objective Policy
No Private Memory Cache for VP
No Direct I/O for Cooked Files
No I-STAR
No Parallel Data Query (PDQ)
No High Performance Loader (HPL)
No Parallel Index Build
No Parallel backup/restore
No Partitioning
No Column Level Encryption
No Compression
No Label based access control (LBAC)
No Last commit concurrency
No Multiple triggers and views
No Web feature service
No Node DataBlade
No Auto-gather statistics during index build
No Point-in-time table recovery
No SQL Warehouse
No Shared Disk Secondary


That would be the excluded features list. At first glance you may be all WTF, but if you take a closer look at what is actually in the list you will see that IBM didn't really give away a crippled version of the engine for free. They simply eliminated some of the features that are only needed by very large engines and niche features.

Here is my take on the restrictions and excluded features:

Maximum of 4 CPUVPs

In the simplest of terms this restricts your Informix engine to 4 CPUs worth of processing power. If you have a single quad core CPU you can use all of the processing power for Informix. If you have 2 dual core CPUs you can use all of the processing power for Informix. If you have 4 quad core CPUs you can only use 1/4 (4 cores out of the 16 available) of the processing power of the server for Informix.

Informix does a great job of minimizing CPU overhead through efficient use of threads and as a result can do a lot with these 4 CPUs. In the real world I am a DBA for a telecommunications provider that uses Informix to handle billions of queries each day with sub second response times and I don't have more than 4 CPUVPs configured for any of my engines.

Maximum of 2GB memory allocated to Informix

Oh how spoiled we have become. Only 2GB of memory, oh the humanity! What could we possibly do with a measly 2 Gigabytes of memory?

A lot.

Considering that PDQ is unavailable in Innovator-C (more on this later) you will most likely want to dedicate a large percentage of this 2GB to Buffers. I'm thinking a good start would be about 75% of the memory to buffers and 25% to the virtual segment (where all the sorting, grouping and session memory lives).

That's 1.5GB of your most frequently used data on disk cached in memory.

In my production environments I dedicate 2GB to buffers, 500MB more than what I would use with Innovator-C, and I get a cache read hit percentage of around 99.7%

Informix does an awesome job of caching the right pages and I feel that if I dropped down to 1.5GB for buffer cache my cache read hit percentage wouldn't suffer much, I still think it would be above 99% and I definitely don't think my users would notice a difference.

Maximum of 2 Enterprise Replication Admin nodes

What? Are you kidding me? IBM is giving me Enterprise Replication for free? Oh no, I can only have 2 Administrator or root nodes, waaaaaaaaaah.

I only have 1 root node in my production environment, so it is definitely possible to accomplish something with ER and 2 root nodes.

ER is a wonderful feature. Simply put, ER replicates database changes (insert, update, deletes) performed on one table to 1 or more tables on different engines.

There are no limits to the number of leaf nodes and non-root nodes configured for Innovator-C, just root nodes.

1 Read/Write HDR Server or 1 Read/Write RSS Server

Wow, not only can you use ER for free, you can also use some of the MACH11 features for fault tolerance and failure recovery for free. Amazing.

An HDR server is a backup server, typically in the same data center, that is continuously and automatically kept in sync with the Primary server. The HDR Secondary can be brought into service if the Primary server fails. HDR is different than ER because it is for the entire engine and not on a table by table basis. HDR is ridiculously easy to setup and requires very little hand holding after it is setup.

To keep things simple, a RSS server is like an HDR server but is meant to be in a different data center to provide geographical redundancy for your database solution. Like HDR RSS is easy to setup and administer and can be brought into service quickly to replace a failed Primary.

And the icing on the cake is that HDR and RSS servers are read/writable. You are free to use the secondary server for inserts, updates, deletes and selects while they are acting as a secondary.

No DBSpace Prioritization during backup/restore

I'm pretty sure this is a feature recently added by IBM to reduce the time spent backing up/restoring dbspaces in parallel. If it is, meh, not the end of the world if this feature isn't there. Backups are taken online so there is no downtime associated with them and this would really only come into play if you have your dbspace sizes are very different from each other.

No Recovery Time Objective Policy

I was sad to see this feature excluded, but just because I think it is a smart way to initiate checkpoints and not because I require it to run production stuff.

There are 2 ways (not really, I'm ignoring the other things that fire checkpoints) you can initiate checkpoints. Every N seconds or When so much work has been done that it would take N seconds to recover from a failure.

The latter is Recovery Time Objective or RTO and it is neat. Say you need fast recovery from a failure to take no more than 90 seconds, you would set RTO to 90 seconds and the engine will monitor what work has been done and when the engine determines a server failure would take 90 seconds to recover from it initiates a checkpoint. This is great because checkpoints suck (even the non blocking checkpoints that Informix does) because it involves I/O and I/O sucks so doing this only when you need to is a great advantage.

Without this feature in Innovator-C you have to revert back to the old school Every N seconds checkpoint mechanism (typically every 5 to 15 minutes).

Not really the end of the world and not something that would keep me from installing in a production environment.


Ok, enough for now. It IS super long list, but we haven't come across a feature or limitation that cripples Innovator-C. Spoiler alert, we're not going to come across any next time either.


4 comments:

  1. Good article, Andrew. I'm not quite sure what you mean by "No Multiple triggers and views" though. "No multiple triggers" means that an action cannot activate more than one trigger, but where do views come into the picture?

    ReplyDelete
  2. This article is a great help which gives me a practical-technical reason to use Innovator-C.

    -F

    ReplyDelete
  3. Thanks Andrew for Explaining the Probibited Components.
    Just one question.. Do we need to get POE from IBM
    to run IDS on production environment?.
    The License Information has got a reference to POE.

    Thanks
    Krishnan

    ReplyDelete
  4. It is my understanding that you do not need to obtain a POE from IBM if you want to use Innovator-C in a production environment, it is just there for you to use for free.

    On the other hand, if you want support you can buy this on a yearly bases from IBM at a very reasonable cost.

    ReplyDelete