If I had to guess, I would say that most production database engines utilize RAID technology to protect against the inevitable disk failure and the ones that don't probably should. Disk is cheap and the revenue saved by avoiding an extended outage can be enough to pay for disk mirroring many times over.
If I had to guess again, I would say that not nearly enough production database engines utilize High Availability Data Replication (HDR) to protect against the inevitable server failure. Why is this? Servers can fail too. Sure, servers are more expensive than disks and sure the MTTF is longer than disks but the money lost during an extended outage that could have been avoided with HDR is probably going to be more than the cost of implementing an HDR solution.
HDR continuously replicates the changes made to a Primary server to a Secondary server that can be quickly converted to a Primary if the original Primary fails. As an added bonus, the Secondary server can be used for reads and writes allowing you to make use of this hardware to improve performance instead of letting it sit there idle. You could also implement multiple Remote Standalone Secondary (RSS) or Shared Disk Secondary (SDS) servers to create a grid if your Informix Edition supports this. I'm going to focus on a single HDR Secondary which is available for no cost in Innovator-C.
As with most Informix features, HDR is incredibly easy to configure and does not require much administration.
Get Yourself Some More Hardware
To enable HDR you will need another server. This server should be identical to the Primary server. The Secondary server doesn't have to be identical in every way, but if you expect it to take over during a failure you're going to want the same amount of memory, CPUs, etc. to ensure it can handle the load. Here is what is required of servers participating in HDR:
- Both servers must run the same Informix version
- Both servers must be able to run the same Informix executable. Ubuntu and Red Hat run the same Informix executable, HP/UX and Red Hat do not. Why would you ever want to do this anyway?
- Both servers must have network capabilities
- The Secondary server must have at least as much disk space for dbspaces as the Primary. The dbspace chunk types (cooked or raw) do not have to be identical
- Dbspace chunk path names must be identical, symbolic links can help here
- Not really a hardware requirement but any databases you want replicated must be logged. Unbuffered logging is preferred
Follow the steps from Installing Innovator-C on Linux on a new server named blogsvr02. Create 0 byte files with the touch command to mirror the dbspace chunks on the primary server.
informix@blogsvr02> mkdir /home/informix/chunks informix@blogsvr02> touch /home/informix/chunks/ROOTDBS.01 informix@blogsvr02> touch /home/informix/chunks/LLOGDBS01.01 informix@blogsvr02> touch /home/informix/chunks/DATADBS01.01 informix@blogsvr02> touch /home/informix/chunks/DATADBS01.02 infofmix@blogsvr02> chmod 660 /home/informix/chunks/*Copy the /etc/profile.d/informix.sh file from the Primary to the Secondary and change INFORMIXSERVER
root@blogsvr02> scp blogsvr01:/etc/profile.d/informix.sh /etc/profile.d/informix.sh root@blogsvr02> vi /etc/profile.d/informix.sh export INFORMIXSERVER=blogsvr02Copy the ONCONFIG file from the Primary to the Secondary and change DBSERVERNAME and add a DBSERVERALIASES to both ONCONFIGs that will be used exclusively for HDR.
informix@blogsvr01> vi $INFORMIXDIR/etc/$ONCONFIG DBSERVERALIASES blogsvr01_hdr informix@blogsvr02> scp blogsvr01:/opt/informix/etc/onconfig.blogsvr01 $INFORMIXDIR/etc/$ONCONFIG informix@blogsvr02> vi $INFORMIXDIR/etc/$ONCONFIG DBSERVERNAME blogsvr02 DBSERVERALIASES blogsvr02_hdrDo we need a dedicated connection for HDR? No, but I feel doing so gives me two advantages
- I can put HDR traffic on a separate network if I want
- Both HDR servers must trust each other, I can use the more secure $INFORMIXDIR/etc/hosts.equiv to accomplish this if HDR runs on a dedicated port
informix> vi $INFORMIXDIR/etc/$ONCONFIG UPDATABLE_SECONDARY 2Add a new port to /etc/services on both servers for HDR.
root> vi /etc/services idshdr01 1528/tcp # Informix HDRModify the sqlhosts file on both the Primary and the Secondary so they both contain connectivity information for both servers. Use the s=6 security option for the HDR ports to indicate that only Replication traffic is allowed on these ports giving us the ability to use $INFORMIXDIR/etc/hosts.equiv to establish trusts.
informix> vi $INFORMIXSQLHOSTS # blogsvr01 blogsvr01 onsoctcp blogsvr01 idstcp01 blogsvr01_hdr onsoctcp blogsvr01 idshdr01 s=6 # blogsvr02 blogsvr02 onsoctcp blogsvr02 idstcp01 blogsvr02_hdr onsoctcp blogsvr02 idshdr01 s=6Bounce the Primary server for ONCONFIG changes to take effect.
Create or Modify an Existing hosts.equiv Files
The hosts.equiv file will contain the hostname of each server that is allowed to make a trusted connection. You must also change the permissions of the file so only the informix user can write to it.
informix@blogsvr01> vi $INFORMIXDIR/etc/hosts.equiv blogsvr02 informix@blogsvr01> chmod 640 $INFORMIXDIR/etc/hosts.equiv informix@blogsvr02> vi $INFORMIXDIR/etc/hosts.equiv blogsvr01 informix@blogsvr02> chmod 640 $INFORMIXDIR/etc/hosts.equivNote: Later when we start HDR if you see messages in your online.log (onstat -m output) that look like this:
12:12:16 listener-thread: err = -956: oserr = 0: errstr = informix@blogsvr02.prod.informix-dba.com[blogsvr02]: Client host or user informix@blogsvr02.prod.informix-dba.com[blogsvr02] is not trusted by the server.then need you to put the full hostname, blogsvr02.prod.informix-dba.com, in hosts.equiv
Restore Secondary Server Using a Backup from the Primary
The first step in actually starting HDR is to perform a physical restore of the Primary to the Secondary. After this is complete we will start HDR and Informix will automatically sync the Secondary with the Primary by processing the logical log records that have been written since the Primary's backup was taken.
One of my favorite Informix features is ontape to STDIO, you can use this feature to simultaneously take a Level 0 backup of your Primary, ship the data over the network and pipe it directly into a physical restore on the Secondary. This is a lot easier than performing an Imported Restore. Like to see it? Here it goes.
informix@blogsvr01> ontape -s -L 0 -F -t STDIO | ssh informix@blogsvr02 ". /etc/profile.d/informix.sh; ontape -p -t STDIO"While this is running, you can use onstat -D on both servers to see the reading of pages on the Primary and the writing of pages on the Secondary in parallel. After the backup and restore completes the Secondary server will be in Fast Recovery mode.
informix@blogsvr02> onstat -m IBM Informix Dynamic Server Version 11.50.UC7IE -- Fast Recovery -- Up 00:00:40 -- 1164976 Kbytes Message Log File: /opt/informix-ids-11.50.UC7IE/tmp/online.log 13:38:11 Maximum server connections 0 13:38:11 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 0, Llog used 0 13:38:11 Checkpoint Completed: duration was 0 seconds. 13:38:11 Tue Aug 3 - loguniq 10, logpos 0x1816018, timestamp: 0x4a722 Interval: 721 13:38:11 Maximum server connections 0 13:38:11 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 0, Llog used 0 13:38:11 Checkpoint Completed: duration was 0 seconds. 13:38:11 Tue Aug 3 - loguniq 10, logpos 0x1816018, timestamp: 0x4a728 Interval: 722 13:38:11 Maximum server connections 0 13:38:11 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 0, Llog used 0 13:38:12 Physical Restore of rootdbs, llogdbs01, datadbs01 Completed. 13:38:12 Checkpoint Completed: duration was 0 seconds. 13:38:12 Tue Aug 3 - loguniq 10, logpos 0x1816018, timestamp: 0x4a739 Interval: 722 13:38:12 Maximum server connections 0and you are ready to start HDR.
Starting HDR
Start HDR on the Primary with the onmode -d primary command. In this command you will tell Informix that this is a Primary HDR server and the Secondary is blogsvr02.
informix@blogsvr01> onmode -d primary blogsvr02Start HDR on the Secondary with the onmode -d secondary command. This will tell Informix that this is a Secondary HDR server and the Primary is blogsvr01.
informix@blogsvr02> onmode -d secondary blogsvr01The two servers will connect and after the Secondary clears its logical logs and receives all of the logical log records from the Primary the HDR setup is complete.
informix@blogsvr02> onstat -m IBM Informix Dynamic Server Version 11.50.UC7IE -- Updatable (Sec) -- Up 00:05:12 -- 1164976 Kbytes Message Log File: /opt/informix-ids-11.50.UC7IE/tmp/online.log 13:42:09 Updates from secondary allowed 13:42:09 DR: Secondary server needs failure recovery 13:42:10 DR: Failure recovery from disk in progress ... 13:42:10 Logical Recovery Started. 13:42:10 10 recovery worker threads will be started. 13:42:10 Start Logical Recovery - Start Log 10, End Log ? 13:42:10 Starting Log Position - 10 0x1816018 13:42:10 Clearing the physical and logical logs has started 13:42:46 Cleared 3059 MB of the physical and logical logs in 36 seconds 13:42:48 Started processing open transactions on secondary during startup 13:42:48 Finished processing open transactions on secondary during startup. 13:42:48 DR: HDR secondary server operational 13:42:49 B-tree scanners disabled. 13:42:50 Checkpoint Completed: duration was 0 seconds. 13:42:50 Tue Aug 3 - loguniq 10, logpos 0x181e018, timestamp: 0x4a7af Interval: 723 13:42:50 Maximum server connections 0 13:42:50 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 14, Llog used 0You really don't have to do anything else from this point forward to administer HDR, just sit back and relax. You're data is safer now.
What do I do when the Secondary Server Fails?
If the Secondary fails and the logical log that was current at the time of the failure has not been reused (they're circular, remember) on the Primary then you can simply restart the Secondary and it will automatically resync.
informix@blogsvr02> oninit informix@blogsvr02> tail -40 $INFORMIXDIR/tmp/online.log 14:46:39 DR: ENCRYPT_HDR is 0 (HDR encryption Disabled) 14:46:39 Event notification facility epoll enabled. 14:46:39 IBM Informix Dynamic Server Version 11.50.UC7IE Software Serial Number AAA#B000000 14:46:40 IBM Informix Dynamic Server Initialized -- Shared Memory Initialized. 14:46:40 Started 1 B-tree scanners. 14:46:40 B-tree scanner threshold set at 5000. 14:46:40 B-tree scanner range scan size set to -1. 14:46:40 B-tree scanner ALICE mode set to 6. 14:46:40 B-tree scanner index compression level set to med. 14:46:40 Physical Recovery Started at Page (1:5623). 14:46:40 Physical Recovery Complete: 0 Pages Examined, 0 Pages Restored. 14:46:40 DR: Trying to connect to primary server = blogsvr01_hdr 14:46:41 Dataskip is now OFF for all dbspaces 14:46:41 Restartable Restore has been ENABLED 14:46:41 Recovery Mode 14:46:45 DR: Secondary server connected 14:46:46 Updates from secondary allowed 14:46:46 Updates from secondary allowed 14:46:46 DR: Using default behavior of failure-recovering Secondary server 14:46:47 DR: Failure recovery from disk in progress ... 14:46:47 Logical Recovery Started. 14:46:47 10 recovery worker threads will be started. 14:46:47 Start Logical Recovery - Start Log 10, End Log ? 14:46:47 Starting Log Position - 10 0x182e018 14:46:48 Started processing open transactions on secondary during startup 14:46:48 Finished processing open transactions on secondary during startup. 14:46:48 DR: HDR secondary server operational 14:46:49 Logical Log 10 Complete, timestamp: 0x4a92d. 14:46:50 Logical Log 11 Complete, timestamp: 0x4a944. 14:46:51 Logical Log 12 Complete, timestamp: 0x4a975. 14:46:52 Logical Log 13 Complete, timestamp: 0x4a987. 14:46:54 B-tree scanners disabled. 14:46:55 Checkpoint Completed: duration was 0 seconds. 14:46:55 Tue Aug 3 - loguniq 14, logpos 0x9018, timestamp: 0x4a9a4 Interval: 729 14:46:55 Maximum server connections 0 14:46:55 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 15, Llog used 0If your Secondary has been down for a while and the logical logs have rolled over there are 2 ways to recover. The easy way and the hard way.
The easy way is to reinitialize HDR by restoring the Primary to the Secondary again and running onmode -d secondary blogsvr01_hdr on the Secondary.
The hard way is to restart the Secondary and when you see this message in the online.log
15:03:21 DR: Start failure recovery from tape ...You can perform a Logical Restore to the Secondary using the logical log backups from the Primary. If you're backing up to a directory, copy the necessary logical log backups from the Primary to the Secondary, rename each backup to include the Secondary server name and use ontape -l -d to perform a Logical Restore.
informix@blogsvr02> scp blogsvr01:/home/informix/backup/llog/* . blogsvr01_0_Log0000000008 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000009 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000010 100% 1440KB 1.4MB/s 00:00 blogsvr01_0_Log0000000011 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000012 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000013 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000014 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000015 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000016 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000017 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000018 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000019 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000020 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000021 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000022 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000023 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000024 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000025 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000026 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000027 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000028 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000029 100% 96KB 96.0KB/s 00:00 blogsvr01_0_Log0000000030 100% 96KB 96.0KB/s 00:00 informix@blogsvr02> script_i_made_to_rename_the_files.ksh informix@blogsvr02> ls -l /home/informix/backup/llog total 3648 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000008 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000009 -rw-rw---- 1 informix informix 1474560 Aug 3 14:56 blogsvr02_0_Log0000000010 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000011 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000012 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000013 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000014 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000015 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000016 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000017 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000018 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000019 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000020 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000021 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000022 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000023 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000024 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000025 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000026 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000027 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000028 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000029 -rw-rw---- 1 informix informix 98304 Aug 3 14:56 blogsvr02_0_Log0000000030 informix@blogsvr02> ontape -l -d Roll forward should start with log number 14 Restore is using file /home/informix/backup/llog/blogsvr02_0_Log0000000014 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000014 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000015 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000016 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000017 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000018 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000019 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000020 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000021 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000022 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000023 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000024 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000025 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000026 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000027 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000028 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000029 ... Using the backup and restore filter /bin/gunzip. Rollforward log file /home/informix/backup/llog/blogsvr02_0_Log0000000030 ... Program over. informix@blogsvr02> tail -46 $INFORMIXDIR/tmp/online.log 15:03:21 DR: Start failure recovery from tape ... 15:03:28 Logical Recovery Started. 15:03:28 10 recovery worker threads will be started. 15:03:28 Start Logical Recovery - Start Log 14, End Log ? 15:03:28 Starting Log Position - 14 0x9018 15:03:29 Started processing open transactions on secondary during startup 15:03:29 Finished processing open transactions on secondary during startup. 15:03:29 DR: HDR secondary server operational 15:03:29 Logical Log 14 Complete, timestamp: 0x4a9e2. 15:03:29 Logical Log 15 Complete, timestamp: 0x4a9f9. 15:03:29 Logical Log 16 Complete, timestamp: 0x4aa0b. 15:03:29 Logical Log 17 Complete, timestamp: 0x4aa0b. 15:03:29 Logical Log 18 Complete, timestamp: 0x4aa33. 15:03:29 Logical Log 19 Complete, timestamp: 0x4aa45. 15:03:29 Logical Log 20 Complete, timestamp: 0x4aa45. 15:03:29 Logical Log 21 Complete, timestamp: 0x4aa6a. 15:03:29 Logical Log 22 Complete, timestamp: 0x4aa6a. 15:03:29 Logical Log 23 Complete, timestamp: 0x4aa94. 15:03:29 Logical Log 24 Complete, timestamp: 0x4aaa6. 15:03:29 Logical Log 25 Complete, timestamp: 0x4aaa6. 15:03:29 Logical Log 26 Complete, timestamp: 0x4aace. 15:03:29 Logical Log 27 Complete, timestamp: 0x4aace. 15:03:29 Logical Log 28 Complete, timestamp: 0x4aaf2. 15:03:29 Checkpoint Completed: duration was 0 seconds. 15:03:29 Tue Aug 3 - loguniq 29, logpos 0x18, timestamp: 0x4aafc Interval: 730 15:03:29 Maximum server connections 0 15:03:29 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 16, Llog used 0 15:03:30 Logical Log 29 Complete, timestamp: 0x4ab1f. 15:03:33 DR: Failure recovery from disk in progress ... 15:03:33 Logical Log 30 Complete, timestamp: 0x4ae61. 15:03:33 Checkpoint Completed: duration was 0 seconds. 15:03:33 Tue Aug 3 - loguniq 31, logpos 0x15018, timestamp: 0x4aea4 Interval: 731 15:03:33 Maximum server connections 0 15:03:33 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 13, Llog used 0 15:03:33 Checkpoint Completed: duration was 0 seconds. 15:03:33 Tue Aug 3 - loguniq 31, logpos 0x17018, timestamp: 0x4aeaa Interval: 732 15:03:33 Maximum server connections 0 15:03:33 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 0, Llog used 0 15:03:35 B-tree scanners disabled. 15:03:36 Checkpoint Completed: duration was 0 seconds. 15:03:36 Tue Aug 3 - loguniq 31, logpos 0x20018, timestamp: 0x4aed2 Interval: 733
What do I do when the Primary Fails?
When your Primary fails you can quickly make the Secondary server a Standalone (i.e. no HDR) server. Even if you have configured an Updatable Secondary you will need to do this since the writes on a Secondary are sent to the Primary under the covers.
Make the Secondary a Standalone server with the onmode -d standard command
informix@blogsvr02> onmode -d standard informix@blogsvr02> onstat -m IBM Informix Dynamic Server Version 11.50.UC7IE -- On-Line -- Up 00:30:32 -- 1164976 Kbytes Message Log File: /opt/informix-ids-11.50.UC7IE/tmp/online.log 15:38:26 Logical Recovery Complete. 15:38:27 Quiescent Mode 15:38:27 Checkpoint Completed: duration was 0 seconds. 15:38:27 Tue Aug 3 - loguniq 31, logpos 0x4c018, timestamp: 0x4b0a6 Interval: 740 15:38:27 Maximum server connections 0 15:38:27 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns blocked 0, Plog used 0, Llog used 1 15:38:27 Started 1 B-tree scanners. 15:38:27 B-tree scanner threshold set at 5000. 15:38:27 B-tree scanner range scan size set to -1. 15:38:27 B-tree scanner ALICE mode set to 6. 15:38:27 B-tree scanner index compression level set to med. 15:38:27 DR: Reservation of the last logical log for log backup turned on 15:38:27 SCHAPI: Started dbScheduler thread. 15:38:27 DR: new type = standard 15:38:27 Booting LanguageWhen the old Primary is fixed and ready to be brought back online you have options.from module <> 15:38:27 Loading Module 15:38:27 SCHAPI: Started 2 dbWorker threads. 15:38:28 On-Line Mode
Option 1 is to reinitialize HDR just like we did when setting up HDR for the first time. Except now blogsvr02 will be the Primary and blogsvr01 will be the Secondary. I like this option because it doesn't require any downtime.
Option 2 is to make blogsvr01 the Primary again (easier to do if the logs have not rolled over on blogsvr02.) This requires some downtime and assumes that the disks on blogsvr01 were not the reason it went down and all of the data is still intact.
Switch blogsvr02 to Quiescent Mode
informix@blogsvr02> onmode -sChange the HDR status of blogsvr02 to Secondary
informix@blogsvr02> onmode -d secondary blogsvr01_hdrStart Informix on the Primary
informix@blogsvr01> oninitIf the logical logs have rolled over on the Secondary (while it was Standalone) you will need to do what we did before. Move the logical log backups that you need from blogsvr02 to blogsvr01, change their names and run ontape -l -d
If everything works as advertised the Secondary will ship over the logs the Primary needs, they will be applied to the Primary and HDR will be restored.
Pretty cool stuff that has saved my butt more than a couple of times.
Hi Andrew, I have a simple question to you. Can I use the secondary server to take a backup? And, can I use that backup to restore primary in case of catastrophic failure? (both servers dead, maybe a earthquake or something)
ReplyDeleteThanks in advance,
Manuel
The secondary can not be used to take a backup, this can only be performed against the primary.
DeleteIf you are concerned about geographical redundancy you should take a look at RSS nodes in a different location if you can afford it.
Hi Andrew,
ReplyDeleteAwesomely documented process. Very clear, concise and easy to understand.
Only perhaps one thing missing, you could include a simple start and stop procedure (for those that are new to HDR) if you just need to shutdown the servers ( say for hardware maintenence)
Thanks
Hi Andrew,
ReplyDeleteMy name is Roberto. I would like to have the commands to check if replication is ok (up & running) on both sides, primary / secondary.
Please, can you help in this ?
hi Roberto , iam not Andrew , but just type onstat -g dri on any server
ReplyDeleteI have a problem. I am kew to Informix. I don't have backups from my databases and I used oninit -ivy. Now all my databases or gone. How can I get them back?
ReplyDeleteHi Andrew, I successfully manage to implement HDR on my Informix database. I want to know what if I need to do a level 0 backup on the primary. Do I need to stop the HDR? Also if I need to add chunk on the primary how do I do it in an HDR setup?
ReplyDeleteThank you.
No need to stop HDR in order to take a backup.
DeleteTo add a chunk, all you have to do is ensure the chunk path exists on the secondary just like you would do on the primary or a stand alone instance and then run the onspaces command on the primary.
Hi Andrew,
ReplyDeleteThank you for your article. Very much helpful. Enjoying HDR now.
Hi Andrew
ReplyDeleteThanks for this nice article. I'm new with Informix replication and just got the pairs servers worked by gone through the article.I mix your article with IBM one also.
I have to present to client for the DR simulation. This simulation will be going for a day. By assuming Primary will be down, once I switch to Secondary the users will access the server to verify the data and do minor transactions. At this point the primary still down can't be access.
But at the end of the activity, I need to return to Primary and let primary be a primary and secondary as secondary as initial.
What are the flow of tasks which I need to do. Expecting no data lost. This is real production db.