Jump to: navigation, search

Reference

Disaster Recovery version 2.7

Operations

Symptoms:

  • Mails gets stuck in queue. Queue size gets increased and maillog will show that unable to deliver mails due to read only file system.
 cat /var/log/maillog
  • Check the drbd log file.
 cat /var/log/messages | grep drbd
 It will show lines similar to mention below:
 kernel: drbd1: rw=0, want=20134015632, limit=31455270
 kernel: EXT3-fs error (device drbd1): ext3_free_branches: Read failure, inode=1149189, block=2516751953

Steps to recover the DR servers

MAIN_HOST DR_HOST
Put the maintenance mode on server.
setMaintenanceModeStatus.sh -status t
do nothing
Stop agents.
/mithi/mcs/bin/stopagents.sh 
do nothing
Stop services.
/mithi/mcs/bin/manageservices.sh --stopall 
do nothing
Wait till the synchronization completes.
/mithi/mcs/bin/drbd_diagnostics.sh


It will show output- REPLICATION OF DATA IS COMPLETED SUCCESSFULLY !!!

do nothing
Disconnect drbd.
drbdadm disconnect all 
Disconnect drbd.
drbdadm disconnect all
Backup mailstore data and mcsdata to another temporary machine.For backing up mcsdata use the tarring process and for mailstore use rsync. Confirm the consistency of backup.
Example
tar -cvzf mailstore.tar.gz /mailstore 
tar -cvzf mcsdata.tar.gz /mcsdata
do nothing
Disable the services.
/mithi/mcs/bin/disableservices.sh

Make the server as secondary.

/mithi/mcs/bin/setas_secondary.sh
do nothing
Stop drbd service
/mithi/mcs/components/mithi-drbd/bin/drbd stop

Formate the '/backup' partition

1. BACKUP_PARTION_IN_DR_FILE=`cat /etc/mithi/mcs/dr.conf |
   grep PRI_DISK_BACKUP | cut -d "=" -f2`
2. mkfs -t ext3 $BACKUP_PARTION_IN_DR_FILE
Stop drbd service
/mithi/mcs/components/mithi-drbd/bin/drbd stop

Formate the '/backup' partition

1. BACKUP_PARTION_IN_DR_FILE=`cat /etc/mithi/mcs/dr.conf | 
   grep SEC_DISK_BACKUP | cut -d "=" -f2`
2. mkfs -t ext3 $BACKUP_PARTION_IN_DR_FILE
Start drbd service
/mithi/mcs/components/mithi-drbd/bin/drbd start 

Check the servers are connected and the resources are in Inconsistent state

cat /proc/drbd

if they are not connected, then execute the below command

drbdadm connect all
Start drbd service
/mithi/mcs/components/mithi-drbd/bin/drbd start 

Check the servers are connected and the resources are in Inconsistent state

cat /proc/drbd

if they are not connected, then execute the below command

drbdadm connect all
Execute the below command
drbdadm -- --do-what-I-say primary all

wait till synchronization completes for first time

watch cat /proc/drbd

Check:
1. Synchronization should start here.The connected status on this server for both the resources mailstore and mcsdata should be either "cs:SyncSource" or "cs:PausedSyncS"

Execute the below command
watch cat /proc/drbd

Check:
1. Synchronization should start here and connected status on this server for both the resources mailstore and mcsdata should be either "cs:SyncTarget" or "cs:PausedSyncT"

Format the '/dev/drbd0' and '/dev/drbd1' resources using below commands.
i. for /dev/drbd0 resource
mkfs.ext3 /dev/drbd0

ii. for /dev/drbd1 resource

mkfs.ext3 /dev/drbd1
do nothing
Execute the command
drbdadm secondary all
do nothing
Check the servers are connected and the resources are in Consistent state
cat /proc/drbd

Outut:
0: cs: Connected st:Secondary/Secondary ld:Consistent
1: cs:Connected st:Secondary/Secondary ld:Consistent

Check the servers are connected and the resources are in Consistent state
 cat /proc/drbd

Outut:
0: cs: Connected st:Secondary/Secondary ld:Consistent
1: cs:Connected st:Secondary/Secondary ld:Consistent

Make the server as Primary
/mithi/mcs/bin/setas_primary.sh

Check the drbd status on this server

cat /proc/drbd

Output:
0: cs:Connected st:Primary/Secondary ld:Consistent
1: cs:Connected st:Primary/Secondary ld:Consistent
Note: if server is not showing in connected state(WFconnection, Standlone), then execute

drbdadm connect all
Check the drbd status on this server
cat /proc/drbd

Output:
0: cs:Connected st:Secondary/Primary ld:Consistent
1: cs:Connected st:Secondary/Primary ld:Consistent
Note: if server is not showing in connected state(WFconnection, Standlone), then execute

drbdadm connect all
Restore the data (You can use any one of the following)
Copy mailstore data and mcsdata to the /mailstore and /mcsdata partition.
For details on rsync refer Rsync
do nothing
Autostart services using following command
/mithi/mcs/bin/enableservices.sh
do nothing
Reboot the server
reboot
do nothing
Check the drbd status on the server
cat /proc/drbd

Output:
0: cs:Connected st:Primary/Secondary ld:Consistent
1: cs:Connected st:Primary/Secondary ld:Consistent
Note: if server is not showing in connected state(WFconnection, Standlone), then execute

drbdadm connect all
Check the drbd status on the server
cat /proc/drbd

Output:
0: cs:Connected st:Secondary/Primary ld:Consistent
1: cs:Connected st:Secondary/Primary ld:Consistent
Note: if server is not showing in connected state(WFconnection, Standlone), then execute

drbdadm connect all

Note

  • Check the working of server, like services are running, agents are started and check mailflow
  • Remove maintainence mode from the Primary server(MAIN_HOST)
/mithi/mcs/bin/setMaintenanceModeStatus.sh -status f