|
Configuring split brain behaviorSplit brain notification
DRBD invokes the split-brain handler, if configured, at any time split brain is detected. To configure this handler, add the following item to your resource configuration:
resource resource
handlers {
split-brain handler;
...
}
...
} handler may be any executable present on the system.
Since DRBD version 8.2.6, the DRBD distribution contains a split brain handler script that installs as /usr/lib/drbd/notify-split-brain.sh. It simply sends a notification e-mail message to a specified address. To configure the handler to send a message to root@localhost (which is expected to be an email address that forwards the notification to a real system administrator), configure the split-brain handler as follows:
resource resource
handlers {
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
...
}
...
}
After you have made this modfication on a running resource (and synchronized the configuration file between nodes), no additional intervention is needed to enable the handler. DRBD will simply invoke the newly-configured handler on the next occurrence of split brain.
Automatic split brain recovery policies
In order to be able to enable and configure DRBD's automatic split brain recovery policies, you must understand that DRBD offers several configuration options for this purpose. DRBD applies its split brain recovery procedures based on the number of nodes in the Primary role at the time the split brain is detected. To that end, DRBD examines the following keywords, all found in the resource's net configuration section:
- after-sb-0pri. Split brain has just been detected, but at this time the resource is not in the Primary role on any host. For this option, DRBD understands the following keywords:
- disconnect. Do not recover automatically, simply invoke the split-brain handler script (if configured), drop the connection and continue in disconnected mode.
- discard-younger-primary. Discard and roll back the modifications made on the host which assumed the Primary role last.
- discard-least-changes. Discard and roll back the modifications on the host where fewer changes occurred.
- discard-zero-changes. If there is any host on which no changes occurred at all, simply apply all modifications made on the other and continue.
- after-sb-1pri. Split brain has just been detected, and at this time the resource is in the Primary role on one host. For this option, DRBD understands the following keywords:
- disconnect. As with after-sb-0pri, simply invoke the split-brain handler script (if configured), drop the connection and continue in disconnected mode.
- consensus. Apply the same recovery policies as specified in after-sb-0pri. If a split brain victim can be selected after applying these policies, automatically resolve. Otherwise, behave exactly as if disconnect were specified.
- call-pri-lost-after-sb. Apply the recovery policies as specified in after-sb-0pri. If a split brain victim can be selected after applying these policies, invoke the pri-lost-after-sb handler on the victim node. This handler must be configured in the handlers section and is expected to forcibly remove the node from the cluster.
- discard-secondary. Whichever host is currently in the Secondary role, make that host the split brain victim.
- after-sb-2pri. Split brain has just been detected, and at this time the resource is in the Primary role on both hosts. This option accepts the same keywords as after-sb-1pri except discard-secondary and consensus.
| Note | DRBD understands additional keywords for these three options, which have been omitted here because they are very rarely used. Refer to drbd.conf(5) for details on split brain recovery keywords not discussed here.
|
For example, a resource which serves as the block device for a GFS or OCFS2 file system in dual-Primary mode may have its recovery policy defined as follows:
resource resource {
handlers {
split-brain "/usr/lib/drbd/notify-split-brain.sh root"
...
}
net {
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
...
}
...
} |
|