About the node failure management
The following table lists the node failure scenarios and how they can be managed.
Table:
Node failure | Description |
|---|---|
Planned node failure | In a routine node maintenance, we recommend that you reduce the impact on deduplication ratio or a job failure. Set the node in a maintenance mode on an MVG server. The client policy assignments on the node are not automatically changed if the node is in maintenance mode. Run the following MSDP command to change them manually: cacontrol --mvg set-mvg-maintenance <msdp-server> cacontrol --mvg unset-mvg-maintenance <msdp-server> cacontrol --mvg get-mvg-maintenance On the NetBackup web UI, go to the Disk Pool webpage of an MVG volume, select and edit a disk volume of an MVG volume, then switch the mode between "Maintenance" and "Normal". |
Unplanned node failure | MVG server has a rebalance freezing time-out. A client policy is not moved to another node until the current node stays unreachable for longer than the rebalance freezing time, when the next backup is started. The value is configurable in the MVG tuning API with the keyword asmt_rb_freezing_timeout. By default, it is 0.5 hour. |
The node is operational again after the failure | If a node stays down for long, the client policy assignments are moved out when NetBackup run backup jobs with the backup policies. When the node is back, most of the original client policy combinations are moved back to keep the system balanced again. |
The node is full | If a disk volume is full, a client policy is moved to another node. It is not moved back unless when the new node is inactive or full and the original node has the space available again. |