Troubleshooting

7 February 2024

This page provides information on the possible execution status errors you may encounter while executing a DR plan.

The following sections only discuss the possible scenarios with examples to help you understand how to recover a DR plan from a failed state and is not a definitive list.

Configuration Error

Cause

A DR plan transitions to the Inactive (Configuration Error) state if the configuration of a DR plan before activation or changes that have been made to the DR plan or protection settings after the activation is incorrect. Examples of configuration errors are:

  • Protection Group or Protection policy is not configured in the cluster or has been removed from the cluster.

  • Replication is not enabled in the Protection Policy.

  • One or more VMs are not included in the Protection Group.

Solution

The error message can help you identify the root cause of the problem. To see the error message, on the Disaster Recovery Plans page, click the name of the DR plan. The error message is displayed along with the other details of the DR plan.

You may try the following to troubleshoot the error:

  • Log in to the Cohesity cluster (or access it via Helios) and fix the protection settings of your VMs.

  • Modify the DR plan to update the VMs defined in the plan.

System Error

Cause

SiteContinuity transitions the DR plan to an Inactive (System Error) state if a problem caused by external factors is impeding the ongoing operation of that plan. This error usually also indicates there are no configuration errors in the plan. Examples of system errors are:

  • Network connectivity issue between the Cohesity cluster and the vCenter

  • vCenter is down

  • Cluster services are slow or unresponsive

Solution

The error message can help you identify the root cause of the problem. To see the error message, on the Disaster Recovery Plans page, click the name of the DR plan. The error message is displayed along with the other details of the DR plan.

You may try the following to troubleshoot the error:

  • Verify network connectivity between the Cohesity cluster and the vCenter.

  • Ensure the vCenter server is running.

  • Examine the vCenter logs to diagnose the error, and so on.

  • Log in to the Cohesity cluster and check the cluster status.

Failover Failed

Cause

If the failover of a DR plan fails, the status is displayed as Failover Failed.

Solution

To see the error message and identify the root cause of the problem:

  1. In SiteContinuity, navigate to DR Plans.

  2. On the Disaster Recovery Plans page, click on the name of that DR plan. The error message is displayed along with the other details of the DR plan.

  3. In the Activity tab, click on the Failover activity. The Log tab displays all the events of the Failover activity in chronological order, including the specific event that encountered the error.

    Examine the error messages and log to check if the errors are due to external factors or configuration errors, and retry after fixing the errors. If the failover fails again, contact Cohesity Support.

Failback Failed

Cause

If the failback of a DR plan fails, the status is displayed as Failback Failed.

Solution

To see the error message and identify the root cause of the problem:

  1. In SiteContinuity, navigate to DR Plans.

  2. On the Disaster Recovery Plans page, click on the name of that DR plan. The error message is displayed along with the other details of the DR plan.

  3. In the Activity tab, click on the Failback activity. The Log tab displays all the events of the Failback activity in chronological order, including the specific event that encountered the error.

    Examine the error messages and log to check if the errors are due to external factors or configuration errors, and retry after fixing the errors. If the failback fails again, contact Cohesity Support.

Test Failover Failed

Cause

If the Test Failover fails, the icon in the Checks column of the Disaster Recovery Plans page shows an error symbol (). Click the icon to see the error message.

Solution

You have the option to edit the DR plan to fix the issue or delete the DR plan altogether.

Test Failback Failed

Cause

If the Test Failback of a DR plan fails, the icon in the Checks column of the Disaster Recovery Plans page shows an error symbol (). Click the icon to see the error message.

Solution

You have the option to edit the DR plan to fix the issue or delete the DR plan altogether.

Health Check Failed

Cause

If the Health Check of a DR plan fails, the Health Check icon in the Checks column of the Disaster Recovery Plans page shows an error symbol (). Click the icon to see the error message.

Solution

You may try the following to troubleshoot the error:

  • Ensure the SLA of the DR plan is met. SLAs might not be met if Replication between the clusters is failing or if protection runs are configured to run shorter cycles when compared to the SLA defined in the DR plan.

  • Verify the remote cluster connection on both the primary and DR Cohesity clusters:

    1. Log in to the Cohesity cluster.

    2. Navigate to Infrastructure > Remote Clusters.

    3. Verify the remote cluster connection and details.

  • Verify that the primary and DR sites are connected to Helios:

    1. Log in to the Cohesity cluster.

    2. When the cluster is connected to Helios, a green check mark is displayed in the Helios icon in the top right corner of the Cohesity Dashboard.