Cloud Scale upgrade shows environment failed or partially ready
During a Cloud Scale upgrade, the environment may report a partial READY state (for example, 3/4 READY) or appear as Failed.
The status can also fluctuate between phases such as:
Upgrading MSDPScaleouts
Upgrading MediaServers
At the same time:
PrimaryServer and MediaServer components may report Success
MSDPScaleout may remain not fully READY
Additional indicators include:
Operator logs showing:
Not all MSDP resources are ready
Failed to get media servers registered to storage server
HTTP timeout to storage-server API
Primary logs may show NBSL / RDSM errors for the same period, then succeed later.
The Cloud Scale upgrade workflow depends on MSDPScaleout availability, including storage services and RDSM. While MSDP engines are starting, restarting, or temporarily unstable, upgrade reconcilers retry dependent operations. During this phase, transient errors can be reported until all required components become fully healthy.
Workaround:
Perform the following:
Check Environment, MSDPScaleout, PrimaryServer, and MediaServer resources together rather than relying on the Environment status alone.
Allow time for the upgrade to progress and recheck the status; in many cases, the condition resolves automatically once MSDP becomes stable.
If the issue persists: verify that all MSDP pods and engines are running and stable.
Verify that all MSDP pods and engines are running and stable.
Correlate NetBackup operator logs with primary server logs for storage or RDSM-related errors.
Escalate the issue if:
MSDPScaleout remains not Ready.
MSDP pods are crash-looping.
PrimaryServer or MediaServer custom resources remain Failed and API calls never succeed.