Ransomware attackers specifically target and attempt to destroy backup systems to increase the probability of payment. Hardening your system is critical. Please ensure you have reviewed your platform security using the Security Hardening Checklist
Cohesity

COHESITY Documentation

Explore our documentation to get started, discover products & new features, access troubleshooting guides, register sources, platforms support.

Products
Data Security Alliance
Visit Cohesity.com
Demos
Support
Blogs
Developers
Partner Portals
Cohesity Community
© 2026 Cohesity, Inc. All Rights Reserved.
Terms of Use|
Privacy Policy|
Legal|
  1. Home
  2. NetBackup™ Backup Planning and Performance Tuning Guide
  3. Media configuration guidelines
  4. About tape I/O error handling
NetBackup™ Backup Planning and Performance Tuning Guide

About tape I/O error handling

Note:

This topic has nothing to do with the number of times NetBackup retries a backup or restore that fails. That situation is controlled by the global configuration parameter Backup Tries for backups and the bp.conf entry RESTORE_RETRIES for restores.

The algorithm that is described here determines whether I/O errors on tape should cause media to be frozen or drives to be downed.

When a read/write/position error occurs on tape, the error that is returned by the operating system does not identify whether the tape or drive caused the error. To prevent the failure of all backups in a given time frame, bptm tries to identify a bad tape volume or drive based on past history.

To do so, bptm uses the following logic:

  • Each time an I/O error occurs on a read/write/position, bptm logs the error in the following file.

    Linux/UNIX

    /usr/openv/netbackup/db/media/errors

    Windows

    install_path\NetBackup\db\media\errors

    The error message includes the time of the error, media ID, drive index, and type of error. The following examples illustrate the entries in this file:

    07/21/96 04:15:17 A00167 4 WRITE_ERROR 
    07/26/96 12:37:47 A00168 4 READ_ERROR
  • Each time an entry is made, the past entries are scanned. The scan determines whether the same media ID or drive has had this type of error in the past "n" hours. "n" is known as the time_window. The default time window is 12 hours.

    During the history search for the time_window entries, EMM notes the past errors that match the media ID, the drive, or both. The purpose is to determine the cause of the error. For example: If a media ID gets write errors on more than one drive, the tape volume may be bad and NetBackup freezes the volume. If more than one media ID gets a particular error on the same drive, the drive goes to a "down" state. If only past errors are found on the same drive with the same media ID, EMM assumes that the volume is bad and freezes it.

  • The freeze or down operation is not performed on the first error.

    Note two other parameters: media_error_threshold and drive_error_threshold. For both of these parameters, the default is 2. For a freeze or down to happen, more than the threshold number of errors must occur. By default, at least three errors must occur in the time window for the same drive or media ID.

    If either media_error_threshold or drive_error_threshold is 0, a freeze or down occurs the first time an I/O error occurs. media_error_threshold is looked at first, so if both values are 0, a freeze overrides a down. Veritas does not recommend that these values be set to 0.

    A change to the default values is not recommended without good reason. One obvious change would be to put very large numbers in the threshold files. Large numbers in that file would disable the mechanism, such that to "freeze" a tape or "down" a drive should never occur.

    Freezing and downing are primarily intended to benefit backups. If read errors occur on a restore, a freeze of media has little effect. NetBackup still accesses the tape to perform the restore. In the restore case, downing a bad drive may help.

For further tuning information on tape backup, see the following topics:

See About the threshold for media errors.

Feedback

Was this page helpful?
Previous

Adjusting the media_error_threshold

Next

About NetBackup media manager tape drive selection

Feedback

Was this page helpful?