Ransomware attackers specifically target and attempt to destroy backup systems to increase the probability of payment. Hardening your system is critical. Please ensure you have reviewed your platform security using the Security Hardening Checklist
Cohesity

COHESITY Documentation

Explore our documentation to get started, discover products & new features, access troubleshooting guides, register sources, platforms support.

Products
Data Security Alliance
Visit Cohesity.com
Demos
Support
Blogs
Developers
Partner Portals
Cohesity Community
© 2026 Cohesity, Inc. All Rights Reserved.
Terms of Use|
Privacy Policy|
Legal|
  1. Home
  2. Veritas NetBackup™ for Hadoop Administrator's Guide
  3. Introduction
  4. Protecting Hadoop data using NetBackup
Veritas NetBackup™ for Hadoop Administrator's Guide

Protecting Hadoop data using NetBackup

Using the NetBackup Parallel Streaming Framework (PSF), Hadoop data can now be protected using NetBackup.

The following diagram provides an overview of how Hadoop data is protected by NetBackup.

Also, review the definitions of terminologies.See NetBackup for Hadoop terminologies.

Figure: Architectural overview

Architectural overview

As illustrated in the diagram:

  • The data is backed up in parallel streams wherein the DataNodes stream data blocks simultaneously to multiple backup hosts. The job processing is accelerated due to multiple backup hosts and parallel streams.

  • The communication between the Hadoop cluster and the NetBackup is enabled using the NetBackup plug-in for Hadoop.

    The plug-in is installed as part of the NetBackup installation.

  • For NetBackup communication, you need to configure a BigData policy and add the related backup hosts.

  • You can configure a NetBackup media server, client, or master server as a backup host. Also, depending on the number of DataNodes, you can add or remove backup hosts. You can scale up your environment easily by adding more backup hosts.

  • The NetBackup Parallel Streaming Framework enables agentless backup wherein the backup and restore operations run on the backup hosts. There is no agent footprint on the cluster nodes. Also, NetBackup is not affected by the Hadoop cluster upgrades or maintenance.

For more information:

  • See Backing up Hadoop data.

  • See Restoring Hadoop data.

  • See Limitations.

  • For information about the NetBackup Parallel Streaming Framework (PSF) refer to the NetBackup Administrator's Guide, Volume I.

Feedback

Was this page helpful?
Previous

Introduction

Next

Backing up Hadoop data

Feedback

Was this page helpful?