We are a few nights away from an exciting new release of Arcserve UDP 6.5 but before I explain certain new features in this blog I thought it might be a good idea to start with the basics and create a UDP blog series. In today’s blog I will explain the components of Arcserve UDP and explain how our image based backup, i2 and dedupe works.
To start off, I will explain the components used first, this won’t be too technical as I will go more in depth in later posts about certain roles.
Arcserve UDP consists of the following components:
- Unified Management Console
- Recovery Point Server (RPS)
- Gateway server (Optional)
The console is a web based GUI for managing the UDP solution. In small environments the console tends to be installed on the RPS server itself. For larger deployments, customers choose to separate the console from the RPS server.
The RPS server is the heart of UDP, it is the main backup server, the RPS has many features, such as deduplication, replication, virtual standby, instant vm etc etc and many of these features will be explained in later blogs.
The gateway is used at remote sites (often in combination with a RPS server), the gateway is connected to the main console and setup a secure link between the console and the remote site, it enables the administrator to manage remote sites from a single console.
Proxy is used for agentless backup of VMware / Hyper-V environments. Proxy enables single-pass backup of all windows and non-windows virtual machines without the installing agents on each virtual machine.
Agents are used for physical servers or servers that cannot be protected with agentless (such as RDM virtual machines or non-VMware / Hyper-V environments like KVM). Agents can be in remotely installed via the console and do not need any reboots.
Image based backup
The main focus of today’s is about Image based i2 backup with deduplication.
When doing a backup with UDP, the first backup will always be a Full backup, this is the only full backup you will take of a protected node, after this full backup our i2 or infinite incremental will kick in. So how does this work?
When a backup is started the specified volume is divided into a number of subordinate data blocks. The first full backup is considered as the parent backup and will establish the baseline blocks to be monitored. When the next backup kicks in the proxy will collect Change Block Tracking (CBT) data for agentless backup or an internal monitoring driver will check changed blocks in an agent backup. UDP will only incrementally backup those blocks that have changed since the last backup. The incremental backups (Child backup) can be scheduled as frequent as every 15 minutes.
If a restore is needed, the most recent backup version of each block is located and the volume is rebuilt using these blocks. In the example below a full backup is taken and 4 incremental backups occurring each hour are taken. After 4 hours a restore is done using the most recent blocks.
So far, this is pretty default incremental backup, but now the smart part, i2 ie infinite incremental. As said earlier, with UDP there is no need to perform a full backup ever again. To explain how this is done I copied the next section from our manual:
“If left alone, the incremental snapshots (backups) would continue, as often as 96 times each day (every 15 minutes). These periodic snapshots will accumulate a large chain of backed up blocks to be monitored each time a new backup is performed, and require added space to store these ever-growing backup images. To minimize this potential problem, Arcserve UDP Agent (Windows) utilizes the Infinite Incremental Backup process, which intelligently creates incremental snapshot backups forever (after the initial full backup) and uses less storage space, performs faster backups, and puts less load on your production servers. Infinite Incremental Backups allow you to set a limit for the number of incremental child backups to be stored.
When the specified limit is exceeded, the earliest (oldest) incremental child backup is merged into the parent backup to create a new baseline image consisting of the “parent plus oldest child” blocks (unchanged blocks will remain the same). This cycle of merging the oldest child backup into the parent backup repeats for each subsequent backup, allowing you to perform Infinite Incremental (I2) snapshot backups while maintaining the same number of stored (and monitored) backup images.”
In UDP the specific limit which is referred to in the manual is also called Retention point or recover point. Retention can be set on daily, weekly, monthly and custom schedules. When the oldest retention is expiring it will be merged into the parent.
On top of i2 backup, Arcserve UDP also provides global deduplication. For those who don’t know what deduplication is: Deduplication is a technique for eliminating duplicate copies of repeating data. Or in simple words deduplication removes the repeated blocks and keeps a reference to the existing block.
This will help customers to save even more storage space needed for backup and replication. Deduplication is enabled on the RPS server and consist of 4 destination folders:
- Data store folder- This is where the recovery point folder structure will be created
- Data folder – This folder contains the actual deduplicated data
- Index folder – Contains the mapping information for matching the dedupe blocks to the recovery points
- Hash folder – contains the hashes for each deduplicated block.
I always recommend to store the hash folder on a low cost SSD to increase the performance, it can also be done in memory although memory is more expensive.
During the setup of a data store you can set the dedupe block size, you can set it to 4KB, 8KB, 16KB (default) and 32KB.
During the backup a hash is created and sent to the RPS server, the RPS server will check with the hash already exist if it does it will not copy the data and only update the index and if it does not exist, it will copy the data and update the index.
(Source: Arcserve bookshelf UDP Manual)