VMCE v9 Study Guide – Module 3 – Core Components – Backup Repository

downloadContinuing along with the core components section of Module 3 we will now look at the backup proxy, both the basic type, as well as the new Scale-Out Backup Repository which was introduced in v9.

So what is a backup repository?

This is where our backup data resides.  Actually holds more than just VM backups – keeps backup chains, VM Copies, and metadata for our replicated VMs.  There are three types of backup repositories in Veeam

1. Simple Backup Repository

Typically a simple backup repository is just a folder or directory located on a the backup storage where we can store our jobs.  We can have multiple backup repositories, and set them up to different jobs in order to limit the number of simultaneous jobs each one is processing, helping to spread the load.  A Simple Backup Repository can be installed on

  • Windows server with local or direct attached storage – storage can be a local disk, direct attached disk (USB Drive) or an iSCSI/FC LUN mounted to the box.  Can be physical or virtual.  When a Windows based repository is added the data mover service is installed and utilized to connect to whatever proxy is sending the backup data, helping to speed up the transfer and processing of data.  Windows repositories can also be configured to run vPower, giving them the ability to mount their backups directly to ESXi hosts over NFS.
  • Linux server with local, DAS, or mounted NFS.  – Similar to that of Windows we can use a Linux instance with directly attached storage, iSCSI/FC LUNs, or mounted NFS shares.  When a task addresses a Linux target, the data mover service is deployed and ran, again establishing a connection to the source proxy.
  • CIFS or SMB share. – an SMB share can be utilized to store your Veeam backups, however it doesn’t have the ability to run the data mover service.  In this case, the gateway server (explained later) will be used to retrieve and write data to the SMB share.  This affects your deployment, you may want to deploy gateway servers offsite if writing to an SMB share at a remote location in order to help performance.
  • Deduplicated storage appliance – Veeam does support EMC Data Domain, ExaGrid and HPE StoreOnce as backup repositories as well.

Interesting tidbits around simple backup repositories

  • Data Domain does not necessarily improve performance, but reduces load on network
  • Data Domain does not support reverse incremental and cannot exceed that of 60 restore points in incremental backup chains.
  • ExaGrid jobs actually achieve a lower deduplication ratio when using multi-task processing.  It’s better to do a single task at a time.
  • When using StoreOnce Veeam needs the CAtalyst agent installed on the gateway proxy.
  • HPE StoreOnce always uses per-vm backup files
  • HPE StoreOnce does not support reverse incremental nor does it support the defrag and compact full backup options.

2. Scale-Out Backup Repository

The scale out backup repository essentially takes several similar simple repositories and groups them together to pool one large backup repository.  This way as you approach your capacity within the SOBR, you can simply add another repository, or extent to the pool, increasing your overall capacity.

When a simple backup repository is added as an extent to a SOBR, Veeam creates a definition.erm file.  This file contains all of the descriptive information about the SOBR and its respective extents.

One setting that must be setup on a SOBR is the Backup file placement policy.  This basically determines how the backup files will be distributed between extents.  There are two Backup file placement polices available

  1. Data Locality
    • All backup files which belong to the same chain will be stored on the same extent.
    • New full backups could reside on another extent, but the incremental thereafter would also be placed on this new extent – where as the old full and old incremental would remain on another extent.
  2. Performance
    • Full and incremental backups that belong to the same chain are stored on different extents.
    • Improves performance on transforms if raw devices are in use as it spreads the I/O load across extents.
    • If an extent is missing containing any part of a targeted backup chain Veeam will not be able to perform the backup.  That said, you can set the ‘Perform full backup when required extent is offline” setting in order to have a full backup performed in the event it can’t piece together the chain, even if an incremental is scheduled.

All this said, the placement policy is not strict – Veeam will always try and complete a backup on another extent with enough free space if an extent is not available, even if you have explicitly said to place full backups on a certain extent.

When selecting extents to place backups, Veeam goes through the following processes.

  1. Looks for availably of extents and their backup files.  If an extent is not available containing part of the chain, Veeam triggers a full backup to a different extent
  2. It then takes into consideration the backup placement policy
  3. Then it looks at free space on the extents – it is placed on the extent with the most free space.
  4. Availability of the backup files form the chain, meaning, an extent that has incremental backups from the current backup chain will have a higher priority than an extent that doesn’t

During the start of a job, Veeam guestimates how much space a backup file will require and compares that to of what is available on the extents.  It does this in a couple of different ways depending on your backup file settings.

  • Per-VM Backup Chains – In determining the full backup file size it calculates by taking 50% of the source VM size.  Incrementals are 10% of the source VM size
  • Single File Backup Chain – The size of the full is equal to 50% of the source VMs in the job.  The first incremental is determined by taking 10% of the source VMs size – subsequent incrementals are equal to that of the size of the incremental before them.

Extents within a SOBR also have some service actions that can be performed as explained below

  • Maintenance Mode – This is mainly used if you need to perform some kind of maintenance on the server hosting the underlying extent such as adding memory or replacing hardware.  When an extent is in maintenance mode you cannot perform any tasks targeted at the extent nor can you restore any data that resides on this extent or backup chains that have data on the extent.  When entering maintenance mode Veeam first checks to see if any jobs are currently using the extent.  If they aren’t, it immediately goes into maintenance mode – if they are, it gets placed into a Maintenance pending state and waits for the tasks to complete, once done, it enters maintenance mode.
  • Backup Files Evacuation – This is used if you would like to remove an extent from a SOBR that contains backup files.  When doing this, Veeam moves the backup files on this extent to other extents that belong to the same SOBR.  Before evacuating, you must first place extents into maintenance mode.  Veeam attempts to abide by its placement policies when looking where to place the evacuated backup files.

Some interesting tidbits around SOBR

  • extents can be mixed and matched, meaning we can have windows repositories, Linux repositories and dedup appliances all providing storage for one SOBR.
  • Used for Backup, Backup Copy, and VeeamZIP jobs only – note the difference – no configuration backups or replication metadata is stored on a SOBR.  If you try and add an extent to a SOBR that is configured inside of any other jobs it will not add – you will first need to target these jobs to another repository.   Further more, if a backup repository is configured as a SOBR extent, you will not be able to use it for any other jobs.
  • Only Available in Enterprise and Enterprise Plus, however Enterprise does have limitations.  Only one SOBR can be created, and can only contain 3 extents.  If you downgrade licenses while you have a SOBR you will still be able to restore from it, but jobs targeted at it will no longer run.
  • When a backup repository is converted to an extent the following information is inherited to the extent
    • Number of Simultaneous tasks
    • Read and write data limit
    • Data compression settings
    • block alignment
    • Limitations on the underlying repository – EMC data domain has a backup chain limit of 60 points, therefore if we use this as an extent in our SOBR, our SOBR will have the same chain limit.
    • Settings that are not inherited include any rotated drive settings as well as Per-VM backup file settings.  Per VM needs to be configured globally on the SOBR.

3. Rotated Drive Backup Repositories

Backup repositories can also use rotated drives.  Think storing backups on external USB drives where you regularly swap these drives in and out to take offsite.  This is setup by using the ‘This repository is backed by rotated drives’ option on the backup repository.

A backup that targets rotated drives goes through the following process.

  1. Veeam creates the backup chain on whatever drive is currently attached
  2. Upon a new session, Veeam checks if the backup chain on the currently connected drive is consistent, meaning it has a full backup as well as subsequent incrementals to restore from.  If the drives had been swapped, or the full/incremental backups are missing from the drive then Veeam will start a new chain, creating a new full backup on the drive which will then be used for subsequent incrementals.  If it is a backup copy job Veeam simply creates a new incremental and adds it to the chain.
  3. For any external drives attached to Windows Servers Veeam will process any outdated restore points from the retention settings and remove them from the drive if need be.
  4. When any original drives get added back into the mix, Veeam repeats this process creating full backups if need be.

Interesting tidbits about repositories backed by rotated drives

  • Veeam can remember and keep track of drives on Windows Servers even if the drive letter changes.  It does this by storing a record about the drive within its configuration database.
    • When a drive is first inserted Veeam has no idea about it, so it must have the exact same letter that is associated in the path to folder setting on the repository.  After this, Veeam stores the information in regards to the drive in the database.
    • After reinserting a drive that is already in the configuration database, Veeam will still use this successfully, even if the drive letter doesn’t match that of the path to folder.
  • GFS Full Backups cannot be created with Backup Copy jobs on rotated drives
  • Per-VM backup files are not supported on rotated drives