Mirroring in 2012.10 and 2014.01 Release Series

Table of Contents (Page)

  1. General Notes on Mirroring
  2. Enabling and Disabling Mirroring
  3. Restoring Data from a Mirror

General Notes on Mirroring

Starting with the 2012.10 release series, FhGFS provides support for metadata and file contents mirroring. Mirroring capabilities are integrated into the normal FhGFS services, so that no separate services or third-party tools are needed. Both types of mirroring (metadata and file contents) can be used independent of each other.

The initial mirroring implementation enables data restore after a storage target or metadata server is lost. It does not provide failover or high availability capabilites yet, so the file system will not automatically remain accessible while a server is down. Those features will be added in later FhGFS releases.

Mirroring can be enabled on a per-directory basis, so that some data in the file system can be mirrored while other data might not be mirrored.

Primary servers/targets and mirror servers/targets are chosen individually for each new directory/file. Thus, there are no passive (i.e. mirror-only) servers/targets in a file system instance.

The granularity of metadata mirroring is a complete directory, so that the entries of a single directory are always mirrored on the same mirror server, but metadata for different directories is mirrored on different metadata servers. The granularity of file contents mirroring are individual files. Each file has its own set of primary and mirror targets, even for files are in the same directory.

Both types of mirroring can be used with an even or odd number of servers or storage targets.

Metadata is mirrored asynchronously, i.e. the client operation completes before the redundant copy has been transferred to the mirror server.
File contents are mirrored synchronously, i.e the client operation completes after both copies of the data were transferred to the servers. File contents mirroring is based on a hybrid client- and server-side approach, where the client decides whether it sends the mirror copy directly to the other server or whether the storage server is responsible for forwarding of the data to the mirror.

Enabling and Disabling Mirroring

By default, mirroring is disabled for a new file system instance. Both types of mirroring can be enabled with the fhgfs-ctl command line tool. (The fhgfs-ctl tool is contained in the fhgfs-utils package and is usually run from a client node.)

Mirroring settings of a directory will be applied to new file entries and will be derived by new subdirectories. For instance, if metadata mirroring is enabled for directory /mnt/fhgfs/mydir1, then a new subdirectory /mnt/fhgfs/mydir1/mydir2 will also automatically have metadata mirroring enabled.

To enable metadata mirroring for a certain directory, see the built-in help of the fhgfs-ctl tool:
 $ fhgfs-ctl --mirrormd --help 

Metadata mirroring can currently not be disabled after it has been enabled for a certain directory. But it is possible to create a new subdirectory without metadata mirroring. See:
 $ fhgfs-ctl --createdir --nomirror --help 

To enable file contents mirroring for a certain directory, see the built-in help of the fhgfs-ctl tool:
 $ fhgfs-ctl --setpattern --raid10 --help 

File contents mirroring can be disabled afterwards by using fhgfs-ctl mode --setpattern without the --raid10 option. However, files that were already created while mirroring was enabled will remain mirrored.

To check the metadata and file contents mirroring settings of a certain directory or file, use:
 $ fhgfs-ctl --getentryinfo /mnt/fhgfs/mydir/myfile 

Restoring Data from a Mirror

Mirrored data is stored in a subdirectory named mirror inside the normal storage directory of other servers.

For example, if you have three storage servers with target IDs 1, 2, and 3 and have lost storage server 1. Then you would have directories with names <storage_path>/mirror/1.chunks on the other two metadata servers 2 and 3.

To restore the storage target 1, you would just copy the contents of those directories to the storage target directory 1 and leave the leading ID part "1." out. So with the scp tool, you would do:
storage2$ scp -rp <storage_path>/mirror/1.chunks/* storage1:<storage_path>/chunks/
storage3$ scp -rp <storage_path>/mirror/1.chunks/* storage1:<storage_path>/chunks/

Similarly to mirrored file contents, mirrored metadata is also stored in a subdirectory named mirror inside the normal storage directory of other metadata servers. In this case, you would also need to copy the contents <storage_path>/mirror/<ID>.dentries and <storage_path>/mirror/<ID>.inodes to the storage directory of the lost server.

However, note that mirrored metadata uses hardlinks and extended attributes, which are not preserved when data is copied via scp. Thus, you would need to use a tool like tar or rsync to copy the data to the lost server. See here for examples on how to copy metadata and preserve hardlinks and extended attributes: Metadata Backup/Restore Example
Depending on the amount of data to be copied and whether you want to preserve sparse files, tools other than scp might also be more appropriate to copy storage server chunk files.

Work on integrated FhGFS tools that allow for a more convenient restore of lost data are already work in progress. These tools will be added in later releases.

Back to table of contents
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki