Striping¶

Striping in BeeGFS can be configured on a per-directory and per-file basis. Each directory has a specific stripe pattern configuration, which will be derived to new subdirectories and applied to any file created inside a directory. There are currently two basic parameters that can be configured for stripe patterns: the desired number of storage targets for each file and the chunk size (or block size) for each file stripe.

The stripe pattern parameters can be configured the command-line control tool beegfs-ctl. It allows to view or change the stripe pattern details of each file or directory in the file system at runtime.

The following command will show you the current stripe settings of your BeeGFS mount root directory (in this case /mnt/beegfs):

$ beegfs-ctl --getentryinfo /mnt/beegfs

Use the subcommand --setpattern to apply new striping settings to a directory. To, for example, stripe files across 4 storage targets with a chunk size of 1 MB, run:

$ beegfs-ctl --setpattern --numtargets=4 --chunksize=1M /mnt/beegfs

Stripe settings will be applied to new files, not to existing files in the directory. To apply the pattern to existing files, recreate them by performing a deep copy.

Buddy Mirroring¶

If you have buddy mirror groups (see Mirroring) defined in your system, you can set the stripe pattern to use buddy groups as stripe targets, instead of individual storage targets. In order to do that, add the option --pattern=buddymirror to the command, as follows. In this particular example, the data will be striped across 4 buddy groups with a chunk size of 1 MB:

$ beegfs-ctl --setpattern --numtargets=4 --chunksize=1M --pattern=buddymirror /mnt/beegfs

To switch back to non-mirrored mode, set the pattern to raid0.

Impact on network communication¶

The data chunk size has an impact on the communication between client and storage servers in several ways.

When a process writes data on a file located on BeeGFS, the client identifies the storage targets that contain the data chunks that will be modified (by querying the metadata servers) and send modification messages to the storage servers containing the modified data. The maximum size of such messages is determined by the data chunk size of the file.

If you define chunksize=1M, 1 MB will be the maximum size of each message. If the amount of data written to the file is larger than the maximum message size, more messages will have to be sent to the servers and this may cause performance loss. So, slightly increasing the chunk size to a few MB has the effect of reducing the number of messages, and this can have a positive performance impact, even in a system with a single target.

On the other hand, it is important to make sure that a data chunk fits the RDMA buffers available on the client (see Client Node Tuning), in order to prevent the messages from being split, and again increasing the number of messages. See

You also have to consider the file cache settings. When the client is using the buffered cache (tuneFileCacheType = buffered), it uses a file cache buffer of 512 KB to accumulate changes on the same data. This data is sent to the servers only when data from outside the boundaries of that buffer is needed by the client. So, the larger this buffer, the less communication will be needed between the client and the servers. You should set this buffer size to a multiple of the data chunk size. For example, adding tuneFileCacheBufSize = 2097152 to the BeeGFS client configuration file will raise the file cache buffer size to 2 MB.

Striping¶

Buddy Mirroring¶

Impact on network communication¶

Documentation

Table of Contents

Previous topic

Next topic