Benchmarking Tools



Table of Contents (Page)

  1. Built-in Benchmarking Tools
    1. StorageBench
    2. NetBench Mode
  2. External Benchmarking Tools
    1. IOR
    2. mdtest
  3. Recommendations
 


Note: To avoid caching effects, sustained streaming benchmarks typically read/write at least two times the size of the server-side RAM.



Built-in Benchmarking Tools


BeeGFS includes a built-in storage targets benchmark (StorageBench) and a built-in network benchmark (NetBench).


StorageBench


The storage targets benchmark is intended to determine the maximum performance of BeeGFS on the storage targets or to detected defective or misconfigured storage targets.

This benchark measures the streaming throughput of the underlying file system and devices independent of the network performance. To simulate client IO, this benchmark generates read/write work packages locally on the servers without any client communication.

Note that without any network communication, file striping cannot be simulated, so the benchmark results are rather comparable to client IO with disabled striping (i.e. one target per file).

It is possible to benchmark only specific targets or all targets together.

The storage benchmark is started and monitored with the beegfs-ctl tool.

The following example starts a write benchmark on all targets of all BeeGFS storage servers with an IO blocksize of 512 KB, using 10 threads (i.e. simulated client streams) per target, each of which will write 200 GB of data to its own file.
$ beegfs-ctl --storagebench --alltargets --write --blocksize=512K --size=200G --threads=10


To query the benchmark status/result of all targets, execute the command below.
$ beegfs-ctl --storagebench --alltargets --status


You can use the watch command for repeating the query in a given interval in seconds, as shown below.
$ watch -n 5 beegfs-ctl --storagebench --alltargets --status


The generated files will not be automatically deleted when a benchmark is complete. You can delete them by using the following command.
$ beegfs-ctl --storagebench --alltargets --cleanup


More details about the storage benchmark and its options are available in the help of the beegfs-ctl tool, as follows.
$ beegfs-ctl --storagebench --help


NetBench Mode


The netbench mode is intended for network streaming throughput benchmarking. In this mode, write and read requests are transmitted over the network from the client to the storage servers like BeeGFS does it during normal operation (i.e. with disabled netbench mode). The difference is that with enabled netbench mode, the servers will discard received write requests instead of actually submitting the received data to the underlying file system (and vice very for read requests, in which case only memory buffers will be sent to the clients instead of actually reading from the underlying file system on the servers.)
Thus, this mode helps to detected slow network connections and can be used to test the maximum network throughput between the clients and the storage servers, as throughput in this mode is independent of the underlying disks.

To test streaming throughput, you can use any tool that writes data to the BeeGFS mountpoint, e.g. dd or IOR. (Note that due to write operations being discarded on the servers, written files will continue to have length 0 after writing, so it is normal that some benchmark tools might print a warning about the unexpected file size.)

All other operations like file creation and unlink will work normally with enabled netbench mode, only write and read operations are affected.

Netbench mode is enabled via the client runtime configuration in /proc/fs/beegfs. The following command will enable netbench mode for the particular client on which it is executed (other clients are not affected). A remount of the client is not required and will disable netbench mode.
$ echo 1 > /proc/fs/beegfs/<clientID>/netbench_mode


Obviously, it is important to disable netbench mode after the benchmarking is done to re-enable normal reads and writes to the file system. This can be done at runtime via the following command.
$ echo 0 > /proc/fs/beegfs/<clientID>/netbench_mode


Note that this command will only affect the client on which it is executed. If you enabled netbench mode on multiple clients, you also have to run this command on all of those clients.


External Benchmarking Tools


This section shows some of the commonly used benchmarks for file IO and metadata performance.


IOR


IOR is a benchmark tool to measure performance of a single or multiple clients with one or more processes per client. IOR is based on MPI for distriubted execution. It can be used to measure streaming throughput or small random IO performance (IOPS).

A fork of IOR with support for tuning BeeGFS parameters is available on the ThinkParQ GibHub page. Please install be beegfs-client-devel package before building to enable BeeGFS support.

The value for the number of processes ${NUM_PROCS} depends on the number of clients to test and the number of processes per client. The block size ${BLOCK_SIZE} can be calculated with ((3 * RAM_SIZE_PER_STORAGE_SERVER * NUM_STORAGE_SERVERS) / ${NUM_PROCS}).

Multi-stream Throughput Benchmark
$ mpirun -hostfile /tmp/nodefile --map-by node -np ${NUM_PROCS} /usr/bin/IOR -wr -i5 -t2m -b ${BLOCK_SIZE} -g -F -e
-o /mnt/beegfs/test.ior


Shared File Throughput Benchmark
$ mpirun -hostfile /tmp/nodefile --map-by node -np ${NUM_PROCS} /usr/bin/IOR -wr -i5 -t1200k -b ${BLOCK_SIZE} -g -e
-o /mnt/beegfs/test.ior


Note: We've picked 1200k just as an example for a transfer size that is not aligned to the BeeGFS chunksize.

IOPS Benchmark
$ mpirun -hostfile /tmp/nodefile --map-by node -np ${NUM_PROCS} /usr/bin/IOR -w -i5 -t4k -b ${BLOCK_SIZE} -F -z -g
-o /mnt/beegfs/test.ior


BeeGFS Tuning Parameters
-O beegfsNumTargets=<n> Number of storage targets to use for striping
-O beegfsChunkSize=<b> striping chunk size, in bytes. Accepts k=kilo, M=mega, G=giga, ...

mpirun Parameters
-hostfile $PATH (file with the hostnames of the clients/servers to benchmark)
-np $N (number of processes)

IOR Parameters
-w (write benchmark)
-r (read benchmark)
-i $N (repetitions)
-t $N (transfer size, for dd it is the block size)
-b $N (block size, amount of data for a process)
-g (use barriers between open, write/read, and close)
-e (perform fsync upon POSIX write close, make sure reads are only started are all writes are done.)
-o $PATH (path to file for the test)
-F (one file per process)
-z (random access to the file)

References
IOR project git repository: https://github.com/IOR-LANL/ior
IOR BeeGFS fork: https://github.com/ThinkParQ/ior
IOR project git repository (old): https://github.com/chaos/ior
IOR project homepage: http://sourceforge.net/projects/ior-sio/

mdtest


mdtest is a metadata benchmark tool, which needs MPI for distributed execution. It can be used to measure values like file creations per seconds or stat operations per second of a single process or of multiple processes.

The value for the number of processes ${NUM_PROCS} depends on the number on clients to test and the number of processes per client to test. The number of directories can be calculated as ${NUM_DIRS} = (parameter -b ^ parameter -z). The total amount of files should always be higher than 1 000 000, so ${FILES_PER_DIR} is calculated as ${FILES_PER_DIR} = (1000000 / ${NUM_DIRS} / ${NUM_PROCS}).

File Create/Stat/Remove Benchmark
$ mpirun -hostfile /tmp/nodefile --map-by node -np ${NUM_PROCS} mdtest -C -T -r -F -d /mnt/beegfs/mdtest -i 3 -I ${FILES_PER_DIR} -z 2 -b 8 -L -u


mpirun Parameters
-hostfile $PATH (file with the hostnames of the clients/servers to benchmark)
-np $N (number of processes)

mdtest Parameters
-C (perform create tests)
-T (perform stat tests)
-r (perform remove tests)
-F (perform only file tests)
-d $PATH (path to test directory)
-i $N (iterations)
-I $N (number of files per directory)
-z $N (depth of the directory structure)
-b $N (how many subdirectories to be created per directory of a higher "-z" level)
-L (use leaf level of the tree for file tests)
-u (each task gets its own working directory)

mdtest project git repository: https://github.com/MDTEST-LANL/mdtest


Recommendations


Regardless of which tool you use, it is important to take some points into consideration when benchmarking a BeeGFS file system.

Valid XHTML :: Valid CSS: :: Powered by WikkaWiki