Quota Support
Table of Contents (Page) |
---|
Introduction
BeeGFS allows the definition of system-wide quotas of disk space allocation and number of chunk files, on a per-user or per-group basis. This can be used to organize users in different access layers with different levels of restriction and also prevent individuals from consuming alone all file system's resources.
The BeeGFS quota management mechanism is composed of two features: quota tracking and quota enforcement. Quota tracking allows the query of the amount of data and the number of chunk files that users and groups are using in the system, without imposing any restriction.
Quota enforcement allows the definition and application of quota limits in the whole system. When this feature is enabled, the BeeGFS management daemon periodically collects quota reports from all storage targets in regular intervals, checks for exceeded quota limits, and informs the rest of the system about which users are no longer allowed to consume more resources.
BeeGFS quota management relies on quota data provided by the underlying file systems of storage server targets. Therefore, the capabilities of such file systems determine which types of quota BeeGFS is able to manage. For example, if the storage targets a version of ZFS prior to 0.7.4, BeeGFS will allow the definition of quotas only for used space, not for number of files, as the latter is not supported by old releases of ZFS. If you use ZFS 0.7.4 or later, the latest version of BeeGFS will allow you define both types of quota.
Quota limits can be configured globally, or separately for each storage pool.
The creation of new files will be prohibited when either the global or the per pool limit is reached.
The following sections explain in more details how these features work and how they can be configured.
Quota tracking
This section provides information on how to enable tracking of used disk space and number of chunk files on the storage targets.
Requirements and general notes
To enable quota tracking, the BeeGFS release of all server and client services must be 2012.10-r11 or higher. Quota tracking is designed to generally work with any underlying local file system on the storage servers that supports user and group quota (reported through the syscall quotactl()), but has only been fully tested with ext4, XFS and ZFS. If ZFS is used as the underlying file system of storage targets, the release of BeeGFS storage services must be at least 2015.03-r10.
Make sure that the local systems of all nodes are correctly configured to query passwd and group databases, by running the commands below. The first command should print the complete list of user IDs. The second one should print the complete list of group IDs.
$ getent passwd $ getent group
If the commands above do not list all users and groups, you will not be able to use the command "beegfs-ctl --getquota --all" to query used space for all users at once and you will not be able to use "quotaQueryType = system" in file beegfs-mgmtd.conf for quota enforcement. However, there are alternatives to both, which you will find in further sections.
If you are also creating files on the storage targets outside of the BeeGFS storage directory, note that the blocks and inodes occupied by those files will also account as used resources for the corresponding owner user. The reports would also be distorted if multiple storage targets were located within the same local file system instance.
Files stored in the disposal directory (which do not appear under the BeeGFS client mountpoint) also account for the amount of space used by users. Therefore, try to clear the disposal directory if you think that shown used space defers from actually used disk space.
Quota tracking has no requirement concerning metadata targets.
It is important to note that quota limits of number of files concern data chunk files created on storage targets and not files created by end-users under the BeeGFS mount point. It is also important to understand that such quota limits do not concern number of directories created in the system.
Enabling quota during a new BeeGFS installation
Walk through these steps if you are about to setup a new BeeGFS instance that should support quota.
In this example, we assume that /dev/sdb is the underlying disk or RAID array of a storage target, which is mounted to the directory /data.
1) Start by enabling quota support for the underlying file system on the storage targets, as described below for ext4, XFS, and ZFS.
# Mount device with quota support for users and groups $ mount /dev/sdb /data -t ext4 -orw,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv1,... # Create quota database files $ quotacheck -cug /data # Calculate current quota values $ quotacheck -vug /data # Enable quota counting $ quotaon -vug /data
XFS: Enable quota support for XFS:
# Mount device with quota support for users and groups $ mount /dev/sdb /data -t xfs -orw,uqnoenforce,gqnoenforce,...
ZFS: Enable quota support for ZFS:
Make sure that the package libzfs2-devel is installed on your system.
On Debian/Ubuntu systems install libzfslinux-dev.
Nothing else needs to be done, because quota tracking is supported automatically based on libzfs.
quotaEnabled = true
This setting will cause the client to transfer extra user data to the servers, namelly the uid and gid of the user making every IO syscall. This extra data allows BeeGFS to correctly compute disk space use and number of files created by each user. If this setting is not done on a client node, all syscalls performed on that node will affect the quota consumption of the root user, instead of the actual caller.
Enabling quota for an existing BeeGFS installation
Take these steps if you want to enable quota support for an existing BeeGFS instance that was previously used without quota support.
In this example, we assume that /dev/sdb is the underlying disk or RAID array of a storage target, which is mounted to the directory /data.
1) Stop all BeeGFS server and client services.
2) Enable quota support for the underlying file system on the storage targets, as described below for ext4, XFS and ZFS.
# Mount device with quota support for users and groups $ mount /dev/sdb /data -t ext4 -orw,usrjquota=aquota.user,grpjquota=aquota.group,jqfmt=vfsv1,... # Create quota database files $ quotacheck -cug /data # Calculate current quota values $ quotacheck -vug /data # Enable quota counting $ quotaon -vug /data
XFS: Enable quota support for XFS:
# Mount device with quota support for users and groups $ mount /dev/sdb /data -t xfs -orw,uqnoenforce,gqnoenforce,...
ZFS: Enable quota support for ZFS:
Make sure that the package libzfs2-devel is installed on your system.
On Debian/Ubuntu systems install libzfslinux-dev.
Nothing else needs to be done, because quota tracking is supported automatically based on libzfs.
quotaEnabled = true
This setting will cause the client to transfer extra user data to the servers, namelly the uid and gid of the user making every IO syscall. This extra data allows BeeGFS to correctly compute disk space use and number of files created by each user. If this setting is not done on a client node, all syscalls performed on that node will affect the quota consumption of the root user, instead of the actual caller.
4) Start all BeeGFS services.
5) Run the following command on one of the client nodes to update the ownership information of the existing data chunk files on the storage servers for quota tracking. This command can take a while to complete, but it is executed only once and the system can be online while the chunk files are being updated.
$ beegfs-fsck --enablequota
This command could be re-executed if you discover later that some clients didn't have option quotaEnabled set to #true#, and you want to update the ownership information of the data chunk files created in the meantime.
Querying quota information
Quota information can be queried with "beegfs-ctl --getquota". The command directly collects quota reports from all storage servers and quota limits from the management service (if defined) and aggregates all the quota information. A table will be printed for each storage pool. Here are some usage examples.
- Show quota information for all normal users:
$ beegfs-ctl --getquota --uid --all
- Show quota information for the user ID 1000:
$ beegfs-ctl --getquota --uid 1000
- Show quota information for group IDs range 1000 to 1500:
$ beegfs-ctl --getquota --gid --range 1000 1500
- Show the default quota limits:
$ beegfs-ctl --getquota --defaultlimits
- To get quota information for a specific storage pool, include the `--storagepoolid=X` option in the command. For exmaple:
$ beegfs-ctl --getquota --uid 1000 --storagepoolid=2
- Show more examples and general help:
$ beegfs-ctl --getquota --help
If the underlying file system of the storage targets is ZFS and therefore, quota of number of files is not supported, the values of the column for used files/inodes will be marked with a dash ("-").
Quota enforcement
This section provides information on how to activate quota enforcement in a BeeGFS system.
Requirements
Quota enforcement requires quota tracking to be enabled, as described above. In addition, all server and client services must be running BeeGFS release 2014.01-r10 or higher. Since release 2015.03-r20, all BeeGFS server daemons get the quota enforcement configuration from the management daemon, making this configuration much simpler. Therefore, 2015.03-r20 is the minimal recommended release.
Enable quota enforcement
1) Set the option below to true in the storage configuration file /etc/beegfs/beegfs-storage.conf.
quotaEnableEnforcement = true
2) Restart the storage service daemon.
quotaUpdateIntervalMin = 10
quotaQueryType = system
quotaQueryType = range quotaQueryUIDRange = 1200,2000 quotaQueryGIDRange = 15000,20000
quotaQueryType = file quotaQueryGIDFile = /etc/beegfs/groupIDs quotaQueryUIDFile = /etc/beegfs/userIDs
quotaEnableEnforcement = true
5) These changes won't be noticed be the other server services until they are restarted. Therefore, restart the storage service daemons and the metadata service daemons.
Setting quota limits
Quota limits can be set with the command "beegfs-ctl --setquota". Here are some usage examples.
- Set quota limit for user ID 1000 to 1 gigabyte and 500 chunk files:
$ beegfs-ctl --setquota --uid 1000 --sizelimit=1G --inodelimit=500
- Set quota limit for group ID 1289 to 10 gigabyte and 22 chunk files:
$ beegfs-ctl --setquota --gid 1289 --sizelimit=10G --inodelimit=50
- Set quota limit for user ID 1000 to unlimited size and 500 chunk files:
$ beegfs-ctl --setquota --uid 1000 --sizelimit=unlimited --inodelimit=500
- Set quota limit for user ID 1000 to unlimited size and reset the chunk files to use the default quota limit:
$ beegfs-ctl --setquota --uid 1000 --sizelimit=unlimited --inodelimit=reset
- Set quota limit for group ID 1289 to 10 gigabyte and unlimited chunk files:
$ beegfs-ctl --setquota --gid 1289 --sizelimit=10G --inodelimit=unlimited
- Set default quota limits for the users to 10 gigabyte and unlimited chunk files:
$ beegfs-ctl --setquota --uid --default --sizelimit=10G --inodelimit=unlimited
- Similar to the --getquota mode, it is possible to set the quota limits via --all, --range and --list parameters. The --setquota mode also allows the import of quota limits from a file. Each line defines the limit for a user or group. Only one type of ID (either user or group) can be given in a quota file. The quota file line format is: <ID or name>,<size limit>,<inode limit>
Example file contents for user quota limits (e.g. located at /tmp/user_quota_limits.txt):
To load the example user quota limits file and apply the user quota limits:
2345,1T,500 8999,5G,20 dbadmin,20G,5000
To load the example user quota limits file and apply the user quota limits:
$ beegfs-ctl --setquota --uid --file=/tmp/user_quota_limits.txt
- Quota can be configured per storage pool by specifying a storage pool id when running the setquota command. For example:
$ beegfs-ctl --setquota --uid 1000 --sizelimit=1G --inodelimit=500 --storagepoolid=2
- To show general help:
$ beegfs-ctl --setquota --help
Project directory quota tracking
The BeeGFS quota management mechanism is based on user and group quota. Group quota can be used for project directories by using the setgid flag on a directory ("chmod g+s /mnt/beegfs/project01"). If this flag is set, all files created in the directory will automatically have the group of the directory instead of the primary group of the user who created the file.
With this approach, it is useful to also create a separate group for the project, e.g. a group project01 and apply it to the project directory ("chown root:project01 /mnt/beegfs/project01"). To avoid conflicts with per-user quota limits, the same approach can be used not only for shared project directories, but also for user directories, in which case each user has its own group.
Alternatively, if you want to track used space or number of files based on subdirectory trees, you might want to look at the Robinhood Policy Engine.
Robinhood can run parallel scans of the file system in regular intervals and store the discovered file and directory information in a SQL database. On the one hand, this has the advantage of enabling various queries of the database with fast results. On the other hand, automatic actions for certain events can be defined in Robinhood, e.g. if the defined used space limit for a certain subdirectory tree is exceeded.
As BeeGFS keeps all the metadata for such scans readily available on the metadata servers (usually flash storage), crawling a file system in parallel is fast. To make sure that the SQL database of Robinhood does not reduce the scan speed, it is recommended to have the Robinhood database also on flash storage.
Back to User Guide - Table of Contents