Striping API


Table of Contents (Page)

  1. Overview
  2. Installation
  3. API Specification
  4. Code Examples
    1. Example build command
    2. Code Example: Create a file with a special stripe pattern
    3. Code Example: Retrieve the stripe pattern of a file
 

Overview


The Striping API was developed during the European Commission funded exascale project DEEP-ER. The API allows the application developer to define some parameters of the stripe pattern. The application developer knows the I/O pattern for each file of the application, so they can decide the striping settings for each file to get the best performance for the application.

A file is striped across multiple targets. It is also possible to stripe the file across multiple targets and mirror the data for resiliency. It is possible to change the number of storage targets which should be used for the striping and the chunksize. The application developer can control these settings using the cache API. The file system administrator can also set the stripe pattern (RAID0 and BuddyMirroring) using the beegfs-ctl, while some striping settings (number of targets and the chunksize) can also be changed by the user.

The following picture shows file striping in BeeGFS.
Striping of files in BeeGFS.
The picture shows Target 1 and Target 3 belong to BuddyMirrorGroup 1. The Target 2 and Target 4 belong to BuddyMirrorGroup 2. Every target belongs to a BuddyMirrorGroup to store mirrored files but the BeeGFS design allows to store unmirroed files in such a target as well. File #1 is mirrored and stripped across BuddyGroup 1 and BuddyGroup 2. So the 1st chunk of File #1 is stored on Target 1 and Target 3, the 2nd chunk is stored on Target 2 and Target 4, the 3rd chunk is stored again on Target 1 and Target 3, and so on. File #2 is only mirrored on BuddyGroup 1. The 1st chunk of File #2 is stored on Target 1 and Target 3, the 2nd chunk is stored again on Target 1 and Target 3, and so on. File #3 is only striped across all 4 targets, but not mirrored. The 1st chunk of File #3 is stored on Target 2, the 2nd chunk is stored on Target 3, the 3rd chunk is stored on Target 4, the 4th chunk is stored on Target 1, the 5th chunk is stored again on Target 2, and so on.

More details about BeeGFS is available on the System Architecture wiki page.


Installation


Install the package beegfs-client-devel.noarch from the repository which fits to your installed BeeGFS version. The package contains the required header files. A library is not required because the API uses ioctls (input/output control) for the communication with the file system. The include file is installed to /usr/include/beegfs/beegfs.h.


API Specification


#define BEEGFS_IOCTL_NODESTRID_BUFLEN   256      // max buffer size for the node string ID

// stripe pattern types
#define BEEGFS_STRIPEPATTERN_INVALID      0      // Stripe pattern is invalid
#define BEEGFS_STRIPEPATTERN_RAID0        1      // Stripe pattern RAID0
#define BEEGFS_STRIPEPATTERN_RAID10       2      // Stripe pattern RAID10
						 // (deprecated since 2015.03)
#define BEEGFS_STRIPEPATTERN_BUDDYMIRROR  3      // Stripe pattern for Buddy Mirroring
						 // (supported since 2015.03)


/**
 * Struct for details of a stripe target
 */
struct BeegfsIoctl_GetStripeTargetV2_Arg
{
   /* inputs */
   uint32_t targetIndex;

   /* outputs */
   uint32_t targetOrGroup; // target ID if the file is not buddy mirrored, otherwise mirror group ID

   uint32_t primaryTarget; // target ID != 0 if buddy mirrored
   uint32_t secondaryTarget; // target ID != 0 if buddy mirrored

   uint32_t primaryNodeID; // node ID of target (if unmirrored) or primary target (if mirrored)
   uint32_t secondaryNodeID; // node ID of secondary target, or 0 if unmirrored

   char primaryNodeStrID[BEEGFS_IOCTL_NODESTRID_BUFLEN];
   char secondaryNodeStrID[BEEGFS_IOCTL_NODESTRID_BUFLEN];
};


/**
 * Get the path to the client config file of an active BeeGFS mountpoint.
 * 
 * @param fd file descriptor pointing to file or dir inside BeeGFS mountpoint.
 * @param outCfgFile buffer for config file path; will be malloc'ed and needs to be free'd by
 *        caller if success was returned.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getConfigFile(int fd, char** outCfgFile);

/**
 * Get the path to the client runtime config file in procfs.
 * 
 * @param fd file descriptor pointing to file or dir inside BeeGFS mountpoint.
 * @param outCfgFile buffer for config file path; will be malloc'ed and needs to be free'd by
 *        caller if success was returned.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getRuntimeConfigFile(int fd, char** outCfgFile);

/**
 * Test if the underlying file system is a BeeGFS.
 * 
 * @param fd file descriptor pointing to some file or dir that should be checked for whether it is
 *        located inside a BeeGFS mount.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_testIsBeeGFS(int fd);

/**
 * Get the mountID aka clientID aka nodeID of client mount aka sessionID.
 * 
 * @param fd file descriptor pointing to some file or dir that should be checked for whether it is
 *        located inside a BeeGFS mount.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getMountID(int fd, char** outMountID);

/**
 * Get the stripe info of a file.
 * 
 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 * @param outPatternType type of stripe pattern (BEEGFS_STRIPEPATTERN_...)
 * @param outChunkSize chunk size for striping.
 * @param outNumTargets number of targets for striping.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getStripeInfo(int fd, unsigned* outPatternType, unsigned* outChunkSize, uint16_t*
	  outNumTargets);

/**
 * Get the stripe target of a file (with 0-based index).
 * 
 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 * @param targetIndex index of target that should be retrieved (start with 0 and then call this
 *        again with index up to "*outNumTargets-1" to retrieve remaining targets).
 * @param outTargetNumID numeric ID of target at given index.
 * @param outNodeNumID numeric ID to node to which this target is assigned.
 * @param outNodeStrID string ID of the node to which this target is assigned; buffer will be 
 *        alloc'ed and needs to be free'd by caller if success is returned.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getStripeTarget(int fd, uint16_t targetIndex, uint16_t* outTargetNumID,
	  uint16_t* outNodeNumID, char** outNodeStrID);

/**
 * Get the stripe target of a file (with 0-based index).
 * 
 * @param fd file descriptor pointing to some file inside a BeeGFS mount.
 * @param targetIndex index of target that should be retrieved (start with 0 and then call this
 *        again with index up to "*outNumTargets-1" to retrieve remaining targets).
 * @param outTargetInfo pointer to struct that will be filled with information about the selected
 *        stripe target
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_getStripeTargetV2(int fd, uint32_t targetIndex,
	  struct BeegfsIoctl_GetStripeTargetV2_Arg* outTargetInfo);

/**
 * Create a new regular file with stripe hints.
 *
 * As the stripe pattern cannot be changed when a file is already created, this is an exclusive
 * create, so it will return an error if the file already existed.
 *
 * @param fd file descriptor pointing to parent directory for the new file.
 * @param filename name of created file.
 * @param mode permission bits of new file (i.e. symbolic constants like S_IRWXU or 0644).
 * @param numtargets desired number of storage targets for striping; 0 for directory default; ~0 to
 *        use all available targets.
 * @param chunksize chunksize per storage target for striping in bytes; 0 for directory default;
 *        must be 2^n >= 64KiB.
 * @return true on success, false on error (in which case errno will be set).
 */
bool beegfs_createFile(int fd, const char* filename, mode_t mode, unsigned numtargets,
	  unsigned chunksize);

/**
 * Checks if the required API version of the application is compatible to current API version
 *
 * @param required_major_version the required major API version of the user application
 * @param required_minor_version the minimal required minor API version of the user application
 * @return true if the required version and the API version are compatible, if not false is returned
 */
bool beegfs_checkApiVersion(const unsigned required_major_version,
	  const unsigned required_minor_version);



Code Examples


The header files of the cache API are located in the default system include path and will be found automatically by your compiler. There is no additional shared libary that needs to be linked to your application.


Example build command


g++ $SOURCE_FILE -o $BINARY_NAME -I /usr/include/



Code Example: Create a file with a special stripe pattern


#include <beegfs/beegfs.h>

#include <dirent.h>
#include <errno.h>
#include <iostream>
#include <libgen.h>
#include <stdlib.h>



static const mode_t MODE_FLAG = S_IRWXU | S_IRGRP | S_IROTH;
static const unsigned numtargets = 8;
static const unsigned chunksize = 1048576; // 1 Mebibyte


int main(int argc, char** argv)
{
   // check if a path to the file is provided
   if(argc != 2)
   {
	  std::cout << "Usage: " << argv[0] << " $PATH_TO_FILE" << std::endl;
	  exit(-1);
   }

   std::string file(argv[1]);
   std::string fileName(basename(argv[1]) );
   std::string parentDirectory(dirname(argv[1]) );

   // check if we got a file name from the given path
   if(fileName.empty() )
   {
	  std::cout << "Can not get file name from given path: " << file << std::endl;
	  exit(-1);
   }

   // check if we got the parent directory path from the given path
   if(parentDirectory.empty() )
   {
	  std::cout << "Can not get parent directory path from given path: " << file << std::endl;
	  exit(-1);
   }

   // open the directory to get a directory stream 
   DIR* parentDir = opendir(parentDirectory.c_str() );
   if(parentDir == NULL)
   {
	  std::cout << "Can not get directory stream of directory: " << parentDirectory
		 << " errno: " << errno << std::endl;
	  exit(-1);
   }
   
   // get a fd of the parent directory
   int fd = dirfd(parentDir);
   if(fd == -1)
   {
	  std::cout << "Can not get fd from directory: " << parentDirectory
		 << " errno: " << errno << std::endl;
	  exit(-1);
   }

   // check if the parent directory is located on a BeeGFS, because the striping API works only on
   // BeeGFS (Results of the BeeGFS ioctl on other file systems are undefined.)
   bool isBeegfs = beegfs_testIsBeeGFS(fd);
   if(!isBeegfs)
   {
	  std::cout << "The given file is not located on an BeeGFS: " << file << std::endl;
	  exit(-1);
   }

   // create the file with the given stripe pattern
   bool isFileCreated = beegfs_createFile(fd, fileName.c_str(), MODE_FLAG, numtargets, chunksize);
   if(isFileCreated)
   {
	  std::cout << "File successful created: " << file << std::endl;
   }
   else
   {
	  std::cout << "Can not create file: " << file << " errno: " << errno << std::endl;
	  exit(-1);
   }
}



Code Example: Retrieve the stripe pattern of a file


#include <beegfs/beegfs.h>

#include <errno.h>
#include <iostream>
#include <stdlib.h>



static const mode_t MODE_FLAG = S_IRWXU | S_IRGRP | S_IROTH;
static const int OPEN_FLAGS = O_RDWR;


int main(int argc, char** argv)
{
   // check if a path to the file is provided
   if(argc != 2)
   {
	  std::cout << "Usage: " << argv[0] << " $PATH_TO_FILE" << std::endl;
	  exit(-1);
   }

   std::string file(argv[1]);

   // open the provided file
   int fd = open(file.c_str(), OPEN_FLAGS, MODE_FLAG);
   if(fd == -1)
   {
	  std::cout << "Open: can not open file: " << file << " errno: " << errno << std::endl;
	  exit(-1);
   }

   // check if the file is located on a BeeGFS, because the striping API works only on BeeGFS
   // (Results of the BeeGFS ioctls on other file systems are undefined.)
   bool isBeegfs = beegfs_testIsBeeGFS(fd);
   if(!isBeegfs)
   {
	  std::cout << "The given file is not located on an BeeGFS: " << file << std::endl;
	  exit(-1);
   }

   unsigned outPatternType = 0;
   unsigned outChunkSize = 0;
   uint16_t outNumTargets = 0;

   // retrive the stripe pattern of the file and print them to the console
   bool stripeInfoRetVal = beegfs_getStripeInfo(fd, &outPatternType, &outChunkSize, &outNumTargets);
   if(stripeInfoRetVal)
   {
	  std::string patternType;
	  switch(outPatternType)
	  {
		 case BEEGFS_STRIPEPATTERN_RAID0:
			patternType = "RAID0";
			break;
		 case BEEGFS_STRIPEPATTERN_RAID10:
			patternType = "RAID10";
			break;
		 case BEEGFS_STRIPEPATTERN_BUDDYMIRROR:
			patternType = "BUDDYMIRROR";
			break;
		 default:
			patternType = "INVALID";
	  }
	  std::cout << "Stripe pattern of file: " << file << std::endl;
	  std::cout << "+ Type: " << patternType << std::endl;
	  std::cout << "+ Chunksize: " << outChunkSize << " Byte" << std::endl;
	  std::cout << "+ Number of storage targets: " << outNumTargets << std::endl;
	  std::cout << "+ Storage targets:" << std::endl;

	  // get the targets which are used for the file and print them to the console
	  for (int targetIndex = 0; targetIndex < outNumTargets; targetIndex++)
	  {
		 struct BeegfsIoctl_GetStripeTargetV2_Arg outTargetInfo;

		 bool stripeTargetRetVal = beegfs_getStripeTargetV2(fd, targetIndex, &outTargetInfo);
		 if(stripeTargetRetVal)
		 {
			if(outPatternType == BEEGFS_STRIPEPATTERN_BUDDYMIRROR)
			{
			   std::cout << "  + " << outTargetInfo.targetOrGroup
				  << " @ " << outTargetInfo.primaryTarget
				  << " @ " << outTargetInfo.primaryNodeStrID
				  << " [ID: "<< outTargetInfo.primaryNodeID << "]" << std::endl;
			   std::cout << "  + " << outTargetInfo.targetOrGroup
				  << " @ " << outTargetInfo.secondaryTarget
				  << " @ " << outTargetInfo.secondaryNodeStrID
				  << " [ID: "<< outTargetInfo.secondaryNodeID << "]" << std::endl;
			}
			else
			{
			   std::cout << "  + " << outTargetInfo.targetOrGroup
				  << " @ " << outTargetInfo.primaryNodeStrID
				  << " [ID: "<< outTargetInfo.primaryNodeID << "]" << std::endl;
			}
		 }
		 else
		 {
			std::cout << "Can not get stripe targets of file: " << file << std::endl;
			exit(-1);
		 }
	  }
   }
   else
   {
	  std::cout << "Can not get stripe info of file: " << file << std::endl;
	  exit(-1);
   }
}



Back to BeeGFS APIs Overview
Valid XHTML :: Valid CSS: :: Powered by WikkaWiki