{"id":626,"date":"2021-01-22T13:20:00","date_gmt":"2021-01-22T12:20:00","guid":{"rendered":"https:\/\/www.beegfs.io\/c\/?p=626"},"modified":"2022-03-28T23:40:26","modified_gmt":"2022-03-28T21:40:26","slug":"beegfs-data-management","status":"publish","type":"post","link":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/","title":{"rendered":"BeeGFS Data Management"},"content":{"rendered":"<p><em>by Peter Braam, CTO, ThinkParQ, <\/em><em>January 2021<\/em><\/p>\n<p><span style=\"font-weight: 400;\">Welcome to the BeeGFS blog.\u00a0 I intend to occasionally publish some of our architectural thoughts, specifically to solicit feedback, publicly in blog comments or on our <\/span><a href=\"https:\/\/groups.google.com\/g\/fhgfs-user\" target=\"_blank\" rel=\"noopener noreferrer\"><span style=\"font-weight: 400;\">mailing list<\/span><\/a><span style=\"font-weight: 400;\">. Alternatively you can send an email to cto &lt;at&gt; thinkparq &lt;dot&gt; com. Probably we\u2019ll find that posts here will be most useful during our early consideration of features, and perhaps there will be 2-4 posts per year.<\/span><!--more--><\/p>\n<p><span style=\"font-weight: 400;\">Many partners ask about managing data across collections of servers in their clusters.\u00a0 For example, how can data be rebalanced over emptier, new servers and fuller, old servers? Technically we use the word pool, e.g. old pool and new pool, to refer to the available storage on the older and newer servers.\u00a0 There are variations on this question, for example, how can we leverage servers with NVMe drives that form a fast storage tier?\u00a0 Here the word tier is another technical term designating a collection of storage services.\u00a0 Utilizing tiers means that we want to move files to that tier to have them ready for fast transfer to clients and move them back to a pool on a slower tier when they have been idle for some time and merely consume capacity on the expensive, fast devices.\u00a0 In yet another scenario we would want to move files that have not been accessed for some time, or that belong to a finished project into an S3 object store.<\/span><\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-629 size-full\" src=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png\" alt=\"BeeGFS cluster requiring space rebalancing and fast tier management\" width=\"1280\" height=\"720\" srcset=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-200x113.png 200w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-300x169.png 300w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-400x225.png 400w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-600x338.png 600w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-768x432.png 768w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-800x450.png 800w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-1024x576.png 1024w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003-1200x675.png 1200w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png 1280w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><\/p>\n<p style=\"text-align: center;\"><b>Figure 1: BeeGFS cluster requiring space rebalancing and fast tier management<\/b><\/p>\n<p><span style=\"font-weight: 400;\">During discussions with our partners, several other features were requested and we began to see how closely related they are.\u00a0 The following features appear to be at the top of the wishlist:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">handling tiers &#8211; placing or moving data into faster or slower tiers<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">data rebalancing &#8211; within and between pools<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">data movement &#8211; parallel data movement within, into, and out of BeeGFS<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">feeds into Irods, S3-like object storage, HSM systems, and auditing systems<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">directory subtree quotas<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400;\">ThinkParQ prefers to keep the BeeGFS file system fast, nimble, and easy to use, and in this spirit, the architecture proposed here combines minor features in the file system with a few external utilities, some of which exist as open source software or can be derived from it.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Consider the following file system interfaces and external utilities:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">BeeGFS event logging &#8211; including read-only file access and quota events<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Atomically swap the data between two BeeGFS files<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Block or abort concurrent writes on a single file<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">A powerful data-movement tool, like <\/span><a href=\"https:\/\/www.lanl.gov\/projects\/ultrascale-systems-research-center\/software.php\"><span style=\"font-weight: 400;\">Los Alamos\u2019 pftool<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">A sub-tree aware file system inventory database<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Directory quota management through client file system directory tree<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">In this blog post, we are going to briefly discuss these interfaces and utilities and see how they will enable the features above.\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Let\u2019s start with a summary.\u00a0 All of the feature requests share a common requirement, namely having knowledge about collections of files.\u00a0 Such knowledge can be extracted from the existing <\/span><a href=\"https:\/\/www.beegfs.io\/wiki\/FilesystemModificationEvents\" target=\"_blank\" rel=\"noopener noreferrer\"><span style=\"font-weight: 400;\">BeeGFS event log<\/span><\/a><span style=\"font-weight: 400;\"> which provides file access and modification events on a socket, in real-time, on a file by file basis.\u00a0 For historical knowledge about larger collections of files, we will use a subtree-aware inventory database. Below, we will illustrate what these databases can do and how they are maintained.\u00a0 Most of the features under discussion involve data movement and we include a powerful parallel file moving tool as one of the desirable utilities.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now, let\u2019s look at the additional logic to provide the core functionality of the features.\u00a0<\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">If we want to exploit a fast tier dynamically, a dynamic-fast-tier (DFT) utility should observe the <\/span><b>event log<\/b><span style=\"font-weight: 400;\"> to detect common access patterns. For example, it can anticipate that an application is going to iterate through all the files in a directory when it sees that a few have already been opened. <img decoding=\"async\" class=\"alignnone wp-image-647 size-full\" src=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1.png\" alt=\"event log and dynamic adaptation for fast tiers\" width=\"3000\" height=\"1688\" srcset=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-200x113.png 200w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-300x169.png 300w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-400x225.png 400w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-600x338.png 600w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-768x432.png 768w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-800x450.png 800w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-1024x576.png 1024w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-1200x675.png 1200w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1-1536x864.png 1536w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_2_v01_300dpi-002-1.png 3000w\" sizes=\"(max-width: 3000px) 100vw, 3000px\" \/><\/span><\/p>\n<p style=\"text-align: center;\"><b>Figure 2: event log and dynamic adaptation for fast tiers<\/b><\/p>\n<p><span style=\"font-weight: 400;\">When the DFT utility observes such a pattern, it will invoke the <\/span><b>data mover<\/b><span style=\"font-weight: 400;\"> to copy the file to a faster tier.\u00a0 LANL has created a sophisticated parallel data mover called pftool, which carefully manages work queues across a collection of data mover nodes and includes interfaces that can perform I\/O with object stores. We will see below why a powerful utility will be useful.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">But there is a subtlety, and just copying the file to the fast tier isn\u2019t quite enough.\u00a0 During this copy operation, we may want to <\/span><b>abort the copy or block concurrent writes<\/b><span style=\"font-weight: 400;\"> to the file by another process.\u00a0 This is necessary to keep the data migration process consistent, and it is not a complicated feature.\u00a0 We simply abort our copy when another process opens the migrating file for writing.\u00a0 When the copy is done, the original file should point to the data on the new tier.\u00a0 A convenient interface for this is to <\/span><b>swap the data in the original and copied file, <\/b><span style=\"font-weight: 400;\">which is a quick metadata-only operation.\u00a0 After this, we can delete the inode to which we copied, which now has the data on the old tier.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Is it reasonable to expect that these relatively simple interfaces and a copy tool can give a reasonable solution to the tiering problem?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The other requested features are similar in spirit, and most are somewhat simpler than the tiering problem we just discussed.\u00a0 One difference to note is that in many cases the event logging system cannot provide the right stream of filenames upon which to act. This is where an inventory database comes into play.\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We want the inventory database to record for each directory and for a set of conditions (predicates) the number of and aggregate bytes consumed by the files which satisfy the condition in the subtree of that directory.\u00a0 When we introduce no condition at all we record the file count and bytes in the subtree of each directory.\u00a0 This is already quite useful because it gives a snapshot of the sub-tree quota (often called directory quota).\u00a0 A near instantaneous lookup will produce these numbers, which are normally obtained by a run of a utility like \u201cdu\u201d.\u00a0\u00a0<\/span><\/p>\n<p style=\"text-align: left;\"><span style=\"font-weight: 400;\">Another condition might state that the file must reside on the fast tier of storage.\u00a0 For this condition we record the count of and bytes consumed by files on that tier in the subtree of each directory, allowing us to see how full the tier is and initiate migration from it. \u00a0 Alternatively, we can look at files accessed or modified before or after a certain date, or by files owned by a particular user or group. If extended attributes are used, additional user managed predicates can be leveraged as well. In fact, tracking storage location in BeeGFS is done with extended attributes.\u00a0 Figure 3 illustrates how the directory summaries that we record in the inventory add up what is in their subtree.<\/span><\/p>\n<p style=\"text-align: center;\"><b><img decoding=\"async\" class=\"alignnone wp-image-648 size-full\" src=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_3_v01_300dpi-002-1-e1611320076622.png\" alt=\"Subtree inventory database records add up file counts and bytes consumed by files in every subtree satisfying a property\" width=\"1800\" height=\"1341\" \/>Figure 3: Subtree inventory database records add up file counts and bytes consumed by files in every subtree satisfying a property\u00a0<\/b><\/p>\n<p><span style=\"font-weight: 400;\">But before discussing the database further, let\u2019s see how we can implement a directory quota system and the other features mentioned above.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A possible way to enforce directory quota is to introduce two extended attributes for directories subject to directory quota.\u00a0 One is the administrators choice of the maximum allowed quota, and the other attribute contains the consumed quota derived from the inventory database.\u00a0 Enforcement can proceed, for example, by not allowing files under the directory to be opened for writing when the consumed quota exceeds the maximum quota (for rigorous enforcement quota changes arising from already opened files must be discussed also, which we omit from the discussion here).\u00a0 The BeeGFS client file system can enforce this in the open system call with modest overhead.\u00a0 I would traverse from the file that is being opened to the ancestor in the dentry tree that has the quota attributes.\u00a0 When a user encounters such a denial to open a file, and when it subsequently removes some files to free up some space, the client could create a file system event which is consumed by a directory-quota (DQ) utility.\u00a0 The DQ utility updates the inventory database and resets the quota consumed, and the user can try again.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The data movement cases for rebalancing pools, servers, and migration to and from the cloud or HSM mentioned above are simpler.\u00a0 A pool-rebalancing (PR) utility will use the following logic.\u00a0 Periodically after the inventory database has been updated, it queries the inventory to see if too much space is consumed in a particular pool.\u00a0 If that is the case, it uses the inventory to find directories (using a fast logarithmic search) in which candidate files for migration can be found. Note that a nice predicate for the search could select files that reside in the pool, are of reasonably large size, and haven\u2019t been accessed too recently.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A common scenario for pool rebalancing is after servers have been added, and in this case perhaps many files have to be migrated.\u00a0 Here a powerful parallel file mover can maintain migration jobs evenly and efficiently across a set of mover nodes.\u00a0<\/span><\/p>\n<p style=\"text-align: center;\"><img decoding=\"async\" class=\"alignnone wp-image-649 size-full\" src=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003.png\" alt=\"Space rebalancing and fast tier management with pftool, inventory, and events\" width=\"3000\" height=\"1688\" srcset=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-200x113.png 200w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-300x169.png 300w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-400x225.png 400w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-600x338.png 600w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-768x432.png 768w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-800x450.png 800w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-1024x576.png 1024w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-1200x675.png 1200w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003-1536x864.png 1536w, https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_4_v01_300dpi-003.png 3000w\" sizes=\"(max-width: 3000px) 100vw, 3000px\" \/><b>Figure 4: Space rebalancing and fast tier management with pftool, inventory, and events<\/b><\/p>\n<p><span style=\"font-weight: 400;\">The use of an inventory database is not a new idea, it has been discussed as early as the 1990\u2019s in academia, and falls in the general class of Merkel tree structures.\u00a0 More recently, Apple included some of these features in APFS, and <\/span><a href=\"https:\/\/www.lanl.gov\/projects\/ultrascale-systems-research-center\/software.php\"><span style=\"font-weight: 400;\">Los Alamos\u2019 grand unified file index (GUFI<\/span><\/a><span style=\"font-weight: 400;\">) was introduced in 2018 while perhaps the simplest inventory\u00a0 <\/span><a href=\"https:\/\/storageconference.us\/2017\/Presentations\/CampaignStorage-slides.pdf\"><span style=\"font-weight: 400;\">database was described at MSST<\/span><\/a><span style=\"font-weight: 400;\"> in 2017.\u00a0 A key difference between the database we propose to use and an HSM database such as Robinhood is that there will only be entries for directories.\u00a0 There will be no database entries for every file, and this is a deliberate choice to keep the database much smaller than the file system metadata itself.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The creation of such a database requires a file system scan and one could propose to do this just occasionally at a low priority. \u00a0 However, the inventory database can often be updated very efficiently. Because the database has an additive structure, one can perform an update in a small sub-tree and propagate the changes to the root by adding them into ancestors. \u00a0 But even better, a database update can skip subtrees which were not accessed since the previous update.\u00a0 Particularly for interactive situations like we encountered for directory quota, there usually is a quick way to restore a user\u2019s standing with respect to quota. If snapshots and differences between snapshots are available the updates to this database can be very efficient as they merely need to add and subtract counts and space used based on the differences.\u00a0\u00a0\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The database will not be 100% accurate, and while that will not lead to any data loss, it will be possible to think of unlikely but unfortunate scenarios. For example, an out-of-date database may contain insufficiently many candidates to migrate out of a pool, because the pool was updated more recently than the database.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Surprisingly many features can be created without complex changes to the file system.\u00a0 We are excited about this possibility and curious if you see further applications or drawbacks.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Our next post will be about a very different topic, again driven by our users.\u00a0 We will explore how BeeOND, the much loved BeeGFS On Demand configuration tool can evolve further.<\/span><\/p>\n<p>&#8212;<\/p>\n<p>We value your feedback and comments, please comment below <a href=\"https:\/\/groups.google.com\/g\/fhgfs-user\">or on the BeeGFS User Forum.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>by Peter Braam, CTO, ThinkParQ, January 2021 Welcome to the BeeGFS blog.\u00a0 I intend to occasionally publish some of our architectural thoughts, specifically to solicit feedback, publicly in blog comments or on our mailing list. Alternatively you can send an email to cto &lt;at&gt; thinkparq &lt;dot&gt; com. Probably we\u2019ll <a href=\"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/\"> <span>Read More<\/span><\/a><\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"image","meta":{"_acf_changed":false,"footnotes":""},"categories":[6],"tags":[],"class_list":["post-626","post","type-post","status-publish","format-image","hentry","category-blog","post_format-post-format-image"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.7 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>BeeGFS Data Management - BeeGFS - The Leading Parallel Cluster File System<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"BeeGFS Data Management - BeeGFS - The Leading Parallel Cluster File System\" \/>\n<meta property=\"og:description\" content=\"by Peter Braam, CTO, ThinkParQ, January 2021 Welcome to the BeeGFS blog.\u00a0 I intend to occasionally publish some of our architectural thoughts, specifically to solicit feedback, publicly in blog comments or on our mailing list. Alternatively you can send an email to cto &lt;at&gt; thinkparq &lt;dot&gt; com. Probably we\u2019ll Read More\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/\" \/>\n<meta property=\"og:site_name\" content=\"BeeGFS - The Leading Parallel Cluster File System\" \/>\n<meta property=\"article:published_time\" content=\"2021-01-22T12:20:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-03-28T21:40:26+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png\" \/>\n<meta name=\"author\" content=\"Troy Patterson\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Troy Patterson\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/\"},\"author\":{\"name\":\"Troy Patterson\",\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/#\\\/schema\\\/person\\\/889fafb6e064ad194bf6b995f2e5147f\"},\"headline\":\"BeeGFS Data Management\",\"datePublished\":\"2021-01-22T12:20:00+00:00\",\"dateModified\":\"2022-03-28T21:40:26+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/\"},\"wordCount\":1906,\"commentCount\":0,\"image\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/wp-content\\\/uploads\\\/2021\\\/01\\\/BeeGFS_Figure_1_v02_300dpi-003.png\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/\",\"url\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/\",\"name\":\"BeeGFS Data Management - BeeGFS - The Leading Parallel Cluster File System\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/wp-content\\\/uploads\\\/2021\\\/01\\\/BeeGFS_Figure_1_v02_300dpi-003.png\",\"datePublished\":\"2021-01-22T12:20:00+00:00\",\"dateModified\":\"2022-03-28T21:40:26+00:00\",\"author\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/#\\\/schema\\\/person\\\/889fafb6e064ad194bf6b995f2e5147f\"},\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/wp-content\\\/uploads\\\/2021\\\/01\\\/BeeGFS_Figure_1_v02_300dpi-003.png\",\"contentUrl\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/wp-content\\\/uploads\\\/2021\\\/01\\\/BeeGFS_Figure_1_v02_300dpi-003.png\",\"width\":1280,\"height\":720},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/beegfs-data-management\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"BeeGFS Data Management\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/#website\",\"url\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/\",\"name\":\"BeeGFS - The Leading Parallel Cluster File System\",\"description\":\"\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/#\\\/schema\\\/person\\\/889fafb6e064ad194bf6b995f2e5147f\",\"name\":\"Troy Patterson\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/3aedb776f814472f0e8914ee35ac325890f5c0d2d64f65d2ab44c6377bff6e6a?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/3aedb776f814472f0e8914ee35ac325890f5c0d2d64f65d2ab44c6377bff6e6a?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/3aedb776f814472f0e8914ee35ac325890f5c0d2d64f65d2ab44c6377bff6e6a?s=96&d=mm&r=g\",\"caption\":\"Troy Patterson\"},\"url\":\"https:\\\/\\\/www.beegfs.io\\\/c\\\/author\\\/tpatterson\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"BeeGFS Data Management - BeeGFS - The Leading Parallel Cluster File System","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/","og_locale":"en_US","og_type":"article","og_title":"BeeGFS Data Management - BeeGFS - The Leading Parallel Cluster File System","og_description":"by Peter Braam, CTO, ThinkParQ, January 2021 Welcome to the BeeGFS blog.\u00a0 I intend to occasionally publish some of our architectural thoughts, specifically to solicit feedback, publicly in blog comments or on our mailing list. Alternatively you can send an email to cto &lt;at&gt; thinkparq &lt;dot&gt; com. Probably we\u2019ll Read More","og_url":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/","og_site_name":"BeeGFS - The Leading Parallel Cluster File System","article_published_time":"2021-01-22T12:20:00+00:00","article_modified_time":"2022-03-28T21:40:26+00:00","og_image":[{"url":"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png","type":"","width":"","height":""}],"author":"Troy Patterson","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Troy Patterson","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#article","isPartOf":{"@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/"},"author":{"name":"Troy Patterson","@id":"https:\/\/www.beegfs.io\/c\/#\/schema\/person\/889fafb6e064ad194bf6b995f2e5147f"},"headline":"BeeGFS Data Management","datePublished":"2021-01-22T12:20:00+00:00","dateModified":"2022-03-28T21:40:26+00:00","mainEntityOfPage":{"@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/"},"wordCount":1906,"commentCount":0,"image":{"@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#primaryimage"},"thumbnailUrl":"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/","url":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/","name":"BeeGFS Data Management - BeeGFS - The Leading Parallel Cluster File System","isPartOf":{"@id":"https:\/\/www.beegfs.io\/c\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#primaryimage"},"image":{"@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#primaryimage"},"thumbnailUrl":"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png","datePublished":"2021-01-22T12:20:00+00:00","dateModified":"2022-03-28T21:40:26+00:00","author":{"@id":"https:\/\/www.beegfs.io\/c\/#\/schema\/person\/889fafb6e064ad194bf6b995f2e5147f"},"breadcrumb":{"@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.beegfs.io\/c\/beegfs-data-management\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#primaryimage","url":"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png","contentUrl":"https:\/\/www.beegfs.io\/c\/wp-content\/uploads\/2021\/01\/BeeGFS_Figure_1_v02_300dpi-003.png","width":1280,"height":720},{"@type":"BreadcrumbList","@id":"https:\/\/www.beegfs.io\/c\/beegfs-data-management\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.beegfs.io\/c\/"},{"@type":"ListItem","position":2,"name":"BeeGFS Data Management"}]},{"@type":"WebSite","@id":"https:\/\/www.beegfs.io\/c\/#website","url":"https:\/\/www.beegfs.io\/c\/","name":"BeeGFS - The Leading Parallel Cluster File System","description":"","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.beegfs.io\/c\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.beegfs.io\/c\/#\/schema\/person\/889fafb6e064ad194bf6b995f2e5147f","name":"Troy Patterson","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/3aedb776f814472f0e8914ee35ac325890f5c0d2d64f65d2ab44c6377bff6e6a?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/3aedb776f814472f0e8914ee35ac325890f5c0d2d64f65d2ab44c6377bff6e6a?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/3aedb776f814472f0e8914ee35ac325890f5c0d2d64f65d2ab44c6377bff6e6a?s=96&d=mm&r=g","caption":"Troy Patterson"},"url":"https:\/\/www.beegfs.io\/c\/author\/tpatterson\/"}]}},"_links":{"self":[{"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/posts\/626","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/comments?post=626"}],"version-history":[{"count":27,"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/posts\/626\/revisions"}],"predecessor-version":[{"id":1144,"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/posts\/626\/revisions\/1144"}],"wp:attachment":[{"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/media?parent=626"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/categories?post=626"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.beegfs.io\/c\/wp-json\/wp\/v2\/tags?post=626"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}