30th June 2023
If you have seen the latest BeeGFS release notes you may be puzzled why we now have not two, but THREE release branches. Worry not, this is just another sign of the times, the buzziest file system is growing up. Record scratch–you’re probably wondering how we ended up here.
Sometimes, BeeGFS is referred to as the “baby Lustre”, a comparison we’ve always found amusing. It is, indeed, a fitting analogy, as BeeGFS has evolved by learning from the path paved by other High-Performance Computing (HPC) storage systems. Like a younger sibling benefiting from their elder’s experiences, BeeGFS is becoming the embodiment of the next generation of HPC storage — user-friendly and opinionated about what features are important to prevent software bloat and deliver fast and reliable performance. However, we suspect that some who call it “baby Lustre” imply BeeGFS is less mature and unable to handle large-scale or enterprise workloads. To them, we respectfully beg to differ.
Historically, if you exclude external factors like hardware failures from the equation, BeeGFS would usually operate happily indefinitely. But all good software should be designed for the hardware it runs on, and in the past, BeeGFS didn’t always react as nicely as we would have liked to the myriad of possible hardware failures. That’s why, over the past few years, the engineering team has prioritized hardening the core product over adding flashy new features to ensure it can thrive in the real world. This is especially true in a modern world where things run in containers that may need to restart at a moment’s notice to accommodate the dynamic nature of cloud environments. Enterprises are also adopting AI and HPC techniques and bringing new expectations for reliability.
If you’ve been a regular at the BeeGFS User Group meetings over the past few years, you’ve undoubtedly heard us emphasize this focus on quality for a while. So, we wouldn’t blame you for asking, “Are we there yet?” Now, with the introduction of a new minor version — boasting the most substantial feature since the introduction of NVIDIA Magnum IO GPUDirect Storage in 7.3.0 — it feels like an opportune moment to reflect on the current state of the core BeeGFS software. As we perused the release notes from our 7.3 branch, it became apparent that BeeGFS, in its own subtle and steady way, has truly matured over the past few years.
Reflecting on Our Progress
Firstly, you’ll notice the frequency of BeeGFS releases has significantly increased. If you were to look inside the ThinkParQ engineering lab, you’d see a state-of-the-art continuous integration (CI) and testing environment run by our top-notch DevOps team. By transitioning from manual to automated testing, we’ve considerably reduced the overhead to ship BeeGFS. This means we can include fewer fixes and enhancements in each release, and ship more frequently with less overhead. It also ensures that BeeGFS has undergone rigorous testing before it ever handles your data.
Secondly, for the past several releases, changes to the core of BeeGFS have focused on hardening functionality, tightening security, enhancing the user experience, improving interoperability, and addressing edge cases. Here are the highlights from the last few releases:
Version 7.3.0 was mainly about GDS and a number of related enhancements. We added client support for multi-rail networking, critical to maximizing performance with modern high density servers featuring multiple NUMA zones (like NVIDIA’s DGX servers). We also added support for Linux 5.10 and ARM processors, ensuring BeeGFS can be used in heterogeneous environments and the client can run inside new hardware platforms such as NVIDIA’s Bluefield DPUs or be used at the edge where low power processing is important.
Version 7.3.1 was all about security. We made connection authentication opt-out and prevented the use of setuid binaries by default. We also set in motion a long-term plan to drastically overhaul how we handle client and server authentication going forward. This release also introduced client support for Linux 5.15, enabling RHEL 9 and Ubuntu 22.04, and we began reconsidering how we support multiple kernel versions to make it more sustainable and future-proof. Additionally we snuck in official BeeGFS containers, providing a new path for deploying filesystems.
Version 7.3.2 delivered a multitude of minor fixes with significant impacts. We substantially improved metadata stat and read performance and reinforced support for using the kernel page cache (native mode) on the client, which markedly improves performance for deep learning and other workloads that re-read the same datasets. We also opened up configuration of RDMA/TCP connection timeouts to better align with modern network capabilities and offer faster error recovery.
Version 7.3.3 significantly improved the overall state of high availability in BeeGFS. Server nodes now automatically detect changes in network interface availability, which are propagated to the clients, minimizing the time I/O stalls when a device fails. With these changes, BeeGFS is well-suited to run in Linux HA or Kubernetes clusters, where services need to failover between nodes with shared disks. We also optimized how the management service recovers from system crashes and expanded the scenarios our fsck tool can address.
This leads us to the present day and the release of BeeGFS versions 7.2.10, 7.3.4, and 7.4.0.
A Tale of Three Release Branches
BeeGFS strives to follow semantic versioning, which dictates that we should increment a minor version when we add functionality in a backward-compatible manner. For BeeGFS, this typically occurs when we alter the network communication protocols between clients and servers or modify on-disk data structures. This is why we added support for GDS as 7.3.0 and cross-directory hard links as 7.4.0. Unless you’re planning to utilize the new features, these releases should be generally backwards-compatible. However, we still advise against running mixed minor versions of clients and servers beyond your upgrade window. Achieving the necessary level of testing for us to confidently declare this will work in 99% of scenarios is challenging. But based on our experience, it typically operates smoothly without any catastrophic failures, although older components may log messages about unfamiliar functionalities.
While we aren’t considering 7.4.0 a feature preview or beta release, we understand some users might inevitably perceive it as such. We don’t want users to have to choose between being able to upgrade for the latest bug fixes/enhancements and a healthy skepticism of releases ending in “0”. And despite all the testing we do to ensure new features work and perform as expected, we’ve found that large-scale users are still very good at sniffing out a few small bugs we missed. Therefore, in line with our goal of providing stable releases that users feel confident upgrading to, we’ve decided to maintain three release branches, albeit briefly:
The 7.2.x branch is kept alive to backport critical fixes and as many enhancements as possible for our customers using older distributions and kernels, mainly RHEL/CentOS 7. The plan is to eventually discontinue this branch when those distributions reach their end of support.
The 7.3.4 release is anticipated as the final release of the 7.3 release branch. It primarily adds support all the way through Linux 6.1 to the BeeGFS client. We understand that quick support for new releases of major Linux distributions is important to the community, so we’ve been heavily investing in keeping up with the ever-evolving state of file system interfaces in the kernel. This release is a continuation of our work over the last few releases to support a wide range of kernel versions seamlessly. Our goal is for the 7.4 branch to become the natural successor to 7.3, supporting the same operating system and driver versions with a low friction upgrade path.
The 7.4.0 release introduces support for cross directory hard links and enhances our monitoring capabilities. Previously hard links in BeeGFS could only exist in the same directory, or you had to use the sysCreateHardlinksAsSymlinks option as a partial workaround. If you are using hard links today you’ll need to migrate using the new mode in beegfs-fsck. Additionally, this release extends monitoring capabilities by integrating new system metrics collected by Telegraf, which are now displayed alongside BeeGFS metrics in Grafana. These improvements have been developed collaboratively with our DevOps and support teams, drawing heavily from their real-world experience to curate the most relevant metrics and dashboards for performance tuning and troubleshooting. Other than the hard links feature and enhancements to system level monitoring, the content of this release mirrors that of 7.3.4.
As the company behind BeeGFS, ThinkParQ remains obsessively focused on reliability and performance. No two metrics are more crucial for a filesystem. With a decreasing number of support tickets even as our install base has nearly doubled, we’re confident we’re on the right track. This progress attests to the hard work of our engineering teams and foresight of our leadership team to prioritize quality first.
However, if you’re hoping for an answer to “are we there yet,” we’ve been in this long enough to know we’ll never truly be there, even though our baby has grown up and graduated college. There will always be new Linux versions, workloads, and use cases that challenge BeeGFS in novel and intriguing ways. But we’re excited about what’s to come as we gradually transition our focus towards new feature development and finalizing our vision for BeeGFS 8.
Joe McCormick, Senior Software Engineer, ThinkParQ
Philipp Falk, Head of Engineering, ThinkParQ