Note: Permissions require you to log into RX and authenticate to view any article links within the Release Notes.
Multi-Cluster Reservations Postmortem
We released the idea of adding multiple Clusters to a single Reservation at the end of April. Here are the highlights of what went wrong, what we did, and where we're going.
AOS / Hypervisor Management
Before the migration, we knew picking the "right" AOS / Hypervisor was going to be an issue. Every cluster in RX has an assigned datacenter, which has an assigned NFS server holding all AOS / Hypervisor builds for that DC. Now, creating workloads on a reservation before any cluster is added, presents the question: How does RX know which builds are "available"? Currently, it doesn't. RX is showing you everything you have permissions to see. We understand it's incredibly frustrating to get clusters imaged when all you want is a 6.7.1.5 image and it's deployed everywhere EXCEPT the datacenter where your cluster is hosted.
Short-term Solution: We've implemented a stop-gap solution that filters the available builds to clusters already added to the reservation. Use this for now, and structural changes coming to "Deployment Planning" will further alleviate this issue. In the long term, we are planning changes to manage builds in the backend to remove this issue altogether.
Migration of "Legacy" Reservations and Stability
As part of our migration plan, we wrote a script that migrated all pre-existing "legacy" Reservations to new database schemas. For the most part, we got this right. However, we missed a few cases that created failure patterns for specific reservation configurations. At this point, most migrated, active reservations have come and gone. We're closing in on all "legacy" reservations expiring. As these reservations aged out, we saw improved stability over time.
Rocky Linux, Language, and MariaDB Upgrades
We released new code while doing an OS refresh and upgrading everything simultaneously, leading to issues post-release, making it hard to pinpoint the cause. It greatly impacted the first 72 hours of release. We won't do "big-bang" code releases and patch everything simultaneously again.
Recently Released Features
Dashboard Improvements
The redesigned Dashboard puts Reservations into Draft, Active, and Future categories with the following information more visible:
Enhanced visibility of Start and End Dates, with a 36-hour countdown warning
Reservation IDs and Reasons
Color-coded cluster Status Indicators
Quickly copy your password or share a reservation.
Quick Action menu includes ability to End or Cancel a Reservation
Improved Account and Opportunity Lookup
We made the following enhancements to the Salesforce fields:
First, if you're not part of a Salesforce integrated Organization you won't even see the Account and Opportunity fields.
We updated the way RX communicates with Salesforce providing better suggestions, such as Accounts you are an owner or a member of the team appear at the top of the list. Opportunities will suggest those where you're the SE Owner, or a member of the Opportunity team.
If you select an Account first, RX will now filter to allow selection of related Opportunities.
If you enter multiple Accounts, Opportunities are disabled.
As you're searching through Accounts, you'll be identified as the owner within the results list.
Additional Updates and Fixes
⚒️ Available clusters are now always sorted to the top of the list, eliminating the need for the "Available Only" checkbox.
🪲 Fixed a bug that's been around in RX for YEARS that prevented reservation of clusters which had a preceding reservation that started and stopped at the same time.
🪲 Corrected issues within the Start and Stop Date selections when adding a cluster.
🪲 Switching a cluster from a workload with a defined hypervisor to one that used Bundled AHV now properly updates the hypervisor.
🪲Fixed an issue preventing collaborator access to reservations for Nutants who had no other access to RX resources.