Achieving Rapid Ceph OSD Recovery Using SBB Technology | Ceph storage solution and service provider. Full-Stack software for Ceph.

Using SBB servers in Ceph storage enables faster OSD recovery, minimizes downtime, ensures service continuity, and reduces the risks of data loss and performance degradation. | Ceph storage solution and service provider. Full-Stack software for Ceph.

Using SBB servers in Ceph storage enables faster OSD recovery, minimizes downtime, ensures service continuity, and reduces the risks of data loss and performance degradation.

Achieving Rapid Ceph OSD Recovery Using SBB Technology

In modern data centers, uninterrupted data availability is critical. While Ceph’s CRUSH algorithm effectively handles failures and protects data integrity, hardware redundancy remains crucial for ensuring high availability. Introducing Storage Bridge Bay (SBB) servers into the Ceph infrastructure significantly improves resilience by minimizing service interruptions during hardware failures.


Challenges of Traditional Ceph Deployments

In a conventional Ceph deployment, each storage server typically hosts multiple Object Storage Daemons (OSDs). If a single server experiences hardware failures, such as a motherboard malfunction or network card failure, all OSDs on that host go offline simultaneously. This situation triggers a recovery process, causing Placement Groups (PGs) to become degraded and potentially compromising data redundancy.

Recovering from such an event can take a significant amount of time, depending on the volume of data and available resources, which can lead to prolonged degraded performance and an increased risk of data loss or service disruption.

Introducing Storage Bridge Bay (SBB) Servers

Storage Bridge Bay (SBB) is a standardized dual-node server architecture designed for high availability. An SBB server houses two independent computing nodes connected to shared storage in a JBOD (Just a Bunch of Disks) configuration. Typically, these servers support dual-port NVMe or SAS drives, providing robust hardware redundancy.

How SBB Enhances Ceph High Availability

In an SBB-based Ceph deployment, each node operates in an active-active mode, meaning both nodes simultaneously run Ceph OSD services. For example, a typical SBB server equipped with 24 NVMe SSD drives distributes these equally between the two nodes, with each node initially managing 12 OSDs.

This design ensures that if one node fails, only half of the OSDs become temporarily unavailable, rather than all at once, significantly reducing the severity and impact of the failure.

Rapid OSD Failover Scenario

When a failure occurs on one node within an SBB server, half of the OSDs become inaccessible. Ambedded Technology has developed a robust script designed to rapidly migrate and reactivate the affected OSDs onto the surviving node.

Here's how the rapid migration process occurs:

1. Obtain the Ceph Container Image:Quickly retrieve the container image reference required for Ceph operations.

2. Remove OSD-specific CRUSH Location: Update the OSD configuration by removing node-specific CRUSH location details.

3. Activate OSD with ceph-volume: Reactivate the OSD services using the ceph-volume utility.

4. Adopt OSD using cephadm: Integrate the activated OSDs back into the Ceph cluster, restoring service swiftly.

Benefits of Using SBB for Ceph

1. Minimized Downtime: Rapid OSD reactivation significantly reduces the time spent in degraded states, swiftly restoring PGs to an active and clean status.

2. Enhanced Service Continuity: Prevents prolonged interruptions, maintaining consistent service delivery.

3. Simplified Maintenance: Immediate hardware repairs become less urgent, as services remain operational on the surviving node.

4. Reduced Risk of Data Loss and Performance Degradation: Accelerated recovery processes and hardware redundancy minimize potential risks associated with hardware failures.

Summary and Conclusion

Integrating Storage Bridge Bay (SBB) servers with Ceph deployments dramatically enhances the resilience and operational efficiency of storage infrastructures. By leveraging an active-active configuration and rapid OSD reactivation capabilities, organizations can significantly reduce downtime and simplify management.

Ambedded Technology's Ceph appliance, Mars 624, exemplifies this integration by offering a turnkey solution that harnesses the benefits of SBB architecture. Organizations looking to improve Ceph availability and streamline maintenance should consider upgrading to Mars 624 to achieve unparalleled storage reliability and efficiency.

Additionally, Ambedded's UniVirStor full-stack Ceph software fully supports any storage servers built on SBB technology, such as Supermicro's SSG-640SP-DE2CR60, ensuring flexibility and compatibility for diverse infrastructure environments.

Related Products
Mars 624 SBB 24x NVMe Two Hot-Swap Nodes Ceph Storage Appliance - Mars624 Ceph Storage 2U 24x NVMe OSD
Mars 624 SBB 24x NVMe Two Hot-Swap Nodes Ceph Storage Appliance
Mars 624 SBB

Mars624 SBB server accommodates two hot-swappable Intel Xeon server nodes in a 2U chassis. Two server nodes can simultaneously connect to all 24x dual-port...

Details

Achieving Rapid Ceph OSD Recovery Using SBB Technology | Ceph storage solution and service provider. Full-Stack software for Ceph.

Founded in Taiwan in 2013, Ambedded Technology Co., Ltd. is a leading provider of block, file, and object storage solutions based on Ceph software-defined storage. We specialize in delivering high-efficiency, scalable storage systems for data centers, enterprises, and research institutions. Our offerings include Ceph-based storage appliances, server integration, storage optimization, and cost-effective Ceph deployment with simplified management.

Ambedded provides turnkey Ceph storage appliances and full-stack Ceph software solutions tailored for B2B organizations. Our Ceph storage platform supports unified block, file (NFS, SMB, CephFS), and S3-compatible object storage, reducing total cost of ownership (TCO) while improving reliability and scalability. With integrated Ceph tuning, intuitive web UI, and automation tools, we help customers achieve high-performance storage for AI, HPC, and cloud workloads.

With over 20 years of experience in enterprise IT and more than a decade in Ceph storage deployment, Ambedded has delivered 200+ successful projects globally. We offer expert consulting, cluster design, deployment support, and ongoing maintenance. Our commitment to professional Ceph support and seamless integration ensures that customers get the most from their Ceph-based storage infrastructure — at scale, with speed, and within budget.