How to choose the Erasure Code K & M numbers?

How erasure code works:
The Ceph erasure code parameters K & M involve the usable capacity efficiency and redundancy against hardware failure. K is the number of data chunks and M is the number of coding chunks. When a client writes a data object to the Ceph storage cluster, the data object will be split into K equal-sized data chunks. Ceph uses the data chunks to encode M chunks of coding chunks which will be used to calculate the lost data chunks when hardware fails.

Data chunks and coding chunks will be distributed and stored in the K+M specified failure domain. The maximum number of data chunks that can be lost is M failure domains. The available failure domains depend on how the storage servers are allocated within the physical infrastructure. For a small-scale Ceph cluster, failure could be disk or server hosts. Larger-scale clusters’ failure domain can be the server racks, server room, or data center, etc.
The data object is reconstructed from the data chunks when the client wants to read the data.

Storage performance

Compared to the Ceph data replication algorithm, erasure coding usually has better usable space efficiency. Because the erasure code utilizes more disk I/O operations to complete the data I/O, it is less friendly to the IOPS-demanding workloads. Larger the number of K+M, the I/O operation consumes more disk IOPS resource.

When using bigger K and smaller M, the total number of bytes of data transferred on the Ceph cluster network will be less. This could increase the I/O throughput performance of large-sized data objects.

Space efficiency

The usable space efficiency of an erasure code pool is equal to K/(K+M). For example, the space efficiency of the K=4, M=2 erasure code pool is 4/6 = 66.7%. This is twice as efficient as replica 3 pools which provide higher IOPS performance with the same level of hardware redundancy.

Reasonable K is larger than M due to the usable space efficiency. The larger (K-M) erasure code gains better space efficiency.

Number of server hosts

The number of K+M also determines the number of hosts or larger failure domains that are required in the cluster.

A typical erasure code pool requires a minimum of K+M server hosts to fully distribute all EC chunks.
An advanced erasure code configuration allows storing multiple EC chunks per failure domain. This configuration reduces the required number of servers for distributing EC chunks.

Summary of erasure code K & M influences:

M determines the redundant number of failure domains.
A larger K + M results in reduced small object IOPS performance for clients but improves throughput for larger objects.
Storage space efficiency = K/(K+M)
Minimum number of servers required.

How to choose the Erasure Code K & M numbers? | Ceph Storage Solutions; Ceph Appliances & Software|Ambedded

Founded in Taiwan in 2013, Ambedded Technology Co., Ltd. is a leading provider of block, file, and object storage solutions based on Ceph software-defined storage. We specialize in delivering high-efficiency, scalable storage systems for data centers, enterprises, and research institutions. Our offerings include Ceph-based storage appliances, server integration, storage optimization, and cost-effective Ceph deployment with simplified management.

Ambedded provides turnkey Ceph storage appliances and full-stack Ceph software solutions tailored for B2B organizations. Our Ceph storage platform supports unified block, file (NFS, SMB, CephFS), and S3-compatible object storage, reducing total cost of ownership (TCO) while improving reliability and scalability. With integrated Ceph tuning, intuitive web UI, and automation tools, we help customers achieve high-performance storage for AI, HPC, and cloud workloads.

With over 20 years of experience in enterprise IT and more than a decade in Ceph storage deployment, Ambedded has delivered 200+ successful projects globally. We offer expert consulting, cluster design, deployment support, and ongoing maintenance. Our commitment to professional Ceph support and seamless integration ensures that customers get the most from their Ceph-based storage infrastructure — at scale, with speed, and within budget.

How to choose the Erasure Code K & M numbers? | Unified Block, File & S3 Object Storage - Ambedded

How to choose the Erasure Code K & M numbers? | Simplified Ceph Management, Lower TCO - Ambedded

How to choose the Erasure Code K & M numbers? | Ceph solution integrates with easy installation, pre-configured software, and a user-friendly UI. Also provide Ceph consultancy, professional service, and seamless updates, offering both software-only and turnkey appliance options.

How to choose the Erasure Code K & M numbers?

How to choose the Erasure Code K & M numbers? | Ceph Storage Solutions; Ceph Appliances & Software|Ambedded

Our Address

Ambedded Showcases Enterprise-grade Ceph Storage at INTI...

Ambedded Collaborates with Supermicro to Launch a Hybrid...

UniVirStor, the Ceph Storage Software Appliance is available...