How to choose the Erasure Code K & M numbers?
How erasure code works:
The Ceph erasure code parameters K & M involve the usable capacity efficiency and redundancy against hardware failure. K is the number of data chunks and M is the number of coding chunks. When a client writes a data object to the Ceph storage cluster, the data object will be split into K equal-sized data chunks. Ceph uses the data chunks to encode M chunks of coding chunks which will be used to calculate the lost data chunks when hardware fails.
Data chunks and coding chunks will be distributed and stored in the K+M specified failure domain. The maximum number of data chunks that can be lost is M failure domains. The available failure domains depend on how the storage servers are allocated within the physical infrastructure. For a small-scale Ceph cluster, failure could be disk or server hosts. Larger-scale clusters’ failure domain can be the server racks, server room, or data center, etc.
The data object is reconstructed from the data chunks when the client wants to read the data.
Storage performance
Compared to the Ceph data replication algorithm, erasure coding usually has better usable space efficiency. Because the erasure code utilizes more disk I/O operations to complete the data I/O, it is less friendly to the IOPS-demanding workloads. Larger the number of K+M, the I/O operation consumes more disk IOPS resource.
When using bigger K and smaller M, the total number of bytes of data transferred on the Ceph cluster network will be less. This could increase the I/O throughput performance of large-sized data objects.
Space efficiency
The usable space efficiency of an erasure code pool is equal to K/(K+M). For example, the space efficiency of the K=4, M=2 erasure code pool is 4/6 = 66.7%. This is twice as efficient as replica 3 pools which provide higher IOPS performance with the same level of hardware redundancy.
Reasonable K is larger than M due to the usable space efficiency. The larger (K-M) erasure code gains better space efficiency.
Number of server hosts
The number of K+M also determines the number of hosts or larger failure domains that are required in the cluster.
- A typical erasure code pool requires a minimum of K+M server hosts to fully distribute all EC chunks.
- An advanced erasure code configuration allows storing multiple EC chunks per failure domain. This configuration reduces the required number of servers for distributing EC chunks.
Summery of erasure code K & M influences:
- M determines the redundant number of failure domains.
- A larger K + M results in reduced small object IOPS performance for clients but improves throughput for larger objects.
- Storage space efficiency = K/(K+M)
- Minimum number of servers required.