adamant-kite-43734
08/25/2023, 1:40 PMswift-sunset-4572
08/25/2023, 1:55 PMcreamy-pencil-82913
08/25/2023, 4:31 PMswift-sunset-4572
08/25/2023, 5:20 PMcreamy-pencil-82913
08/25/2023, 5:25 PMcreamy-pencil-82913
08/25/2023, 5:27 PMswift-sunset-4572
08/25/2023, 7:15 PMcreamy-pencil-82913
08/25/2023, 7:17 PMcreamy-pencil-82913
08/25/2023, 7:17 PMcreamy-pencil-82913
08/25/2023, 7:18 PMswift-sunset-4572
08/25/2023, 7:20 PMswift-sunset-4572
08/28/2023, 7:41 AMFast disks are the most critical factor for etcd deployment performance and stability.
A slow disk will increase etcd request latency and potentially hurt cluster stability. Since etcd's consensus protocol depends on persistently storing metadata to a log, a majority of etcd cluster members must write every request down to disk. Additionally, etcd will also incrementally checkpoint its state to disk so it can truncate this log. If these writes take too long, heartbeats may time out and trigger an election, undermining the stability of the cluster.
etcd is very sensitive to disk write latency. Typically 50 sequential IOPS (e.g., a 7200 RPM disk) is required. For heavily loaded clusters, 500 sequential IOPS (e.g., a typical local SSD or a high performance virtualized block device) is recommended. Note that most cloud providers publish concurrent IOPS rather than sequential IOPS; the published concurrent IOPS can be 10x greater than the sequential IOPS. To measure actual sequential IOPS, we suggest using a disk benchmarking tool such as diskbench or fio.
etcd requires only modest disk bandwidth but more disk bandwidth buys faster recovery times when a failed member has to catch up with the cluster. Typically 10MB/s will recover 100MB data within 15 seconds. For large clusters, 100MB/s or higher is suggested for recovering 1GB data within 15 seconds.
When possible, back etcd's storage with a SSD. A SSD usually provides lower write latencies and with less variance than a spinning disk, thus improving the stability and reliability of etcd. If using spinning disk, get the fastest disks possible (15,000 RPM). Using RAID 0 is also an effective way to increase disk speed, for both spinning disks and SSD. With at least three cluster members, mirroring and/or parity variants of RAID are unnecessary; etcd's consistent replication already gets high availability