etcd 0.4.0 with Standby Mode
May 20, 2014 · By Yicheng Qin
The etcd team has been focused on making it easier to scale and manage larger clusters, and is happy to announce a release with features to help: etcd v0.4.0. This release is an important step in our road to 1.0. (If you are new to etcd, our getting started guide can give you a quick overview of the project).
This release is will be available in the alpha channel of CoreOS in the next few days, and will roll out to the beta channel after it has proven solid.
What’s New in 0.4.0
Cluster Management API
This release introduces a documented API for removing machines from a cluster. This gives administrators the ability to remove downed machines from the cluster so that new machines can join.
This new endpoint also includes options for controlling two important new cluster-wide features: standby mode, and automatic node promotion/demotion.
The new “standby mode” adds extra resiliency to etcd. All machines in an etcd cluster are now in one of either two modes: peer mode or standby mode. Peers participate in the Raft consensus algorithm; in earlier releases of etcd, every node in a cluster was a peer. Standbys, on the other hand, do not participate in consensus, and instead redirect client requests to participating peers. This setup enables users to have a small number of machines participating in consensus directly while having a much larger number of standbys that are keeping up to date on the cluster members.
You can find more details about standby mode here.
To complement standby mode, etcd can automatically reconfigure the cluster by promoting and demoting nodes between standby and peer modes. As part of the cluster configuration, every cluster now has a defined maximum number of participating peers (
activeSize). Any etcd process that attempts to join a cluster greater than or equal to the maximum number of peers will start out as a standby. If the number of peers in the cluster drops below the configured
activeSize, then nodes in standby will automatically be promoted to peers until the
activeSize is reached.
You can learn more about this feature in the API documentation.
Clients can now use HEAD to check for the existence of a key without a body. This is useful for avoiding unnecessary network costs when the body of the request isn’t required.
Updated Peer Discovery
When an instance starts up, etcd now uses the following sources, in order, to locate other peers in the cluster:
- log data in the
- a static
This improves on v0.3.0 and promotes the priority of log data before finding peers from other sources. In practice this means it’s easier for an etcd instance to reconnect to a cluster of which it was previously a member.
Name IP Migration
If a machine’s address is changed, etcd will now accept the new address. This is useful in the case of new DHCP leases or virtual machine migrations.
Deprecation of /mod Modules
Modules have been a great experimental testing ground for higher level features built on top etcd. Until now, they have seen some use but haven’t had consistent stability. We are focusing on improving the etcd core API and will turn back to these experimental modules in a later future release. It was not an easy decision, but we are sure you will be happy with what results this decision can allow us to produce.
Minor Fixes and Improvements
Along with the features and fixes outlined above we have made a number of smaller improvements:
- Set logs NOCOW flag when BTRFS is detected to avoid fsync overhead
- Fix all known data races, and pass Go race detector
- Fixed timeouts when using HTTPS
- Improved snapshot stability
Please join the etcd development mailing list to discuss etcd with us!
Update: 0.4.1 has been released.