In the previous post, we explored the theory behind Raft’s leader election and log replication processes. We learned how nodes transition between follower, candidate, and leader states, and how they use terms and voting to reach consensus on who should lead.
Now it’s time to roll up our sleeves and turn those concepts into working Ruby code. Don’t worry if you haven’t memorized every detail from the last post; we’ll refresh the important bits as we go along.
Note: The code snippets provided here are simplified to focus on the core concepts. You can find the complete working implementation on this repo.
Laying the groundwork
How nodes talk to each other
Before we dive into leader election, we need a way for nodes to communicate. In a real-world distributed system, you’d typically use something like gRPC, REST APIs, or even a message broker like Kafka.
For our implementation, we’ll use Ruby’s built-in DRb
(Distributed Ruby) library, which is simple to set up and perfect for demo purposes.
Each node needs to run a DRb server listening on its port:
module Raft
class DRbServer
def initialize(node, port)
@node = node
@port = port
end
def start
uri = "druby://localhost:#{port}"
DRb.start_service(uri, node)
logger.info "DRb server started on #{uri}"
end
end
end
To talk to other nodes (forming a cluster), we create remote node connections:
module Raft
class RemoteNode
def initialize(node_id, port)
@node_id = node_id
@uri = "druby://localhost:#{port}"
end
def request_vote(request)
remote_node.request_vote(request)
end
def append_entries(request)
remote_node.append_entries(request)
end
private
def remote_node
@remote_node ||= DRbObject.new_with_uri(uri)
end
end
end
This abstraction lets us call methods on remote nodes as if they were local objects (in our case, the RaftNode
class).
In production-grade systems, you’d replace this with proper RPC calls and put a retry logic, but the concept remains the same.
Setting up our Raft node
First, let’s define the three states a Raft node can be in:
module Raft
module NodeState
FOLLOWER = :follower
CANDIDATE = :candidate
LEADER = :leader
end
end
Next, we need to create our Raft node class, this is where all the magic happens. Each node needs to track a few things. Let’s start with the most important ones:
module Raft
class RaftNode
attr_reader :id, :state, :current_term, :voted_for
def initialize(id, port = nil)
@id = id
@port = port
# Persistent state (must survive restarts)
@current_term = 0
@voted_for = nil
# Node starts as a follower
@state = NodeState::FOLLOWER
# Cluster remote nodes and DRb server
@remote_nodes = {}
@drb_server = nil
# For thread safety
@mutex = Mutex.new
# Timers for elections and heartbeats
@election_timer = nil
@heartbeat_timer = nil
@logger = Logger.new(STDOUT)
logger.info "Node #{id} initialized as #{state}"
# Start election timer
start_election_timer
end
private
attr_reader :logger, :mutex, :port
attr_accessor :drb_server, :election_timer, :heartbeat_timer, :remote_nodes
end
end
Notice how every node starts as a follower with term 0
. This makes sense when a cluster first forms, no one is in charge yet, and everyone is waiting to see who will step up as leader.
The election timeout: Detecting leader failures
Remember from our last post that elections are triggered by timeouts. If a follower doesn’t hear from a leader for too long, it assumes something is wrong and starts an election. In Raft, this is known as the election timeout.
Let’s implement this timing mechanism:
def start_election_timer
# Randomized timeout between 5-10 seconds
election_timeout = rand(5..10)
self.election_timer = Thread.new do
sleep(election_timeout)
# If we are not a leader, start an election
if state != NodeState::LEADER
logger.info 'Election timeout - starting election'
start_election
end
end
end
def stop_election_timer
return unless election_timer
election_timer.kill if election_timer != Thread.current
self.election_timer = nil
end
def reset_election_timer
stop_election_timer
start_election_timer
end
The randomization of the election timeout is key here. By giving each node a different timeout (say, between 5 and 10 seconds), we reduce the chances of multiple nodes starting elections simultaneously, which helps avoid split votes.
Starting an election: From follower to candidate
When a node’s election timer expires, it’s time to throw its hat in the ring and become a candidate.
Here’s how a node transitions from follower to candidate:
def become_candidate
self.state = NodeState::CANDIDATE
self.current_term += 1 # Increment term for new election
self.voted_for = id # Vote for self
reset_election_timer
logger.info "Became candidate (term #{current_term})"
end
Notice three important things happening here:
- We increment the term number (our logical clock).
- We vote for ourselves.
- We reset the election timer to retry if this election fails (no majority votes).
Requesting and accepting votes: From candidate to leader
Now comes the important part of actually running the election. A candidate needs to receive a majority of votes to become the new leader.
First, we need to send a RequestVote
message to each node in parallel:
def majority_count
(remote_nodes.size + 1) / 2 + 1
end
def start_election
become_candidate
logger.info "Starting election for term #{current_term}"
votes_received = 1 # We already voted for ourselves
votes_needed = majority_count
logger.info "Need #{votes_needed} votes, got 1 (self)"
# Request votes from all other nodes in parallel
remote_nodes.each do |node_id, remote_node|
Thread.new do
vote_request = Models::RequestVote::Request.new(candidate_id: id, term: current_term)
response = remote_node.request_vote(vote_request)
process_vote_response(response, votes_received, votes_needed)
rescue StandardError => e
logger.warn "Failed to get vote from #{node_id}: #{e.message}"
end
end
end
We spawn a separate thread for each vote request. This parallelism is important, we don’t want to wait for slow nodes when others might be ready to vote immediately and to also avoid hitting the election timeout in case of doing it sequentially.
When vote responses come back, we handle them based on the term in the response:
- If we receive responses with older terms than ours (imagine we started an election in the term
5
, but while waiting for votes, we either started a new election (now in term6
) or received a message that bumped us to a higher term), we basically ignore them. - If we receive responses with higher terms than ours, it tells us there’s been more recent activity in the cluster (perhaps another node won an election while we were campaigning), we immediately step down to follower and adopt this higher term.
def process_vote_response(response, votes_received, votes_needed)
mutex.synchronize do
# Only count votes if we're still a candidate in the same term
if state == NodeState::CANDIDATE && current_term == response.term
if response.granted?
votes_received += 1
logger.info "Received vote (#{votes_received}/#{votes_needed})"
# Did we win?
if votes_received >= votes_needed
logger.info "Won election with #{votes_received} votes!"
become_leader
end
end
elsif response.term > current_term
# Oops, someone has a higher term - we're out of date
logger.info "Discovered higher term #{response.term}"
become_follower(response.term)
end
end
end
The mutex synchronization is important here because multiple vote responses might arrive simultaneously, and we need to ensure our vote counting is thread-safe.
Granting votes: Being a good citizen
Now let’s look at the other side, how nodes decide whether to grant their vote:
def request_vote(request)
mutex.synchronize do
logger.info "Received #{request}"
# Rule 1: Reject if the candidate's term is old
if request.term < current_term
logger.info "Rejecting vote - term too old"
return Models::RequestVote::Response.new(term: current_term, vote_granted: false)
end
# Rule 2: Update our term if we see a newer one
become_follower(request.term) if request.term > current_term
# Rule 3: Grant vote if we haven't voted yet
can_vote = voted_for.nil? || voted_for == request.candidate_id
if can_vote
self.voted_for = request.candidate_id
reset_election_timer # Reset timer when granting vote
logger.info "Granted vote to #{request.candidate_id}"
Models::RequestVote::Response.new(term: current_term, vote_granted: true)
else
logger.info "Denied vote - already voted for someone else"
Models::RequestVote::Response.new(term: current_term, vote_granted: false)
end
end
end
There are a lot of things packed into this voting logic:
- We only vote once per term (preventing multiple leaders in the same term).
- We reset our election timer when granting a vote (giving the new leader time to establish itself).
In the next post, we’ll add an additional safety check to ensure candidates have up-to-date logs before we vote for them. For now, our simplified implementation focuses on the core voting mechanism.
Becoming the leader: Victory!
When a candidate receives enough votes, it transitions to the leader state:
def become_leader
self.state = NodeState::LEADER
# Stop election timer since we go to leader state
stop_election_timer
# Reset heartbeat timer and start sending heartbeats
reset_heartbeat_timer
logger.info "Became leader (term #{current_term})"
end
The new leader needs to start sending heartbeats to all other nodes to announce its leadership and prevent other nodes from starting unnecessary elections.
Heartbeats: Keeping the cluster in sync
Heartbeats are how the leader maintains its leadership and keeps followers from starting new elections.
In Raft, heartbeats are actually empty AppendEntries
messages. Here’s how the leader sends them:
def send_heartbeats
return unless leader?
logger.info 'Sending heartbeats to followers'
remote_nodes.each do |node_id, remote_node|
Thread.new do
heartbeat = Models::AppendEntries::Request.new(
leader_id: id,
term: current_term,
log_entries: [] # Empty = heartbeat!
)
response = remote_node.append_entries(heartbeat)
# Check if someone has a higher term, if so, we step down as a follower
mutex.synchronize do
if response.term > current_term
logger.info "Discovered higher term #{response.term} - stepping down"
become_follower(response.term)
end
end
rescue StandardError => e
logger.warn "Failed to send heartbeat to #{node_id}: #{e.message}"
end
end
end
The leader sends heartbeats at a regular interval:
def start_heartbeat_timer
self.heartbeat_timer = Thread.new do
while leader?
send_heartbeats
sleep(1.0) # Send heartbeat every second
end
end
end
From the follower’s perspective, receiving a heartbeat (or any AppendEntries
) resets their election timer:
def append_entries(request)
mutex.synchronize do
# Reply false if term < currentTerm
if request.term < current_term
return Models::AppendEntries::Response.new(term: current_term, success: false)
end
# Any valid AppendEntries from current/newer term resets election timer
if request.term >= current_term
become_follower(request.term)
reset_election_timer # This prevents new elections!
end
# For now, just acknowledge the heartbeat
Models::AppendEntries::Response.new(term: current_term, success: true)
end
end
As long as the leader is alive and the network is functioning, followers keep resetting their election timers, and the leader remains in charge.
Only when heartbeats stop arriving do followers assume something is wrong and trigger a new election.
Handling edge cases: When things don’t go as planned
Split votes
What if no candidate gets a majority? This can happen when multiple nodes become candidates at nearly the same time. Each candidate will:
- Fail to achieve a majority
- Eventually timeout
- Start a new election with a higher term
The randomized timeouts make it likely that one candidate will start slightly earlier in the next round and win.
Network partitions
Consider what happens when the network splits. If a leader ends up in a minority partition, it can’t get acknowledgements from a majority of nodes.
Meanwhile, the majority partition will elect a new leader. When the partition heals, the old leader will see the higher term number and step down as a follower.
Discovering higher terms
Any time a node sees a message with a higher term than its own, it immediately becomes a follower:
def become_follower(term)
old_state = state
self.current_term = term if term > current_term
self.state = NodeState::FOLLOWER
self.voted_for = nil # Clear vote for new term
# Stop heartbeat timer if we were leader
stop_heartbeat_timer if old_state == NodeState::LEADER
# Reset election timer
reset_election_timer
logger.info "Became follower (term #{current_term})"
end
This ensures that there’s only one leader per term and that nodes with stale information quickly get back in sync.
Putting it all together
Let’s see our implementation in action! We’ll run three nodes and watch them elect a leader.
First, let’s add two helper methods to our RaftNode class for setting up the cluster:
def setup_cluster_ports(cluster_config)
cluster_config.each do |node_id, port|
next if node_id == id # Skip self
remote_nodes[node_id] = RemoteNode.new(node_id, port)
end
logger.info "Connected to cluster nodes: #{remote_nodes.keys}"
end
def start_drb_server
self.drb_server = DRbServer.new(self, port)
drb_server.start
end
Now we can create a script that runs a Raft node:
#!/usr/bin/env ruby
DEFAULT_CLUSTER = {
'node1' => 8001,
'node2' => 8002,
'node3' => 8003
}.freeze
# Create and configure the node
node = Raft::RaftNode.new(node_id, port)
node.setup_cluster_ports(DEFAULT_CLUSTER)
node.start_drb_server
loop do
sleep(1)
if (Time.now.to_i % 10).zero?
puts "#{Time.now} - Node #{node_id}: #{node.state} (term #{node.current_term})"
end
end
Start each node in a separate terminal:
# Terminal 1
ruby demo/start_node.rb node1
# Terminal 2
ruby demo/start_node.rb node2
# Terminal 3
ruby demo/start_node.rb node3
When you start the nodes, you’ll see output like:
=== Starting Raft Node ===
Node ID: node1
Port: 8001
==========================
Node node1 initialized as follower
DRb server started on druby://localhost:8001
Node node1 started successfully!
12:34:15 - Node node1: follower (term 0)
[INFO] Election timeout - starting election
[INFO] Became candidate (term 1)
[INFO] Starting election for term 1
[INFO] Need 2 votes, got 1 (self)
[INFO] Received vote from node2 (2/2)
[INFO] Won election with 2 votes!
[INFO] Became leader (term 1)
12:34:25 - Node node1: leader (term 1)
[INFO] Sending heartbeats to followers
Meanwhile, on node2
and node3
, you’ll see them receiving vote requests and heartbeats:
# Node 2 output
[INFO] Received RequestVote(candidate: node1, term: 1)
[INFO] Granted vote to node1
[INFO] Received Heartbeat(leader: node1, term: 1)
[INFO] Became follower (term 1)
12:34:25 - Node node2: follower (term 1)
You can see that node2
and node3
become followers, and node1
becomes the leader and starts sending heartbeats to them.
You can also play around by stopping the leader and watching a new election occur:
# After stopping node1 (the leader)
# Node 2 output
[INFO] Election timeout - starting election
[INFO] Became candidate (term 2)
[INFO] Starting election for term 2
[INFO] Received vote from node3 (2/2)
[INFO] Won election with 2 votes!
[INFO] Became leader (term 2)
Woohoo! This wasn’t too hard, was it?
What’s next?
We’ve successfully implemented leader election, but a leader without followers isn’t very useful. In the next post, we’ll implement log replication to make all nodes maintain an identical copy of the system’s state even when things go wrong.
Stay tuned for Part 3, where we’ll make our cluster actually do something useful!
Comments