Multi-Leader Replication

Overview

Multi-leader replication allows multiple nodes to accept writes simultaneously, with changes replicated between all leader nodes. This approach improves write availability and reduces latency for geographically distributed systems, but introduces the complexity of conflict resolution when concurrent writes occur.

This pattern is essential for globally distributed applications that require high write availability and low latency across multiple regions.

Architecture

Region A                Region B                Region C
[Leader A] ←→ replication ←→ [Leader B] ←→ replication ←→ [Leader C]
    ↓                          ↓                          ↓
[Follower A1]              [Follower B1]              [Follower C1]
[Follower A2]              [Follower B2]              [Follower C2]

Key Challenges

1. Conflict Detection and Resolution

class MultiLeaderReplication:
    def __init__(self, node_id):
        self.node_id = node_id
        self.data = {}
        self.vector_clock = {}
        self.peers = []
        
    def write(self, key, value):
        # Increment local clock
        self.vector_clock[self.node_id] = self.vector_clock.get(self.node_id, 0) + 1
        
        # Create versioned entry
        entry = {
            'key': key,
            'value': value,
            'vector_clock': self.vector_clock.copy(),
            'node_id': self.node_id,
            'timestamp': time.time()
        }
        
        # Apply locally
        self.apply_write(entry)
        
        # Replicate to peers
        self.replicate_to_peers(entry)
    
    def apply_write(self, entry):
        key = entry['key']
        
        if key not in self.data:
            self.data[key] = []
        
        # Check for conflicts
        conflicts = self.detect_conflicts(key, entry)
        
        if conflicts:
            # Resolve conflict using strategy
            resolved_value = self.resolve_conflict(conflicts + [entry])
            self.data[key] = [resolved_value]
        else:
            self.data[key].append(entry)
    
    def detect_conflicts(self, key, new_entry):
        if key not in self.data:
            return []
        
        conflicts = []
        for existing in self.data[key]:
            if self.are_concurrent(existing['vector_clock'], new_entry['vector_clock']):
                conflicts.append(existing)
        
        return conflicts
    
    def resolve_conflict(self, conflicted_entries):
        # Last-writer-wins strategy
        return max(conflicted_entries, key=lambda x: x['timestamp'])

2. Conflict Resolution Strategies

Last-Writer-Wins (LWW):

Simple timestamp-based resolution
Risk of data loss
Good for systems where losing updates is acceptable

Multi-Value Resolution:

Keep all conflicting values
Let application resolve conflicts
Used in Amazon DynamoDB

Custom Resolution Logic:

Application-specific conflict resolution
Complex but precise
Example: Merge shopping cart items

Implementation Patterns

Conflict-Free Replicated Data Types (CRDTs)

class GCounterCRDT:
    """Grow-only counter CRDT"""
    def __init__(self, node_id):
        self.node_id = node_id
        self.counts = {}  # node_id -> count
    
    def increment(self):
        """Increment local counter"""
        self.counts[self.node_id] = self.counts.get(self.node_id, 0) + 1
    
    def value(self):
        """Get current counter value"""
        return sum(self.counts.values())
    
    def merge(self, other):
        """Merge with another counter (conflict-free)"""
        result = GCounterCRDT(self.node_id)
        
        all_nodes = set(self.counts.keys()) | set(other.counts.keys())
        
        for node in all_nodes:
            self_count = self.counts.get(node, 0)
            other_count = other.counts.get(node, 0)
            result.counts[node] = max(self_count, other_count)
        
        return result

class PNCounterCRDT:
    """Increment/Decrement counter CRDT"""
    def __init__(self, node_id):
        self.node_id = node_id
        self.increments = GCounterCRDT(node_id)
        self.decrements = GCounterCRDT(node_id)
    
    def increment(self):
        self.increments.increment()
    
    def decrement(self):
        self.decrements.increment()
    
    def value(self):
        return self.increments.value() - self.decrements.value()
    
    def merge(self, other):
        result = PNCounterCRDT(self.node_id)
        result.increments = self.increments.merge(other.increments)
        result.decrements = self.decrements.merge(other.decrements)
        return result

Real-World Examples

Amazon DynamoDB Global Tables

# DynamoDB handles multi-leader automatically
import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('GlobalTable')

# Writes to any region are replicated globally
table.put_item(
    Item={
        'id': '123',
        'data': 'value',
        'last_updated': int(time.time())
    }
)

Apache Cassandra Multi-DC

-- Create keyspace with multi-datacenter replication
CREATE KEYSPACE mykeyspace
WITH REPLICATION = {
    'class': 'NetworkTopologyStrategy',
    'datacenter1': 3,
    'datacenter2': 3
};

-- Writes with consistency level
INSERT INTO users (id, name, email) VALUES (1, 'John', 'john@example.com')
USING CONSISTENCY LOCAL_QUORUM;

Performance Considerations

Write Latency:

Local writes: ~1-5ms
Cross-region replication: 50-200ms
Conflict resolution overhead: 10-50ms

Throughput:

Scales linearly with number of leader nodes
Limited by network bandwidth for replication
Conflict rate affects performance

Best Practices

Minimize Conflicts:
- Partition data by geography/user
- Use CRDTs when possible
- Implement application-level coordination
Monitor Conflicts:
- Track conflict rates
- Measure resolution latency
- Alert on high conflict scenarios
Design for Conflicts:
- Plan conflict resolution strategies
- Test edge cases thoroughly
- Provide manual resolution tools
Network Optimization:
- Compress replication data
- Batch updates when possible
- Use dedicated replication channels

Trade-offs

Advantages:

High write availability
Low latency for geo-distributed writes
No single point of failure
Better performance for global applications

Disadvantages:

Complex conflict resolution
Potential data inconsistency
Higher implementation complexity
Difficult to reason about consistency

Multi-leader replication is powerful but complex, requiring careful design and conflict resolution strategies to maintain data integrity across distributed systems.

Multi-Leader Replication

Multi-Leader Replication

Overview

Architecture

Key Challenges

1. Conflict Detection and Resolution

2. Conflict Resolution Strategies

Implementation Patterns

Conflict-Free Replicated Data Types (CRDTs)

Real-World Examples

Amazon DynamoDB Global Tables

Apache Cassandra Multi-DC

Performance Considerations

Best Practices

Trade-offs

Contents

Related Concepts

Used By