Design WhatsApp
System Design Challenge
Design WhatsApp
What is WhatsApp?
WhatsApp is a messaging platform that allows users to send text messages, media files, and make voice/video calls. It's similar to Telegram, Signal, or iMessage. The service provides real-time messaging, group chats, and end-to-end encryption.
Real-time messaging with end-to-end encryption and media sharing is what makes systems like WhatsApp unique. By understanding WhatsApp, you can tackle interview questions for similar messaging platforms, since the core design challenges—message delivery, encryption, group management, and real-time communication—remain the same.
Functional Requirements
Core (Interview Focussed)
- Message Sending: Users can send text messages and media files.
- Group Chats: Users can create and participate in group conversations.
- Message Delivery: Ensure reliable message delivery and read receipts.
- End-to-End Encryption: Encrypt messages for security.
Out of Scope
- User authentication and accounts
- Voice and video calling
- Status updates and stories
- Business messaging
- Mobile app specific features
Non-Functional Requirements
Core (Interview Focussed)
- Low latency: Sub-second message delivery time.
- High availability: 99.9% uptime for messaging services.
- Scalability: Handle billions of messages per day.
- Security: Strong encryption and data protection.
Out of Scope
- Data retention policies
- Compliance and privacy regulations
💡 Interview Tip: Focus on low latency, high availability, and security. Interviewers care most about message delivery, encryption, and group management.
Core Entities
Entity | Key Attributes | Notes |
---|---|---|
Message | message_id, sender_id, recipient_id, content, timestamp | Indexed by recipient_id for fast delivery |
Chat | chat_id, chat_type, participants, created_at | Individual and group chats |
User | user_id, username, phone_number, status | User account information |
Media | media_id, message_id, file_type, file_url, size | Media file information |
Encryption | key_id, user_id, public_key, private_key, created_at | End-to-end encryption keys |
💡 Interview Tip: Focus on Message, Chat, and Encryption as they drive message delivery, group management, and security.
Core APIs
Message Management
POST /messages { recipient_id, content, media_id }
– Send a new messageGET /messages/{message_id}
– Get message detailsPUT /messages/{message_id}/read
– Mark message as readGET /messages?chat_id=&limit=
– Get chat messages
Chat Management
POST /chats { chat_type, participants[] }
– Create a new chatGET /chats/{chat_id}
– Get chat detailsPUT /chats/{chat_id}/participants { participants[] }
– Update chat participantsGET /chats?user_id=&limit=
– Get user's chats
Media Sharing
POST /media/upload { file, file_type }
– Upload media fileGET /media/{media_id}
– Get media fileGET /media/{media_id}/download
– Download media fileDELETE /media/{media_id}
– Delete media file
Encryption
POST /encryption/keys { public_key }
– Upload public keyGET /encryption/keys/{user_id}
– Get user's public keyPOST /encryption/encrypt { message, recipient_key }
– Encrypt messagePOST /encryption/decrypt { encrypted_message, private_key }
– Decrypt message
High-Level Design
System Architecture Diagram
Key Components
- Message Service: Handle message CRUD operations
- Chat Service: Manage individual and group chats
- Real-time Service: Handle WebSocket connections and real-time updates
- Encryption Service: Manage end-to-end encryption
- Media Service: Handle media file upload and sharing
- Database: Persistent storage for messages, chats, and users
Mapping Core Functional Requirements to Components
Functional Requirement | Responsible Components | Key Considerations |
---|---|---|
Message Sending | Message Service, Real-time Service | Message delivery, real-time updates |
Group Chats | Chat Service, Message Service | Group management, message distribution |
Message Delivery | Message Service, Real-time Service | Delivery guarantees, read receipts |
End-to-End Encryption | Encryption Service, Message Service | Key management, message encryption |
Detailed Design
Message Service
Purpose: Handle message creation, storage, and delivery.
Key Design Decisions:
- Message Storage: Store messages efficiently with encryption
- Delivery Guarantees: Ensure reliable message delivery
- Read Receipts: Track message read status
- Message Ordering: Maintain message order in conversations
Algorithm: Message delivery
1. Receive message from sender
2. Encrypt message content
3. Store encrypted message in database
4. Determine recipients:
- For individual chat: single recipient
- For group chat: all group members
5. For each recipient:
- Check recipient's online status
- If online: send via WebSocket
- If offline: store for later delivery
6. Update message status
7. Send delivery confirmation to sender
Real-time Service
Purpose: Handle WebSocket connections and broadcast real-time updates.
Key Design Decisions:
- WebSocket Connections: Maintain persistent connections for real-time updates
- Message Broadcasting: Broadcast messages to relevant users
- Connection Management: Handle connection drops and reconnections
- Update Filtering: Send relevant updates to each user
Algorithm: Real-time message broadcasting
1. User connects to message stream
2. Send recent messages to user
3. When new message arrives:
- Check if user is recipient
- If recipient:
- Send message via WebSocket
- Update message status
- If group member:
- Send to all group members
4. Handle connection drops gracefully
5. Reconnect users with missed messages
Encryption Service
Purpose: Manage end-to-end encryption for messages.
Key Design Decisions:
- Key Management: Generate and manage encryption keys
- Message Encryption: Encrypt messages before storage
- Key Exchange: Secure key exchange between users
- Key Rotation: Rotate keys periodically for security
Algorithm: End-to-end encryption
1. Generate encryption keys for user
2. Store public key, keep private key secure
3. When sending message:
- Get recipient's public key
- Encrypt message with recipient's public key
- Store encrypted message
4. When receiving message:
- Decrypt message with private key
- Display decrypted content
5. Handle key rotation:
- Generate new keys periodically
- Update key references
- Re-encrypt existing messages
Chat Service
Purpose: Manage individual and group chat functionality.
Key Design Decisions:
- Chat Creation: Create individual and group chats
- Participant Management: Add/remove participants from group chats
- Chat Metadata: Track chat information and settings
- Chat History: Maintain chat message history
Algorithm: Group chat management
1. Create group chat:
- Generate unique chat ID
- Add initial participants
- Set chat permissions
2. Add participant:
- Validate user exists
- Add to participants list
- Send notification to existing members
3. Remove participant:
- Remove from participants list
- Update chat permissions
- Send notification to remaining members
4. Update chat settings:
- Modify chat name/description
- Update permissions
- Notify all participants
Database Design
Messages Table
Field | Type | Description |
---|---|---|
message_id | VARCHAR(36) | Primary key |
chat_id | VARCHAR(36) | Associated chat |
sender_id | VARCHAR(36) | Message sender |
content | TEXT | Message content |
media_id | VARCHAR(36) | Associated media |
encrypted_content | TEXT | Encrypted message |
timestamp | TIMESTAMP | Message timestamp |
status | VARCHAR(50) | Message status |
Indexes:
idx_chat_timestamp
on (chat_id, timestamp) - Chat messagesidx_sender_id
on (sender_id) - User messagesidx_status
on (status) - Message status queries
Chats Table
Field | Type | Description |
---|---|---|
chat_id | VARCHAR(36) | Primary key |
chat_type | VARCHAR(50) | Chat type (individual/group) |
name | VARCHAR(255) | Chat name |
description | TEXT | Chat description |
created_at | TIMESTAMP | Chat creation |
last_message_at | TIMESTAMP | Last message time |
Indexes:
idx_chat_type
on (chat_type) - Chat type queriesidx_last_message_at
on (last_message_at) - Recent chats
Chat Participants Table
Field | Type | Description |
---|---|---|
participant_id | VARCHAR(36) | Primary key |
chat_id | VARCHAR(36) | Associated chat |
user_id | VARCHAR(36) | Participant user |
role | VARCHAR(50) | Participant role |
joined_at | TIMESTAMP | Join timestamp |
Indexes:
idx_chat_id
on (chat_id) - Chat participantsidx_user_id
on (user_id) - User chatsunique_chat_user
on (chat_id, user_id) - Prevent duplicate participants
Encryption Keys Table
Field | Type | Description |
---|---|---|
key_id | VARCHAR(36) | Primary key |
user_id | VARCHAR(36) | Key owner |
public_key | TEXT | Public encryption key |
key_type | VARCHAR(50) | Key type |
created_at | TIMESTAMP | Key creation |
expires_at | TIMESTAMP | Key expiration |
Indexes:
idx_user_id
on (user_id) - User keysidx_expires_at
on (expires_at) - Key expiration queries
Scalability Considerations
Horizontal Scaling
- Message Service: Scale horizontally with load balancers
- Real-time Service: Scale WebSocket connections with load balancers
- Chat Service: Use consistent hashing for chat partitioning
- Database: Shard messages and chats by user_id
Caching Strategy
- Redis: Cache recent messages and chat metadata
- Application Cache: Cache frequently accessed data
- Database Cache: Cache message and chat data
Performance Optimization
- Connection Pooling: Efficient database connections
- Batch Processing: Batch message operations for efficiency
- Async Processing: Non-blocking message processing
- Resource Monitoring: Monitor CPU, memory, and network usage
Monitoring and Observability
Key Metrics
- Message Latency: Average message delivery time
- WebSocket Connections: Number of active connections
- Encryption Performance: Time to encrypt/decrypt messages
- System Health: CPU, memory, and disk usage
Alerting
- High Latency: Alert when message delivery time exceeds threshold
- Connection Drops: Alert when WebSocket connections drop frequently
- Encryption Errors: Alert when encryption operations fail
- System Errors: Alert on message processing failures
Trade-offs and Considerations
Consistency vs. Availability
- Choice: Strong consistency for message delivery
- Reasoning: Message delivery requires immediate accuracy
Latency vs. Security
- Choice: Use efficient encryption algorithms
- Reasoning: Balance between message security and delivery speed
Storage vs. Performance
- Choice: Use efficient storage for encrypted messages
- Reasoning: Balance between storage costs and query performance
Common Interview Questions
Q: How would you handle message delivery failures?
A: Use retry mechanisms, offline storage, and delivery confirmation to handle message delivery failures.
Q: How do you ensure message security?
A: Use end-to-end encryption, secure key management, and message authentication to ensure message security.
Q: How would you scale this system globally?
A: Deploy regional messaging servers, use geo-distributed databases, and implement data replication strategies.
Q: How do you handle group chat management?
A: Use participant management, permission systems, and real-time updates to handle group chat management effectively.
Key Takeaways
- Message Delivery: Reliable delivery mechanisms and read receipts are essential for messaging platforms
- End-to-End Encryption: Secure key management and message encryption ensure user privacy
- Group Management: Participant management and real-time updates enable group chat functionality
- Scalability: Horizontal scaling and partitioning are crucial for handling large-scale messaging
- Monitoring: Comprehensive monitoring ensures system reliability and performance