Design WhatsApp

System Design Challenge

hard
45-60 minutes
messagingreal-time-communicationend-to-end-encryptionmedia-sharing

Design WhatsApp

What is WhatsApp?

WhatsApp is a messaging platform that allows users to send text messages, media files, and make voice/video calls. It's similar to Telegram, Signal, or iMessage. The service provides real-time messaging, group chats, and end-to-end encryption.

Real-time messaging with end-to-end encryption and media sharing is what makes systems like WhatsApp unique. By understanding WhatsApp, you can tackle interview questions for similar messaging platforms, since the core design challenges—message delivery, encryption, group management, and real-time communication—remain the same.


Functional Requirements

Core (Interview Focussed)

  • Message Sending: Users can send text messages and media files.
  • Group Chats: Users can create and participate in group conversations.
  • Message Delivery: Ensure reliable message delivery and read receipts.
  • End-to-End Encryption: Encrypt messages for security.

Out of Scope

  • User authentication and accounts
  • Voice and video calling
  • Status updates and stories
  • Business messaging
  • Mobile app specific features

Non-Functional Requirements

Core (Interview Focussed)

  • Low latency: Sub-second message delivery time.
  • High availability: 99.9% uptime for messaging services.
  • Scalability: Handle billions of messages per day.
  • Security: Strong encryption and data protection.

Out of Scope

  • Data retention policies
  • Compliance and privacy regulations

💡 Interview Tip: Focus on low latency, high availability, and security. Interviewers care most about message delivery, encryption, and group management.


Core Entities

EntityKey AttributesNotes
Messagemessage_id, sender_id, recipient_id, content, timestampIndexed by recipient_id for fast delivery
Chatchat_id, chat_type, participants, created_atIndividual and group chats
Useruser_id, username, phone_number, statusUser account information
Mediamedia_id, message_id, file_type, file_url, sizeMedia file information
Encryptionkey_id, user_id, public_key, private_key, created_atEnd-to-end encryption keys

💡 Interview Tip: Focus on Message, Chat, and Encryption as they drive message delivery, group management, and security.


Core APIs

Message Management

  • POST /messages { recipient_id, content, media_id } – Send a new message
  • GET /messages/{message_id} – Get message details
  • PUT /messages/{message_id}/read – Mark message as read
  • GET /messages?chat_id=&limit= – Get chat messages

Chat Management

  • POST /chats { chat_type, participants[] } – Create a new chat
  • GET /chats/{chat_id} – Get chat details
  • PUT /chats/{chat_id}/participants { participants[] } – Update chat participants
  • GET /chats?user_id=&limit= – Get user's chats

Media Sharing

  • POST /media/upload { file, file_type } – Upload media file
  • GET /media/{media_id} – Get media file
  • GET /media/{media_id}/download – Download media file
  • DELETE /media/{media_id} – Delete media file

Encryption

  • POST /encryption/keys { public_key } – Upload public key
  • GET /encryption/keys/{user_id} – Get user's public key
  • POST /encryption/encrypt { message, recipient_key } – Encrypt message
  • POST /encryption/decrypt { encrypted_message, private_key } – Decrypt message

High-Level Design

System Architecture Diagram

Key Components

  • Message Service: Handle message CRUD operations
  • Chat Service: Manage individual and group chats
  • Real-time Service: Handle WebSocket connections and real-time updates
  • Encryption Service: Manage end-to-end encryption
  • Media Service: Handle media file upload and sharing
  • Database: Persistent storage for messages, chats, and users

Mapping Core Functional Requirements to Components

Functional RequirementResponsible ComponentsKey Considerations
Message SendingMessage Service, Real-time ServiceMessage delivery, real-time updates
Group ChatsChat Service, Message ServiceGroup management, message distribution
Message DeliveryMessage Service, Real-time ServiceDelivery guarantees, read receipts
End-to-End EncryptionEncryption Service, Message ServiceKey management, message encryption

Detailed Design

Message Service

Purpose: Handle message creation, storage, and delivery.

Key Design Decisions:

  • Message Storage: Store messages efficiently with encryption
  • Delivery Guarantees: Ensure reliable message delivery
  • Read Receipts: Track message read status
  • Message Ordering: Maintain message order in conversations

Algorithm: Message delivery

1. Receive message from sender
2. Encrypt message content
3. Store encrypted message in database
4. Determine recipients:
   - For individual chat: single recipient
   - For group chat: all group members
5. For each recipient:
   - Check recipient's online status
   - If online: send via WebSocket
   - If offline: store for later delivery
6. Update message status
7. Send delivery confirmation to sender

Real-time Service

Purpose: Handle WebSocket connections and broadcast real-time updates.

Key Design Decisions:

  • WebSocket Connections: Maintain persistent connections for real-time updates
  • Message Broadcasting: Broadcast messages to relevant users
  • Connection Management: Handle connection drops and reconnections
  • Update Filtering: Send relevant updates to each user

Algorithm: Real-time message broadcasting

1. User connects to message stream
2. Send recent messages to user
3. When new message arrives:
   - Check if user is recipient
   - If recipient:
     - Send message via WebSocket
     - Update message status
   - If group member:
     - Send to all group members
4. Handle connection drops gracefully
5. Reconnect users with missed messages

Encryption Service

Purpose: Manage end-to-end encryption for messages.

Key Design Decisions:

  • Key Management: Generate and manage encryption keys
  • Message Encryption: Encrypt messages before storage
  • Key Exchange: Secure key exchange between users
  • Key Rotation: Rotate keys periodically for security

Algorithm: End-to-end encryption

1. Generate encryption keys for user
2. Store public key, keep private key secure
3. When sending message:
   - Get recipient's public key
   - Encrypt message with recipient's public key
   - Store encrypted message
4. When receiving message:
   - Decrypt message with private key
   - Display decrypted content
5. Handle key rotation:
   - Generate new keys periodically
   - Update key references
   - Re-encrypt existing messages

Chat Service

Purpose: Manage individual and group chat functionality.

Key Design Decisions:

  • Chat Creation: Create individual and group chats
  • Participant Management: Add/remove participants from group chats
  • Chat Metadata: Track chat information and settings
  • Chat History: Maintain chat message history

Algorithm: Group chat management

1. Create group chat:
   - Generate unique chat ID
   - Add initial participants
   - Set chat permissions
2. Add participant:
   - Validate user exists
   - Add to participants list
   - Send notification to existing members
3. Remove participant:
   - Remove from participants list
   - Update chat permissions
   - Send notification to remaining members
4. Update chat settings:
   - Modify chat name/description
   - Update permissions
   - Notify all participants

Database Design

Messages Table

FieldTypeDescription
message_idVARCHAR(36)Primary key
chat_idVARCHAR(36)Associated chat
sender_idVARCHAR(36)Message sender
contentTEXTMessage content
media_idVARCHAR(36)Associated media
encrypted_contentTEXTEncrypted message
timestampTIMESTAMPMessage timestamp
statusVARCHAR(50)Message status

Indexes:

  • idx_chat_timestamp on (chat_id, timestamp) - Chat messages
  • idx_sender_id on (sender_id) - User messages
  • idx_status on (status) - Message status queries

Chats Table

FieldTypeDescription
chat_idVARCHAR(36)Primary key
chat_typeVARCHAR(50)Chat type (individual/group)
nameVARCHAR(255)Chat name
descriptionTEXTChat description
created_atTIMESTAMPChat creation
last_message_atTIMESTAMPLast message time

Indexes:

  • idx_chat_type on (chat_type) - Chat type queries
  • idx_last_message_at on (last_message_at) - Recent chats

Chat Participants Table

FieldTypeDescription
participant_idVARCHAR(36)Primary key
chat_idVARCHAR(36)Associated chat
user_idVARCHAR(36)Participant user
roleVARCHAR(50)Participant role
joined_atTIMESTAMPJoin timestamp

Indexes:

  • idx_chat_id on (chat_id) - Chat participants
  • idx_user_id on (user_id) - User chats
  • unique_chat_user on (chat_id, user_id) - Prevent duplicate participants

Encryption Keys Table

FieldTypeDescription
key_idVARCHAR(36)Primary key
user_idVARCHAR(36)Key owner
public_keyTEXTPublic encryption key
key_typeVARCHAR(50)Key type
created_atTIMESTAMPKey creation
expires_atTIMESTAMPKey expiration

Indexes:

  • idx_user_id on (user_id) - User keys
  • idx_expires_at on (expires_at) - Key expiration queries

Scalability Considerations

Horizontal Scaling

  • Message Service: Scale horizontally with load balancers
  • Real-time Service: Scale WebSocket connections with load balancers
  • Chat Service: Use consistent hashing for chat partitioning
  • Database: Shard messages and chats by user_id

Caching Strategy

  • Redis: Cache recent messages and chat metadata
  • Application Cache: Cache frequently accessed data
  • Database Cache: Cache message and chat data

Performance Optimization

  • Connection Pooling: Efficient database connections
  • Batch Processing: Batch message operations for efficiency
  • Async Processing: Non-blocking message processing
  • Resource Monitoring: Monitor CPU, memory, and network usage

Monitoring and Observability

Key Metrics

  • Message Latency: Average message delivery time
  • WebSocket Connections: Number of active connections
  • Encryption Performance: Time to encrypt/decrypt messages
  • System Health: CPU, memory, and disk usage

Alerting

  • High Latency: Alert when message delivery time exceeds threshold
  • Connection Drops: Alert when WebSocket connections drop frequently
  • Encryption Errors: Alert when encryption operations fail
  • System Errors: Alert on message processing failures

Trade-offs and Considerations

Consistency vs. Availability

  • Choice: Strong consistency for message delivery
  • Reasoning: Message delivery requires immediate accuracy

Latency vs. Security

  • Choice: Use efficient encryption algorithms
  • Reasoning: Balance between message security and delivery speed

Storage vs. Performance

  • Choice: Use efficient storage for encrypted messages
  • Reasoning: Balance between storage costs and query performance

Common Interview Questions

Q: How would you handle message delivery failures?

A: Use retry mechanisms, offline storage, and delivery confirmation to handle message delivery failures.

Q: How do you ensure message security?

A: Use end-to-end encryption, secure key management, and message authentication to ensure message security.

Q: How would you scale this system globally?

A: Deploy regional messaging servers, use geo-distributed databases, and implement data replication strategies.

Q: How do you handle group chat management?

A: Use participant management, permission systems, and real-time updates to handle group chat management effectively.


Key Takeaways

  1. Message Delivery: Reliable delivery mechanisms and read receipts are essential for messaging platforms
  2. End-to-End Encryption: Secure key management and message encryption ensure user privacy
  3. Group Management: Participant management and real-time updates enable group chat functionality
  4. Scalability: Horizontal scaling and partitioning are crucial for handling large-scale messaging
  5. Monitoring: Comprehensive monitoring ensures system reliability and performance