Design Instagram

What is Instagram?

Instagram is a photo and video sharing social media platform that lets users upload media, follow other users, interact with posts through likes and comments, and discover content through personalized feeds and explore pages. It is similar to TikTok, Snapchat, and Pinterest in terms of visual content focus. Other social media platforms, such as Facebook, Twitter, and LinkedIn, follow similar patterns for social interactions and content distribution.

Real-time media processing, personalized feed generation, and content discovery at massive scale are what make systems like Instagram unique. By understanding Instagram, you can tackle interview questions for similar social media platforms, since the core design challenges—media processing pipelines, social graph management, feed algorithms, content discovery, and global content delivery—remain the same.

Functional Requirements

Media Upload & Processing: Handle photo/video uploads with real-time processing and multiple resolution generation.
Social Feed Generation: Create personalized feeds combining followed users' content with algorithmic recommendations.
Content Discovery: Enable search and exploration through hashtags, locations, and trending content.
Social Interactions: Support likes, comments, follows, and real-time notifications.

Out of Scope

Instagram Shopping and e-commerce features
Instagram Reels and advanced video editing
Instagram Live streaming capabilities
Advanced analytics for business accounts
Third-party API access and integrations
AR filters and advanced camera features

Non-Functional Requirements

Low latency media processing: Users should see processed content quickly, even under high load.
High availability: The system should remain accessible during peak traffic.
Consistency: Ensure social interactions are eventually consistent, user data is strongly consistent.

Out of Scope (Non-Functional)

Business continuity and disaster recovery (BCDR)
GDPR and other data privacy regulations

💡 Interview Tip: Focus on media processing pipelines, feed generation algorithms, and social graph management. Interviewers care most about scalability, real-time interactions, and content discovery.

Core Entities

Entity	Key Attributes	Notes
User	user_id, username, bio, follower_count, following_count, profile_picture	Indexed by username for fast search
Post	post_id, user_id, caption, hashtags, location, media_ids, created_at	Status: published, archived, deleted
Media	media_id, post_id, file_path, width, height, processing_status, variants	Multiple resolutions stored in S3
Comment	comment_id, post_id, user_id, text, parent_comment_id, created_at	Supports nested replies
Story	story_id, user_id, media_id, created_at, expires_at	Auto-expires after 24 hours
Hashtag	hashtag_id, name, post_count, trending_score	Links to posts for discovery
Follow	follower_id, following_id, created_at, notification_enabled	Social graph relationships

💡 Interview Tip: Focus on Posts, Media, and Follow as they drive media processing, social interactions, and feed generation.

Core APIs

Media Upload & Processing

POST /api/v1/media/upload – Upload photos/videos with processing status
GET /api/v1/media/{media_id}/status – Check processing status and get URLs
POST /api/v1/posts – Create post with media and metadata

Social Feed

GET /api/v1/feed/home?limit=&max_id= – Get personalized home feed
GET /api/v1/feed/explore?category=&limit= – Get explore page content
GET /api/v1/users/{user_id}/posts?limit=&max_id= – Get user's posts

Social Interactions

POST /api/v1/posts/{post_id}/like – Like/unlike a post
POST /api/v1/posts/{post_id}/comments – Add comment to post
POST /api/v1/users/{user_id}/follow – Follow/unfollow user
GET /api/v1/posts/{post_id}/comments?limit=&max_id= – Get post comments

Content Discovery

GET /api/v1/search?q=&type=&limit= – Search users, posts, hashtags
GET /api/v1/hashtags/{hashtag}/posts?limit=&max_id= – Get posts by hashtag
GET /api/v1/trending/hashtags – Get trending hashtags

High-Level Design

Key Components

Client / Frontend: Web or mobile app for browsing feeds, uploading media, and social interactions
API Gateway: Routes requests, handles throttling, and load balancing
Media Service: Handles photo/video uploads, processing, storage, and delivery
Feed Service: Generates personalized home feeds and explore pages using algorithmic ranking
Social Service: Manages follow relationships, likes, comments, and user interactions
Search Service: Provides content discovery through full-text search, hashtags, and recommendations
Cache / In-Memory Store: Speeds up feed generation, user sessions, and media metadata
Database / Persistent Storage: Stores users, posts, media metadata, and social graph
CDN: Global content delivery for media files and static assets

Mapping Core Functional Requirements to Components

Functional Requirement	Responsible Components	Key Considerations
Media Upload & Processing	Media Service, CDN, Database	Handle large files, multiple resolutions, processing queues
Social Feed Generation	Feed Service, Social Service, Cache	Personalized ranking, real-time updates, scalability
Content Discovery	Search Service, Cache	Fast search, trending algorithms, recommendations
Social Interactions	Social Service, Notification Service	Real-time updates, consistency, high throughput

💡 Interview Tip: Focus on Media Service, Feed Service, and Social Service; other components can be simplified.

Instagram Architecture

System Architecture Diagram

Data Flow & Component Interaction

System Architecture Diagram

This diagram illustrates the data flow and component interaction when a user uploads media, creates posts, loads feeds, and interacts with content in an Instagram-like system. It highlights the key components that ensure efficient media processing, personalized feed generation, and real-time social interactions.

Media Upload & Processing

The user initiates a media upload from the frontend.
The API Gateway routes the request to the Media Service, which handles file processing and storage.
Media metadata is stored in the database while the actual files are uploaded to S3/CDN.
Multiple resolutions are generated asynchronously for optimal delivery.

Post Creation & Feed Update

The user creates a post with media and metadata.
The post data is stored in the database.
The Feed Service is triggered to update personalized feeds for followers.
Feed caches are updated to ensure fast retrieval for subsequent requests.

Feed Loading & Personalization

The user requests their home feed.
The Feed Service checks Redis cache for pre-computed feeds.
On cache miss, the system queries the database for posts from followed users.
Posts are ranked using algorithmic signals and cached for future requests.

The user likes a post.
The Social Service records the interaction and updates counters.
Cache is updated to reflect the new engagement metrics.
Real-time updates are sent to relevant users.

Key Design Highlights

Asynchronous Processing: Media processing happens in background for better user experience.
Intelligent Caching: Feed caches reduce database load and improve response times.
Personalized Ranking: Algorithmic feed generation balances relevance with discovery.
Real-time Updates: Social interactions are processed quickly with eventual consistency.

This flow guarantees efficient media processing, personalized content delivery, and responsive social interactions, making it ideal for Instagram-like platforms where users expect fast, engaging experiences.

Database Design

Use Case	SQL Option	NoSQL Option	Recommendation	Reasoning
User Profiles	PostgreSQL	DynamoDB	PostgreSQL	Complex relationships, ACID compliance, social graph queries
Posts & Media	PostgreSQL	MongoDB	PostgreSQL	Complex queries, analytics, JSON support for metadata
Media Storage	-	S3	S3	Object storage, global CDN, multiple resolution support
Activity Feeds	PostgreSQL	Cassandra	Cassandra	Time-series data, high write volume, linear scalability
Social Graph	PostgreSQL	Neo4j	Neo4j	Graph relationships, recommendation algorithms, complex traversals
Search Index	PostgreSQL	Elasticsearch	Elasticsearch	Full-text search, content discovery, faceted search
Real-time Cache	-	Redis	Redis	Sub-millisecond performance, session storage, feed caching
Analytics	ClickHouse	BigQuery	ClickHouse	OLAP workload, real-time analytics, cost optimization

User Database Schema

Table: users
├── user_id (UUID, PRIMARY KEY)
├── username (VARCHAR, UNIQUE)
├── email (VARCHAR, UNIQUE)
├── bio (TEXT)
├── follower_count (INTEGER)
├── following_count (INTEGER)
├── is_verified (BOOLEAN)
└── created_at (TIMESTAMP)

Indexes:
- PRIMARY KEY (user_id)
- UNIQUE INDEX (username)
- INDEX (is_verified, follower_count)

Table: user_follows
├── follower_id (UUID, FOREIGN KEY)
├── following_id (UUID, FOREIGN KEY)
└── created_at (TIMESTAMP)

Indexes:
- PRIMARY KEY (follower_id, following_id)
- INDEX (following_id, created_at)

Post Database Schema

Table: posts
├── post_id (UUID, PRIMARY KEY)
├── user_id (UUID, FOREIGN KEY)
├── caption (TEXT)
├── hashtags (TEXT[])
├── location_name (VARCHAR)
├── created_at (TIMESTAMP)
├── like_count (INTEGER)
├── comment_count (INTEGER)
└── visibility (ENUM)

Indexes:
- PRIMARY KEY (post_id)
- INDEX (user_id, created_at DESC)
- INDEX (hashtags) USING GIN
- INDEX (visibility, created_at DESC)

Table: post_media
├── media_id (UUID, PRIMARY KEY)
├── post_id (UUID, FOREIGN KEY)
├── file_path (VARCHAR)
├── width (INTEGER)
├── height (INTEGER)
└── processing_status (ENUM)

Indexes:
- PRIMARY KEY (media_id)
- INDEX (post_id)

Table: post_likes
├── post_id (UUID, FOREIGN KEY)
├── user_id (UUID, FOREIGN KEY)
└── created_at (TIMESTAMP)

Indexes:
- PRIMARY KEY (post_id, user_id)
- INDEX (user_id, created_at DESC)

Table: post_comments
├── comment_id (UUID, PRIMARY KEY)
├── post_id (UUID, FOREIGN KEY)
├── user_id (UUID, FOREIGN KEY)
├── text (TEXT)
└── created_at (TIMESTAMP)

Indexes:
- PRIMARY KEY (comment_id)
- INDEX (post_id, created_at)

Activity Feed Schema (Cassandra)

Table: user_feed
├── user_id (UUID, PARTITION KEY)
├── post_timestamp (TIMESTAMP, CLUSTERING KEY)
├── post_id (UUID, CLUSTERING KEY)
├── author_id (UUID)
├── caption (TEXT)
└── feed_rank_score (DOUBLE)

Clustering Order: ORDER BY (post_timestamp DESC, post_id)
TTL: 30 days for feed cleanup

Deep Dive on Components

Image and Video Processing Pipeline

Options Considered:

Synchronous Processing: Process media during upload request
- Pros: Immediate feedback, simple architecture
- Cons: High latency for uploads, poor user experience for large files
- Best for: Small images with minimal processing requirements
Asynchronous Processing: Upload first, process in background
- Pros: Fast upload response, better user experience
- Cons: Delayed media availability, complex status tracking
- Best for: Large files requiring extensive processing
Progressive Processing (Recommended): Quick preview + background optimization
- Pros: Fast initial response with progressive quality improvement
- Cons: Complex pipeline, multiple file versions
- Why chosen: Optimal user experience with comprehensive processing

How It Works:

The system implements a multi-stage media processing pipeline:

Upload Stage: User uploads photo/video directly to S3 storage
Quick Thumbnail: Generate small preview immediately for UI
Background Processing: Create multiple resolutions (thumbnail, small, medium, large)
Quality Optimization: Compress files for faster loading
CDN Distribution: Distribute processed media to global edge locations

Key Design Decisions:

Immediate Response: Users see thumbnail instantly while full processing happens in background
Multiple Resolutions: Serve appropriate size based on device and connection
Queue-based Processing: Handle high upload volumes without blocking users
Progressive Enhancement: Start with low quality, upgrade as processing completes

Feed Generation and Ranking Algorithm

Options Considered:

Chronological Feed: Show posts in reverse chronological order
- Pros: Simple implementation, predictable user experience
- Cons: Poor engagement, important content gets buried
- Best for: Real-time news feeds or small user bases
Interest-based Ranking: Rank posts by predicted user interest
- Pros: Higher engagement, personalized experience
- Cons: Echo chamber effect, complex algorithm tuning
- Best for: Content discovery and user engagement optimization
Hybrid Approach (Recommended): Combine chronological and interest signals
- Pros: Balances freshness with relevance, configurable by user
- Cons: Complex implementation, requires extensive experimentation
- Why chosen: Provides optimal user experience across different usage patterns

How It Works:

The system implements a sophisticated feed ranking algorithm:

Content Signals: Post type, quality score, engagement velocity
User Relationship: Interaction history, follow recency, mutual connections
Temporal Signals: Post recency, user activity patterns, time zone
Personalization: Individual user preferences, demographic factors
Diversity: Content type mix, author diversity, topic variety

Key Design Decisions:

Multi-factor Scoring: Combine multiple signals for balanced ranking
Real-time Updates: Adjust rankings based on fresh engagement data
Diversity Filters: Prevent echo chambers by mixing content types
User Control: Allow users to switch between chronological and algorithmic feeds
A/B Testing: Continuously optimize algorithm parameters

Content Discovery and Search

Options Considered:

Basic Text Search: Simple keyword matching on captions and hashtags
- Pros: Fast implementation, low computational overhead
- Cons: Poor relevance, limited discovery capabilities
- Best for: Simple hashtag-based content organization
Advanced Search with ML: Use computer vision and NLP for content understanding
- Pros: Rich content discovery, semantic search capabilities
- Cons: High computational cost, complex infrastructure requirements
- Best for: Advanced content platforms with large user bases
Hybrid Search System (Recommended): Combine text, visual, and behavioral signals
- Pros: Comprehensive discovery, balanced cost/performance
- Cons: Moderate complexity, requires multiple data sources
- Why chosen: Optimal balance for social media platform requirements

How It Works:

The system implements a multi-modal search and discovery system:

Text Search: Elasticsearch with custom analyzers for hashtags, captions, and user mentions
Visual Search: Computer vision models for object detection, scene classification
Behavioral Search: User interaction patterns, trending content detection
Personalized Discovery: Machine learning models for content recommendation
Real-time Indexing: Stream processing for immediate content availability

Key Design Decisions:

Multi-modal Approach: Combine text, visual, and behavioral signals for comprehensive search
Real-time Trending: Use sliding windows and exponential decay for trending calculations
Personalized Results: Rank search results based on user preferences and history
Geographic Relevance: Show location-based content when relevant
Spam Detection: Filter out low-quality or spam content from search results

Real-time Notification System

Options Considered:

Database Polling: Periodically check for new notifications
- Pros: Simple implementation, reliable delivery
- Cons: High latency, unnecessary database load
- Best for: Low-frequency notifications or simple systems
Push-based System: Real-time event-driven notifications
- Pros: Low latency, efficient resource usage
- Cons: Complex implementation, potential message loss
- Best for: High-frequency, real-time social interactions
Hybrid System (Recommended): Push with polling fallback
- Pros: Real-time performance with reliability guarantees
- Cons: Complex architecture, multiple delivery paths
- Why chosen: Optimal for social media requiring real-time engagement

How It Works:

The system implements a real-time notification pipeline:

Event Generation: Capture user interactions (likes, comments, follows) as events
Event Processing: Filter, aggregate, and route notifications
Delivery Channels: Push notifications, in-app notifications, email
Preference Management: User notification preferences and delivery settings
Analytics: Track delivery rates and user engagement with notifications

Key Design Decisions:

Event-driven Architecture: Generate notifications from user interaction events
Aggregation Rules: Combine similar notifications to reduce spam (e.g., "5 people liked your post")
Multi-channel Delivery: Support push, in-app, and email notifications
User Preferences: Allow granular control over notification types and frequency
Delivery Guarantees: Ensure critical notifications are delivered reliably

Monitoring and Operations

Observability Architecture

Metrics Collection:

Business Metrics: Posts created/day, active users, engagement rate, media upload success rate
Infrastructure Metrics: API response times, database connection pools, cache hit ratios, CDN performance
Performance Metrics: Feed generation latency, media processing time, search query latency
User Experience: App crash rate, upload success rate, feed refresh time, notification delivery rate

Monitoring Stack:

Metrics: Prometheus + Grafana for dashboards and alerting
Logging: ELK stack for centralized logging and analysis
Tracing: Jaeger for distributed request flow analysis
Alerting: PagerDuty with severity-based escalation

Key Dashboards:

User Experience Dashboard:
- Feed generation latency p50, p95, p99
- Media upload success rate and processing time
- App crash rate and error rates
- User engagement metrics
Infrastructure Health Dashboard:
- API response times by endpoint
- Database connection pool utilization
- Cache hit ratios by service
- CDN performance metrics
Content Performance Dashboard:
- Posts created per hour
- Media processing queue depth
- Search query performance
- Trending content metrics

Operational Runbooks

Media Processing Pipeline Recovery:

Queue Backlog: Scale processing workers when queue depth > 1000 items
Processing Failures: Retry failed jobs with exponential backoff
Storage Issues: Monitor S3 upload success rate and CDN distribution
Performance: Optimize processing algorithms based on media type

Feed Generation Optimization:

Cache Warming: Pre-generate feeds for active users
Load Balancing: Distribute feed generation across multiple workers
Database Optimization: Monitor query performance and optimize indexes
Scaling: Add feed generation workers based on user activity

Database Performance:

Connection Pooling: Monitor connection usage and scale pools
Query Optimization: Analyze slow queries and optimize indexes
Partitioning: Monitor partition sizes and implement range partitioning
Replication: Ensure read replicas are healthy and up-to-date

Capacity Planning

Growth Projections:

Year 1: 100M users, 1B posts/day, 10TB media/day
Year 2: 500M users, 5B posts/day, 50TB media/day
Year 3: 1B users, 10B posts/day, 100TB media/day

Resource Scaling:

Media Storage: Scale S3 buckets and CDN capacity based on upload volume
Database: Add read replicas when query latency > 100ms
Processing: Scale workers based on queue depth and processing time
Caching: Expand Redis clusters when hit ratio < 80%

Cost Optimization:

Storage Tiering: Move old media to cheaper storage tiers
CDN Optimization: Use intelligent caching and compression
Database Optimization: Implement connection pooling and query optimization
Resource Right-sizing: Monthly review of instance utilization

Security and Compliance

Data Protection:

Encryption: AES-256 encryption at rest and in transit
Access Control: OAuth 2.0 with JWT tokens and RBAC
Audit Logging: Complete audit trail for all user actions
Data Retention: Automated deletion based on user preferences and regulations

Content Moderation:

Automated Detection: ML models for inappropriate content
Human Review: Escalation to human moderators for edge cases
User Reporting: Community-driven content flagging system
Appeal Process: User-friendly content appeal and review system

FAQ

Software Engineer Level

Q: How do you handle image uploads from different devices with varying quality? A: Implement adaptive processing based on source device and connection quality. Use progressive JPEG encoding for better loading experience. Detect device capabilities and adjust processing parameters accordingly. Implement client-side compression for mobile devices to reduce upload time.

Expected Depth: Basic understanding of media processing, can explain one approach clearly Red Flags: Over-engineering, not considering simple solutions first

Q: How do you ensure feed loading is fast for users with slow connections? A: Implement progressive loading with skeleton screens. Use image thumbnails for initial load, then progressive enhancement. Compress images aggressively for slow connections. Implement offline caching for recently viewed content. Use adaptive bitrate for videos.

Expected Depth: Understanding of caching strategies and progressive loading Red Flags: Only theoretical knowledge, no discussion of user experience

Senior Software Engineer Level

Q: How do you handle celebrity users with millions of followers for feed generation? A: Implement tiered fan-out strategies with production-grade scaling. Use immediate fan-out to top 10K active followers, lazy loading for others. Deploy separate high-capacity queues for celebrity accounts (1000+ ops/sec capacity). Implement content caching with 24-hour TTL for viral posts. Use push-pull hybrid model with intelligent prefetching based on user activity patterns.

Expected Depth: Multiple solutions with trade-offs, real-world scaling experience Red Flags: Only theoretical knowledge, no discussion of operational concerns

Q: How would you implement efficient hashtag trending algorithms? A: Use sliding window algorithms with exponential decay and Redis sorted sets for real-time scoring. Implement Apache Kafka streams with 1-minute windows for trend calculation. Use geographic partitioning for localized trends. Apply ML-based spam detection with 99.5% accuracy. Implement circuit breakers for trending calculation failures.

Expected Depth: Advanced algorithms with performance considerations Red Flags: Not considering spam detection or geographic variations

Staff Engineer Level

Q: How would you design this system for 10x growth in users and content? A:

Implement microservices architecture with domain-driven design
Use event-driven architecture with CQRS for read/write separation
Design for multi-region deployment with data locality
Implement edge computing for content processing and delivery
Use machine learning for intelligent content caching and prefetching
Design auto-scaling systems with predictive scaling based on usage patterns

Expected Depth: End-to-end system design, cost considerations, organizational impact Red Flags: Not considering team/operational complexity, ignoring cost

Q: How do you balance personalization with content discovery and creator fairness? A: Implement multi-objective optimization in ranking algorithms that balance engagement, discovery, and fairness. Use exploration vs exploitation strategies to ensure diverse content exposure. Implement creator boost mechanisms for new or underrepresented creators. Use position bias correction in ranking algorithms. Provide transparency tools for creators to understand their reach.

Expected Depth: Complex algorithmic trade-offs with business impact Red Flags: Not considering creator ecosystem or algorithmic bias

Performance Optimizations

Image and Video Optimization

Adaptive Quality: Serve different quality levels based on device and connection
Format Optimization: Use modern formats (WebP, AVIF) with fallbacks
Progressive Loading: Progressive JPEG and adaptive streaming for videos
Compression: Intelligent compression based on content type and viewing context

Feed Performance

Precomputed Feeds: Generate and cache feeds for active users
Pagination: Efficient cursor-based pagination for infinite scroll
Lazy Loading: Load content as user scrolls with predictive prefetching
Edge Caching: Cache popular content at CDN edge locations

Database Optimization

Read Replicas: Distribute read queries across multiple replicas
Partitioning: Time-based and hash-based partitioning for large tables
Indexing: Optimize indexes for common query patterns
Caching: Multi-level caching with Redis for hot data

Security Considerations

Content Security

Content Scanning: Automated detection of inappropriate content using ML
User Reporting: Community-driven moderation with reporting mechanisms
Access Controls: Fine-grained privacy controls for posts and profiles
Data Encryption: Encrypt sensitive data at rest and in transit

Platform Security

API Security: Rate limiting, authentication, and input validation
DDoS Protection: Use CDN and specialized DDoS protection services
Fraud Detection: Detect fake accounts and artificial engagement
Privacy Controls: Granular privacy settings and data protection measures

Tips for Success

Start with Core Features: Focus on photo sharing and basic social features first
Emphasize Scale: Discuss media processing and feed generation at scale
User Experience: Consider mobile-first design and performance optimization
Content Quality: Address content moderation and recommendation systems
Global Considerations: Plan for international users and content delivery
Privacy and Safety: Address data protection and user safety concerns
Algorithm Transparency: Discuss ranking algorithms and their trade-offs

Design Instagram

Design Instagram

What is Instagram?

Functional Requirements

Out of Scope

Non-Functional Requirements

Out of Scope (Non-Functional)

Core Entities

Core APIs

High-Level Design

Instagram Architecture

System Architecture Diagram

Data Flow & Component Interaction

System Architecture Diagram

Media Upload & Processing

Post Creation & Feed Update

Feed Loading & Personalization

Social Interactions

Key Design Highlights

Database Design

User Database Schema

Post Database Schema

Social Interaction Schema

Activity Feed Schema (Cassandra)

Deep Dive on Components

Image and Video Processing Pipeline

Feed Generation and Ranking Algorithm

Content Discovery and Search

Real-time Notification System

Monitoring and Operations

Observability Architecture

Operational Runbooks

Capacity Planning

Security and Compliance

FAQ

Software Engineer Level

Senior Software Engineer Level

Staff Engineer Level

Performance Optimizations

Image and Video Optimization

Feed Performance

Database Optimization

Security Considerations

Content Security

Platform Security

Tips for Success

Contents