Design YouTube
System Design Challenge
Design YouTube
What is YouTube?
YouTube is a video sharing platform that allows users to upload, watch, and share videos. It's similar to Vimeo, TikTok, or Twitch. The service provides video hosting, streaming, and content discovery.
Video processing with CDN distribution and content delivery is what makes systems like YouTube unique. By understanding YouTube, you can tackle interview questions for similar video platforms, since the core design challenges—video processing, CDN distribution, content delivery, and scalability—remain the same.
Functional Requirements
Core (Interview Focussed)
- Video Upload: Users can upload videos of various formats and sizes.
- Video Streaming: Users can stream videos with different quality options.
- Content Discovery: Users can discover videos through search and recommendations.
- Video Processing: Process videos for different quality levels and formats.
Out of Scope
- User authentication and accounts
- Video monetization and ads
- Live streaming
- Video analytics and insights
- Mobile app specific features
Non-Functional Requirements
Core (Interview Focussed)
- High availability: 99.9% uptime for video streaming.
- Scalability: Handle petabytes of video content.
- Performance: Fast video loading and streaming.
- Global distribution: Serve videos worldwide with low latency.
Out of Scope
- Data retention policies
- Compliance and privacy regulations
💡 Interview Tip: Focus on high availability, scalability, and performance. Interviewers care most about video processing, CDN distribution, and content delivery.
Core Entities
Entity | Key Attributes | Notes |
---|---|---|
Video | video_id, title, description, duration, upload_date | Indexed by upload_date for recent videos |
User | user_id, username, email, subscriber_count | User account information |
VideoFile | file_id, video_id, quality, format, file_url, size | Video file information |
Category | category_id, name, description, video_count | Video categorization |
View | view_id, video_id, user_id, timestamp, duration | Video view tracking |
💡 Interview Tip: Focus on Video, VideoFile, and View as they drive video processing, content delivery, and analytics.
Core APIs
Video Management
POST /videos { title, description, category, file }
– Upload a new videoGET /videos/{video_id}
– Get video detailsPUT /videos/{video_id} { title, description }
– Update video informationDELETE /videos/{video_id}
– Delete a video
Video Streaming
GET /videos/{video_id}/stream?quality=
– Stream video contentGET /videos/{video_id}/thumbnail
– Get video thumbnailGET /videos/{video_id}/formats
– Get available video formatsPOST /videos/{video_id}/view
– Record video view
Content Discovery
GET /videos/search?query=&category=&limit=
– Search for videosGET /videos/trending?category=&limit=
– Get trending videosGET /videos/recommended?user_id=&limit=
– Get recommended videosGET /videos/category/{category_id}?limit=
– Get videos by category
User Management
GET /users/{user_id}/videos
– Get user's videosGET /users/{user_id}/subscriptions
– Get user's subscriptionsPOST /users/{user_id}/subscribe
– Subscribe to userGET /users/{user_id}/recommendations
– Get personalized recommendations
High-Level Design
System Architecture Diagram
Key Components
- Video Service: Handle video CRUD operations
- Video Processing Service: Process videos for different formats and qualities
- CDN Service: Distribute video content globally
- Content Discovery Service: Handle search and recommendations
- Streaming Service: Manage video streaming and delivery
- Database: Persistent storage for videos, users, and metadata
Mapping Core Functional Requirements to Components
Functional Requirement | Responsible Components | Key Considerations |
---|---|---|
Video Upload | Video Service, Video Processing Service | File upload, video processing |
Video Streaming | Streaming Service, CDN Service | Content delivery, quality adaptation |
Content Discovery | Content Discovery Service, Database | Search algorithms, recommendation systems |
Video Processing | Video Processing Service, Storage | Transcoding, format conversion |
Detailed Design
Video Processing Service
Purpose: Process uploaded videos for different formats and quality levels.
Key Design Decisions:
- Transcoding: Convert videos to multiple formats and qualities
- Thumbnail Generation: Generate video thumbnails
- Metadata Extraction: Extract video metadata
- Quality Optimization: Optimize video quality for different devices
Algorithm: Video processing
1. Receive uploaded video file
2. Validate video format and size
3. Extract video metadata:
- Duration
- Resolution
- Frame rate
- Bitrate
4. Generate video thumbnails:
- Extract frames at intervals
- Generate thumbnail images
- Store thumbnails
5. Transcode video:
- Convert to multiple formats (MP4, WebM)
- Generate different quality levels
- Optimize for different devices
6. Store processed videos
7. Update video status
CDN Service
Purpose: Distribute video content globally with low latency.
Key Design Decisions:
- Global Distribution: Deploy CDN nodes worldwide
- Content Caching: Cache popular videos at edge locations
- Load Balancing: Distribute traffic across CDN nodes
- Cache Management: Manage cache expiration and updates
Algorithm: CDN content distribution
1. Receive video streaming request
2. Determine user's geographic location
3. Find nearest CDN node
4. Check if video is cached at node
5. If cached:
- Serve video from cache
- Update cache statistics
6. If not cached:
- Fetch video from origin server
- Cache video at edge node
- Serve video to user
7. Monitor CDN performance
8. Update cache policies
Content Discovery Service
Purpose: Handle video search and recommendation systems.
Key Design Decisions:
- Search Algorithms: Use full-text search and content-based search
- Recommendation Engine: Generate personalized video recommendations
- Trending Algorithm: Identify trending videos
- Content Filtering: Filter content based on user preferences
Algorithm: Video recommendation
1. Analyze user behavior:
- Watch history
- Search history
- Like/dislike patterns
- Subscription preferences
2. Find similar users:
- Users with similar watch patterns
- Users with similar preferences
3. Generate recommendations:
- Videos liked by similar users
- Videos in preferred categories
- Trending videos
4. Rank recommendations:
- User preference score
- Video popularity
- Recency factor
5. Return personalized recommendations
Streaming Service
Purpose: Manage video streaming and quality adaptation.
Key Design Decisions:
- Adaptive Streaming: Adjust video quality based on network conditions
- Buffering Management: Manage video buffering and preloading
- Quality Selection: Select appropriate video quality
- Stream Optimization: Optimize streaming performance
Algorithm: Adaptive streaming
1. Receive video streaming request
2. Detect user's network conditions:
- Bandwidth
- Latency
- Device capabilities
3. Select appropriate video quality:
- Start with medium quality
- Adjust based on network conditions
- Consider device capabilities
4. Stream video content:
- Send video chunks
- Monitor streaming performance
- Adjust quality as needed
5. Handle streaming errors:
- Retry failed requests
- Fallback to lower quality
- Notify user of issues
Database Design
Videos Table
Field | Type | Description |
---|---|---|
video_id | VARCHAR(36) | Primary key |
user_id | VARCHAR(36) | Video owner |
title | VARCHAR(255) | Video title |
description | TEXT | Video description |
category | VARCHAR(100) | Video category |
duration | INT | Video duration in seconds |
upload_date | TIMESTAMP | Upload timestamp |
view_count | BIGINT | Total views |
like_count | INT | Total likes |
Indexes:
idx_user_id
on (user_id) - User videosidx_category
on (category) - Category-based queriesidx_upload_date
on (upload_date) - Recent videosidx_view_count
on (view_count) - Popular videos
Video Files Table
Field | Type | Description |
---|---|---|
file_id | VARCHAR(36) | Primary key |
video_id | VARCHAR(36) | Associated video |
quality | VARCHAR(20) | Video quality |
format | VARCHAR(10) | Video format |
file_url | TEXT | File storage URL |
file_size | BIGINT | File size in bytes |
Indexes:
idx_video_id
on (video_id) - Video filesidx_quality
on (quality) - Quality-based queries
Users Table
Field | Type | Description |
---|---|---|
user_id | VARCHAR(36) | Primary key |
username | VARCHAR(100) | Username |
VARCHAR(255) | Email address | |
subscriber_count | INT | Number of subscribers |
video_count | INT | Number of videos |
created_at | TIMESTAMP | Account creation |
Indexes:
idx_username
on (username) - Username lookupidx_subscriber_count
on (subscriber_count) - Popular creators
Views Table
Field | Type | Description |
---|---|---|
view_id | VARCHAR(36) | Primary key |
video_id | VARCHAR(36) | Viewed video |
user_id | VARCHAR(36) | Viewer (optional) |
timestamp | TIMESTAMP | View timestamp |
duration | INT | View duration in seconds |
Indexes:
idx_video_id
on (video_id) - Video viewsidx_user_id
on (user_id) - User viewsidx_timestamp
on (timestamp) - View history
Scalability Considerations
Horizontal Scaling
- Video Service: Scale horizontally with load balancers
- Video Processing Service: Scale video processing with distributed systems
- CDN Service: Scale CDN nodes globally
- Database: Shard videos and users by geographic regions
Caching Strategy
- CDN: Cache video content globally
- Redis: Cache video metadata and recommendations
- Application Cache: Cache frequently accessed data
Performance Optimization
- Connection Pooling: Efficient database connections
- Batch Processing: Batch video processing for efficiency
- Async Processing: Non-blocking video processing
- Resource Monitoring: Monitor CPU, memory, and network usage
Monitoring and Observability
Key Metrics
- Video Upload Time: Average time to upload videos
- Streaming Latency: Average video streaming latency
- CDN Performance: CDN hit rate and response time
- System Health: CPU, memory, and disk usage
Alerting
- High Latency: Alert when streaming latency exceeds threshold
- CDN Failures: Alert when CDN nodes fail
- Processing Errors: Alert when video processing fails
- System Errors: Alert on video service failures
Trade-offs and Considerations
Consistency vs. Availability
- Choice: Eventual consistency for video metadata, strong consistency for streaming
- Reasoning: Video metadata can tolerate slight delays, streaming needs immediate accuracy
Storage vs. Performance
- Choice: Use CDN caching for better performance
- Reasoning: Balance between storage costs and streaming performance
Quality vs. Bandwidth
- Choice: Use adaptive streaming for optimal quality
- Reasoning: Balance between video quality and bandwidth usage
Common Interview Questions
Q: How would you handle video processing at scale?
A: Use distributed video processing, multiple transcoding pipelines, and efficient storage to handle video processing at scale.
Q: How do you ensure global video delivery?
A: Use CDN distribution, edge caching, and geographic optimization to ensure global video delivery.
Q: How would you scale this system globally?
A: Deploy regional video servers, use geo-distributed databases, and implement data replication strategies.
Q: How do you handle video recommendation accuracy?
A: Use multiple recommendation algorithms, user feedback, and continuous learning to improve recommendation accuracy.
Key Takeaways
- Video Processing: Distributed transcoding and format conversion are essential for video platforms
- CDN Distribution: Global CDN deployment and edge caching enable fast video delivery
- Content Discovery: Search algorithms and recommendation systems improve user experience
- Scalability: Horizontal scaling and geographic partitioning are crucial for handling large-scale video content
- Monitoring: Comprehensive monitoring ensures system reliability and performance