Design Yelp
System Design Challenge
Design Yelp
What is Yelp?
Yelp is a local business discovery platform that allows users to find and review local businesses. It's similar to Google Maps, Foursquare, or TripAdvisor. The service provides business search, reviews, ratings, and recommendations.
Geospatial search with business discovery and review systems is what makes systems like Yelp unique. By understanding Yelp, you can tackle interview questions for similar local business platforms, since the core design challenges—geospatial search, business discovery, review management, and recommendations—remain the same.
Functional Requirements
Core (Interview Focussed)
- Business Search: Users can search for businesses by location and category.
- Business Discovery: Discover nearby businesses based on location.
- Review Management: Users can write and read business reviews.
- Recommendation System: Provide personalized business recommendations.
Out of Scope
- User authentication and accounts
- Business owner tools and analytics
- Reservation and booking systems
- Payment processing
- Mobile app specific features
Non-Functional Requirements
Core (Interview Focussed)
- Low latency: Sub-second response time for search queries.
- High availability: 99.9% uptime for business discovery.
- Scalability: Handle millions of businesses and reviews.
- Accuracy: Accurate business information and recommendations.
Out of Scope
- Data retention policies
- Compliance and privacy regulations
💡 Interview Tip: Focus on low latency, high availability, and scalability. Interviewers care most about geospatial search, business discovery, and review management.
Core Entities
Entity | Key Attributes | Notes |
---|---|---|
Business | business_id, name, category, location, rating, review_count | Indexed by location for geospatial search |
Review | review_id, business_id, user_id, rating, content, timestamp | Indexed by business_id for business reviews |
User | user_id, username, email, review_count, helpful_votes | User account information |
Category | category_id, name, parent_category, business_count | Business categorization |
Location | location_id, latitude, longitude, address, city | Geographic location data |
💡 Interview Tip: Focus on Business, Review, and Location as they drive business discovery, review management, and geospatial search.
Core APIs
Business Search
GET /businesses/search?query=&location=&category=&radius=
– Search for businessesGET /businesses/{business_id}
– Get business detailsGET /businesses/nearby?latitude=&longitude=&radius=&category=
– Find nearby businessesGET /businesses?category=&city=&limit=
– List businesses with filters
Review Management
POST /businesses/{business_id}/reviews { rating, content }
– Write a business reviewGET /businesses/{business_id}/reviews?sort=&limit=
– Get business reviewsPUT /reviews/{review_id} { rating, content }
– Update a reviewDELETE /reviews/{review_id}
– Delete a review
User Management
GET /users/{user_id}
– Get user profileGET /users/{user_id}/reviews
– Get user's reviewsGET /users/{user_id}/recommendations
– Get personalized recommendationsPOST /reviews/{review_id}/helpful
– Mark review as helpful
Categories
GET /categories
– Get all business categoriesGET /categories/{category_id}/businesses
– Get businesses in categoryGET /categories/{category_id}/subcategories
– Get subcategoriesGET /categories/search?query=
– Search categories
High-Level Design
System Architecture Diagram
Key Components
- Business Service: Handle business CRUD operations
- Search Service: Process search queries and geospatial search
- Review Service: Manage business reviews and ratings
- Recommendation Service: Generate personalized business recommendations
- Geospatial Service: Handle location-based queries
- Database: Persistent storage for businesses, reviews, and users
Mapping Core Functional Requirements to Components
Functional Requirement | Responsible Components | Key Considerations |
---|---|---|
Business Search | Search Service, Geospatial Service | Search algorithms, geospatial indexing |
Business Discovery | Geospatial Service, Business Service | Location-based queries, business data |
Review Management | Review Service, Database | Review storage, rating calculation |
Recommendation System | Recommendation Service, Review Service | Personalization, business ranking |
Detailed Design
Search Service
Purpose: Process search queries and provide relevant business results.
Key Design Decisions:
- Search Algorithms: Use full-text search and geospatial search
- Result Ranking: Rank results by relevance, rating, and distance
- Query Processing: Parse and optimize search queries
- Caching: Cache frequent search results
Algorithm: Business search
1. Receive search query with location
2. Parse query parameters:
- Search terms
- Location coordinates
- Category filters
- Radius constraints
3. Execute search:
- Full-text search on business names/descriptions
- Geospatial search for location-based results
- Category filtering
4. Rank results by:
- Text relevance score
- Business rating
- Distance from user
- Review count
5. Return ranked results
6. Cache results for performance
Geospatial Service
Purpose: Handle location-based queries and geospatial search.
Key Design Decisions:
- Geospatial Indexing: Use R-tree or similar for spatial queries
- Distance Calculation: Calculate distances efficiently
- Location Validation: Validate location coordinates
- Proximity Search: Find businesses within specified radius
Algorithm: Geospatial search
1. Receive location-based query
2. Validate location coordinates
3. Query geospatial index:
- Find businesses within radius
- Filter by category if specified
- Sort by distance
4. Calculate distances:
- Use Haversine formula for accuracy
- Cache distance calculations
5. Return proximity-ranked results
6. Update search statistics
Review Service
Purpose: Manage business reviews and calculate ratings.
Key Design Decisions:
- Review Storage: Store reviews efficiently with metadata
- Rating Calculation: Calculate business ratings from reviews
- Review Validation: Validate review content and ratings
- Review Moderation: Moderate reviews for quality
Algorithm: Review processing
1. Receive review submission
2. Validate review:
- Check rating range (1-5)
- Validate content length
- Check for spam/abuse
3. Store review in database
4. Update business rating:
- Recalculate average rating
- Update review count
- Update rating distribution
5. Update user review count
6. Trigger recommendation updates
Recommendation Service
Purpose: Generate personalized business recommendations.
Key Design Decisions:
- Collaborative Filtering: Use user behavior for recommendations
- Content-based Filtering: Use business attributes for recommendations
- Hybrid Approach: Combine multiple recommendation methods
- Real-time Updates: Update recommendations based on user activity
Algorithm: Business recommendation
1. Analyze user preferences:
- Review history
- Rating patterns
- Category preferences
- Location patterns
2. Find similar users:
- Users with similar review patterns
- Users with similar preferences
3. Generate recommendations:
- Businesses liked by similar users
- Businesses in preferred categories
- Businesses in preferred locations
4. Rank recommendations:
- User preference score
- Business rating
- Distance from user
5. Return personalized recommendations
Database Design
Businesses Table
Field | Type | Description |
---|---|---|
business_id | VARCHAR(36) | Primary key |
name | VARCHAR(255) | Business name |
category | VARCHAR(100) | Business category |
latitude | DECIMAL(10,8) | Business latitude |
longitude | DECIMAL(11,8) | Business longitude |
address | TEXT | Business address |
city | VARCHAR(100) | Business city |
rating | DECIMAL(3,2) | Average rating |
review_count | INT | Number of reviews |
created_at | TIMESTAMP | Business creation |
Indexes:
idx_category
on (category) - Category-based queriesidx_city
on (city) - City-based queriesidx_rating
on (rating) - Rating-based queriesidx_location
on (latitude, longitude) - Geospatial queries
Reviews Table
Field | Type | Description |
---|---|---|
review_id | VARCHAR(36) | Primary key |
business_id | VARCHAR(36) | Associated business |
user_id | VARCHAR(36) | Review author |
rating | INT | Review rating (1-5) |
content | TEXT | Review content |
helpful_votes | INT | Helpful votes count |
created_at | TIMESTAMP | Review creation |
Indexes:
idx_business_id
on (business_id) - Business reviewsidx_user_id
on (user_id) - User reviewsidx_rating
on (rating) - Rating-based queriesidx_created_at
on (created_at) - Recent reviews
Users Table
Field | Type | Description |
---|---|---|
user_id | VARCHAR(36) | Primary key |
username | VARCHAR(100) | Username |
VARCHAR(255) | Email address | |
review_count | INT | Number of reviews |
helpful_votes | INT | Helpful votes received |
created_at | TIMESTAMP | Account creation |
Indexes:
idx_username
on (username) - Username lookupidx_review_count
on (review_count) - Active reviewers
Categories Table
Field | Type | Description |
---|---|---|
category_id | VARCHAR(36) | Primary key |
name | VARCHAR(100) | Category name |
parent_category | VARCHAR(100) | Parent category |
business_count | INT | Number of businesses |
Indexes:
idx_name
on (name) - Category lookupidx_parent_category
on (parent_category) - Subcategories
Scalability Considerations
Horizontal Scaling
- Business Service: Scale horizontally with load balancers
- Search Service: Use consistent hashing for search partitioning
- Review Service: Scale review processing with distributed systems
- Database: Shard businesses and reviews by geographic regions
Caching Strategy
- Redis: Cache search results and business data
- CDN: Cache static content and images
- Application Cache: Cache frequently accessed data
Performance Optimization
- Connection Pooling: Efficient database connections
- Batch Processing: Batch review updates for efficiency
- Async Processing: Non-blocking search processing
- Resource Monitoring: Monitor CPU, memory, and network usage
Monitoring and Observability
Key Metrics
- Search Latency: Average search response time
- Review Processing Time: Average time to process reviews
- Recommendation Accuracy: Accuracy of business recommendations
- System Health: CPU, memory, and disk usage
Alerting
- High Latency: Alert when search time exceeds threshold
- Review Processing Errors: Alert when review processing fails
- Recommendation Errors: Alert when recommendation generation fails
- System Errors: Alert on business processing failures
Trade-offs and Considerations
Consistency vs. Availability
- Choice: Eventual consistency for ratings, strong consistency for reviews
- Reasoning: Ratings can tolerate slight delays, reviews need immediate accuracy
Latency vs. Accuracy
- Choice: Use approximation algorithms for geospatial search
- Reasoning: Balance between search accuracy and response time
Storage vs. Performance
- Choice: Use efficient storage for business and review data
- Reasoning: Balance between storage costs and query performance
Common Interview Questions
Q: How would you handle geospatial search at scale?
A: Use geospatial indexing, efficient distance calculations, and geographic partitioning to handle geospatial search at scale.
Q: How do you ensure review quality?
A: Use review validation, moderation systems, and user feedback to ensure review quality.
Q: How would you scale this system globally?
A: Deploy regional search servers, use geo-distributed databases, and implement data replication strategies.
Q: How do you handle business recommendation accuracy?
A: Use multiple recommendation algorithms, user feedback, and continuous learning to improve recommendation accuracy.
Key Takeaways
- Geospatial Search: Efficient spatial indexing and distance calculations are essential for location-based search
- Review Management: Review validation and rating calculation ensure accurate business information
- Recommendation System: Multiple recommendation methods provide better user experience
- Scalability: Horizontal scaling and geographic partitioning are crucial for handling large-scale business data
- Monitoring: Comprehensive monitoring ensures system reliability and performance