Concepts
Essential computer science and distributed systems concepts organized by Designing Data-Intensive Applications
Foundations of Data Systems
Reliability, scalability, maintainability, data models, storage, and encoding
Distributed Data
Replication, partitioning, transactions, and distributed system problems
Derived Data
Batch processing, stream processing, and future of data systems
Resilience & Fault Tolerance
Cross-cutting patterns for building resilient distributed systems
Storage & Retrieval
Database internals, indexing strategies, and storage engines
Data Encoding & Evolution
Serialization formats, schema evolution, and compatibility
Replication Strategies
Data replication patterns and conflict resolution
Partitioning & Sharding
Data distribution strategies across multiple nodes
Transactions & Consistency
ACID properties, isolation levels, and consistency models
Batch Processing
Large-scale data processing with MapReduce and beyond
Stream Processing
Real-time data processing and event streaming
Security & Privacy
Encryption, authentication, and data privacy in distributed systems