How columnar storage optimizes analytical workloads and compression
Apache Spark, Flink batch, and modern dataflow architectures
Sort-merge, hash, and broadcast joins in distributed systems