Serialize and Deserialize Binary Tree

Serialize a binary tree to a string and deserialize it back to the original tree

Language Selection

Choose your preferred programming language

Showing: Python

Serialize and Deserialize Binary Tree

Problem Statement

Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.

Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

Input/Output Specifications

  • Input: The root of a binary tree where 0 <= number of nodes <= 10^4 and -1000 <= Node.val <= 1000
  • Output: A string representation of the tree (serialization) and the reconstructed tree (deserialization)

Constraints

  • 0 <= number of nodes <= 10^4
  • -1000 <= Node.val <= 1000

Examples

Example 1:

Input: root = [1,2,3,null,null,4,5]
Output: [1,2,3,null,null,4,5]

Example 2:

Input: root = []
Output: []

Example 3:

Input: root = [1]
Output: [1]

Solution Approaches

Approach 1: Preorder Traversal with Null Markers - Optimal

Algorithm Explanation:

  1. Use preorder traversal to serialize the tree
  2. Use “null” or “#” to mark null nodes
  3. For deserialization, use the same preorder traversal to reconstruct the tree
  4. Use a queue or list to track the serialized values

Implementation:

Python:

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

class Codec:
    def serialize(self, root):
        """
        Encodes a tree to a single string using preorder traversal
        Time: O(n)
        Space: O(n)
        """
        def preorder(node):
            if not node:
                return "null"
            return str(node.val) + "," + preorder(node.left) + "," + preorder(node.right)
        
        return preorder(root)
    
    def deserialize(self, data):
        """
        Decodes your encoded data to tree using preorder traversal
        Time: O(n)
        Space: O(n)
        """
        def build_tree():
            if not self.values:
                return None
            
            val = self.values.popleft()
            if val == "null":
                return None
            
            node = TreeNode(int(val))
            node.left = build_tree()
            node.right = build_tree()
            return node
        
        from collections import deque
        self.values = deque(data.split(","))
        return build_tree()

Java:

import java.util.*;

class TreeNode {
    int val;
    TreeNode left;
    TreeNode right;
    TreeNode(int x) { val = x; }
}

class Codec {
    public String serialize(TreeNode root) {
        StringBuilder sb = new StringBuilder();
        preorder(root, sb);
        return sb.toString();
    }
    
    private void preorder(TreeNode node, StringBuilder sb) {
        if (node == null) {
            sb.append("null,");
            return;
        }
        
        sb.append(node.val).append(",");
        preorder(node.left, sb);
        preorder(node.right, sb);
    }
    
    public TreeNode deserialize(String data) {
        Queue<String> values = new LinkedList<>(Arrays.asList(data.split(",")));
        return buildTree(values);
    }
    
    private TreeNode buildTree(Queue<String> values) {
        if (values.isEmpty()) {
            return null;
        }
        
        String val = values.poll();
        if ("null".equals(val)) {
            return null;
        }
        
        TreeNode node = new TreeNode(Integer.parseInt(val));
        node.left = buildTree(values);
        node.right = buildTree(values);
        return node;
    }
}

Complexity Analysis:

  • Time Complexity: O(n) - Visit each node once during serialization and deserialization
  • Space Complexity: O(n) - Recursion stack and string storage

Key Insights

  1. Preorder Traversal: Natural choice for serialization as it preserves tree structure
  2. Null Markers: Essential for reconstructing the tree correctly
  3. String Format: Use comma-separated values for easy parsing
  4. Recursive Reconstruction: Use the same traversal order for deserialization

Edge Cases

  1. Empty Tree: Handle null root
  2. Single Node: Handle tree with only root
  3. Skewed Tree: Handle left-skewed or right-skewed trees
  4. Negative Values: Handle negative node values
  5. Large Trees: Handle trees with many nodes

Test Cases

def test_serialize_deserialize():
    # Test case 1
    root1 = TreeNode(1)
    root1.left = TreeNode(2)
    root1.right = TreeNode(3)
    root1.right.left = TreeNode(4)
    root1.right.right = TreeNode(5)
    
    codec = Codec()
    serialized = codec.serialize(root1)
    deserialized = codec.deserialize(serialized)
    
    # Verify the tree structure is preserved
    assert deserialized.val == 1
    assert deserialized.left.val == 2
    assert deserialized.right.val == 3
    assert deserialized.right.left.val == 4
    assert deserialized.right.right.val == 5
    
    # Test case 2: Empty tree
    serialized_empty = codec.serialize(None)
    deserialized_empty = codec.deserialize(serialized_empty)
    assert deserialized_empty is None
    
    print("All tests passed!")

test_serialize_deserialize()

Follow-up Questions

  1. Serialize BST: How would you serialize a Binary Search Tree?
  2. Compress Serialization: How would you compress the serialized string?
  3. Version Control: How would you handle versioning of serialized trees?
  4. Network Transfer: How would you handle large trees over network?
  5. Validation: How would you validate the serialized string?

Common Mistakes

  1. Missing Null Markers: Not including null nodes in serialization
  2. Wrong Traversal Order: Using different orders for serialization and deserialization
  3. String Parsing: Incorrect parsing of serialized string
  4. Index Management: Off-by-one errors in deserialization
  5. Edge Case Handling: Not handling empty tree or single node

Interview Tips

  1. Start with Preorder: Begin with preorder traversal approach
  2. Explain Null Markers: Describe why null markers are necessary
  3. Discuss Alternatives: Mention BFS approach as alternative
  4. Handle Edge Cases: Cover empty tree and special cases
  5. Code Carefully: Pay attention to string parsing and tree reconstruction

Concept Explanations

Serialization: The process of converting a data structure into a format that can be stored or transmitted and later reconstructed.

Deserialization: The process of reconstructing a data structure from its serialized form.

Preorder Traversal: A tree traversal method where we visit the root, then the left subtree, then the right subtree.

Null Markers: Special markers (like “null” or “#”) used to indicate the absence of a node in the serialized representation.

When to Use: Use this pattern when you need to store, transmit, or reconstruct tree structures, such as in file systems, databases, or network protocols.