Taro Logo

Serialize and Deserialize Binary Tree

Hard
Apple logo
Apple
Topics:
TreesRecursion

Problem Statement

You are tasked with designing an algorithm to serialize and deserialize a binary tree. The goal is to convert the tree into a string representation that can be stored or transmitted and then reconstructed back into the original tree structure.

Clarifications:

  • The serialization should capture the structure and data of the binary tree.
  • The deserialization should accurately reconstruct the original binary tree from the serialized string.
  • There are no restrictions on how serialization/deserialization should work, and there are multiple valid approaches.
  • You may use standard tree node definitions like:
    class TreeNode:
        def __init__(self, val=0, left=None, right=None):
            self.val = val
            self.left = left
            self.right = right
    

Examples:

  1. Example 1:

    • Input Tree: [1, 2, 3, null, null, 4, 5] (represented in level order)
    • Output String: (Your serialized string representation of the tree)
    • Deserialized Tree: [1, 2, 3, null, null, 4, 5] (must be structurally identical to the input)
  2. Example 2:

    • Input Tree: [] (empty tree)
    • Output String: (Your serialized string representation of the empty tree)
    • Deserialized Tree: [] (must be an empty tree)

Constraints:

  • The number of nodes in the tree can range from 0 to 104.
  • Node values will be integers between -1000 and 1000.

Your task is to implement two functions:

  • serialize(root): Takes the root of a binary tree as input and returns a string representation of the tree.
  • deserialize(data): Takes the serialized string as input and returns the root of the reconstructed binary tree.

Explain your approach, its time and space complexity, and handle edge cases such as empty trees. Consider alternative approaches and their trade-offs.

Solution


Serialize and Deserialize a Binary Tree

This problem requires us to design algorithms to serialize a binary tree into a string representation and then deserialize this string back into the original binary tree structure. Let's explore a couple of approaches.

1. Naive Approach: Level Order Traversal with Null Markers

A simple approach is to perform a level order (breadth-first) traversal of the tree. While traversing, we record the value of each node. If a node is null, we record a special marker (e.g., "null"). This creates a string representation that captures the tree's structure.

Serialization:

  1. Start with the root node.
  2. Use a queue for level order traversal.
  3. For each node, add its value to the string. If the node is null, add "null".
  4. Add the node's left and right children to the queue (even if they are null).
  5. Repeat until the queue is empty.

Deserialization:

  1. Split the string into an array of node values.
  2. Create the root node from the first value.
  3. Use a queue to reconstruct the tree level by level.
  4. For each node in the queue, create its left and right children from the array of values.
  5. Add the children to the queue.

Code (Python):

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

class Codec:

    def serialize(self, root):
        if not root:
            return "[]"
        
        queue = [root]
        result = []
        
        while queue:
            node = queue.pop(0)
            if node:
                result.append(str(node.val))
                queue.append(node.left)
                queue.append(node.right)
            else:
                result.append("null")
        
        return '[' + ','.join(result) + ']'

    def deserialize(self, data):
        if data == "[]":
            return None
        
        vals = data[1:-1].split(',')
        root = TreeNode(int(vals[0]))
        queue = [root]
        i = 1
        
        while queue:
            node = queue.pop(0)
            
            if vals[i] != "null":
                node.left = TreeNode(int(vals[i]))
                queue.append(node.left)
            i += 1
            
            if vals[i] != "null":
                node.right = TreeNode(int(vals[i]))
                queue.append(node.right)
            i += 1
        
        return root

Time Complexity: O(N), where N is the number of nodes in the tree. We visit each node once during both serialization and deserialization.

Space Complexity: O(N) in the worst-case scenario (e.g., a complete binary tree), where we store all nodes in the queue during level order traversal.

2. Optimal Approach: Preorder Traversal

Another effective approach is to use a preorder traversal along with null markers. This method can be more space-efficient in some cases.

Serialization:

  1. Perform a preorder traversal of the tree.
  2. For each node, add its value to the string. If the node is null, add "null".

Deserialization:

  1. Split the string into an array of node values.
  2. Recursively reconstruct the tree using the preorder traversal sequence.

Code (Python):

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

class Codec:

    def serialize(self, root):
        def preorder(node):
            if not node:
                result.append("null")
                return
            
            result.append(str(node.val))
            preorder(node.left)
            preorder(node.right)

        result = []
        preorder(root)
        return ','.join(result)

    def deserialize(self, data):
        def build_tree():
            val = next(vals_iter)
            if val == "null":
                return None
            
            node = TreeNode(int(val))
            node.left = build_tree()
            node.right = build_tree()
            return node

        vals = data.split(',')
        vals_iter = iter(vals)
        return build_tree()

Time Complexity: O(N), where N is the number of nodes in the tree. We visit each node once during both serialization and deserialization.

Space Complexity: O(N) in the worst-case scenario (e.g., a skewed tree), due to the recursive call stack during preorder traversal and deserialization.

Edge Cases:

  • Empty Tree: Both algorithms should handle an empty tree gracefully by returning an appropriate empty representation (e.g., "[]" or "null").
  • Tree with only one node: Both algorithms should be able to serialize and deserialize a tree containing only a root node.
  • Skewed Trees: Be aware of the space complexity implications for skewed trees, especially when using recursion.

Both approaches effectively solve the problem, but the choice between them might depend on specific requirements or constraints related to space efficiency or implementation complexity.