Serialize and Deserialize Binary Tree

Medium
7 days ago

Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.

Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

Clarification: The input/output format is the same as how LeetCode serializes a binary tree. You do not necessarily need to follow this format, so please be creative and come up with different approaches yourself.

Example 1: Given the binary tree represented as the array [1,2,3,null,null,4,5], the serialization should result in a string that can be deserialized back to the same tree structure.

Example 2: For an empty tree represented as [], the serialization should handle it appropriately and the deserialization should return an empty tree.

Constraints:

  • The number of nodes in the tree is in the range [0, 10^4].
  • -1000 <= Node.val <= 1000
Sample Answer
## Serialization and Deserialization of a Binary Tree

This problem asks us to design an algorithm to serialize and deserialize a binary tree. The goal is to convert a tree into a string representation and then reconstruct the original tree from that string.

### Naive Approach (Level-Order Traversal with Null Markers)

A straightforward approach is to use level-order traversal. We can traverse the tree level by level and record the value of each node. We'll use a special marker (e.g., "null") to represent null nodes. This way, we preserve the structure of the tree.

#### Serialization

1.  Start with the root node.
2.  Use a queue for level-order traversal.
3.  For each node:
    *   Append its value to the string (or "null" if it's null).
    *   Enqueue its left and right children (even if they are null).
4.  Join the values with a delimiter (e.g., comma).

#### Deserialization

1.  Split the string by the delimiter.
2.  Create the root node from the first value.
3.  Use a queue to keep track of nodes to process.
4.  For each value in the split string:
    *   If it's not "null", create a node.
    *   Assign the node to the left or right child of the current node (from the queue).
    *   Enqueue the new node.

Here's a Python implementation:

```python
class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

class Codec:

    def serialize(self, root):
        if not root:
            return ""

        queue = [root]
        result = []

        while queue:
            node = queue.pop(0)
            if node:
                result.append(str(node.val))
                queue.append(node.left)
                queue.append(node.right)
            else:
                result.append("null")

        return ",".join(result)

    def deserialize(self, data):
        if not data:
            return None

        nodes = data.split(",")
        root = TreeNode(int(nodes[0]))
        queue = [root]
        i = 1

        while queue:
            node = queue.pop(0)

            if nodes[i] != "null":
                node.left = TreeNode(int(nodes[i]))
                queue.append(node.left)
            i += 1

            if nodes[i] != "null":
                node.right = TreeNode(int(nodes[i]))
                queue.append(node.right)
            i += 1

        return root

Example Usage:

# Create a sample tree
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.right.left = TreeNode(4)
root.right.right = TreeNode(5)

# Serialize the tree
codec = Codec()
serialized_tree = codec.serialize(root)
print("Serialized tree:", serialized_tree)

# Deserialize the tree
deserialized_tree = codec.deserialize(serialized_tree)

# You can add a helper function to print the tree to verify
# that the deserialized tree matches the original

Optimal Approach (Preorder Traversal)

Another approach is to use preorder traversal. This is a bit more concise and can be more efficient in terms of space.

Serialization

  1. Traverse the tree in preorder fashion.
  2. For each node:
    • Append its value to the string (or "null" if it's null).
  3. Join the values with a delimiter (e.g., comma).

Deserialization

  1. Split the string by the delimiter.
  2. Use a recursive helper function to reconstruct the tree.
  3. The helper function processes the values one by one.

Here's a Python implementation:

class TreeNode:
    def __init__(self, x):
        self.val = x
        self.left = None
        self.right = None

class Codec:

    def serialize(self, root):
        def preorder(node):
            if not node:
                return "null,"
            return str(node.val) + "," + preorder(node.left) + preorder(node.right)

        return preorder(root)

    def deserialize(self, data):
        def helper():
            val = next(nodes)
            if val == "null":
                return None
            node = TreeNode(int(val))
            node.left = helper()
            node.right = helper()
            return node

        nodes = iter(data.split(","))
        return helper()

Example Usage:

# Create a sample tree
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.right.left = TreeNode(4)
root.right.right = TreeNode(5)

# Serialize the tree
codec = Codec()
serialized_tree = codec.serialize(root)
print("Serialized tree:", serialized_tree)

# Deserialize the tree
deserialized_tree = codec.deserialize(serialized_tree)

# You can add a helper function to print the tree to verify
# that the deserialized tree matches the original

Big(O) Runtime Analysis

Level-Order Traversal

  • Serialization: O(N), where N is the number of nodes in the tree. We visit each node once.
  • Deserialization: O(N), where N is the number of nodes in the tree. We process each value in the string once.

Preorder Traversal

  • Serialization: O(N), where N is the number of nodes in the tree. We visit each node once.
  • Deserialization: O(N), where N is the number of nodes in the tree. We process each value in the string once.

Big(O) Space Usage Analysis

Level-Order Traversal

  • Serialization: O(N), where N is the number of nodes in the tree. In the worst case (a complete binary tree), the queue can hold N/2 nodes.
  • Deserialization: O(N), where N is the number of nodes in the tree. In the worst case (a complete binary tree), the queue can hold N/2 nodes.

Preorder Traversal

  • Serialization: O(N), where N is the number of nodes in the tree. This is due to the recursion stack.
  • Deserialization: O(N), where N is the number of nodes in the tree. This is due to the recursion stack.

Edge Cases

  1. Empty Tree:
    • Both serialization methods should handle an empty tree gracefully (return an empty string or a special marker).
  2. Tree with Only One Node:
    • The serialization should produce a string with a single value.
  3. Tree with Skewed Structure:
    • The algorithms should correctly handle left-skewed or right-skewed trees.
  4. Large Tree:
    • For very large trees, consider using a more compact serialization format to reduce the size of the string. Compression algorithms can also be applied to the serialized string.
  5. Values with Commas:
    • If the node values themselves contain commas, choose a different delimiter or escape the commas in the serialized string.

By handling these edge cases, we can ensure that the serialization and deserialization algorithms are robust and work correctly for any binary tree.