Taro Logo

Vertical Order Traversal of a Binary Tree

Hard
Google logo
Google
1 view
Topics:
Trees

Vertical Order Traversal of a Binary Tree

Given the root of a binary tree, calculate the vertical order traversal of the binary tree.

For each node at position (row, col), its left and right children will be at positions (row + 1, col - 1) and (row + 1, col + 1) respectively. The root of the tree is at (0, 0).

The vertical order traversal of a binary tree is a list of top-to-bottom orderings for each column index starting from the leftmost column and ending on the rightmost column. There may be multiple nodes in the same row and same column. In such a case, sort these nodes by their values.

Return the vertical order traversal of the binary tree.

Example 1:

Consider the following binary tree:

    3
   / \
  9  20
    /  \
   15   7

The vertical order traversal would be: [[9], [3, 15], [20], [7]]

  • Column -1: Only node 9 is in this column.
  • Column 0: Nodes 3 and 15 are in this column in that order from top to bottom.
  • Column 1: Only node 20 is in this column.
  • Column 2: Only node 7 is in this column.

Example 2:

Consider the following binary tree:

      1
     / \
    2   3
   / \ / \
  4   5 6  7

The vertical order traversal would be: [[4], [2], [1, 5, 6], [3], [7]]

  • Column -2: Only node 4 is in this column.
  • Column -1: Only node 2 is in this column.
  • Column 0: Nodes 1, 5, and 6 are in this column. 1 is at the top, so it comes first. 5 and 6 are at the same position (2, 0), so we order them by their value, 5 before 6.
  • Column 1: Only node 3 is in this column.
  • Column 2: Only node 7 is in this column.

Clarifications:

  • How should the output be formatted?
  • What should be returned if the root is null?
  • What are the constraints for node values? (e.g., non-negative)
  • How should nodes at the same row and column be handled?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of values for the node values in the binary tree? Are negative values possible?
  2. What should I return if the root is null or the tree is empty?
  3. In the case of multiple nodes at the same row and column, what is the tie-breaking rule for ordering them?
  4. Are there any memory constraints I should be aware of, given the potential size of the tree?
  5. How should the output be formatted exactly? Specifically, should it be a list of lists, and how should the inner lists be structured?

Brute Force Solution

Approach

The brute force method for vertical order traversal of a binary tree essentially tries to capture every single node and its horizontal position. It then organizes them based on these horizontal positions. It is an exhaustive approach that doesn't take any shortcuts.

Here's how the algorithm would work step-by-step:

  1. First, imagine each node in the tree has a 'horizontal level' number. The root starts at level 0.
  2. Go through the tree, level by level, and for each node, calculate its horizontal level. If you go left, subtract one from the level. If you go right, add one to the level.
  3. Store each node along with its calculated horizontal level.
  4. Once you've visited every node, find the smallest and largest horizontal levels that exist in your stored data.
  5. Now, go through each horizontal level, starting from the smallest and ending at the largest.
  6. For each horizontal level, collect all the node values that have that particular level.
  7. Finally, present the node values for each horizontal level in order from smallest to largest level. This is the vertical order traversal.

Code Implementation

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def vertical_order_traversal_brute_force(root):
    node_levels = []

    def calculate_node_levels(node, level):
        if not node:
            return

        node_levels.append((level, node.val))
        calculate_node_levels(node.left, level - 1)
        calculate_node_levels(node.right, level + 1)

    calculate_node_levels(root, 0)

    # Need this to determine what range of levels to iterate through.
    if not node_levels:
        return []

    min_level = min(level for level, _ in node_levels)
    max_level = max(level for level, _ in node_levels)

    vertical_order = []

    # Iterate through each horizontal level
    for level in range(min_level, max_level + 1):
        current_level_nodes = []

        # Collect nodes at the current horizontal level
        for node_level, node_value in node_levels:
            if node_level == level:
                current_level_nodes.append(node_value)

        vertical_order.append(current_level_nodes)

    return vertical_order

Big(O) Analysis

Time Complexity
O(n log n)The algorithm visits each of the n nodes in the tree to calculate and store its horizontal level. Determining the horizontal level for each node takes constant time. Storing each node and its level can be done in O(1) on average with a hashmap. Sorting the nodes to prepare them for output based on their horizontal levels takes O(n log n) time. Iterating through the sorted nodes and extracting them into the vertical order traversal result takes O(n) time. Therefore, the dominant factor in the time complexity is sorting, leading to an overall time complexity of O(n log n).
Space Complexity
O(N)The brute force approach described stores each node along with its horizontal level. This means we maintain a data structure, likely a list or dictionary, to hold this information for all N nodes in the tree. Additionally, to find the smallest and largest horizontal levels, we might store all levels in a set or list, which in the worst case could contain N distinct levels. Therefore, the dominant space usage is proportional to the number of nodes, N, leading to a space complexity of O(N).

Optimal Solution

Approach

The key to solving this problem efficiently is to think about the tree's structure in terms of columns. We want to group nodes by their horizontal distance from the root and then order those groups by level.

Here's how the algorithm would work step-by-step:

  1. Imagine each node in the tree is located on a horizontal line relative to the root. The root is at position zero.
  2. As we explore the tree, keep track of the horizontal position of each node. Left children are one position less than their parent, and right children are one position more.
  3. Use a way to remember which nodes belong to which horizontal position (or column).
  4. Do a traversal of the tree (like breadth-first search) where you visit each node. As you visit a node, store its value into the correct column based on its horizontal position.
  5. Once you've visited every node, you'll have all the nodes grouped by their column.
  6. Finally, go through your columns from left to right, and within each column, list the node values in the order you found them during your traversal. This gives you the vertical order.

Code Implementation

from collections import deque

class TreeNode:
    def __init__(self, val=0, left=None, right=None):
        self.val = val
        self.left = left
        self.right = right

def vertical_order(root):
    if not root:
        return []

    column_to_nodes = {}
    queue = deque([(root, 0)])

    while queue:
        node, column_index = queue.popleft()

        # Store the node value for the current column index.
        if column_index not in column_to_nodes:
            column_to_nodes[column_index] = []
        column_to_nodes[column_index].append(node.val)

        if node.left:
            queue.append((node.left, column_index - 1))
        if node.right:
            queue.append((node.right, column_index + 1))

    # Get column indexes in sorted order.
    sorted_columns = sorted(column_to_nodes.keys())

    result = []
    # Construct the final result based on the sorted column indexes.
    for column_index in sorted_columns:
        result.append(column_to_nodes[column_index])

    return result

Big(O) Analysis

Time Complexity
O(n log n)The algorithm visits each of the n nodes in the binary tree once using a breadth-first search (BFS), which takes O(n) time. During the BFS, for each node, we might insert it into a data structure (like a map or dictionary) based on its column. If the column is not already present, the insert operation takes O(1) on average. However, to maintain the vertical order, it is typically required to sort the nodes in each column by their level/discovery order (implicitly maintained by BFS order). If the number of nodes in the widest column is 'h', which can be at most n in a skewed tree or log n in balanced trees, sorting each column will take at most O(h log h). Since we do it for all n nodes, the overall time complexity becomes O(n log n) in the worst-case.
Space Complexity
O(N)The primary auxiliary space usage comes from storing the tree nodes in a queue for the breadth-first search traversal. In the worst-case scenario (e.g., a complete binary tree), the queue might hold roughly N/2 nodes at some point, where N is the total number of nodes in the tree. Additionally, a hash map (or similar data structure) is used to store nodes grouped by their horizontal column, which could also potentially store all N nodes in the worst case. Therefore, the space complexity is O(N).

Edge Cases

CaseHow to Handle
Null or Empty TreeReturn an empty list if the root is null, as there are no nodes to traverse.
Single Node TreeReturn a list containing a list with the root node's value, representing the single column.
Tree with all nodes having the same valueThe algorithm should correctly assign columns and row to nodes with same value ensuring proper vertical order.
Highly skewed tree (e.g., all nodes on the left)The algorithm should correctly handle unbalanced trees without stack overflow or performance issues.
Large tree (potential memory issues)Consider using an iterative approach or a space-efficient data structure to prevent excessive memory consumption for large trees.
Negative node valuesThe algorithm should handle negative node values without any issues.
Nodes with zero valueThe algorithm should handle zero values for nodes without issues.
Integer Overflow with column indicesUse appropriate data types (e.g., long) for column indices to prevent potential integer overflows, especially in very wide trees.