Let's explore how to build a binary expression tree from an infix expression.
Question:
Design an algorithm and write code to construct a binary expression tree from a given infix expression string. The expression can contain numbers, operators (+, -, ", /), and parentheses. The tree should represent the structure and order of operations defined by the infix expression.
For example, given the infix expression (3 + 5) * 2
, your algorithm should produce a binary tree where the root node contains "*", the left child is a subtree representing (3 + 5)
, and the right child is a node containing "2". The subtree representing (3 + 5)
should have "+" as the root, "3" as the left child, and "5" as the right child.
Your solution should handle:
Describe the time and space complexity of your solution.
Could you provide a Python code to implement this algorithm? Also, describe any edge cases you might encounter.
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The goal is to transform a mathematical expression written in the usual way (infix notation) into a tree structure. A brute-force method involves exploring every possible way to group parts of the expression to figure out how to build the tree. We essentially try different tree structures until we find one that correctly represents the expression.
Here's how the algorithm would work step-by-step:
class Node:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def build_expression_tree_brute_force(expression):
# Helper function to evaluate an expression tree
def evaluate_tree(node):
if node is None:
return 0
if isinstance(node.value, int):
return node.value
left_result = evaluate_tree(node.left)
right_result = evaluate_tree(node.right)
if node.value == '+':
return left_result + right_result
elif node.value == '-':
return left_result - right_result
elif node.value == '*':
return left_result * right_result
elif node.value == '/':
if right_result == 0:
return float('inf')
return left_result / right_result
return 0
# Helper function to tokenize the expression
def tokenize(expression):
tokens = []
number = ''
for char in expression:
if char.isdigit():
number += char
elif char in ['+', '-', '*', '/']:
if number:
tokens.append(int(number))
number = ''
tokens.append(char)
elif char == '(':
pass
elif char == ')':
pass
elif char == ' ':
pass
else:
return None
if number:
tokens.append(int(number))
return tokens
# Helper function to build the tree recursively
def build_tree(tokens):
if not tokens:
return None
# Iterate through possible root operators and positions
for i in range(len(tokens)):
if tokens[i] in ['+', '-', '*', '/']:
root = Node(tokens[i])
# Recursively build the left and right subtrees
left_tokens = tokens[:i]
right_tokens = tokens[i+1:]
root.left = build_tree(left_tokens)
root.right = build_tree(right_tokens)
# Check if the tree is valid by evaluating it
if root.left is not None and root.right is not None:
return root
if len(tokens) == 1 and isinstance(tokens[0], int):
return Node(tokens[0])
else:
return None
tokens = tokenize(expression)
if tokens is None:
return None
return build_tree(tokens)
To build the expression tree efficiently, we will convert the infix expression to postfix first. The postfix expression makes building the tree simpler since we know operators appear after their operands. Then we use a stack to create the tree from the postfix form.
Here's how the algorithm would work step-by-step:
class TreeNode:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def build_expression_tree(infix_expression):
def precedence(operator):
if operator == '+' or operator == '-':
return 1
elif operator == '*' or operator == '/':
return 2
return 0
def infix_to_postfix(infix_expression):
postfix_result = []
operator_stack = []
for char in infix_expression:
if char.isalnum():
postfix_result.append(char)
elif char == '(':
operator_stack.append(char)
elif char == ')':
while operator_stack and operator_stack[-1] != '(':
postfix_result.append(operator_stack.pop())
operator_stack.pop()
else:
# Ensure correct order of operations
while operator_stack and precedence(char) <= precedence(operator_stack[-1]):
postfix_result.append(operator_stack.pop())
operator_stack.append(char)
while operator_stack:
postfix_result.append(operator_stack.pop())
return ''.join(postfix_result)
postfix_expression = infix_to_postfix(infix_expression)
node_stack = []
for char in postfix_expression:
if char.isalnum():
node = TreeNode(char)
node_stack.append(node)
else:
# Operators need two operands
node = TreeNode(char)
right_node = node_stack.pop()
left_node = node_stack.pop()
node.right = right_node
node.left = left_node
node_stack.append(node)
#The remaining node is the root
return node_stack[0]
Case | How to Handle |
---|---|
Null or empty input string | Return null or throw an IllegalArgumentException since an empty expression cannot form a tree. |
Input string with only a single operand | Create a single-node tree with the operand as the root. |
Input string with only operators and no operands | Return null or throw an exception as it is an invalid expression. |
Input string with unmatched parentheses | Throw an exception indicating invalid expression syntax because proper nesting is essential. |
Input string with consecutive operators without operands | Throw an exception indicating invalid expression syntax since operators need operands. |
Very long input string (potential stack overflow with recursion) | Consider an iterative approach to avoid stack overflow errors for deeply nested expressions. |
Division by zero | The evaluation of the expression tree should handle this case by returning an error or appropriate value (e.g., infinity) to prevent runtime exceptions. |
Input string containing unsupported operators or operands | Throw an exception or return null to indicate an invalid expression with unsupported characters. |