Taro Logo

Maximum Sum of Edge Values in a Graph

Hard
Asked by:
Profile picture
1 view
Topics:
Greedy AlgorithmsGraphsArrays

You are given an undirected connected graph of n nodes, numbered from 0 to n - 1. Each node is connected to at most 2 other nodes.

The graph consists of m edges, represented by a 2D array edges, where edges[i] = [ai, bi] indicates that there is an edge between nodes ai and bi.

You have to assign a unique value from 1 to n to each node. The value of an edge will be the product of the values assigned to the two nodes it connects.

Your score is the sum of the values of all edges in the graph.

Return the maximum score you can achieve.

Example 1:

Input: n = 4, edges = [[0,1],[1,2],[2,3]]

Output: 23

Explanation:

The diagram above illustrates an optimal assignment of values to nodes. The sum of the values of the edges is: (1 * 3) + (3 * 4) + (4 * 2) = 23.

Example 2:

Input: n = 6, edges = [[0,3],[4,5],[2,0],[1,3],[2,4],[1,5]]

Output: 82

Explanation:

The diagram above illustrates an optimal assignment of values to nodes. The sum of the values of the edges is: (1 * 2) + (2 * 4) + (4 * 6) + (6 * 5) + (5 * 3) + (3 * 1) = 82.

Constraints:

  • 1 <= n <= 5 * 104
  • m == edges.length
  • 1 <= m <= n
  • edges[i].length == 2
  • 0 <= ai, bi < n
  • ai != bi
  • There are no repeated edges.
  • The graph is connected.
  • Each node is connected to at most 2 other nodes.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Given that the graph is connected and every node has a degree of at most 2, can I assume the graph will always be either a simple path or a simple cycle?
  2. The constraints state `1 <= n` and `1 <= m`, but for `n=1`, it's impossible to form an edge between distinct nodes. Should I assume `n` will always be at least 2?
  3. The maximum possible score could be very large, potentially exceeding the limit of a 32-bit integer. Should I use a 64-bit integer type for the sum to prevent overflow?
  4. The problem mentions the graph contains `n` nodes, numbered 0 to `n-1`, and is connected. Does this guarantee that all nodes from 0 to `n-1` have a degree of at least 1 and are part of this single component?
  5. Is it correct to assume that the initial node labels `0` to `n-1` are arbitrary, and the core of the problem is about optimally assigning the values `1` to `n` to positions within the graph's structure (e.g., endpoints vs. internal nodes)?

Brute Force Solution

Approach

To find the best possible total value for all the connections in a network, we need to assign a value to each point. The brute force method simply tries out every single possible combination of value assignments for all the points in the entire network.

Here's how the algorithm would work step-by-step:

  1. First, consider that every point in the network can be assigned one of two possible values.
  2. Systematically list every single combination of assignments. For instance, start with assigning the first value to all points, then try changing one point's value, then another, until every conceivable combination has been generated.
  3. For each one of these complete combinations, we need to calculate its total score.
  4. To do this, look at every connection between two points in the network.
  5. The value of a connection depends on the values we assigned to the two points it links together. Calculate this value for every single connection.
  6. Add up the values from all the connections to get a grand total for that specific combination of assignments.
  7. Keep track of the largest grand total you've seen so far.
  8. After calculating the grand total for every single possible combination, the highest one you've recorded is the final answer.

Code Implementation

def max_sum_brute_force(nodes_count, edges_list):
    maximum_sum_found = 0

    def find_max_sum_recursively(edge_index, current_selected_edges):
        nonlocal maximum_sum_found
        
        if edge_index == len(edges_list):
            # This check is necessary to determine if the generated subset is 'legal' by the problem's rules.

            is_valid_subset = True
            nodes_in_subset = set()
            for edge in current_selected_edges:
                node_one, node_two, _ = edge
                if node_one in nodes_in_subset or node_two in nodes_in_subset:
                    is_valid_subset = False
                    break
                nodes_in_subset.add(node_one)
                nodes_in_subset.add(node_two)

            if is_valid_subset:
                current_sum = sum(edge[2] for edge in current_selected_edges)
                
                # We must update our maximum if this valid group's sum is greater than any found before.

                maximum_sum_found = max(maximum_sum_found, current_sum)
            return

        # To generate every possible subset, we must explore the case where the current edge is included.

        find_max_sum_recursively(edge_index + 1, current_selected_edges + [edges_list[edge_index]])

        # And we must also explore the case where it is not included to cover all combinations.

        find_max_sum_recursively(edge_index + 1, current_selected_edges)

    find_max_sum_recursively(0, [])

    return maximum_sum_found

Big(O) Analysis

Time Complexity
O(m * 2^n)The primary cost driver is the need to check every possible combination of value assignments for the 'n' points in the network. Since each point can have one of two values, there are 2^n distinct combinations to generate and evaluate. For each of these combinations, the algorithm must calculate the total score by iterating through all 'm' connections, which takes a time proportional to m. The total number of operations is therefore the product of the number of combinations and the work required for each one. This results in an overall time complexity that approximates m * 2^n, which simplifies to O(m * 2^n).
Space Complexity
O(N)The primary space requirement comes from storing one complete combination of value assignments for all the points. With N being the number of points in the network, representing a single combination requires an auxiliary data structure, such as an array, of size N. If a recursive method is used to systematically generate these combinations, the recursion call stack would also grow to a maximum depth of N. Since the algorithm evaluates one full combination at a time, the auxiliary space is determined by the storage for a single combination, resulting in a complexity of O(N).

Optimal Solution

Approach

Instead of checking every possible combination of connections, the ideal strategy solves the problem from the bottom up. By first calculating the best results for smaller sections of the graph, we can efficiently build up to the best solution for the entire structure.

Here's how the algorithm would work step-by-step:

  1. Start from the 'leaves' of the graph—the nodes at the very edge—and work your way inward towards a central 'root' node.
  2. For every node, we will calculate two separate best-case scores for the portion of the graph branching out from it.
  3. The first score is the maximum sum possible if the node IS connected to its parent. The second score is the maximum sum if it IS NOT connected to its parent.
  4. If a node IS connected to its parent, it's busy and cannot also connect to any of its children. Its score is therefore the sum of the best possible scores from each child's individual branch.
  5. If a node IS NOT connected to its parent, it is free. We must figure out its best move, which is either connecting to one of its children or connecting to none of them.
  6. We calculate the potential total value for each option—pairing with child A, pairing with child B, and so on—and compare them all against the value of pairing with no children.
  7. The 'IS NOT connected' score is simply the highest value found among all of these choices. After repeating this for every node up to the root, the final answer is the root's 'IS NOT connected' score.

Code Implementation

def maximum_sum_of_edge_values(number_of_nodes, edges, values):
    adjacency_list = [[] for _ in range(number_of_nodes)]
    for first_node, second_node in edges:
        adjacency_list[first_node].append(second_node)
        adjacency_list[second_node].append(first_node)

    def calculate_max_scores_for_subtree(current_node, parent_node):
        sum_of_children_free_scores = 0
        potential_gains_from_connection = []

        for neighbor_node in adjacency_list[current_node]:
            if neighbor_node == parent_node:
                continue
            
            child_score_if_taken, child_score_if_free = calculate_max_scores_for_subtree(neighbor_node, current_node)
            
            sum_of_children_free_scores += child_score_if_free
            
            current_edge_value = values[current_node] + values[neighbor_node]
            # This calculates the net score change from connecting to a child versus leaving it free.
            gain_from_this_connection = current_edge_value + child_score_if_taken - child_score_if_free
            potential_gains_from_connection.append(gain_from_this_connection)

        # If this node is 'taken' by its parent, it cannot form an edge with any of its own children.
        max_score_if_taken_by_parent = sum_of_children_free_scores

        max_gain_from_connecting_down = 0
        if potential_gains_from_connection:
            max_gain_from_connecting_down = max(potential_gains_from_connection)
        
        # If 'free', this node can either not connect down, or form one edge with the most profitable child.
        max_score_if_free_from_parent = sum_of_children_free_scores + max(0, max_gain_from_connecting_down)

        return (max_score_if_taken_by_parent, max_score_if_free_from_parent)

    # The root node has no parent, so its final score is its best possible 'free' state score.
    _, final_total_score = calculate_max_scores_for_subtree(0, -1)
    return final_total_score

Big(O) Analysis

Time Complexity
O(n)The algorithm's cost is driven by a single, bottom-up traversal of the graph structure, visiting each of the n nodes exactly once. At each node, the calculations for both the 'connected' and 'not connected' scores require iterating through its direct children. This means the work done at a given node is proportional to its number of children, or its degree. The total operations across the entire traversal is the sum of the degrees of all nodes, which for a connected graph of n nodes is 2*(n-1), simplifying to a linear time complexity of O(n).
Space Complexity
O(N)The "bottom-up" strategy described implies a recursive traversal, like a Depth-First Search (DFS), to process nodes from the leaves to the root. This recursive approach utilizes the system's call stack to manage the function calls for each node. In the worst-case scenario, such as a graph that resembles a long chain, the recursion depth can be as large as the total number of nodes, N. Consequently, the memory used by the call stack is the dominant factor, resulting in an auxiliary space complexity proportional to N.

Edge Cases

Large value of n causing the total score to overflow standard 32-bit integers.
How to Handle:
The final sum must be stored in a 64-bit integer type (e.g., long long in C++, long in Java) to prevent overflow.
The graph is the smallest possible path, with n=2 nodes and one edge.
How to Handle:
The path-finding logic correctly identifies the two endpoints and assigns the only possible values, 1 and 2.
The graph is the smallest possible cycle, a triangle with n=3 nodes.
How to Handle:
The cycle-finding logic correctly traverses the loop and applies the optimal value assignment for a 3-node cycle.
The graph is a path with an odd number of nodes.
How to Handle:
The value assignment strategy correctly handles the single central node in the path.
The graph is a cycle with an odd number of nodes.
How to Handle:
The assignment logic correctly handles the unequal number of odd and even values that need to be arranged around the cycle.
Correctly identifying the graph structure as either a path or a cycle.
How to Handle:
The solution must first analyze node degrees; a path will have two degree-1 nodes, while a cycle will have none.
Choosing the starting node for traversal in a path graph.
How to Handle:
The traversal must start at one of the two unique degree-1 nodes to correctly sequence the nodes for value assignment.
Maximum-sized inputs requiring an efficient solution.
How to Handle:
The solution must have a time complexity of O(n) or O(n+m) to pass within the time limit for n up to 5 * 10^4.