Taro Logo

The Earliest Moment When Everyone Become Friends

Medium
Google logo
Google
8 views
Topics:
ArraysGreedy Algorithms

You are given an array logs where each element logs[i] = [timestamp_i, x_i, y_i] indicates that at time timestamp_i, person x_i and person y_i become friends. You are also given an integer n representing the total number of people. Friendship is transitive. That is, if A is a friend of B, and B is a friend of C, then A is a friend of C.

Your task is to find the earliest time when all the n people are connected (i.e., every person is a friend of every other person). Return the earliest timestamp at which this happens. If there is no time at which all people are connected, return -1.

For example:

  1. logs = [[20190101,0,1],[20190104,3,4],[20190107,2,3],[20190211,1,5],[20190224,2,4],[20190301,0,3],[20190312,1,2],[20190322,4,5]], n = 6

    In this case, the people become connected at the following times:

    • At timestamp 20190101, persons 0 and 1 become friends.
    • At timestamp 20190104, persons 3 and 4 become friends.
    • At timestamp 20190107, persons 2 and 3 become friends.
    • At timestamp 20190211, persons 1 and 5 become friends.
    • At timestamp 20190224, persons 2 and 4 become friends.
    • At timestamp 20190301, persons 0 and 3 become friends. All persons 0, 1, 2, 3, 4 and 5 are not yet all connected.
    • At timestamp 20190312, persons 1 and 2 become friends. All persons 0, 1, 2, 3, 4 and 5 are not yet all connected.
    • At timestamp 20190322, persons 4 and 5 become friends. At this moment, all people become friends. The earliest time when everyone is connected is 20190322.
  2. logs = [[0,2,0],[1,0,1],[3,0,3],[4,1,2],[5,1,3]], n = 4

    The output is 5, because that is the first time that all nodes are connected.

Can you implement an algorithm to efficiently solve this problem?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible ranges for the 'friendship' timestamps, and what data type are they represented as (e.g., integers, floats)?
  2. If not all individuals become friends, what value should I return?
  3. Is the input guaranteed to be sorted by timestamp, or do I need to sort it first?
  4. What are the valid ranges and data types for the individual IDs?
  5. How is the input structured? Is it an array of tuples (timestamp, user1, user2) or another format?

Brute Force Solution

Approach

The brute force approach means we will explore every possible friendship event order to see when everyone is connected. We'll go through each friendship event, one at a time, and check if, after adding that friendship, everyone is friends with each other.

Here's how the algorithm would work step-by-step:

  1. Start by considering each friendship event in the order they happened.
  2. After adding a new friendship event to our network, check if all people are now connected, either directly or indirectly, through their friendships.
  3. To check if everyone is connected, look at each person individually and try to find a path of friendships that links them to every other person.
  4. If even one person can't be linked to everyone else, then everyone isn't yet friends. Move on to consider the next friendship event.
  5. Keep repeating steps 2, 3, and 4. As soon as you find the exact moment a friendship event causes everyone to be connected, that's your answer!
  6. If you check every friendship event and never find a moment when everyone is connected, then it's impossible for everyone to become friends.

Code Implementation

def earliest_moment_everyone_friends(number_people, friendships):
    for current_time, person_one, person_two in friendships:
        # Add the current friendship to the network
        friend_network = build_friend_network(friendships, current_time)

        # Check if everyone is connected
        if all_connected(number_people, friend_network):
            return current_time

    return -1

def build_friend_network(friendships, current_time):
    friend_network = {}
    for time_stamp, person_one, person_two in friendships:
        if time_stamp <= current_time:
            if person_one not in friend_network:
                friend_network[person_one] = []
            if person_two not in friend_network:
                friend_network[person_two] = []
            friend_network[person_one].append(person_two)
            friend_network[person_two].append(person_one)
    return friend_network

def all_connected(number_people, friend_network):
    if not friend_network:
        return number_people == 1

    first_person = next(iter(friend_network))
    visited = {first_person}
    queue = [first_person]

    # Perform BFS to find all connected people
    while queue:
        person = queue.pop(0)
        if person in friend_network:
            for friend in friend_network[person]:
                if friend not in visited:
                    visited.add(friend)
                    queue.append(friend)

    # Verify that everyone is connected
    return len(visited) == number_people

# This is an example of usage for this function
# number_people = 4
# friendships = [[0, 1, 2], [1, 2, 3], [2, 1, 4]]
# result = earliest_moment_everyone_friends(number_people, friendships)
# print(result)

Big(O) Analysis

Time Complexity
O(n * m^2)We iterate through each of the 'n' friendship events. For each event, we check if everyone is connected. Checking if everyone is connected involves, for each of the 'm' people, finding a path to every other person. Finding a path for one person to every other person requires checking up to 'm' other people, potentially through multiple connections. Thus, the connectivity check for a single friendship event takes O(m^2) time. Therefore, the overall time complexity is O(n * m^2).
Space Complexity
O(N^2)The provided brute force approach requires a way to represent the friendship network to determine if everyone is connected. A common way to represent this is using an adjacency matrix of size N x N, where N is the number of people. This matrix uses N^2 space to store the direct or indirect connections between each pair of individuals. Thus, the auxiliary space used scales quadratically with the number of people, resulting in a space complexity of O(N^2).

Optimal Solution

Approach

The goal is to find the earliest time when everyone is connected. To do this efficiently, we'll use a method to track connections and efficiently merge groups of people until everyone belongs to the same group, stopping at the first moment this happens.

Here's how the algorithm would work step-by-step:

  1. Imagine each person starts in their own individual group.
  2. Process the connections one at a time, in the order they happened.
  3. For each connection, check if the two people involved are already in the same group. If they are, it means they are already indirectly connected, so skip this connection.
  4. If the two people are in different groups, merge those groups together. This represents these people becoming part of a larger connected community.
  5. Keep track of how many groups you have. Every time you merge two groups, the total number of groups goes down by one.
  6. As you merge groups, check to see if the number of groups has reached one. This means everyone is now connected.
  7. If everyone is connected, record the time of that last connection that made it happen. This is the earliest time everyone becomes friends.
  8. If you process all connections and everyone never ends up in the same group, then no such time exists.

Code Implementation

def find_earliest_moment(number_of_people, connections):

    parent = list(range(number_of_people))
    number_of_groups = number_of_people

    def find(person):
        if parent[person] != person:
            parent[person] = find(parent[person])
        return parent[person]

    def union(person1, person2):
        nonlocal number_of_groups
        root_person1 = find(person1)
        root_person2 = find(person2)

        if root_person1 != root_person2:
            parent[root_person1] = root_person2
            number_of_groups -= 1

    for timestamp, person1, person2 in connections:
        # Process connections in chronological order.

        union(person1 - 1, person2 - 1)

        # Check if all people are now in a single group.
        if number_of_groups == 1:

            return timestamp

    # If no such moment exists.
    return -1

Big(O) Analysis

Time Complexity
O(N log N)The primary cost driver is sorting the connections, which takes O(N log N) time where N is the number of connections. The Union-Find operations (find and union) inside the loop have a time complexity that is nearly constant with path compression and union by rank optimization, making them negligible compared to the sorting step. Therefore, the overall time complexity is dominated by the sorting, resulting in O(N log N).
Space Complexity
O(N)The algorithm needs to maintain a data structure to represent the groups each person belongs to. In the worst case, each person starts in their own group, requiring an auxiliary array or dictionary of size N to map each person to their group, where N is the number of people. When merging groups, this data structure is updated. The number of groups and time of connection variables are constant and do not affect the overall space complexity. Therefore, the dominant space usage is O(N).

Edge Cases

CaseHow to Handle
Empty log or no friendships establishedReturn -1 or appropriate error code if the log is empty, indicating no friendship ever occurred.
Only one person existsReturn 0 if only one person exists since everyone is already 'friends' from the start.
Log is not sorted by timestampSort the log entries by timestamp as a preprocessing step to ensure correct order of events.
Disjointed groups that never fully connectReturn -1 if the algorithm completes and there are still disjoint groups, indicating that everyone never becomes friends.
Maximum number of people is very large, impacting Union-Find performance.Use path compression and union by rank in the Union-Find data structure for optimal performance with a large number of people.
Integer overflow when calculating timestamp or representing person IDs.Use appropriate data types (e.g., long) to handle large timestamps and IDs to avoid integer overflow.
Invalid person IDs (e.g., negative or out of range).Validate person IDs at the beginning, raising an exception or returning an error if they are invalid.
Log contains redundant friendship connections between the same two people at different timestamps.The Union-Find data structure ensures that subsequent connections between already connected people are ignored, preventing incorrect earliest moment calculation.