Taro Logo

Get Watched Videos by Your Friends

Medium
Guidewire logo
Guidewire
3 views
Topics:
ArraysGraphsBreadth-first Search

There are n people, each person has a unique id between 0 and n-1. Given the arrays watchedVideos and friends, where watchedVideos[i] and friends[i] contain the list of watched videos and the list of friends respectively for the person with id = i.

Level 1 of videos are all watched videos by your friends, level 2 of videos are all watched videos by the friends of your friends and so on. In general, the level k of videos are all watched videos by people with the shortest path exactly equal to k with you. Given your id and the level of videos, return the list of videos ordered by their frequencies (increasing). For videos with the same frequency order them alphabetically from least to greatest. 

Example 1:

Input: watchedVideos = [["A","B"],["C"],["B","C"],["D"]], friends = [[1,2],[0,3],[0,3],[1,2]], id = 0, level = 1
Output: ["B","C"] 
Explanation: 
You have id = 0 (green color in the figure) and your friends are (yellow color in the figure):
Person with id = 1 -> watchedVideos = ["C"] 
Person with id = 2 -> watchedVideos = ["B","C"] 
The frequencies of watchedVideos by your friends are: 
B -> 1 
C -> 2

Example 2:

Input: watchedVideos = [["A","B"],["C"],["B","C"],["D"]], friends = [[1,2],[0,3],[0,3],[1,2]], id = 0, level = 2
Output: ["D"]
Explanation: 
You have id = 0 (green color in the figure) and the only friend of your friends is the person with id = 3 (yellow color in the figure).

Constraints:

  • n == watchedVideos.length == friends.length
  • 2 <= n <= 100
  • 1 <= watchedVideos[i].length <= 100
  • 1 <= watchedVideos[i][j].length <= 8
  • 0 <= friends[i].length < n
  • 0 <= friends[i][j] < n
  • 0 <= id < n
  • 1 <= level < n
  • if friends[i] contains j, then friends[j] contains i

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can you please clarify the data types and range of values for the user IDs and video IDs?
  2. What should I return if a user has no friends, or if none of their friends have watched any videos?
  3. If multiple videos have the same view count among friends, how should I order them in the output?
  4. Is the 'level' parameter guaranteed to be non-negative? What happens if the 'level' exceeds the depth of the friendship network?
  5. Are there any constraints on the size of the user and video lists, or the depth of the friendship network? This might impact memory usage.

Brute Force Solution

Approach

The brute force approach involves exploring all possible friends within the specified levels. For each friend found, collect all the videos they watched and then count how many times each video appears.

Here's how the algorithm would work step-by-step:

  1. Start with the given person and their list of friends.
  2. Go through the first level of friends to see who they are.
  3. Then go through the friends of those friends to find the second level of friends, and so on, up to the specified level.
  4. As you find each friend, note the videos they have watched.
  5. Once you have found all friends within the level, create a big list of all the videos watched by those friends.
  6. Count how many times each video appears in the big list.
  7. Finally, present the videos sorted by how often they were watched.

Code Implementation

def watched_videos_by_friends_brute_force(watched_videos, friend_connections, person_id, level):

    friends_at_level = get_friends_at_level(friend_connections, person_id, level)

    watched_videos_by_friends = []

    # Iterate through each friend at the specified level.
    for friend in friends_at_level:
        watched_videos_by_friends.extend(watched_videos[friend])

    video_counts = {}
    for video in watched_videos_by_friends:
        if video in video_counts:
            video_counts[video] += 1
        else:
            video_counts[video] = 1

    # Sort videos by frequency and then alphabetically.
    sorted_videos = sorted(video_counts.items(), key=lambda item: (item[1], item[0]))

    result = [video for video, count in sorted_videos]
    return result

def get_friends_at_level(friend_connections, person_id, level):
    queue = [(person_id, 0)]
    visited = {person_id}
    friends_at_target_level = []

    while queue:
        current_person, current_level = queue.pop(0)

        if current_level == level:
            friends_at_target_level.append(current_person)
            continue

        # Traverse the graph breadth first
        for friend in friend_connections[current_person]:

            if friend not in visited:
                visited.add(friend)
                queue.append((friend, current_level + 1))

    return friends_at_target_level

Big(O) Analysis

Time Complexity
O(N + F + V log V)Let N be the number of people in the network, F be the total number of friendships explored within the given level, and V be the total number of videos watched by the friends within the level. The breadth-first search to find friends up to the specified level takes O(N + F) time, where O(N) accounts for processing each person and O(F) accounts for exploring the friendship edges. Counting the video occurrences takes O(V) time. Sorting the videos based on frequency takes O(V log V) time. Thus, the overall time complexity is O(N + F + V log V).
Space Complexity
O(N)The algorithm maintains a queue to track friends to explore at each level. In the worst-case scenario, this queue could contain all the friends of the starting person up to the specified level. Let N be the total number of individuals connected to the starting person within the given levels, including friends at each level. Additionally, a hash map is used to count the frequency of each video watched by these friends, which could potentially store a count for each unique video, meaning at most N videos. Therefore, the auxiliary space used is proportional to N, resulting in a space complexity of O(N).

Optimal Solution

Approach

The goal is to find videos watched by friends within a certain degree of separation. We'll use a method similar to how information spreads in social networks, focusing on expanding our search layer by layer and keeping track of who watched what.

Here's how the algorithm would work step-by-step:

  1. Start with the initial person and consider them level zero.
  2. Find all the people who are directly friends with the initial person (level one friends).
  3. Repeat this process to find friends of friends (level two friends), then friends of friends of friends (level three friends), and so on, up to the specified degree.
  4. As you discover each friend, keep track of the videos they have watched.
  5. Once you've explored all friends within the specified degree of separation, count how many times each video was watched across all those friends.
  6. Finally, sort the videos by the number of times they were watched, starting with the most popular, and return this list.

Code Implementation

def get_watched_videos_by_friends(watched_videos, friends, person_id, level):    video_counts = {}
    visited = {person_id}
    queue = [(person_id, 0)]

    while queue:
        current_person, current_level = queue.pop(0)

        if current_level <= level:
            # Process videos watched by friends within the level
            for video in watched_videos[current_person]:
                video_counts[video] = video_counts.get(video, 0) + 1

        if current_level < level:
            # Explore friends of the current person
            for friend in friends[current_person]:
                if friend not in visited:
                    visited.add(friend)
                    queue.append((friend, current_level + 1))

    # Avoid returning videos watched by the target
    for video in watched_videos[person_id]:
        if video in video_counts:
            del video_counts[video]

    # Sort videos by frequency, then alphabetically
    sorted_videos = sorted(video_counts.items(), key=lambda item: (item[1], item[0]))

    result = [video for video, count in sorted_videos]
    return result

Big(O) Analysis

Time Complexity
O(V + E + VlogV)The algorithm involves traversing the graph of friends up to the specified degree. V represents the number of people we visit (vertices) and E represents the total number of friendships (edges) we traverse to reach those people; thus the breadth-first search has a time complexity of O(V+E). Additionally, counting video views takes O(V) time because each friend's videos need to be processed. Finally, sorting the videos by frequency takes O(VlogV) time in the worst case, where V represents the number of unique videos watched by the friends. Therefore, the dominant operations are graph traversal and sorting, resulting in an overall time complexity of O(V + E + VlogV).
Space Complexity
O(N)The space complexity is primarily determined by the auxiliary space used to store the friends to explore at each degree of separation and the videos watched by those friends. In the worst-case scenario, we might need to store all individuals in the social network in a queue-like structure for exploration, where N represents the number of individuals. Additionally, we need to store the videos watched by these individuals, which could also potentially scale linearly with the number of individuals if each watches a unique set of videos. Therefore, the auxiliary space complexity is O(N).

Edge Cases

CaseHow to Handle
Empty watchedVideos or friends listsReturn an empty list if either the watchedVideos or friends lists are empty, as there are no videos to watch or friends to consider.
No common friends between user and other usersIf no one is a friend of the target user, the algorithm should still return an empty list of videos.
Cycle in the friend networkUse a visited set to prevent infinite loops and ensure each friend is only processed once.
Very large friend network (scalability)The solution should use a breadth-first search (BFS) approach, which is suitable for exploring large networks, and minimize memory usage with efficient data structures.
watchedVideos list containing null or empty stringsFilter out null or empty strings from the watchedVideos lists to avoid errors during processing or counting.
Large number of videos watched by individual friendsUse a hash map to efficiently count the frequency of watched videos, even with a large number of videos per friend.
Input level exceeding the number of friendsReturn an empty list if the input level is greater than or equal to the number of friends, as there won't be any videos at that level.
Integer overflow in frequency countsUse a data type with sufficient capacity (e.g., long in Java) for the video frequency counts to prevent integer overflows when dealing with a very large number of friends or videos.