Group Anagrams Interview Question for Google

from collections import defaultdict


def groupAnagrams(strs):
    """Groups anagrams together in a list of strings.

    Args:
        strs (list[str]): A list of strings.

    Returns:
        list[list[str]]: A list of lists, where each inner list contains anagrams.
    """
    anagram_groups = defaultdict(list)
    for s in strs:
        # Use a sorted string as the key for anagrams
        key = "".join(sorted(s))
        anagram_groups[key].append(s)
    return list(anagram_groups.values())

# Example usage
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
result = groupAnagrams(strs)
print(result)  # Output: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Explanation:

Initialization: A defaultdict is used to store anagrams. The keys are sorted strings, and the values are lists of strings that are anagrams of each other.
Iteration: Iterate through each string in the input list.
Key Generation: For each string, generate a key by sorting the characters in the string. This ensures that anagrams will have the same key.
Grouping: Append the original string to the list associated with its key in the anagram_groups dictionary.
Result: Convert the values of the anagram_groups dictionary to a list and return it.

Brute Force Solution

A brute force solution would involve comparing each string with every other string to check if they are anagrams. This is inefficient.

def groupAnagrams_brute_force(strs):
    if not strs:
        return []

    groups = []
    used = [False] * len(strs)

    for i in range(len(strs)):
        if used[i]:
            continue

        group = [strs[i]]
        used[i] = True

        for j in range(i + 1, len(strs)):
            if not used[j] and is_anagram(strs[i], strs[j]):
                group.append(strs[j])
                used[j] = True

        groups.append(group)

    return groups


def is_anagram(s1, s2):
    if len(s1) != len(s2):
        return False
    return sorted(s1) == sorted(s2)

# Example usage:
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(groupAnagrams_brute_force(strs))

Big(O) Run-time Analysis

Optimal Solution:

Time Complexity: O(N * KlogK), where N is the number of strings in the input list, and K is the maximum length of a string. For each string, we sort it, which takes KlogK time.

Brute Force Solution:

Time Complexity: O(N^2 * KlogK) where N is the number of strings, and K is the average length of the string. We have to compare each string to every other string which is O(N^2), and then doing a KlogK comparison of strings.

Big(O) Space Usage Analysis

Optimal Solution:

Space Complexity: O(N * K), where N is the number of strings, and K is the maximum length of a string. This is because, in the worst case, each string is unique, and we store each string in the anagram_groups dictionary.

Brute Force Solution:

Space Complexity: O(N * K). In the worst case, we may have to store all N strings of average length K into our output array.

Edge Cases

Empty Input List: If the input list is empty, the function should return an empty list.
Empty Strings: If the input list contains empty strings, they should be grouped together.
Single String: If the input list contains only one string, the function should return a list containing a list with that single string.
Large Input: The solution should handle a large number of strings efficiently using the hashmap approach to avoid performance issues.
Non-Anagrams: If there are strings that are not anagrams of any other string, they should be in their own group.