Taro Logo

Group Anagrams

Medium
5 views
2 months ago

Given an array of strings strs, group the anagrams together. You can return the answer in any order.

Example 1:

Input: strs = ["eat","tea","tan","ate","nat","bat"] Output: [["bat"],["nat","tan"],["ate","eat","tea"]]

Explanation:

There is no string in strs that can be rearranged to form "bat". The strings "nat" and "tan" are anagrams as they can be rearranged to form each other. The strings "ate", "eat", and "tea" are anagrams as they can be rearranged to form each other.

Example 2:

Input: strs = [""] Output: [[""]]

Example 3:

Input: strs = ["a"] Output: [["a"]]

Constraints:

1 <= strs.length <= 10^4 0 <= strs[i].length <= 100 strs[i] consists of lowercase English letters.

Sample Answer
from collections import defaultdict


def groupAnagrams(strs):
    """Groups anagrams together in a list of strings.

    Args:
        strs (list[str]): A list of strings.

    Returns:
        list[list[str]]: A list of lists, where each inner list contains anagrams.
    """
    anagram_groups = defaultdict(list)
    for s in strs:
        # Use a sorted string as the key for anagrams
        key = "".join(sorted(s))
        anagram_groups[key].append(s)
    return list(anagram_groups.values())

# Example usage
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
result = groupAnagrams(strs)
print(result)  # Output: [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]

Explanation:

  1. Initialization: A defaultdict is used to store anagrams. The keys are sorted strings, and the values are lists of strings that are anagrams of each other.
  2. Iteration: Iterate through each string in the input list.
  3. Key Generation: For each string, generate a key by sorting the characters in the string. This ensures that anagrams will have the same key.
  4. Grouping: Append the original string to the list associated with its key in the anagram_groups dictionary.
  5. Result: Convert the values of the anagram_groups dictionary to a list and return it.

Brute Force Solution

A brute force solution would involve comparing each string with every other string to check if they are anagrams. This is inefficient.

def groupAnagrams_brute_force(strs):
    if not strs:
        return []

    groups = []
    used = [False] * len(strs)

    for i in range(len(strs)):
        if used[i]:
            continue

        group = [strs[i]]
        used[i] = True

        for j in range(i + 1, len(strs)):
            if not used[j] and is_anagram(strs[i], strs[j]):
                group.append(strs[j])
                used[j] = True

        groups.append(group)

    return groups


def is_anagram(s1, s2):
    if len(s1) != len(s2):
        return False
    return sorted(s1) == sorted(s2)

# Example usage:
strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
print(groupAnagrams_brute_force(strs))

Big(O) Run-time Analysis

Optimal Solution:

  • Time Complexity: O(N * KlogK), where N is the number of strings in the input list, and K is the maximum length of a string. For each string, we sort it, which takes KlogK time.

Brute Force Solution:

  • Time Complexity: O(N^2 * KlogK) where N is the number of strings, and K is the average length of the string. We have to compare each string to every other string which is O(N^2), and then doing a KlogK comparison of strings.

Big(O) Space Usage Analysis

Optimal Solution:

  • Space Complexity: O(N * K), where N is the number of strings, and K is the maximum length of a string. This is because, in the worst case, each string is unique, and we store each string in the anagram_groups dictionary.

Brute Force Solution:

  • Space Complexity: O(N * K). In the worst case, we may have to store all N strings of average length K into our output array.

Edge Cases

  1. Empty Input List: If the input list is empty, the function should return an empty list.
  2. Empty Strings: If the input list contains empty strings, they should be grouped together.
  3. Single String: If the input list contains only one string, the function should return a list containing a list with that single string.
  4. Large Input: The solution should handle a large number of strings efficiently using the hashmap approach to avoid performance issues.
  5. Non-Anagrams: If there are strings that are not anagrams of any other string, they should be in their own group.