Given an array of strings strs
, group the anagrams together. You can return the answer in any order.
Example 1:
Input: strs = ["eat","tea","tan","ate","nat","bat"]
Output: [["bat"],["nat","tan"],["ate","eat","tea"]]
Explanation:
"bat"
."nat"
and "tan"
are anagrams as they can be rearranged to form each other."ate"
, "eat"
, and "tea"
are anagrams as they can be rearranged to form each other.Example 2:
Input: strs = [""]
Output: [[""]]
Example 3:
Input: strs = ["a"]
Output: [["a"]]
Constraints:
1 <= strs.length <= 104
0 <= strs[i].length <= 100
strs[i]
consists of lowercase English letters.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force strategy involves comparing every single word with every other word in the list to see if they are anagrams of each other. We will go through the list one by one, and for each word, we'll check all the remaining words to find its anagram partners.
Here's how the algorithm would work step-by-step:
def are_anagrams(first_word, second_word):
if len(first_word) != len(second_word):
return False
return sorted(first_word) == sorted(second_word)
def group_anagrams_brute_force(list_of_words):
grouped_anagrams = []
words_already_grouped = [False] * len(list_of_words)
# Iterate through each word to either form a new group or join an existing one.
for current_index, current_word in enumerate(list_of_words):
if words_already_grouped[current_index]:
continue
# Create a new group for the current word as it hasn't been grouped yet.
new_anagram_group = [current_word]
words_already_grouped[current_index] = True
# Compare the current word against all subsequent words to find its anagrams.
for next_index in range(current_index + 1, len(list_of_words)):
if not words_already_grouped[next_index]:
next_word = list_of_words[next_index]
if are_anagrams(current_word, next_word):
new_anagram_group.append(next_word)
# Mark this subsequent word as grouped so we don't process it again.
words_already_grouped[next_index] = True
grouped_anagrams.append(new_anagram_group)
return grouped_anagrams
The key insight is that words which are anagrams of each other will look identical if we rearrange their letters into a standard order. We can use this sorted version of each word as a unique signature to group all the original words that share it.
Here's how the algorithm would work step-by-step:
from collections import defaultdict
def group_anagrams(words):
# This dictionary will serve as our collection of labeled boxes.
# Using defaultdict simplifies adding the first word for a new anagram group.
anagram_groups = defaultdict(list)
for original_word in words:
# Sorting the letters of a word creates a unique signature for all its anagrams.
# This signature acts as the label for our boxes.
sorted_word_tuple = tuple(sorted(original_word))
# Place the original word into the box corresponding to its sorted letter signature.
anagram_groups[sorted_word_tuple].append(original_word)
return list(anagram_groups.values())
Case | How to Handle |
---|---|
Input array is empty | The solution should return an empty list as there are no strings to group. |
Input array contains a single string | The solution should return a list containing a single list with just that string. |
Input array contains empty strings | All empty strings are anagrams of each other and should be grouped together. |
All strings in the array are anagrams of each other | The solution should return a single list containing all the input strings. |
No strings in the array are anagrams of each other | The solution should return a list of lists, where each inner list contains only one string. |
Strings contain non-alphabetic characters like numbers or symbols | The sorting or character counting key generation handles these characters correctly, grouping strings with the same set of characters. |
Input contains strings with different cases (e.g., 'Act' and 'cat') | These are treated as distinct strings because of case sensitivity, so they will not be grouped together unless normalized first. |
Very large input array or very long strings | An efficient O(N*K) solution using a character count map as the key is preferred over an O(N*K*logK) sorting-based solution to avoid performance issues. |