Taro Logo

Find Resultant Array After Removing Anagrams

Easy
Asked by:
Profile picture
Profile picture
43 views
Topics:
ArraysStrings

You are given a 0-indexed string array words, where words[i] consists of lowercase English letters.

In one operation, select any index i such that 0 < i < words.length and words[i - 1] and words[i] are anagrams, and delete words[i] from words. Keep performing this operation as long as you can select an index that satisfies the conditions.

Return words after performing all operations. It can be shown that selecting the indices for each operation in any arbitrary order will lead to the same result.

An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase using all the original letters exactly once. For example, "dacb" is an anagram of "abdc".

Example 1:

Input: words = ["abba","baba","bbaa","cd","cd"]
Output: ["abba","cd"]
Explanation:
One of the ways we can obtain the resultant array is by using the following operations:
- Since words[2] = "bbaa" and words[1] = "baba" are anagrams, we choose index 2 and delete words[2].
  Now words = ["abba","baba","cd","cd"].
- Since words[1] = "baba" and words[0] = "abba" are anagrams, we choose index 1 and delete words[1].
  Now words = ["abba","cd","cd"].
- Since words[2] = "cd" and words[1] = "cd" are anagrams, we choose index 2 and delete words[2].
  Now words = ["abba","cd"].
We can no longer perform any operations, so ["abba","cd"] is the final answer.

Example 2:

Input: words = ["a","b","c","d","e"]
Output: ["a","b","c","d","e"]
Explanation:
No two adjacent strings in words are anagrams of each other, so no operations are performed.

Constraints:

  • 1 <= words.length <= 100
  • 1 <= words[i].length <= 10
  • words[i] consists of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of lengths for each string in the input array?
  2. Is the input array guaranteed to contain only lowercase English letters?
  3. If adjacent strings are anagrams, should I remove *both* of them, or just the second one in the sequence?
  4. What should be returned if the input array is null or empty?
  5. Is the comparison of anagrams case-sensitive?

Brute Force Solution

Approach

The brute force approach involves going through the list of words one by one. We compare each word to the word right after it to see if they are anagrams of each other. If they are, we remove the second word.

Here's how the algorithm would work step-by-step:

  1. Start with the first word in the list.
  2. Check if the current word and the next word are anagrams. Two words are anagrams if they contain the exact same letters, just in a different order.
  3. If the current word and the next word are anagrams, remove the next word from the list.
  4. If they are not anagrams, move on to the next word as the current word.
  5. Keep doing this until we reach the end of the list.
  6. The remaining words in the list will be the words after removing anagrams that were next to each other.

Code Implementation

def find_resultant_array_after_removing_anagrams(words):
    index = 0
    while index < len(words) - 1:
        # Compare current word to the next word.
        current_word = words[index]
        next_word = words[index + 1]

        # Check if the current word and the next word are anagrams.
        if sorted(current_word) == sorted(next_word):
            # Remove the next word if it is an anagram.
            words.pop(index + 1)

        # Move to the next word if they are not anagrams.
        else:
            index += 1

    return words

Big(O) Analysis

Time Complexity
O(n*k)The outer loop iterates up to n times in the worst case, where n is the number of words. Inside the loop, the anagram check involves comparing the sorted versions of two adjacent words, each of length at most k, where k is the maximum length of a word. Sorting a word of length k takes O(k log k) time. Since we are comparing already existing words we are not re-sorting each time, but comparing the words by character which takes O(k) time. Therefore the complexity of the anagram check is O(k). Thus, the overall time complexity becomes O(n*k).
Space Complexity
O(1)The algorithm iterates through the input list and potentially modifies it in-place. The plain English explanation describes comparing adjacent words and removing the second one if they are anagrams. This does not inherently require any auxiliary data structures that scale with the input size N (where N is the number of words in the input list). While string comparison might use temporary variables, these are constant in size. Therefore, the auxiliary space complexity is constant.

Optimal Solution

Approach

The core idea is to go through the list of words one by one, checking if consecutive words are anagrams. If two adjacent words are anagrams, we remove the second word. This continues until no adjacent anagrams remain, resulting in the final list.

Here's how the algorithm would work step-by-step:

  1. Start by looking at the first word in the list.
  2. Compare this word with the next word to see if they are anagrams (meaning they contain the same letters, possibly in a different order).
  3. If the two words are anagrams, remove the second word from the list.
  4. If the two words are not anagrams, move to the next word in the list and repeat the anagram check.
  5. Keep doing this, comparing each word with the word that comes after it, until you reach the end of the list.
  6. After going through the entire list once, repeat the whole process from the beginning to ensure that all adjacent anagrams have been removed, as removing one pair might have created a new pair.
  7. Continue repeating this process until you can go through the entire list without removing any words, meaning there are no more adjacent anagrams.

Code Implementation

def find_resultant_array_after_removing_anagrams(words):
    while True:
        removed_anagram = False
        index = 0
        while index < len(words) - 1:
            # Anagram check ensures order doesn't matter.
            if sorted(words[index]) == sorted(words[index + 1]):

                words.pop(index + 1)
                removed_anagram = True

                # Reset index to recheck after removal.
                index = 0

            else:
                index += 1

        # If no anagrams were removed, the loop can terminate.
        if not removed_anagram:
            break

    return words

Big(O) Analysis

Time Complexity
O(n*k)The outer loop iterates until no adjacent anagrams are found. In the worst case, this could iterate n times. The inner loop iterates through the list of strings, which has a maximum length of n. Checking if two strings are anagrams has a complexity of O(k), where k is the maximum length of a string. Therefore, the overall time complexity is O(n*k), because in each outer loop we iterate a list of n strings and perform the anagram check (of length k). The number of steps scales linearly with the list and the cost of an anagram check.
Space Complexity
O(1)The algorithm iterates through the list of words and removes anagrams in-place. No auxiliary data structures, such as temporary lists or hash maps, are explicitly created to store information about the words during the comparison or removal process. The space used for temporary variables like loop counters or to facilitate the in-place removal of elements is constant regardless of the number of words in the input list. Therefore, the auxiliary space complexity is O(1).

Edge Cases

Empty input list
How to Handle:
Return an empty list immediately as there are no elements to process.
List with a single string
How to Handle:
Return the original list as a single string cannot form an anagram pair.
List with two anagrams
How to Handle:
The algorithm should correctly identify and remove one of the anagrams, resulting in a list with only one element.
List with consecutive anagrams (e.g., ['abc', 'cba', 'bac'])
How to Handle:
The solution should remove 'cba' and 'bac' leaving just 'abc'.
List with non-anagram strings
How to Handle:
The solution should retain all strings that are not anagrams of their adjacent element.
List with strings of varying lengths, including very long strings
How to Handle:
Ensure that the anagram comparison method (e.g., sorting) is efficient enough to handle long strings, and consider potential performance bottlenecks.
List with Unicode strings
How to Handle:
Ensure the anagram detection works correctly with Unicode characters, considering encoding and character representation.
List with strings containing spaces or special characters
How to Handle:
Consider whether spaces and special characters should be considered when determining if two strings are anagrams, and handle accordingly.