Taro Logo

Minimum Number of Steps to Make Two Strings Anagram

Medium
Google logo
Google
3 views
Topics:
Strings

You are given two strings of the same length s and t. In one step you can choose any character of t and replace it with another character. Return the minimum number of steps to make t an anagram of s.

An Anagram of a string is a string that contains the same characters with a different (or the same) ordering.

For example:

  1. If s = "bab", and t = "aba", the output is 1. You can replace the first 'a' in t with b, t = "bba" which is an anagram of s.
  2. If s = "leetcode", and t = "practice", the output is 5. You must replace 'p', 'r', 'a', 'i' and 'c' from t with proper characters to make t an anagram of s.
  3. If s = "anagram", and t = "mangaar", the output is 0. Because "anagram" and "mangaar" are anagrams.

Write a function to efficiently compute the minimum number of steps to make t an anagram of s. What is the time and space complexity of your approach?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Are the input strings case-sensitive, or should I treat uppercase and lowercase letters as the same?
  2. Can the input strings be empty or null?
  3. Are the input strings guaranteed to contain only ASCII characters, or should I consider Unicode?
  4. If the strings are already anagrams of each other, should I return 0?
  5. Is there a maximum length for the input strings?

Brute Force Solution

Approach

The brute force method to find the minimum steps to make two strings anagrams involves checking every possible manipulation of the first string. We generate all possible variations of the first string and compare them to the second string, counting the necessary steps for each.

Here's how the algorithm would work step-by-step:

  1. Think of each character in the first string as something you can potentially change.
  2. Start by considering every possible way to change just one character in the first string to match the second string.
  3. Next, consider all the ways you could change two characters, then three, and so on.
  4. For each of these possibilities, compare the altered first string to the second string.
  5. If the altered string matches the second string (meaning they are anagrams), record the number of changes it took to get there.
  6. Keep doing this, increasing the number of changes you are trying each time, until you find a combination that makes the strings anagrams.
  7. Of all the possible combinations you have checked, pick the one with the smallest number of changes. That's your answer: the minimum number of steps.

Code Implementation

def min_steps_to_anagram_brute_force(first_string, second_string):
    string_length = len(first_string)
    minimum_changes = float('inf')

    # Iterate through all possible number of changes
    for number_of_changes in range(string_length + 1):
        possible_combinations = generate_combinations(string_length, number_of_changes)

        for combination in possible_combinations:
            modified_string = list(first_string)

            # Iterate through the indices to be changed
            for index in combination:
                # Try all possible characters for the replacement
                for character_code in range(ord('a'), ord('z') + 1):
                    replacement_character = chr(character_code)
                    modified_string[index] = replacement_character
                    modified_string_joined = "".join(modified_string)

                    # Check if the modified string is an anagram
                    if is_anagram(modified_string_joined, second_string):

                        # Update the minimum changes if needed
                        minimum_changes = min(minimum_changes, number_of_changes)

    if minimum_changes == float('inf'):
        return -1
    else:
        return minimum_changes

def generate_combinations(string_length, number_of_changes):
    if number_of_changes == 0:
        return [[]]

    if number_of_changes > string_length:
        return []

    combinations = []
    if number_of_changes == 1:
        for i in range(string_length):
            combinations.append([i])
        return combinations

    for i in range(string_length):
        remaining_length = string_length - (i + 1)
        remaining_changes = number_of_changes - 1
        sub_combinations = generate_combinations(remaining_length + i, remaining_changes)

        for sub_combination in sub_combinations:
            if len(sub_combination) > 0:
                valid = True
                for index in sub_combination:
                    if index < i:
                        valid = False
                        break
                if valid:
                    new_combination = [i]
                    for index in sub_combination:
                        new_combination.append(index)
                    combinations.append(new_combination)
            else:
                combinations.append([i])

    return combinations

def is_anagram(first_string, second_string):
    if len(first_string) != len(second_string):
        return False

    first_string_counts = {}
    for char in first_string:
        first_string_counts[char] = first_string_counts.get(char, 0) + 1

    second_string_counts = {}
    for char in second_string:
        second_string_counts[char] = second_string_counts.get(char, 0) + 1

    return first_string_counts == second_string_counts

Big(O) Analysis

Time Complexity
O(26^n * n!)The brute force approach involves generating all possible variations of the first string by changing characters. For a string of length n, with 26 possible characters for each position, the number of possible altered strings we need to consider increases dramatically. For each of the n positions, we may try substituting any of the 26 characters. In the worst case, we have to try every possible combination of character substitutions. Consider generating all permutations of the first string. There are n! permutations. And, in each case, each character can be swapped with the other 26 possible characters, giving us 26^n permutations. Comparing each altered string to the second string takes O(n) time. This results in an incredibly high time complexity.
Space Complexity
O(N!)The brute force approach, as described, involves generating all possible variations of the first string to compare against the second. In the worst case, we're exploring all permutations of the input string. The number of possible permutations for a string of length N is N! Therefore, potentially storing all these variations either implicitly (via recursion) or explicitly requires auxiliary space proportional to the number of permutations. This results in a space complexity of O(N!).

Optimal Solution

Approach

To find the fewest changes to make two words anagrams, we focus on what's different between them. We count the letters in each word and then see which letters need to be added or removed to make them match.

Here's how the algorithm would work step-by-step:

  1. Count how many times each letter appears in the first word.
  2. Count how many times each letter appears in the second word.
  3. Compare the counts for each letter. Find the letters where the counts are different.
  4. For each letter where the first word has more, figure out how many extra of that letter there are.
  5. Add up all the extra counts from the previous step. This total is the minimum number of steps needed to make the words anagrams.

Code Implementation

def min_steps_to_anagram(first_string, second_string):
    first_string_letter_counts = {}
    second_string_letter_counts = {}

    for letter in first_string:
        first_string_letter_counts[letter] = first_string_letter_counts.get(letter, 0) + 1
    
    for letter in second_string:
        second_string_letter_counts[letter] = second_string_letter_counts.get(letter, 0) + 1
    
    steps_needed = 0

    #We iterate through the first string's character counts
    for letter, count in first_string_letter_counts.items():
        if letter in second_string_letter_counts:
            if count > second_string_letter_counts[letter]:

                #Calculate how many extra letters are present
                steps_needed += count - second_string_letter_counts[letter]
        else:
            #When the letter isn't in the second string, add the count
            steps_needed += count

    #We must account for letters ONLY in the second string 
    for letter, count in second_string_letter_counts.items():
        if letter not in first_string_letter_counts:
            steps_needed += count

    return steps_needed

Big(O) Analysis

Time Complexity
O(n)The solution iterates through the first string of length n to count character frequencies, taking O(n) time. It then iterates through the second string, also of length n, to count character frequencies, again taking O(n) time. Finally, it iterates through the character counts (at most 26, which is constant) to find the differences and sum the extra counts. Because the loop iterating character counts is constant, it has no bearing on the Big O. Therefore, the dominant operation is O(n) + O(n), which simplifies to O(n).
Space Complexity
O(1)The algorithm uses frequency counts for each letter in both strings. Since the number of possible characters is fixed (e.g., 26 for lowercase English letters), the space required to store these counts remains constant regardless of the input string length. Specifically, it stores two frequency arrays, each of a fixed size. Therefore, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Both strings are emptyReturn 0, as empty strings are anagrams of each other with zero steps.
One string is empty, the other is notReturn the length of the non-empty string, as all characters need to be changed.
Strings have different lengthsReturn the difference in lengths, as the extra characters in the longer string must be changed to match the shorter string's character frequencies after necessary changes have been made to the shorter string.
Strings are identical (already anagrams)Return 0, as no steps are required.
Strings contain only one distinct character, with vastly different countsThe frequency counts in the character count difference will handle this correctly.
Strings contain unicode charactersThe character count difference must support unicode characters or extended ASCII.
Strings are very long (approaching maximum string length)Ensure the character count uses a data structure capable of handling large character frequencies without memory issues.
Strings contain case-sensitive characters (e.g., 'a' vs. 'A')Consider converting both strings to lowercase or uppercase to handle case-insensitive anagrams or consider them different characters during counting.