Taro Logo

Minimum Operations to Make Character Frequencies Equal

Hard
Asked by:
Profile picture
17 views
Topics:
Strings

You are given a string s.

A string t is called good if all characters of t occur the same number of times.

You can perform the following operations any number of times:

  • Delete a character from s.
  • Insert a character in s.
  • Change a character in s to its next letter in the alphabet.

Note that you cannot change 'z' to 'a' using the third operation.

Return the minimum number of operations required to make s good.

Example 1:

Input: s = "acab"

Output: 1

Explanation:

We can make s good by deleting one occurrence of character 'a'.

Example 2:

Input: s = "wddw"

Output: 0

Explanation:

We do not need to perform any operations since s is initially good.

Example 3:

Input: s = "aaabc"

Output: 2

Explanation:

We can make s good by applying these operations:

  • Change one occurrence of 'a' to 'b'
  • Insert one occurrence of 'c' into s

Constraints:

  • 3 <= s.length <= 2 * 104
  • s contains only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string, and what characters can it contain (e.g., only lowercase English letters)?
  2. If it's impossible to make the character frequencies equal, what should I return?
  3. Do I need to handle an empty input string, and if so, what should the return value be?
  4. Is the goal to minimize the number of operations *across all characters*, or for each character to have the most frequent count?
  5. By 'operation', do you mean decrementing the count of a character by one? Can I only decrement, or can I also increment?

Brute Force Solution

Approach

The brute force approach to this problem involves exploring every conceivable way to modify the frequencies of characters in a string. We'll try making all frequencies the same by repeatedly reducing some frequencies, checking if the resulting string meets the requirement of having all frequencies equal.

Here's how the algorithm would work step-by-step:

  1. First, count how many times each letter appears in the word.
  2. Then, imagine a target frequency. Start with the lowest possible target frequency (which is 1).
  3. Now, for each target frequency, go through each letter's count.
  4. If a letter's count is bigger than the target frequency, figure out how many operations it would take to reduce it to the target frequency.
  5. Add up all the operations needed for each letter for this specific target frequency.
  6. If a letter's count is smaller than the target frequency, we don't do anything because we are only allowed to reduce counts, not increase them.
  7. Keep track of the number of operations required for each target frequency you test.
  8. Repeat this whole process with a slightly higher target frequency, like 2, then 3, and so on, up to the highest frequency that exists in the original word.
  9. After trying all target frequencies, find the smallest number of operations you recorded. That's the answer.

Code Implementation

def minimum_operations_to_equalize_frequencies_brute_force(word):
    character_counts = {}
    for character in word:
        character_counts[character] = character_counts.get(character, 0) + 1

    frequencies = list(character_counts.values())
    minimum_operations = float('inf')

    # Iterate through possible target frequencies.
    for target_frequency in range(1, max(frequencies) + 1):
        operations_for_target = 0
        # Calculate operations needed for this target.
        for frequency in frequencies:
            # We only reduce frequencies, never increase them.
            if frequency > target_frequency:

                operations_for_target += frequency - target_frequency

        minimum_operations = min(minimum_operations, operations_for_target)

    # Handle edge case for empty string
    if minimum_operations == float('inf'):
        return 0
    return minimum_operations

Big(O) Analysis

Time Complexity
O(n^2)The algorithm first counts character frequencies, which takes O(n) time, where n is the length of the input string. Then, it iterates through possible target frequencies, up to the maximum frequency which is at most n. For each target frequency, it iterates through all distinct character counts (at most 26, a constant). In each of these inner iterations, it calculates the operations needed to reduce frequencies, which is a constant-time operation. Thus, the total number of operations is bounded by n * 26, which simplifies to O(n^2) since we have n target frequencies and for each we iterate through the frequencies which are at most n. Ignoring the constants it leads to approximately n * n operations, which simplifies to O(n^2).
Space Complexity
O(1)The algorithm's space complexity is determined by the character frequency counts and a variable to store the minimum operations. The frequency counts are stored for each unique character, which is at most 26 (lowercase English alphabet). The minimum operations variable requires constant space. Thus, the space used remains constant regardless of the input string's length, making the space complexity O(1).

Optimal Solution

Approach

The goal is to find the smallest number of changes needed to make all character frequencies the same. The optimal strategy involves counting the frequency of each character, then systematically trying each possible frequency value and calculating the cost to achieve that frequency.

Here's how the algorithm would work step-by-step:

  1. First, count how many times each character appears in the string.
  2. Next, go through each possible frequency value, from 1 up to the maximum frequency any character has.
  3. For each of these possible frequency values, figure out how many changes you would need to make to get every character to have that frequency.
  4. To do this, for each character, if its frequency is higher than the target frequency, subtract the target frequency from the characters frequency. This is the number of characters we'd need to delete.
  5. If a character's frequency is lower than the target frequency, subtract the character's frequency from the target frequency. This is the number of characters we'd need to add. (Since adding characters is not possible we do not need to consider adding.)
  6. Sum up the number of deletions that are needed to achieve the frequency. This becomes the current total operations to set frequency equal to a given value.
  7. Remember the smallest number of changes you find across all the possible frequencies.
  8. Return the smallest number of changes you found.

Code Implementation

def min_operations_to_equal_frequency(input_string):
    frequency_map = {}
    for char in input_string:
        frequency_map[char] = frequency_map.get(char, 0) + 1

    frequencies = list(frequency_map.values())
    max_frequency = max(frequencies)
    min_operations = len(input_string)

    for target_frequency in range(1, max_frequency + 1):
        current_operations = 0

        # Calculate operations needed for the target frequency.
        for frequency in frequencies:
            if frequency > target_frequency:

                current_operations += frequency - target_frequency

        # Update minimum operations if current is smaller.
        min_operations = min(min_operations, current_operations)

    return min_operations

Big(O) Analysis

Time Complexity
O(n)The algorithm first counts character frequencies, which takes O(n) time where n is the length of the string. Then, it iterates through potential frequency values from 1 to the maximum frequency, which is at most n. For each frequency, it iterates through the character counts (at most 26, which is constant) to compute the number of operations, taking O(1) time. Since we iterate through a maximum of n frequencies and do constant work inside, the overall time complexity is O(n) + O(n)*O(1) = O(n). Thus, the dominant factor is the initial frequency counting and the iteration through all possible frequencies.
Space Complexity
O(1)The algorithm uses a frequency count, which in the worst case, stores the frequencies of all unique characters in the input string. However, the number of unique characters is limited by the character set (e.g., 26 for lowercase English letters). It also uses a few integer variables to track the target frequency and the minimum operations. Therefore, the space used is constant and independent of the input string's length, N.

Edge Cases

Empty string input
How to Handle:
Return 0 since there are no characters to modify.
String with a single character
How to Handle:
Return 0 as all character frequencies are already equal (frequency of 1).
String with all identical characters
How to Handle:
Return 0, as frequencies are already equal.
String with maximum allowed length (consider memory)
How to Handle:
Ensure the frequency counting method scales efficiently (e.g., using a fixed-size array for character frequencies if the character set is limited).
String with a highly skewed character distribution (one character dominates)
How to Handle:
The core logic should still function correctly, accurately calculating necessary operations to balance frequencies.
Input string with all lowercase letters only.
How to Handle:
The frequency counting can be done with a simple array index from 'a'.
Integer overflow when calculating the number of operations (large frequencies)
How to Handle:
Use a data type with a larger range (e.g., long) to store the frequencies and intermediate calculations to prevent overflow.
When the same number of operations yields multiple character frequency solutions
How to Handle:
The problem seeks a minimum number of operations and does not ask for which characters or frequencies must be changed, hence any of the equally optimal solutions will satisfy the requirment.