Taro Logo

Reconstruct Original Digits from English

Medium
Asked by:
Profile picture
6 views
Topics:
Strings

Given a string s containing an out-of-order English representation of digits 0-9, return the digits in ascending order.

Example 1:

Input: s = "owoztneoer"
Output: "012"

Example 2:

Input: s = "fviefuro"
Output: "45"

Constraints:

  • 1 <= s.length <= 105
  • s[i] is one of the characters ["e","g","f","i","h","o","n","s","r","u","t","w","v","x","z"].
  • s is guaranteed to be valid.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected range for the length of the input string? Should I anticipate very large inputs?
  2. Will the input string only contain lowercase English letters?
  3. Is the input guaranteed to be a jumbled concatenation of valid English digit words (zero, one, two, ..., nine), or could it contain other characters or invalid words?
  4. If the input string does not represent a valid combination of digit words, what should be the return value? Should I return an empty string, null, or throw an exception?
  5. Are the digit words guaranteed to be in English, or could they be in another language?

Brute Force Solution

Approach

The brute force way to solve this puzzle is to try every possible combination of digits, represented by their English spellings, to see if they match the input string. This means trying to construct the original digits, one by one, and checking if we can form the input.

Here's how the algorithm would work step-by-step:

  1. Consider all possible counts of the digit zero.
  2. For each count of zero, subtract the letters used to spell out that many zeros from the input string.
  3. If after removing letters, you have invalid letters (letters with negative counts), reject this path.
  4. Repeat the process above for digit one. Remember to check if the path is invalid before proceeding with any further steps.
  5. Continue this process for all digits from zero to nine.
  6. If, at the end, all letters are used from the input and there are no invalid paths, you have found a solution. This means that the counts of all the digits represent the original number sequence.
  7. If there are multiple solutions, determine which solution is the correct one by forming numbers with discovered digits and comparing them with original input.

Code Implementation

def reconstruct_original_digits_brute_force(input_string):
    all_digits = ['zero', 'one', 'two', 'three', 'four', 'five', 'six', 'seven', 'eight', 'nine']
    digit_counts = [0] * 10
    best_solution = None

    def backtrack(remaining_string, current_digit_counts, digit_index):
        nonlocal best_solution

        #If all digits have been considered
        if digit_index == 10:
            if not remaining_string:
                current_solution = "".join([str(i) * current_digit_counts[i] for i in range(10)])
                if best_solution is None or current_solution < best_solution:
                    best_solution = current_solution
                return
            else:
                return

        # Try different counts for the current digit
        for count in range(11):
            new_remaining_string = remaining_string
            valid_path = True

            #Subtract the letters used to spell out that many zeros from the input string
            for _ in range(count):
                for char in all_digits[digit_index]:
                    if char in new_remaining_string:
                        new_remaining_string = new_remaining_string.replace(char, '', 1)
                    else:
                        valid_path = False
                        break
                if not valid_path:
                    break

            if valid_path:
                current_digit_counts[digit_index] = count

                # Recursive call to the next digit with the updated string
                backtrack(new_remaining_string, current_digit_counts[:], digit_index + 1)

    backtrack(input_string, digit_counts, 0)

    return best_solution if best_solution else ''

Big(O) Analysis

Time Complexity
O(10!)The algorithm iterates through all possible counts for each digit from zero to nine. In the worst case, where the input string contains many letters allowing various digit counts, the number of iterations will be proportional to the number of ways to arrange digits from 0 to 9 which is factorial of the number of digits. Since there are 10 digits, the time complexity is O(10!). Since factorials grow faster than exponential it can also be seen as greater than O(2^n), however given the prompt only covers digits 0 through 9, the factorial representation is the most accurate. Since the maximum possible length of the string is constrained, this algorithm effectively has a constant time complexity, dominated by the possible permutations of digit counts.
Space Complexity
O(1)The provided solution explores possible digit counts, subtracting letters from the input string as it goes. While the algorithm conceptually modifies the input string, no persistent auxiliary data structures that scale with the input size are described. The described operations involve counting and subtracting characters which could be done in place or with a constant amount of extra space, such as storing counts for each digit or letter. Therefore, the space complexity remains constant, regardless of the input string's length N.

Optimal Solution

Approach

The problem involves figuring out which digits are hidden within a jumbled word made from spelling out those digits. The efficient approach is to count the letters in the jumbled word and then deduce the digits based on unique letter patterns. We strategically identify digits that have unique letter occurrences to count them directly.

Here's how the algorithm would work step-by-step:

  1. Count how many times each letter appears in the input word.
  2. Identify digits based on unique letters. For example, the letter 'z' only appears in 'zero', so count how many 'z's there are to find the number of zeros.
  3. Similarly, the letter 'w' is unique to 'two', 'u' is unique to 'four', 'x' is unique to 'six', and 'g' is unique to 'eight'. Use these to determine counts of two, four, six, and eight.
  4. After finding these, some letters are no longer unique. For instance, the letter 'h' appears in 'three' and 'eight'. Since we know how many 'eights' there are, we can subtract that count from the number of 'h's to find the number of 'threes'.
  5. Continue this process: 'o' is in 'zero', 'two', 'four'. After finding counts for zero, two and four, subtract those counts from the total 'o' count to determine the number of 'ones'.
  6. Follow a similar deduction process to find counts for 'five', 'seven', and 'nine' by eliminating overlapping letters from previously determined digits. This involves checking letters like 'f', 'v', 's', 'i', and 'n'.
  7. Finally, create a string of digits by repeating each digit according to its calculated count, sorted in ascending order.

Code Implementation

def reconstruct_original_digits(input_string):
    letter_counts = {}
    for char in input_string:
        letter_counts[char] = letter_counts.get(char, 0) + 1

    digit_counts = [0] * 10

    # Count digits based on unique letters.
    digit_counts[0] = letter_counts.get('z', 0)

    digit_counts[2] = letter_counts.get('w', 0)

    digit_counts[4] = letter_counts.get('u', 0)

    digit_counts[6] = letter_counts.get('x', 0)

    digit_counts[8] = letter_counts.get('g', 0)

    # Deduce remaining digits after handling unique letters.
    digit_counts[3] = letter_counts.get('h', 0) - digit_counts[8]

    digit_counts[1] = letter_counts.get('o', 0) - digit_counts[0] - digit_counts[2] - digit_counts[4]

    digit_counts[5] = letter_counts.get('f', 0) - digit_counts[4]

    digit_counts[7] = letter_counts.get('v', 0) - digit_counts[5]

    # 'i' appears in five, six, eight, nine. Determine nine last
    digit_counts[9] = letter_counts.get('i', 0) - digit_counts[5] - digit_counts[6] - digit_counts[8]

    result = ''
    for digit in range(10):
        result += str(digit) * digit_counts[digit]

    return result

Big(O) Analysis

Time Complexity
O(n)The algorithm first counts the frequency of each letter in the input string of length n. This step takes O(n) time. Subsequently, it iterates through a fixed number of digits (0-9) and their corresponding English spellings to count the occurrences of each digit based on unique letters and deduction. Since the number of digits is constant (10), this part of the algorithm takes constant time, O(1). Finally, constructing the resulting string involves iterating through the counts of each digit, which again takes at most O(n) time in the worst case (e.g., the input string is all 'one'). Overall, the dominant factor is the initial letter counting and string construction, leading to a time complexity of O(n).
Space Complexity
O(1)The solution uses a fixed-size array or hash map to count letter frequencies, an array to store digit counts, and a string builder for the final result. The size of the frequency map and digit counts are determined by the English alphabet (26 letters) and the number of digits (10), respectively, both of which are constants. Therefore, the auxiliary space used does not depend on the input string's length, denoted as N, and remains constant. Thus, the space complexity is O(1).

Edge Cases

CaseHow to Handle
Null or empty input stringReturn an empty list or null, based on problem specifications, since there are no digits to reconstruct.
Input string contains characters other than lowercase English lettersReject the input or preprocess by removing/ignoring non-alphabetic characters to avoid incorrect counts.
Input string is extremely long (close to the maximum allowable string length)Ensure the character counting and digit reconstruction processes are efficient enough to handle large inputs without exceeding time limits.
Input string contains letters that don't form any valid digit wordsReturn an empty list or a predefined error value, indicating no valid digit reconstruction is possible.
Input string contains a combination of letters that could lead to multiple valid digit sequencesThe algorithm should deterministically choose one valid sequence based on a predefined priority (e.g., always prioritizing certain digit words).
The frequency counts of certain letters are zero or negative after processing some digitsHandle the edge case gracefully, by either detecting the error and returning an empty list or by continuing the algorithm ignoring the negative/zero values as they imply invalid cases.
The reconstructed digits, when concatenated, result in leading zeros (e.g., 'zero zero')Ensure the resulting list is properly formatted by stripping leading zeros if required, depending on problem specifications.
Integer overflow when counting character frequencies or digit occurrences.Use appropriate data types (e.g., long) for storing frequency counts to prevent potential integer overflows.