Taro Logo

Decoded String at Index

Medium
a month ago

You are given an encoded string s. To decode the string to a tape, the encoded string is read one character at a time and the following steps are taken:

  • If the character read is a letter, that letter is written onto the tape.
  • If the character read is a digit d, the entire current tape is repeatedly written d - 1 more times in total.

Given an integer k, return the k<sup>th</sup> letter (1-indexed) in the decoded string.

Example 1:

Input: s = "leet2code3", k = 10
Output: "o"
Explanation: The decoded string is "leetleetcodeleetleetcodeleetleetcode".
The 10<sup>th</sup> letter in the string is "o".

Example 2:

Input: s = "ha22", k = 5
Output: "h"
Explanation: The decoded string is "hahahaha".
The 5<sup>th</sup> letter is "h".

Example 3:

Input: s = "a2345678999999999999999", k = 1
Output: "a"
Explanation: The decoded string is "a" repeated 8301530446056247680 times.
The 1<sup>st</sup> letter is "a".
Sample Answer
def decodeAtIndex(s: str, k: int) -> str:
    size = 0
    # Calculate the size of the decoded string
    for char in s:
        if char.isdigit():
            size *= int(char)
        else:
            size += 1

    for i in range(len(s) - 1, -1, -1):
        char = s[i]
        k %= size
        if k == 0 and char.isalpha():
            return char

        if char.isdigit():
            size //= int(char)
        else:
            size -= 1

    return ""

Naive Approach

The naive approach would be to actually decode the entire string as described in the problem statement. This means iterating through the input string s, and for each character:

  1. If the character is a letter, append it to the decoded string.
  2. If the character is a digit d, repeat the current decoded string d-1 times and append the result to the decoded string.

Once the entire string s is processed, we can simply return the kth character of the decoded string. However, this is highly inefficient and not practical for large decoded strings or large values of k.

# Naive Approach (will cause MemoryError for larger inputs)
def decodeAtIndexNaive(s: str, k: int) -> str:
    decoded_string = ""
    for char in s:
        if char.isdigit():
            repeat_count = int(char) - 1
            decoded_string *= (repeat_count + 1)
        else:
            decoded_string += char
    return decoded_string[k - 1]

The major problem with this approach is the memory usage. As the decoded string grows exponentially, it will quickly exhaust available memory.

Optimal Approach

The optimal approach is to work backward, without ever actually constructing the full decoded string. The core idea is to determine the length of the decoded string and then trace back to find the k-th character.

  1. Determine the Decoded String Length: Iterate through the input string s to compute the total length (size) of the fully decoded string. When a letter is encountered, increment the size. When a digit d is encountered, multiply the size by d.

  2. Work Backwards: Iterate through the input string s in reverse order. At each character, adjust k and the decoded size.

    • If k is 0 and the current character is a letter, we've found the k-th character and return it.
    • If the current character is a digit d, divide the size by d and update k as k % size. This simulates undoing the repetition. If k becomes 0, it means k was a multiple of the previous size. We return the k-th character in the previous iteration. However, we don't return here.
    • If the current character is a letter, decrement the size.
def decodeAtIndex(s: str, k: int) -> str:
    size = 0
    # Calculate the size of the decoded string
    for char in s:
        if char.isdigit():
            size *= int(char)
        else:
            size += 1

    for i in range(len(s) - 1, -1, -1):
        char = s[i]
        k %= size
        if k == 0 and char.isalpha():
            return char

        if char.isdigit():
            size //= int(char)
        else:
            size -= 1

    return ""

Big(O) Runtime Analysis

  • Outer Loop (Calculating Size): Iterates through the string s once. This loop has a time complexity of O(N), where N is the length of s.
  • Inner Loop (Working Backwards): Iterates through the string s in reverse order once. This loop also has a time complexity of O(N).
  • Overall: O(N) + O(N) = O(N).

Therefore, the overall time complexity of the optimal approach is O(N), where N is the length of the input string s.

Big(O) Space Usage Analysis

  • The algorithm uses a constant amount of extra space for storing variables like size, k, and char. It doesn't create any large data structures that depend on the input size.

Therefore, the space complexity is O(1), which means constant space.

Edge Cases and How to Handle Them

  1. Empty Input String:

    • If the input string s is empty, the loops will not execute, and an empty string might be returned which should not happen as problem statement says that s will start with a letter, so this case is handled implicitly and no additional check is required.
  2. k = 1:

    • If k is 1, the algorithm should correctly identify the first character in the decoded string. The modulo operation ensures this.
  3. Large Repetition Factors:

    • If the input string contains large digits, the size can become very large. The modulo operation (k %= size) is essential to avoid unnecessarily large computations and potential overflow errors.
  4. k greater than the Decoded Length:

    • The problem guarantees that k is less than or equal to the length of the decoded string, so this case is implicitly handled correctly.
  5. String contains only a single character:

    • If the input string contains only a single letter, the algorithm should return that letter when k is 1.
  6. Overflow:

    • In languages like Python, integer overflow is automatically handled. However, in languages like C++ or Java, it's important to ensure that intermediate calculations do not exceed the maximum representable value of integers. This is handled by constantly using modulo operations.