Taro Logo

Shortest Palindrome

Medium
1 views
2 months ago

You are given a string s. You can convert s to a palindrome by adding characters in front of it.

Return the shortest palindrome you can find by performing this transformation.

Example 1: Input: s = "aacecaaa" Output: "aaacecaaa"

Example 2: Input: s = "abcd" Output: "dcbabcd"

Constraints: 0 <= s.length <= 5 * 10^4 s consists of lowercase English letters only.

Sample Answer
class Solution:
    def shortestPalindrome(self, s: str) -> str:
        # Naive Approach (TLE):
        # For each prefix of s, check if it's a palindrome.
        # If it is, construct the shortest palindrome.
        # Time complexity: O(n^2), where n is the length of s.
        # Space complexity: O(n)

        # Optimized Approach: KMP
        # The idea is to find the longest prefix of s that is also a suffix of its reverse.
        # 1. Reverse the string s to get rev_s
        # 2. Create a new string t = s + '#' + rev_s
        # 3. Compute the longest prefix suffix (LPS) array for t using the KMP algorithm.
        # 4. The value of LPS[-1] is the length of the longest prefix of s that is also a suffix of rev_s.
        # 5. The shortest palindrome is rev_s[:len(s) - LPS[-1]] + s

        def kmp_table(w):
            T = [0] * len(w)
            pos = 1
            cnd = 0
            while pos < len(w):
                if w[pos] == w[cnd]:
                    T[pos] = cnd + 1
                    pos += 1
                    cnd += 1
                elif cnd > 0:
                    cnd = T[cnd - 1]
                else:
                    pos += 1
            return T

        rev_s = s[::-1]
        t = s + '#' + rev_s
        lps = kmp_table(t)
        return rev_s[:len(s) - lps[-1]] + s

# Example Usage
solution = Solution()
print(solution.shortestPalindrome("aacecaaa"))  # Output: aaacecaaa
print(solution.shortestPalindrome("abcd"))     # Output: dcbabcd

Explanation:

The problem asks us to find the shortest palindrome that can be formed by adding characters to the beginning of a given string s. A naive approach would involve checking every prefix of s to see if it's a palindrome and constructing the shortest possible palindrome from that. However, this is inefficient.

The provided solution uses an optimized KMP (Knuth-Morris-Pratt) based approach. Here's how it works:

  1. Reverse the string: Create the reversed version of the input string, rev_s.
  2. Concatenate with a separator: Create a new string t by concatenating s, a special character #, and rev_s. The # ensures that we don't accidentally consider matches that cross the boundary between s and its reverse.
  3. Compute LPS array: Compute the Longest Proper Prefix which is also a Suffix (LPS) array for string t. The LPS array at each index i stores the length of the longest proper prefix of t[:i+1] which is also a suffix of t[:i+1].
  4. Determine the prefix: The last element of the LPS array, lps[-1], gives the length of the longest prefix of s that is also a suffix of rev_s.
  5. Construct the shortest palindrome: The shortest palindrome is formed by taking the portion of rev_s that isn't already a prefix of s (i.e., rev_s[:len(s) - lps[-1]]) and prepending it to the original string s.

Time Complexity: O(n)

  • Reversing the string s: O(n)

  • Creating the string t: O(n)

  • Computing the LPS array using the KMP algorithm: O(n)

  • Constructing the shortest palindrome: O(n)

    Therefore, the overall time complexity is O(n).

Space Complexity: O(n)

  • rev_s: O(n)

  • t: O(n)

  • lps: O(n)

    Therefore, the overall space complexity is O(n).

Edge Cases:

  • Empty string: If the input string s is empty, the function should return an empty string, which it does correctly.
  • Already a palindrome: If the input string s is already a palindrome, the function should return the string itself. The algorithm correctly identifies this because lps[-1] will be equal to the length of s.
  • String with only one character: If the input string s has only one character, it is already a palindrome, and the function should return the string itself. The algorithm handles this case correctly.
  • String with repeating characters: Strings like "aaaa" are correctly handled as they are already palindromes.

KMP Algorithm Explanation:

The kmp_table function computes the LPS array (also known as the prefix function) for a given string w. This table is a crucial part of the KMP algorithm. The LPS array is used to avoid unnecessary comparisons when a mismatch occurs during the search for a pattern in a text. The algorithm efficiently determines the longest proper prefix of a string that is also a suffix. Here's a step-by-step explanation of the code:

  1. Initialization:

    • T = [0] * len(w): Initializes the LPS array T with zeros. The length of T is the same as the length of the input string w.
    • pos = 1: pos is the index of the character currently being processed in the string w. It starts from 1 because the LPS value for the first character is always 0.
    • cnd = 0: cnd (candidate) is the index of the end of the longest proper prefix that is also a suffix. It starts from 0.
  2. Loop through the string:

    • The while pos < len(w) loop iterates through the string w from the second character onwards.
  3. Character Match:

    • if w[pos] == w[cnd]:
      • If the current character w[pos] matches the character at index cnd, it means we can extend the length of the longest proper prefix that is also a suffix.
      • T[pos] = cnd + 1: We update the LPS value for the current position pos to cnd + 1.
      • pos += 1: Move to the next character in the string.
      • cnd += 1: Increment cnd to reflect the extended prefix/suffix.
  4. Character Mismatch:

    • elif cnd > 0:
      • If there is a mismatch and cnd is greater than 0, it means we have a non-empty candidate prefix. We need to find a shorter prefix that might match.
      • cnd = T[cnd - 1]: We update cnd to the LPS value of the previous index (cnd - 1). This is the crucial step in the KMP algorithm that avoids restarting the search from the beginning. It leverages the information already computed in the LPS array.
  5. No Match (cnd is 0):

    • else:
      • If cnd is 0, it means there is no matching prefix. We simply move to the next character in the string.
      • pos += 1
  6. Return the LPS array:

    • return T: After the loop completes, the function returns the computed LPS array T.

In summary, the kmp_table function efficiently computes the LPS array, which is a key component of the KMP algorithm, by iteratively comparing characters and leveraging previously computed LPS values to avoid redundant comparisons. The LPS array is then used to find the shortest palindrome by prepending the necessary characters to the original string.