Taro Logo

Longest Happy Prefix

Hard
Bloomberg LP logo
Bloomberg LP
0 views
Topics:
StringsArraysTwo Pointers

A string is called a happy prefix if is a non-empty prefix which is also a suffix (excluding itself).

Given a string s, return the longest happy prefix of s. Return an empty string "" if no such prefix exists.

Example 1:

Input: s = "level"
Output: "l"
Explanation: s contains 4 prefix excluding itself ("l", "le", "lev", "leve"), and suffix ("l", "el", "vel", "evel"). The largest prefix which is also suffix is given by "l".

Example 2:

Input: s = "ababab"
Output: "abab"
Explanation: "abab" is the largest prefix which is also suffix. They can overlap in the original string.

Constraints:

  • 1 <= s.length <= 105
  • s contains only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of the input string?
  2. Is the input string guaranteed to be non-empty?
  3. If there are multiple happy prefixes of different lengths, should I return the longest one?
  4. If no happy prefix exists (other than the empty string), what should I return?
  5. Is the input string composed of only ASCII characters, or should I account for Unicode?

Brute Force Solution

Approach

The brute force approach to finding the longest happy prefix is like comparing every possible beginning of a word to its end. We check increasingly longer prefixes to see if they match the suffix of the same length.

Here's how the algorithm would work step-by-step:

  1. Start by looking at the very first letter and the very last letter. See if they're the same.
  2. Next, look at the first two letters and the last two letters. See if those groups are the same.
  3. Keep doing this, each time adding one more letter to both the beginning and the end groups.
  4. For each comparison, if the beginning group and the ending group are identical, remember the length of that group.
  5. Continue until you've reached a point where the beginning group and the ending group overlap or meet in the middle.
  6. The longest group you remembered that had matching beginning and ending sections is your answer.

Code Implementation

def longest_happy_prefix_brute_force(input_string):
    longest_happy_prefix = ""

    for prefix_length in range(1, len(input_string)):
        # Check prefixes of increasing length

        prefix = input_string[:prefix_length]
        suffix = input_string[len(input_string) - prefix_length:]

        # Need to compare the prefix with the suffix to find matches
        if prefix == suffix:

            longest_happy_prefix = prefix

    return longest_happy_prefix

Big(O) Analysis

Time Complexity
O(n²)The brute force approach iterates through possible prefix lengths from 1 up to n-1. For each prefix length, it compares the prefix of that length with the suffix of the same length. The comparison of the prefix and suffix of length 'k' takes O(k) time. Since 'k' iterates from 1 to n-1, the total time complexity is proportional to 1 + 2 + 3 + ... + (n-1), which is the sum of an arithmetic series. This sum is approximately n(n-1)/2 which simplifies to O(n²).
Space Complexity
O(1)The provided algorithm iteratively compares prefixes and suffixes of the input string without using any auxiliary data structures that scale with the input size, N, which is the length of the input string. It only stores the length of the longest matching prefix/suffix encountered so far. Since the number of stored variables remains constant irrespective of the input string's length, the space complexity is constant.

Optimal Solution

Approach

The goal is to find the longest part of a word that also appears at the beginning. Instead of checking every possible part, we use a clever trick to quickly identify matching sections, avoiding redundant comparisons.

Here's how the algorithm would work step-by-step:

  1. Imagine you have two 'pointers', one starting at the beginning of the word and the other at the end.
  2. Move both 'pointers' inward, one character at a time.
  3. At each step, check if the part of the word from the very beginning up to the first 'pointer' is exactly the same as the part of the word from the second 'pointer' to the very end.
  4. Keep doing this until the two 'pointers' meet or cross each other.
  5. The longest match you found before the pointers crossed is your answer - that's the longest part that's both a prefix and a suffix.

Code Implementation

def longest_happy_prefix(word):
    length_of_word = len(word)
    prefix_end_index = 0
    suffix_start_index = length_of_word - 1
    longest_match = ""

    # Iterate inward from both ends of the word.
    while prefix_end_index < suffix_start_index:
        prefix = word[0:prefix_end_index + 1]
        suffix = word[suffix_start_index:length_of_word]

        # Check if prefix and suffix match.
        if prefix == suffix:
            # Update the longest match found so far.
            longest_match = prefix

        # Move pointers inward.
        prefix_end_index += 1
        suffix_start_index -= 1

    return longest_match

Big(O) Analysis

Time Complexity
O(n²)The algorithm involves comparing a prefix of length 'i' with a suffix of length 'i' for increasing values of 'i' up to n/2. For each value of 'i', the comparison of the prefix and suffix takes O(i) time. Since 'i' iterates up to approximately n/2, the total time complexity is proportional to the sum of 'i' from 1 to n/2, which is (n/2) * (n/2 + 1) / 2. This simplifies to approximately n²/8 + n/4. Therefore, the dominant term is n², resulting in a time complexity of O(n²).
Space Complexity
O(1)The described algorithm uses two pointers that track positions within the input string, but it doesn't create any new data structures (like arrays, lists, or hash maps) to store intermediate results or track visited locations. The space required for these two pointers remains constant regardless of the length of the input string, denoted as N. Therefore, the auxiliary space complexity is constant. The algorithm's space usage does not grow with the input size.

Edge Cases

CaseHow to Handle
Null or empty input stringReturn an empty string immediately as there can be no prefix or suffix.
String of length 1Return an empty string as a prefix and suffix cannot overlap in this case.
String with all identical charactersIteratively reduce prefix length until a mismatch is found or the prefix is empty.
Very long string (approaching maximum string length)Ensure the chosen algorithm (e.g., KMP or rolling hash) is efficient and avoids excessive memory allocation or stack overflow.
String where the only happy prefix is the empty stringThe algorithm should correctly identify and return the empty string as the longest happy prefix when no other match exists.
String with overlapping prefix and suffix (e.g., 'ababab')The algorithm should correctly identify the longest overlapping prefix that is also a suffix, handling this overlap properly.
String containing unicode charactersThe algorithm should handle unicode strings correctly, considering that character comparisons may involve multi-byte representations.
String with maximum length and complex patternEnsure the time complexity of the chosen algorithm is optimal (e.g. O(n) using KMP) to handle the maximum input size within acceptable time limits.