Taro Logo

Find Beautiful Indices in the Given Array II

Hard
Palantir logo
Palantir
0 views
Topics:
ArraysTwo PointersSliding Windows

You are given a 0-indexed string s, a string a, a string b, and an integer k.

An index i is beautiful if:

  • 0 <= i <= s.length - a.length
  • s[i..(i + a.length - 1)] == a
  • There exists an index j such that:
    • 0 <= j <= s.length - b.length
    • s[j..(j + b.length - 1)] == b
    • |j - i| <= k

Return the array that contains beautiful indices in sorted order from smallest to largest.

Example 1:

Input: s = "isawsquirrelnearmysquirrelhouseohmy", a = "my", b = "squirrel", k = 15
Output: [16,33]
Explanation: There are 2 beautiful indices: [16,33].
- The index 16 is beautiful as s[16..17] == "my" and there exists an index 4 with s[4..11] == "squirrel" and |16 - 4| <= 15.
- The index 33 is beautiful as s[33..34] == "my" and there exists an index 18 with s[18..25] == "squirrel" and |33 - 18| <= 15.
Thus we return [16,33] as the result.

Example 2:

Input: s = "abcd", a = "a", b = "a", k = 4
Output: [0]
Explanation: There is 1 beautiful index: [0].
- The index 0 is beautiful as s[0..0] == "a" and there exists an index 0 with s[0..0] == "a" and |0 - 0| <= 4.
Thus we return [0] as the result.

Constraints:

  • 1 <= k <= s.length <= 5 * 105
  • 1 <= a.length, b.length <= 5 * 105
  • s, a, and b contain only lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the possible ranges and data types for the values within the input array, and can it contain negative numbers or zeros?
  2. What is the expected behavior if either string `a` or `b` is empty? Should I return an empty list or throw an exception?
  3. How large can the input array `nums`, and the strings `a` and `b` be (maximum length)?
  4. If an index `i` satisfies the 'beautiful index' criteria multiple times, should it be included multiple times in the result, or only once?
  5. Is the order of the returned 'beautiful indices' significant, or can I return them in any order?

Brute Force Solution

Approach

The brute force approach to finding beautiful indices is to check every possible index to see if it meets the beautiful criteria. We will go through all potential starting points and exhaustively check all other indices to determine which ones are 'beautiful'.

Here's how the algorithm would work step-by-step:

  1. For every position in the main piece of text, consider it a potential starting point.
  2. For each of these potential starting points, check if the first required pattern is found beginning at that location.
  3. If the first pattern is found, search the entire main piece of text to find any instances where the second pattern appears.
  4. For each appearance of the second pattern, check if it's close enough to the starting point of the first pattern based on the allowed distance.
  5. If the distance requirement is met, mark the starting position of the first pattern as a 'beautiful index'.
  6. Repeat this process for every position in the main piece of text to identify all 'beautiful indices'.

Code Implementation

def find_beautiful_indices_brute_force(text, pattern1, pattern2, distance):
    beautiful_indices = []

    for first_pattern_start in range(len(text)):
        # Check if pattern1 exists starting at this index
        if text[first_pattern_start:first_pattern_start + len(pattern1)] == pattern1:

            for second_pattern_start in range(len(text)):
                # Find all occurrences of pattern2 in the text
                if text[second_pattern_start:second_pattern_start + len(pattern2)] == pattern2:
                    # Check if pattern2 is within the distance from pattern1
                    if abs(first_pattern_start - second_pattern_start) <= distance:

                        # Mark the index as beautiful if the condition is met
                        beautiful_indices.append(first_pattern_start)
                        break # avoid duplicates

    return sorted(list(set(beautiful_indices)))

Big(O) Analysis

Time Complexity
O(n*m1*n*m2)The algorithm iterates through each index of the text array (n). For each index, it checks if pattern1 (length m1) matches. If it matches, it iterates through the entire text array again (n) to find matches of pattern2 (length m2). For each potential pattern2 match, it calculates the distance. Thus, the time complexity is proportional to n * m1 * n * m2, where m1 and m2 are the lengths of pattern1 and pattern2 respectively. This simplifies to O(n*m1*n*m2).
Space Complexity
O(1)The brute force approach iterates through the text and patterns using index variables. The described algorithm does not create any auxiliary data structures like lists, hash maps, or sets to store intermediate results or visited locations. Therefore, the space required is limited to a few constant-size variables, independent of the text's length (N). This results in constant auxiliary space complexity.

Optimal Solution

Approach

The efficient method avoids redundant checks by leveraging string matching algorithms. It first locates all occurrences of the search strings within the large text, then identifies 'beautiful' positions based on distance criteria between the located occurrences. This prevents repeatedly scanning the entire large text.

Here's how the algorithm would work step-by-step:

  1. First, find all starting places where the first search word appears in the big block of text. Record each of these locations.
  2. Next, do the same thing for the second search word: find all starting places and record them.
  3. Now, go through each starting place of the first search word.
  4. For each starting place, check if there's a starting place of the second search word nearby (within the given maximum distance).
  5. If a nearby starting place is found, mark the starting place of the first word as a 'beautiful' location.
  6. Keep doing this for all starting places of the first search word, marking each one as beautiful if it meets the distance requirement.
  7. Finally, return a list of all the marked 'beautiful' locations.

Code Implementation

def find_beautiful_indices(string, search_word1, search_word2, maximum_distance):    beautiful_indices = []    search_word1_indices = []    search_word2_indices = []    # Find all occurrences of search_word1 in string    for index in range(len(string) - len(search_word1) + 1):      if string[index:index + len(search_word1)] == search_word1:        search_word1_indices.append(index)    # Find all occurrences of search_word2 in string    for index in range(len(string) - len(search_word2) + 1):      if string[index:index + len(search_word2)] == search_word2:        search_word2_indices.append(index)    # Iterate through each index of search_word1    for search_word1_index in search_word1_indices:      # Check if a search_word2 index is within maximum_distance      for search_word2_index in search_word2_indices:
        if abs(search_word1_index - search_word2_index) <= maximum_distance:          # Mark index as beautiful if condition is met          beautiful_indices.append(search_word1_index)          break
    # Removing duplicates while preserving order    beautiful_indices = sorted(list(set(beautiful_indices)))    return beautiful_indices

Big(O) Analysis

Time Complexity
O(n*m)Finding all occurrences of the search strings within the large text of size n using string matching algorithms takes O(n) time for each search string. Since we have two search strings, the total time for finding all occurrences is O(n + n) which simplifies to O(n) assuming the lengths of the search strings are constant or relatively small compared to n. The nested loop iterates through all starting indices of the first word (at most n) and then iterates through all starting indices of the second word (at most m, potentially up to n in the worst case where search words overlap frequently), performing a constant-time distance check in each inner iteration. Thus, the overall time complexity becomes O(n * m) where m represents the number of occurrences of the second search word. In the worst case where there are a lot of occurences, m becomes proportional to n leading to O(n*n) which is O(n^2).
Space Complexity
O(N)The dominant space complexity stems from storing the starting indices of both the first search word and the second search word within the large text. In the worst-case scenario, where either search word appears at every position in the large text (of length N), the algorithm would store N indices for each word. Therefore, the auxiliary space is proportional to 2N, which simplifies to O(N).

Edge Cases

CaseHow to Handle
Empty s or p stringReturn an empty list if either s or p is empty, as no valid indices can exist.
s and p are identical stringsReturn a list containing only the index 0 since 'p' will be found at the very start of 's'.
p is longer than sReturn an empty list, as 'p' cannot be a substring of 's' if it's longer.
s and p contain only one repeating characterThe solution should efficiently find all occurrences; the KMP algorithm handles repeating character patterns correctly.
s contains many overlapping occurrences of pThe solution must iterate and check for each overlapping occurence to correctly identify all indices.
Very long s and relatively short p, approaching maximum string lengthEnsure the algorithm's time complexity is considered and optimizes the search for performance.
s contains Unicode charactersThe KMP algorithm should work correctly as it compares code points and not rely on ASCII assumption.
s and p are extremely long strings, potentially causing memory issuesConsider a rolling hash or similar technique to avoid storing the entire string in memory at once.