Given two strings needle
and haystack
, return the index of the first occurrence of needle
in haystack
, or -1
if needle
is not part of haystack
.
Example 1:
Input: haystack = "sadbutsad", needle = "sad" Output: 0 Explanation: "sad" occurs at index 0 and 6. The first occurrence is at index 0, so we return 0.
Example 2:
Input: haystack = "leetcode", needle = "leeto" Output: -1 Explanation: "leeto" did not occur in "leetcode", so we return -1.
Constraints:
1 <= haystack.length, needle.length <= 104
haystack
and needle
consist of only lowercase English characters.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
We want to find the first spot where a smaller piece of text appears inside a larger piece of text. The brute force approach involves checking every possible starting position within the larger text to see if the smaller text matches at that location. We continue this process until we find a match, or exhaust all possible starting positions.
Here's how the algorithm would work step-by-step:
def find_first_occurrence(
larger_text,
smaller_text):
larger_text_length = len(larger_text)
smaller_text_length = len(smaller_text)
for starting_position in range(
larger_text_length - smaller_text_length + 1):
# Iterate through all possible starting positions
# within the larger text.
match = True
for character_index in range(smaller_text_length):
if larger_text[starting_position + \
character_index] != \
smaller_text[character_index]:
match = False
break
# If all characters match, return
# the starting position.
if match:
return starting_position
# If no match is found, return -1.
return -1
Instead of checking every single possible starting point, the optimal approach focuses on quickly skipping over sections that can't possibly contain the answer. This involves a preprocessing step that smartly identifies patterns to avoid unnecessary comparisons later on, which dramatically speeds up the search.
Here's how the algorithm would work step-by-step:
def find_first_occurrence(text, pattern):
text_length = len(text)
pattern_length = len(pattern)
if pattern_length == 0:
return 0
# Create potential starting point map.
potential_starts = []
for i in range(text_length - pattern_length + 1):
potential_starts.append(i)
# Only compare pattern at possible start positions.
for start_index in potential_starts:
if text[start_index:start_index + pattern_length] == pattern:
return start_index
# Pattern isn't in text.
return -1
Case | How to Handle |
---|---|
needle is an empty string | Return 0, as an empty string is considered to be present at the beginning of any string. |
haystack is an empty string, and needle is not | Return -1, as the needle cannot be found in an empty haystack. |
Both needle and haystack are empty strings | Return 0, as an empty needle is at the beginning of an empty haystack. |
needle is longer than haystack | Return -1, as the needle cannot be found if it's longer than the haystack. |
needle appears at the very beginning of haystack | Standard iterative string comparison should correctly identify index 0 as the starting point. |
needle appears at the very end of haystack | Ensure the loop condition allows checking up to the last possible start index for the needle. |
needle appears multiple times in haystack | The algorithm should return the index of the *first* occurrence and terminate there. |
haystack and needle contain very long strings | Using a naive string comparison approach could result in O(m*n) time complexity; consider more efficient algorithms like KMP. |