Taro Logo

Find the Length of the Longest Common Prefix

Medium
Capital One logo
Capital One
0 views
Topics:
ArraysStrings

You are given two arrays with positive integers arr1 and arr2.

A prefix of a positive integer is an integer formed by one or more of its digits, starting from its leftmost digit. For example, 123 is a prefix of the integer 12345, while 234 is not.

A common prefix of two integers a and b is an integer c, such that c is a prefix of both a and b. For example, 5655359 and 56554 have common prefixes 565 and 5655 while 1223 and 43456 do not have a common prefix.

You need to find the length of the longest common prefix between all pairs of integers (x, y) such that x belongs to arr1 and y belongs to arr2.

Return the length of the longest common prefix among all pairs. If no common prefix exists among them, return 0.

Example 1:

Input: arr1 = [1,10,100], arr2 = [1000]
Output: 3
Explanation: There are 3 pairs (arr1[i], arr2[j]):
- The longest common prefix of (1, 1000) is 1.
- The longest common prefix of (10, 1000) is 10.
- The longest common prefix of (100, 1000) is 100.
The longest common prefix is 100 with a length of 3.

Example 2:

Input: arr1 = [1,2,3], arr2 = [4,4,4]
Output: 0
Explanation: There exists no common prefix for any pair (arr1[i], arr2[j]), hence we return 0.
Note that common prefixes between elements of the same array do not count.

Constraints:

  • 1 <= arr1.length, arr2.length <= 5 * 104
  • 1 <= arr1[i], arr2[i] <= 108

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of any individual string within the input array?
  2. Can the input array `strs` be empty or contain null or empty strings?
  3. Is the prefix case-sensitive?
  4. If no common prefix exists, should I return 0, or an empty string's length (which would also be 0), or something else?
  5. Are there any special characters or unicode characters that could be present in the strings?

Brute Force Solution

Approach

The goal is to find the longest set of characters that appear at the beginning of all the words in a given list. The brute force method involves checking progressively longer prefixes until we find one that doesn't match all words. If nothing matches, then the common prefix is nothing or 'empty'.

Here's how the algorithm would work step-by-step:

  1. Start by assuming the longest common prefix is the entire first word.
  2. Now, check if this entire prefix appears at the beginning of *every* other word in the list.
  3. If it does, then that's the answer!
  4. If not, shorten the prefix by one character from the end.
  5. Repeat the process: check if this shorter prefix appears at the beginning of *every* other word.
  6. Continue shortening the prefix one character at a time and checking until you find a prefix that appears at the beginning of all the words.
  7. If you shorten the prefix to nothing and still haven't found a match, then there is no common prefix, and the answer is nothing.

Code Implementation

def find_longest_common_prefix_brute_force(list_of_strings):
    if not list_of_strings:
        return ""

    longest_common_prefix = list_of_strings[0]

    # Iterate while the prefix is not empty
    while longest_common_prefix:
        # Check if current prefix is a prefix of all strings
        is_common = True
        for current_string in list_of_strings[1:]:
            if not current_string.startswith(longest_common_prefix):
                is_common = False
                break

        # If it is a common prefix, return it
        if is_common:
            return longest_common_prefix

        # Shorten the prefix by one character

        longest_common_prefix = longest_common_prefix[:-1]

    # If the loop finishes, no common prefix was found
    return ""

Big(O) Analysis

Time Complexity
O(S*N)Let S be the length of the shortest string in the input array and N be the number of strings in the input array. The outer loop iterates at most S times, shortening the prefix of the first string. Inside the outer loop, we iterate through the remaining N-1 strings to check if they start with the current prefix. Therefore, the time complexity is proportional to S multiplied by N, giving us O(S*N).
Space Complexity
O(1)The provided algorithm iteratively shortens the initial prefix. While there's implicit string slicing, these slices are often implemented in-place or return a reference to the original string (depending on the language), thus not creating a full copy. The dominant space usage comes from storing the length of the prefix and loop counters which are independent of the input size N (the number of words, or the length of the words). Therefore, the auxiliary space required is constant.

Optimal Solution

Approach

To efficiently find the longest common prefix of a list of words, we'll compare the words character by character. The goal is to stop as soon as we find a character that doesn't match across all words.

Here's how the algorithm would work step-by-step:

  1. Start by examining the first letter of every word in the list.
  2. If the first letter is the same for all words, move to the second letter and repeat the check.
  3. Continue this process, comparing letters at the same position in each word, until you find a position where the letters are different or you reach the end of one of the words.
  4. The longest common prefix is the string formed by all the letters that matched before you found a difference or reached the end of a word.

Code Implementation

def find_length_of_longest_common_prefix(list_of_strings):
    if not list_of_strings:
        return 0

    prefix_length = 0
    # Iterate char by char thru the first string.
    for character_index in range(len(list_of_strings[0])):
        character = list_of_strings[0][character_index]

        # Compare current char against all other strings.
        for string_index in range(1, len(list_of_strings)):
            # Check if index is out of range for the current string.
            if character_index >= len(list_of_strings[string_index]) or \
               list_of_strings[string_index][character_index] != character:
                return prefix_length

        # All chars matched for all strings.
        prefix_length += 1

    return prefix_length

Big(O) Analysis

Time Complexity
O(S)The algorithm iterates through the characters of the strings until a mismatch is found or the end of the shortest string is reached. The outer loop's iterations are bounded by the length of the shortest string in the input array, say L. The inner loop iterates through all the strings in the input array (say there are N strings). Thus the time complexity is O(L * N). We can also define S as the sum of all the string lengths, therefore L * N is less than or equal to S, which is the total number of characters across all strings. In the worst-case, we would need to compare every character of every word and the Big O time complexity can be expressed as O(S), where S represents total characters of all strings.
Space Complexity
O(1)The algorithm compares characters at the same position across all words. It doesn't use any additional data structures like arrays, lists, or hash maps to store intermediate results or track visited words. Only a constant amount of extra space is used to store index variables for iterating through the characters. Therefore, the space complexity is O(1), indicating constant auxiliary space usage regardless of the input size (number of words or length of the words).

Edge Cases

CaseHow to Handle
Empty input arrayReturn 0 immediately as there are no strings to compare.
Null input arrayThrow an IllegalArgumentException or return 0 after a null check, indicating invalid input.
Array with a single stringReturn the length of the single string, as it is trivially the longest common prefix.
All strings are identicalReturn the length of any of the strings, as they all have the same length and are the common prefix.
No common prefix exists (first characters are all different)The algorithm should correctly identify this and return 0.
One string is a prefix of another string (e.g., 'flower' and 'flow')The algorithm should correctly identify 'flow' as the longest common prefix and return its length.
Very long strings (performance consideration)The algorithm's efficiency should be considered to avoid excessive iteration in very large strings; iterative solutions are generally preferred.
Strings containing non-ASCII charactersThe algorithm should work correctly with Unicode/UTF-8 encoded characters, ensuring correct character comparisons.