Taro Logo

Permutation in String

Medium
Goldman Sachs logo
Goldman Sachs
0 views
Topics:
StringsSliding Windows

Given two strings s1 and s2, return true if s2 contains a permutation of s1, or false otherwise.

In other words, return true if one of s1's permutations is the substring of s2.

Example 1:

Input: s1 = "ab", s2 = "eidbaooo"
Output: true
Explanation: s2 contains one permutation of s1 ("ba").

Example 2:

Input: s1 = "ab", s2 = "eidboaoo"
Output: false

Constraints:

  • 1 <= s1.length, s2.length <= 104
  • s1 and s2 consist of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the input strings s1 and s2 contain characters outside of the standard ASCII character set?
  2. What are the maximum lengths of s1 and s2? This will help me think about potential performance bottlenecks.
  3. Is it possible for either s1 or s2 to be null or empty strings? If so, what should the return value be?
  4. Is the comparison case-sensitive (e.g., should 'abc' be considered a permutation of 'bAc')?
  5. If s1 is longer than s2, can a permutation of s1 exist within s2? Should I return false immediately in that scenario?

Brute Force Solution

Approach

The brute force method for this problem is about checking every possible arrangement. We will try every possible rearrangement of the shorter string to see if it is contained within the larger string. This involves creating and comparing a large number of possibilities.

Here's how the algorithm would work step-by-step:

  1. First, consider all possible rearrangements of the letters in the shorter string.
  2. Then, go through the longer string piece by piece, looking at sections that are the same length as the shorter string.
  3. For each section of the longer string, check if it's an identical rearrangement of the shorter string that you've already generated.
  4. If you find one that matches, you've found a solution and can stop.
  5. If you go through every rearrangement and check every section of the longer string without finding a match, then no such rearrangement exists.

Code Implementation

def permutation_in_string_brute_force(short_string, long_string):
    from itertools import permutations

    # Generate all possible permutations of the short string
    all_permutations = [''.join(permutation) for permutation in permutations(short_string)]

    # Iterate through the long string with a sliding window
    for index in range(len(long_string) - len(short_string) + 1):
        substring_to_check = long_string[index:index + len(short_string)]

        # Iterate through all the permutations of the short string
        for possible_permutation in all_permutations:
            # Check if a permutation is a match to the current substring
            if substring_to_check == possible_permutation:

                # If match found, return true
                return True

    # If no permutation matches any substring, return false
    return False

Big(O) Analysis

Time Complexity
O(n! * m)Let n be the length of the shorter string (s1) and m be the length of the longer string (s2). Generating all permutations of the shorter string takes O(n!) time. For each permutation, we iterate through the longer string (s2) with length m, comparing each substring of length n to the generated permutation, taking O(m) time per permutation. Thus, the overall time complexity is O(n! * m).
Space Complexity
O(N!)The brute force method generates all possible rearrangements of the shorter string. In the worst-case scenario, the shorter string has length N, where N represents the length of the shorter string. Generating all permutations of a string of length N requires storing up to N! permutations. Thus, the auxiliary space is dominated by the space needed to store these permutations, resulting in O(N!) space complexity.

Optimal Solution

Approach

The trick is to avoid generating all possible arrangements. Instead, we focus on checking if the characters in the smaller string are present in a sliding window of the larger string, and if those characters appear with the correct frequency.

Here's how the algorithm would work step-by-step:

  1. First, count how many times each character appears in the smaller string.
  2. Next, slide a window across the larger string, where the window's length is the same as the smaller string's length.
  3. For each window, count how many times each character appears in the window.
  4. Compare the character counts of the window with the character counts of the smaller string. If they match exactly, you've found a permutation!
  5. If you find a match, you can stop and say that a permutation exists.
  6. If you reach the end of the larger string without finding a match, then no permutation exists.

Code Implementation

def find_permutation(smaller_string, larger_string):
    smaller_string_length = len(smaller_string)
    larger_string_length = len(larger_string)

    if smaller_string_length > larger_string_length:
        return False

    # Store char frequencies of the smaller string
    smaller_string_char_counts = {}
    for character in smaller_string:
        smaller_string_char_counts[character] = smaller_string_char_counts.get(character, 0) + 1

    # Initialize the sliding window's char counts
    window_char_counts = {}
    for i in range(smaller_string_length):
        character = larger_string[i]
        window_char_counts[character] = window_char_counts.get(character, 0) + 1

    # Check if the initial window is a permutation
    if window_char_counts == smaller_string_char_counts:
        return True

    # Slide the window across the larger string
    for i in range(1, larger_string_length - smaller_string_length + 1):
        # Remove the leftmost character from the window
        left_character = larger_string[i - 1]
        window_char_counts[left_character] -= 1

        if window_char_counts[left_character] == 0:
            del window_char_counts[left_character]

        # Add the rightmost character to the window
        right_character = larger_string[i + smaller_string_length - 1]
        window_char_counts[right_character] = window_char_counts.get(right_character, 0) + 1

        # Compare the current window with the smaller string
        if window_char_counts == smaller_string_char_counts:
            # Permutation found
            return True

    # No permutation found
    return False

Big(O) Analysis

Time Complexity
O(n)The time complexity is determined by the sliding window approach. Calculating character counts for the smaller string takes constant time as the string size is limited to 26 characters. Sliding the window of size equal to the smaller string across the larger string of size n involves n-k+1 operations (where k is the length of the smaller string). Comparing the character counts of the window with the character counts of the smaller string also takes constant time. Therefore, the overall time complexity is dominated by the single pass through the larger string, resulting in O(n).
Space Complexity
O(1)The algorithm uses constant auxiliary space. It stores character counts for the smaller string and the current window of the larger string. Since the character set size is limited (e.g., 26 for lowercase English letters), the character counts are stored in fixed-size data structures regardless of the input string lengths. Therefore, the space used does not depend on the size of the input strings and remains constant.

Edge Cases

CaseHow to Handle
s1 or s2 is null or emptyReturn false if either s1 or s2 is null or empty as a permutation cannot exist.
s1 is longer than s2Return false if s1 is longer than s2, as s2 cannot contain a permutation of s1.
s1 and s2 are identicalReturn true if s1 and s2 are identical, as s2 trivially contains a permutation of s1.
s1 and s2 contain only one characterCompare the characters directly and return true if they match, false otherwise.
s1 contains duplicate charactersThe sliding window and character frequency map approach will accurately account for duplicates.
s2 contains many repeating characters, but not a permutation of s1The frequency map comparison will not find a match and correctly return false.
Maximum length strings with Unicode charactersEnsure the chosen frequency map representation (e.g., hash map) can handle the Unicode character range and doesn't overflow, while considering time complexity constraints for very large inputs.
s1 is very long and contains many distinct characters, compared to a short s2The initial frequency map calculation of s1 might be expensive, but the sliding window should efficiently compare it with s2's substrings.