Longest Subsequence Repeated k Times Interview Question for Meta

You are given a string s of length n, and an integer k. You are tasked to find the longest subsequence repeated k times in string s.

A subsequence is a string that can be derived from another string by deleting some or no characters without changing the order of the remaining characters.

A subsequence seq is repeated k times in the string s if seq * k is a subsequence of s, where seq * k represents a string constructed by concatenating seq k times.

For example, "bba" is repeated 2 times in the string "bababcba", because the string "bbabba", constructed by concatenating "bba" 2 times, is a subsequence of the string "**b**a**bab**c**ba**".

Return the longest subsequence repeated k times in string s. If multiple such subsequences are found, return the lexicographically largest one. If there is no such subsequence, return an empty string.

For example:

s = "letsleetcode", k = 2
Output: "let"
Explanation: There are two longest subsequences repeated 2 times: "let" and "ete".
"let" is the lexicographically largest one.

s = "bb", k = 2
Output: "b"
Explanation: The longest subsequence repeated 2 times is "b".

s = "ab", k = 2
Output: ""
Explanation: There is no subsequence repeated 2 times. Empty string is returned.

Can you provide an efficient algorithm and code implementation to solve this problem? What is the time and space complexity of your solution?

Naive Approach

A brute-force approach would involve generating all possible subsequences of the input string s and then checking if each subsequence, when repeated k times, is also a subsequence of s. The longest lexicographically largest subsequence that meets this condition would be the answer.

Algorithm

Generate all subsequences of s.
For each subsequence seq, create a string repeated_seq by concatenating seq k times.
Check if repeated_seq is a subsequence of s.
Keep track of the longest subsequence that satisfies the condition and is lexicographically largest.

Code (Python)

def is_subsequence(s, t):
    i = 0
    j = 0
    while i < len(s) and j < len(t):
        if s[i] == t[j]:
            i += 1
        j += 1
    return i == len(s)

def longest_repeated_subsequence_naive(s, k):
    n = len(s)
    longest_subsequence = ""
    
    for i in range(2**n):
        subsequence = ""
        for j in range(n):
            if (i >> j) & 1:
                subsequence += s[j]
        
        repeated_subsequence = subsequence * k
        
        if is_subsequence(repeated_subsequence, s):
            if len(subsequence) > len(longest_subsequence) or \
               (len(subsequence) == len(longest_subsequence) and subsequence > longest_subsequence):
                longest_subsequence = subsequence
                
    return longest_subsequence

Time Complexity

Generating all subsequences takes O(2ⁿ) time.
Checking if a subsequence is repeated k times and is a subsequence of s takes O(k * len(subsequence) + n) in the worst case for each subsequence.
Overall, the time complexity is O(2ⁿ * (k * n + n)), which simplifies to O(2ⁿ * k * n).

Space Complexity

O(n) to store the subsequence and the repeated subsequence.

Optimal Approach: Dynamic Programming + Greedy

The optimal approach combines dynamic programming to precompute character positions and a greedy strategy to build the longest repeated subsequence.

Algorithm

Precompute Character Positions: Create a list of lists pos where pos[c] stores the indices of character c in the string s. This allows for quick lookups of character positions during subsequence construction.
Greedy Construction: Start with an empty subsequence. Iterate through characters 'z' to 'a' (lexicographically largest first). For each character c, try to append it to the current subsequence. Check if the new subsequence (repeated k times) is a subsequence of s. If it is, append c to the subsequence and continue. If not, move to the next character.
Subsequence Check: To efficiently check if a subsequence seq * k is a subsequence of s, maintain an index pointer into s. For each character in seq * k, find its next occurrence in s starting from the current index. If any character cannot be found, the subsequence is invalid.

Code (Python)

def longest_repeated_subsequence(s, k):
    pos = {c: [] for c in 'abcdefghijklmnopqrstuvwxyz'}
    for i, c in enumerate(s):
        pos[c].append(i)

    res = ""
    idx = -1

    for char in sorted(pos.keys(), reverse=True):
        new_res = res + char
        cur_idx = -1
        possible = True

        for _ in range(k):
            for c in new_res:
                indices = pos[c]
                next_idx = next((i for i in indices if i > cur_idx), None)

                if next_idx is None:
                    possible = False
                    break
                cur_idx = next_idx
            if not possible:
                break

        if possible:
            res = new_res

    return res

Time Complexity

O(n) to create the pos dictionary.
O(26 * k * len(res) * n) in the worst case, where len(res) is the length of the result which is at most n. In the worst case, len(res) can be n.
Overall, the time complexity is approximately O(n * k * n) which simplifies to O(kn²).

Space Complexity

O(26 * n) to store the pos dictionary (at most 26 characters, each with at most n positions).
O(n) to store the resulting subsequence.
Overall, the space complexity is O(n).

Edge Cases

Empty String: If the input string s is empty, the longest repeated subsequence is an empty string.
No Repeated Subsequence: If there is no subsequence that can be repeated k times, the function should return an empty string. Example: `s =

Longest Subsequence Repeated k Times

Solution

Naive Approach

Algorithm

Code (Python)

Time Complexity

Space Complexity

Optimal Approach: Dynamic Programming + Greedy

Algorithm

Code (Python)

Time Complexity

Space Complexity

Edge Cases