Given a string s
consisting of lowercase English letters, return the first letter to appear twice.
Note:
a
appears twice before another letter b
if the second occurrence of a
is before the second occurrence of b
.s
will contain at least one letter that appears twice.For example:
Example 1:
s = "abccbaacz"
Output:
"c"
Explanation: The letter 'a' appears on the indexes 0, 5 and 6. The letter 'b' appears on the indexes 1 and 4. The letter 'c' appears on the indexes 2, 3 and 7. The letter 'z' appears on the index 8. The letter 'c' is the first letter to appear twice, because out of all the letters the index of its second occurrence is the smallest.
Example 2:
s = "abcdd"
Output:
"d"
Explanation: The only letter that appears twice is 'd' so we return 'd'.
What is the most efficient algorithm to solve this problem? Describe the Big O time and space complexity. Provide code. What are the edge cases?
The brute force approach involves iterating through the string and, for each character, checking if it appears again later in the string. This can be done by using nested loops.
def first_twice_naive(s: str) -> str:
for i in range(len(s)):
for j in range(i + 1, len(s)):
if s[i] == s[j]:
return s[i]
return ""
The outer loop runs n
times, and the inner loop runs up to n
times in the worst case, so the time complexity is O(n^2).
The space complexity is O(1) because it only uses a constant amount of extra space.
A more efficient solution utilizes a hash set to keep track of the characters encountered so far. As we iterate through the string, if a character is already in the set, it means we have found the first character that appears twice.
def first_twice_optimal(s: str) -> str:
seen = set()
for char in s:
if char in seen:
return char
seen.add(char)
return ""
The optimal solution iterates through the string once, so the time complexity is O(n).
The space complexity is O(1), or more precisely O(26), because in the worst-case scenario, we might store all 26 lowercase English letters in the set. This can be considered constant space since the size of the set is bounded by a fixed number and does not depend on the input string size beyond that fixed number.