Taro Logo

Find Common Characters

Easy
Google logo
Google
1 view
Topics:
ArraysStrings

Given a string array words, return an array of all characters that show up in all strings within the words (including duplicates). You may return the answer in any order.

For example:

  • Input: words = ["bella","label","roller"] Output: ["e","l","l"]
  • Input: words = ["cool","lock","cook"] Output: ["c","o"]

Constraints:

  • 1 <= words.length <= 100
  • 1 <= words[i].length <= 100
  • words[i] consists of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of possible characters in the strings (e.g., lowercase English letters only, ASCII, Unicode)?
  2. If a character appears multiple times in *all* strings, should it appear that many times in the output?
  3. What should the function return if the input array is empty or if any of the strings in the input array are empty?
  4. Is the order of the output characters significant, or can they be in any order?
  5. Are all input strings guaranteed to be non-null?

Brute Force Solution

Approach

The brute force approach to finding common characters in a list of words involves checking every possible character across all the words. We will start by looking at the first word and comparing each character of this word with all the other words to see if it exists in all of them. This is like an exhaustive search, ensuring no possibility is missed.

Here's how the algorithm would work step-by-step:

  1. Take the first word from the list.
  2. Consider the first character of the first word.
  3. Check if this character is present in every other word in the list.
  4. If the character is present in every other word, mark it as a common character.
  5. Move to the next character in the first word and repeat the previous two steps.
  6. Continue this process for all characters in the first word.
  7. If a common character appears multiple times in the first word, make sure to account for its maximum possible occurrences based on how many times it shows up in other words.
  8. Collect all the characters that have been identified as common across all words. These are the common characters.

Code Implementation

def find_common_characters_brute_force(words):
    first_word = words[0]
    common_characters = []

    for char_index in range(len(first_word)):
        character = first_word[char_index]
        is_common = True

        # Check if the character is present in all other words
        for word_index in range(1, len(words)):
            if character not in words[word_index]:
                is_common = False
                break

        if is_common:
            # If char is common, count occurrences in each word
            min_occurrences = float('inf')
            for word in words:
                occurrences = word.count(character)
                min_occurrences = min(min_occurrences, occurrences)

            # Add the character to the result
            # based on minimum occurrences.
            for _ in range(min_occurrences):
                common_characters.append(character)

            # Remove the character from the words so it isn't double counted
            for word_index in range(len(words)):
                words[word_index] = words[word_index].replace(character, '', min_occurrences)

    return common_characters

Big(O) Analysis

Time Complexity
O(N * M * K)Let N be the length of the first word, M be the number of words in the input list, and K be the average length of the remaining words. The outer loop iterates through each character of the first word (N iterations). Inside this loop, for each character, we iterate through the remaining M-1 words. For each of those words, we need to check if the character exists (which can take up to K operations in the worst case, scanning the word). Therefore, the total time complexity is approximately N * (M-1) * K which simplifies to O(N * M * K).
Space Complexity
O(1)The provided algorithm, based on the plain English explanation, iterates through the characters of the first word and compares them against all other words. It appears the algorithm primarily operates in-place or uses a constant number of variables to track the presence or count of common characters. No auxiliary data structures that scale with the input size (number of words or word lengths) are explicitly mentioned or implied. Therefore, the auxiliary space complexity remains constant, independent of the input size.

Optimal Solution

Approach

To efficiently find common characters across multiple words, we'll count how often each letter appears in each word. Then, we identify the letters that appear in all words and, for each such letter, determine the minimum number of times it appears in any of the words.

Here's how the algorithm would work step-by-step:

  1. For each word, count how many times each letter appears.
  2. Find the letters that appear in *every* word.
  3. For each common letter, find the smallest number of times it appears in any of the words.
  4. Collect each common letter into a result, repeating each letter as many times as the smallest count.
  5. The resulting collection of letters are the common characters.

Code Implementation

def find_common_characters(words):
    letter_counts = []
    for word in words:
        word_count = {}
        for char in word:
            word_count[char] = word_count.get(char, 0) + 1
        letter_counts.append(word_count)

    # Determine all distinct characters across all words
    all_chars = set()
    for word_count in letter_counts:
        all_chars.update(word_count.keys())

    common_characters = []
    for character in all_chars:
        # Check if the character is present in all words
        is_common = True
        for word_count in letter_counts:
            if character not in word_count:
                is_common = False
                break

        # If the character is in all words, find min count
        if is_common:

            # Find the minimum frequency of this char
            min_frequency = float('inf')
            for word_count in letter_counts:
                min_frequency = min(min_frequency, word_count[character])

            # Add the character to the result the appropriate number of times
            for _ in range(min_frequency):
                common_characters.append(character)

    return common_characters

Big(O) Analysis

Time Complexity
O(n*m)Let n be the number of words in the input array and m be the average length of a word. Step 1 involves iterating through each of the n words and, for each word, iterating up to m times to count the character frequencies. Step 2 and 3 effectively iterate through the alphabet (constant time since the alphabet size is fixed at 26) and, for each letter, check its count across all n words. Step 4 constructs the output list, which can take up to min(m) operations in the worst case, repeated for at most 26 characters, dominated by the initial frequency counting process. Therefore, the time complexity is primarily determined by iterating through all characters in all words, resulting in O(n*m).
Space Complexity
O(1)The algorithm uses a fixed-size array (or hash map) of size 26 to store character counts for each word. Regardless of the number of words (N) or the size of each word, the space used by these count arrays remains constant since it only depends on the number of lowercase English letters. Therefore, the auxiliary space complexity is O(1).

Edge Cases

CaseHow to Handle
Input array of strings is null or emptyReturn an empty list, as there are no strings to compare for common characters.
One or more input strings are null or emptyTreat an empty string as having no characters, thus the intersection with it will always be empty.
Input array contains only one stringReturn the unique characters in that single string as a list, since those are common to itself.
All input strings are identicalReturn a list of the unique characters in any one of the identical strings, maintaining their original frequency.
Input strings contain non-alphabetic characters (numbers, symbols, whitespace)Include only alphabetic characters in the resulting common characters, ignoring other characters in the strings.
One string contains a character repeated many times, but another string contains it only onceThe character should only appear in the result as many times as it appears in the string with the fewest occurrences of that character.
Input strings are very large (potential memory issues)Use a character counting approach (hash map/array) to avoid storing large intermediate strings and optimize space.
No common characters exist among all input stringsReturn an empty list, indicating no commonality.