Taro Logo

Find Words That Can Be Formed by Characters

Easy
Meta logo
Meta
1 view
Topics:
ArraysStrings

You are given an array of strings words and a string chars.

A string is good if it can be formed by characters from chars (each character can only be used once for each word in words).

Return the sum of lengths of all good strings in words.

For example:

words = ["cat","bt","hat","tree"], chars = "atach" Output: 6 Explanation: The strings that can be formed are "cat" and "hat" so the answer is 3 + 3 = 6.

words = ["hello","world","leetcode"], chars = "welldonehoneyr" Output: 10 Explanation: The strings that can be formed are "hello" and "world" so the answer is 5 + 5 = 10.

Could you please provide an efficient algorithm to solve this problem, considering time and space complexity?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What are the constraints on the lengths of the `words` array and the `chars` string?
  2. Can `words` or `chars` be empty or null?
  3. Are the characters in `words` and `chars` guaranteed to be lowercase English letters?
  4. If a word can be formed multiple times from `chars`, should its length be counted multiple times or only once?
  5. What should be returned if no words in the `words` array can be formed from `chars`?

Brute Force Solution

Approach

The brute force method for this problem is all about checking every word individually to see if it can be formed using the provided characters. We examine each word and compare its letters against the available characters to determine if it's a valid word that can be formed.

Here's how the algorithm would work step-by-step:

  1. Take the first word from the list of words.
  2. Go through each letter in that word one by one.
  3. For each letter, check if that letter is available in the set of characters we can use.
  4. If a letter is not available, then the word cannot be formed; so, reject that word.
  5. If all letters in the word are available (and we potentially remove each used character from our available characters as we go), then that word can be formed using the characters.
  6. Record the length of the words which can be formed.
  7. Repeat these steps for every word in the list of words.

Code Implementation

def find_words_that_can_be_formed_by_characters_brute_force(words, characters):
    total_length_of_valid_words = 0
    for word in words:
        # Check if the current word can be formed
        if can_form_word(word, characters):
            total_length_of_valid_words += len(word)

    return total_length_of_valid_words

def can_form_word(word, characters):
    available_characters = list(characters)
    for char in word:
        # Need to find and remove the character from available chars
        if char in available_characters:
            available_characters.remove(char)
        else:
            # If not found, word can't be formed
            return False

    return True

Big(O) Analysis

Time Complexity
O(N * M)Let N be the number of words in the input list `words`, and let M be the maximum length of a word in `words`. The algorithm iterates through each of the N words. For each word, it iterates through each of its characters, which takes at most M steps. Therefore, in the worst-case scenario, the algorithm performs approximately N * M operations to check all the words and their characters, resulting in a time complexity of O(N * M).
Space Complexity
O(1)The described algorithm iterates through each word and its letters, checking character availability. It might implicitly modify the available characters (removing used ones), but the plain English description doesn't require storing any additional data structures dependent on the input size to facilitate this check. The only memory used is for potentially a few temporary variables needed for iteration, which remains constant regardless of the input words or available characters, implying O(1) auxiliary space.

Optimal Solution

Approach

We want to determine which words can be made using only the characters from a given source string. The clever approach here is to count the characters we have available, and then check if each word can be formed using those counts, avoiding unnecessary string manipulations.

Here's how the algorithm would work step-by-step:

  1. First, count the number of times each character appears in the source string of available characters.
  2. Next, for each word in the list of words, do the following:
  3. Count the number of times each character appears in the current word.
  4. Compare the character counts of the word to the character counts of the available characters.
  5. If the word requires more of a specific character than is available in the source string, then we know this word cannot be formed, and we move on to the next word.
  6. If the word can be formed entirely from the available characters, then we add its length to a running total.
  7. After processing all the words, return the total length of the words that can be formed.

Code Implementation

def find_words_that_can_be_formed_by_characters(words, characters):
    character_counts = {}
    for char in characters:
        character_counts[char] = character_counts.get(char, 0) + 1

    total_length = 0

    for word in words:
        word_counts = {}
        can_form = True
        for char in word:
            word_counts[char] = word_counts.get(char, 0) + 1

        # Check if the word can be formed
        # using the available characters.
        for char, count in word_counts.items():
            if char not in character_counts or character_counts[char] < count:
                can_form = False
                break

        if can_form:
            # Adds the length of the word
            # to the total length if it can be formed.
            total_length += len(word)

    return total_length

Big(O) Analysis

Time Complexity
O(n + m * k)First, we count character frequencies in the available characters string, which takes O(n) time, where n is the length of the available characters string. Then, for each word in the list of words (m words), we count character frequencies (k, the length of the word) and compare them to the available character counts. Thus, the inner loop takes O(k) time. So, the total time complexity will be O(n + m * k), where n is the length of the available characters string, m is the number of words, and k is the average length of the words.
Space Complexity
O(1)The algorithm primarily uses two dictionaries to store character counts: one for the source string and one for each word. The size of these dictionaries is bounded by the size of the alphabet, which is constant (e.g., 26 for lowercase English letters). Therefore, the auxiliary space used is independent of the input size (number of words or the length of the source string) and remains constant. This fixed amount of memory makes the space complexity O(1).

Edge Cases

CaseHow to Handle
words is null or emptyReturn 0 as no words can be formed if the input list is empty or null.
chars is null or emptyReturn 0, because no word can be formed if no characters are available.
words contains empty stringsAn empty string has length 0 and should be included in the sum if chars is not empty.
chars contains repeated charactersThe solution should correctly count the occurrences of each character in chars.
A word in words contains characters not present in charsThe word should not be included in the sum.
A word contains a character that appears more times than in charsThe word should not be included in the sum.
words array contains duplicate wordsEach word is assessed independently so duplicates do not affect correctness and the lengths are correctly summed without special handling.
words and chars contain only one type of character (e.g. 'a')The frequency of 'a' in each word compared to 'a' in chars is verified to ensure correctness.