Taro Logo

Uncommon Words from Two Sentences

#1595 Most AskedEasy
5 views
Topics:
ArraysStrings

A sentence is a string of single-space separated words where each word consists only of lowercase letters.

A word is uncommon if it appears exactly once in one of the sentences, and does not appear in the other sentence.

Given two sentences s1 and s2, return a list of all the uncommon words. You may return the answer in any order.

Example 1:

Input: s1 = "this apple is sweet", s2 = "this apple is sour"

Output: ["sweet","sour"]

Explanation:

The word "sweet" appears only in s1, while the word "sour" appears only in s2.

Example 2:

Input: s1 = "apple apple", s2 = "banana"

Output: ["banana"]

Constraints:

  • 1 <= s1.length, s2.length <= 200
  • s1 and s2 consist of lowercase English letters and spaces.
  • s1 and s2 do not have leading or trailing spaces.
  • All the words in s1 and s2 are separated by a single space.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Are the input sentences case-sensitive, or should I treat them as case-insensitive?
  2. Can the input sentences be empty or contain only whitespace?
  3. What characters are considered word separators (e.g., spaces, punctuation)?
  4. If a word appears in both sentences but only once in each, is it considered uncommon?
  5. If there are multiple valid lists of uncommon words, is the order of words in the output important?

Brute Force Solution

Approach

The brute force method identifies uncommon words by meticulously checking each word in both sentences. We compare each word against every other word to see if it appears only once across the two sentences. This exhaustive comparison guarantees we find all uncommon words.

Here's how the algorithm would work step-by-step:

  1. First, take the first sentence and look at each word in it, one at a time.
  2. For each word in the first sentence, check if it appears anywhere else in the first sentence.
  3. Also, check if it appears anywhere in the second sentence.
  4. If the word appears only once and not anywhere else in either sentence, remember it as a potential uncommon word.
  5. Next, take the second sentence and repeat the same process. Look at each word in the second sentence, one at a time.
  6. For each word in the second sentence, check if it appears anywhere else in the second sentence.
  7. Also, check if it appears anywhere in the first sentence.
  8. If the word appears only once and not anywhere else in either sentence, remember it as a potential uncommon word.
  9. Finally, combine all the potential uncommon words you found from both sentences. This will be your list of uncommon words.

Code Implementation

def uncommon_from_sentences(first_sentence, second_sentence):
    first_sentence_words = first_sentence.split()
    second_sentence_words = second_sentence.split()
    uncommon_words = []

    for word_from_first in first_sentence_words:
        is_uncommon = True

        # Checking against the first sentence
        for another_word_first in first_sentence_words:

            if (word_from_first == another_word_first and
                    first_sentence_words.index(word_from_first) !=
                    first_sentence_words.index(another_word_first)):

                is_uncommon = False
                break

        # Checking against the second sentence
        if is_uncommon:
            for word_from_second in second_sentence_words:
                if word_from_first == word_from_second:
                    is_uncommon = False
                    break

        if is_uncommon:
            uncommon_words.append(word_from_first)

    for word_from_second in second_sentence_words:
        is_uncommon = True

        # Checking against the second sentence
        for another_word_second in second_sentence_words:

            if (word_from_second == another_word_second and
                    second_sentence_words.index(word_from_second) !=
                    second_sentence_words.index(another_word_second)):

                is_uncommon = False
                break

        # Checking against the first sentence
        if is_uncommon:
            for word_from_first in first_sentence_words:
                if word_from_second == word_from_first:
                    is_uncommon = False
                    break

        if is_uncommon:
            uncommon_words.append(word_from_second)

    # Ensuring there are no duplicates
    final_uncommon_words = []

    for word in uncommon_words:
        if word not in final_uncommon_words:
            final_uncommon_words.append(word)

    return final_uncommon_words

Big(O) Analysis

Time Complexity
O(n²)The algorithm iterates through each of the n words in the first sentence and, for each word, searches for it within the same sentence (up to n comparisons) and in the second sentence (up to n comparisons). The same process is repeated for each of the n words in the second sentence. This nested search operation leads to roughly n * (n + n) operations which is repeated for the second sentence resulting in approximately 2n * 2n operations or 4n². Thus, the time complexity simplifies to O(n²).
Space Complexity
O(N)The algorithm iterates through each sentence, potentially storing uncommon words in a list. In the worst-case scenario, where all words are unique across both sentences, the list of potential uncommon words could grow to contain all words from both sentences. Therefore, the space required to store this list scales linearly with the total number of words in both sentences, which we denote as N. Thus, the auxiliary space complexity is O(N).

Optimal Solution

Approach

The best way to solve this is to count how often each word appears in both sentences. Then, we can easily identify words that appear exactly once in one of the sentences and not at all in the other sentence.

Here's how the algorithm would work step-by-step:

  1. First, take both sentences and split them up into individual words.
  2. Next, count how many times each word appears in the first sentence.
  3. Do the same and count how many times each word appears in the second sentence.
  4. Now, go through the words from the first sentence. If a word appears only once in the first sentence AND it doesn't appear at all in the second sentence, then it's an uncommon word and we add it to our list.
  5. Then, do the same with the words from the second sentence. If a word appears only once in the second sentence AND it doesn't appear at all in the first sentence, we add it to our list of uncommon words.
  6. Finally, we have a list of all the uncommon words that meet our criteria.

Code Implementation

def find_uncommon_words(sentence1, sentence2):
    words_sentence1 = sentence1.split()
    words_sentence2 = sentence2.split()

    frequency_sentence1 = {}
    for word in words_sentence1:
        frequency_sentence1[word] = frequency_sentence1.get(word, 0) + 1

    frequency_sentence2 = {}
    for word in words_sentence2:
        frequency_sentence2[word] = frequency_sentence2.get(word, 0) + 1

    uncommon_words = []

    # Add words unique to sentence1
    for word in words_sentence1:
        if frequency_sentence1[word] == 1:
            # Only consider words appearing once in sentence1

            if word not in frequency_sentence2:

                uncommon_words.append(word)

    # Add words unique to sentence2
    for word in words_sentence2:
        if frequency_sentence2[word] == 1:
            # Only consider words appearing once in sentence2

            if word not in frequency_sentence1:

                uncommon_words.append(word)

    return uncommon_words

Big(O) Analysis

Time Complexity
O(n)Let n be the total number of words across both sentences. Splitting the sentences into words takes O(n) time. Counting word frequencies in each sentence involves iterating through the words, taking O(n) time. Finally, iterating through the word counts to identify uncommon words also takes O(n) time. Therefore, the overall time complexity is dominated by these linear operations, resulting in O(n).
Space Complexity
O(N)The algorithm uses two hash maps (or dictionaries) to count word frequencies in each sentence. In the worst-case scenario, where all words are unique in both sentences, the size of these hash maps will be proportional to the total number of words, N, across both sentences. Additionally, a list to store the uncommon words is created, which in the worst case also has a size proportional to N. Therefore, the auxiliary space complexity is O(N).

Edge Cases

One or both input sentences are empty strings
How to Handle:
Return an empty list if either sentence is empty as no uncommon words can be found.
Input sentences contain leading/trailing whitespace
How to Handle:
Trim whitespace from each sentence before processing to avoid incorrect word splitting.
Input sentences contain multiple spaces between words
How to Handle:
Normalize spaces (e.g., by splitting and rejoining) to ensure correct word counting.
Input sentences contain punctuation marks
How to Handle:
Remove or replace punctuation marks before processing words to avoid counting 'word!' differently from 'word'.
All words are common to both sentences
How to Handle:
The result will be an empty list since no uncommon words exist.
One or both sentences contain very long words
How to Handle:
The solution should handle very long words without causing memory issues or errors, perhaps by limiting word length.
Sentences with extremely large number of words (scalability)
How to Handle:
Using a hash map for word counts allows for efficient O(n) time complexity even with large sentences.
Case sensitivity (same word with different capitalization)
How to Handle:
Convert all words to lowercase before counting to treat words with different capitalization as the same.
0/1916 completed