Taro Logo

Longest Word in Dictionary

Medium
Pinterest logo
Pinterest
0 views
Topics:
ArraysStrings

Given an array of strings words representing an English Dictionary, return the longest word in words that can be built one character at a time by other words in words.

If there is more than one possible answer, return the longest word with the smallest lexicographical order. If there is no answer, return the empty string.

Note that the word should be built from left to right with each additional character being added to the end of a previous word. 

Example 1:

Input: words = ["w","wo","wor","worl","world"]
Output: "world"
Explanation: The word "world" can be built one character at a time by "w", "wo", "wor", and "worl".

Example 2:

Input: words = ["a","banana","app","appl","ap","apply","apple"]
Output: "apple"
Explanation: Both "apply" and "apple" can be built from other words in the dictionary. However, "apple" is lexicographically smaller than "apply".

Constraints:

  • 1 <= words.length <= 1000
  • 1 <= words[i].length <= 30
  • words[i] consists of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the input dictionary contain empty strings or null values?
  2. If there are multiple longest words with the same length, which one should I return (e.g., the lexicographically smallest, the first one encountered, or is it arbitrary)?
  3. Are the words in the dictionary guaranteed to contain only lowercase English letters, or might there be other characters (uppercase, numbers, special symbols)?
  4. Is the dictionary guaranteed to be sorted in any way?
  5. Is it possible for the input dictionary to be empty, and if so, what should I return in that case?

Brute Force Solution

Approach

The brute force approach to finding the longest word that can be built from other words involves checking every single word in the dictionary to see if it meets the criteria. We'll examine each word and methodically determine if it can be constructed by combining other words from the provided list. This is done by exhaustively checking all possible combinations of words.

Here's how the algorithm would work step-by-step:

  1. Start by picking the first word in the list of words.
  2. Check if the first word can be formed by combining other words in the list.
  3. To do this, try every possible combination of words from the dictionary and see if any of those combinations exactly match our first word.
  4. If you find a combination that matches, mark the first word as 'buildable'.
  5. If you don't find a matching combination after trying everything, mark the first word as 'not buildable'.
  6. Repeat this process for every single word in the list.
  7. After checking all the words, find the longest 'buildable' word.
  8. If there are multiple 'buildable' words of the same maximum length, choose the one that comes first alphabetically.

Code Implementation

def longest_word_in_dictionary_brute_force(words):
    longest_buildable_word = ""

    for current_word in words:
        if can_be_built(current_word, words):
            #Check if this buildable word is longer or same length but lexicographically smaller

            if len(current_word) > len(longest_buildable_word) or \
               (len(current_word) == len(longest_buildable_word) and current_word < longest_buildable_word):
                longest_buildable_word = current_word

    return longest_buildable_word

def can_be_built(target_word, words):
    if target_word == "":
        return True

    #Iterate through all possible word combinations
    for i in range(1 << len(words)):
        combined_word = ""
        for j in range(len(words)):
            if (i >> j) & 1:
                combined_word += words[j]

        #Check if any combination equals target
        if combined_word == target_word:
            # If we're comparing the word against itself, we must exclude it from the list to check
            if target_word not in words or words.count(target_word) > 1:
                return True
            else:
                words_copy = words.copy()
                words_copy.remove(target_word)
                if target_word in words_copy and words_copy.count(target_word) > 0:
                    return True
                else:
                    continue

    return False

Big(O) Analysis

Time Complexity
O(n^(m+1))The brute force approach iterates through each of the n words in the input list. For each word, it attempts to construct it by combining other words. In the worst case, each word can be constructed by combining at most m other words from the dictionary. Therefore, we need to check all possible combinations of the words, and the number of possible combinations can grow exponentially with m. In addition, for each combination of m words we generate, we are comparing it with the target word of length m as well; if m is proportional to n, we can consider string comparisons to be n time. Thus, the time complexity can be expressed as O(n * n^m). Considering the construction part (n^m) + the outer loop that goes through all the words again (n), the total complexity can be thought of as O(n^(m+1)).
Space Complexity
O(N^M)The described brute force approach involves checking all possible combinations of words to form each target word. Let N be the number of words in the dictionary, and M be the maximum length of a word. For each word in the dictionary, we explore combinations of other words. In the worst case, the number of combinations we could generate is exponential, and storing all these combinations could require O(N^M) auxiliary space, where we are potentially storing a combination of nearly all words in the dictionary to compare to a single target word. Moreover, the 'buildable' status of each word could be stored in a boolean array of size N, however, the dominant term will be the space required to store the combinations. Thus, the space complexity is dominated by generating and storing possible combinations of words.

Optimal Solution

Approach

The best way to find the longest word from a dictionary that can be built one character at a time is to first organize the dictionary and then search strategically. We'll sort the words to make checking easier and then build our potential longest word character by character.

Here's how the algorithm would work step-by-step:

  1. First, put all the words in the dictionary in alphabetical order. If two words have the same length, the one that comes earlier alphabetically should be prioritized.
  2. Start by looking at the shortest words in the dictionary. A word has to be able to be built from shorter words.
  3. Check if each word can be formed by adding just one letter to another word already found. If it can, we know it's a valid word.
  4. Keep track of the longest valid word found so far.
  5. Continue checking words, building them character by character from shorter words, until we've checked every word in the dictionary.
  6. The longest valid word we found is the answer.

Code Implementation

def longest_word(dictionary):
    dictionary.sort(key=lambda word: (len(word), word))
    longest_word_found = ""
    valid_words = set()

    for current_word in dictionary:
        # Words of length 1 are inherently valid.
        if len(current_word) == 1:
            valid_words.add(current_word)
            if len(current_word) > len(longest_word_found):
                longest_word_found = current_word

        else:
            # Check if the word can be built from a shorter word.
            prefix = current_word[:-1]
            if prefix in valid_words:
                valid_words.add(current_word)

                # Update the longest word if applicable.
                if len(current_word) > len(longest_word_found):
                    longest_word_found = current_word

    return longest_word_found

Big(O) Analysis

Time Complexity
O(n log n)Sorting the input list of words, where n is the number of words, takes O(n log n) time. We then iterate through the sorted list of words. For each word, we check if it can be formed by adding one character to a previous valid word. The operation to check if a word is formable from a previously found set of words takes O(k) time in the worst case (where k is the length of the word). Since we've already sorted the list by length, the number of 'previous' words we need to check will grow with the length of each new word. Assuming the average word length is less than log n (which is reasonable for a dictionary), the dominant term remains the initial sorting. Therefore, the overall time complexity is approximated by the sorting step which is O(n log n).
Space Complexity
O(N)The algorithm uses a set (or similar data structure) to keep track of the valid words that can be built one character at a time. In the worst-case scenario, all the words in the dictionary could be valid, meaning the set might store all N words, where N is the number of words in the dictionary. Therefore, the space used by the algorithm is proportional to the number of words in the dictionary, resulting in O(N) space complexity.

Edge Cases

CaseHow to Handle
Empty dictionary arrayReturn an empty string immediately as there are no words to process.
Dictionary contains only empty stringsReturn the empty string, as it's the longest valid word constructible.
No word in dictionary can be built from other words (all letters mixed up)Return an empty string, indicating no valid word can be constructed.
Dictionary contains only one wordIf the word has length 1, return it, else check if its prefixes exist (which they don't), return ''.
Dictionary contains extremely long wordsEnsure the solution doesn't cause stack overflow errors due to excessive recursion by utilizing iterative methods.
Dictionary contains words with non-alphabetic charactersPreprocess the dictionary to remove or ignore words containing non-alphabetic characters (or throw an error).
Multiple words in dictionary are of same maximum length and validReturn the lexicographically smallest word among all valid words of the maximum length.
Integer overflow when calculating the word lengthUse appropriate data types or length limits to prevent potential integer overflow when storing word length.