Taro Logo

Verifying an Alien Dictionary

Easy
Google logo
Google
1 view
Topics:
ArraysStringsTwo Pointers

In an alien language, surprisingly, they also use English lowercase letters, but possibly in a different order. The order of the alphabet is some permutation of lowercase letters.

Given a sequence of words written in the alien language, and the order of the alphabet, determine if the given words are sorted lexicographically in this alien language.

Example 1:

words = ["hello","leetcode"], order = "hlabcdefgijkmnopqrstuvwxyz"
Output: true
Explanation: As 'h' comes before 'l' in this language, then the sequence is sorted.

Example 2:

words = ["word","world","row"], order = "worldabcefghijkmnpqstuvxyz"
Output: false
Explanation: As 'd' comes after 'l' in this language, then words[0] > words[1], hence the sequence is unsorted.

Example 3:

words = ["apple","app"], order = "abcdefghijklmnopqrstuvwxyz"
Output: false
Explanation: The first three characters "app" match, and the second string is shorter (in size.) According to lexicographical rules "apple" > "app", because 'l' > '', where '' is defined as the blank character which is less than any other character.

Write a function to solve this problem. The function should take a list of strings words and a string order as input and return a boolean value indicating whether the words are sorted lexicographically according to the given order.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of a word in the `words` array, and what is the maximum number of words in the `words` array?
  2. Can the `order` string contain any characters that are not present in the words? Should I assume all characters in `words` are in `order`?
  3. Is the `order` string guaranteed to contain all 26 lowercase English letters, or could it be a subset of the alphabet?
  4. If two words are identical up to the length of the shorter word, and one word is shorter than the other, how should that be treated in terms of the alien lexicographical order?
  5. What should I return if the input `words` array is empty or contains only one word?

Brute Force Solution

Approach

We are given words that are supposed to be in order based on a strange alien alphabet, and we want to check if they really are. The brute force way to do this is to directly compare each pair of adjacent words to make sure they follow the alphabet's rules.

Here's how the algorithm would work step-by-step:

  1. Go through the words one at a time, looking at each word and the word that comes right after it.
  2. For each pair of words, compare the letters in the same position, going from left to right.
  3. If the letters are different, see which letter comes first in the alien alphabet. If the first word's letter should come after the second word's letter, then the words are in the wrong order, and you can stop.
  4. If all the letters are the same up to the end of one of the words, then the shorter word should come first. If it doesn't, then the words are in the wrong order, and you can stop.
  5. Keep doing this for all the pairs of words. If you get through all the pairs without finding any words out of order, then the words are correctly sorted according to the alien alphabet.

Code Implementation

def is_alien_sorted(words, order):
    order_map = {char: index for index, char in enumerate(order)}

    def compare_words(word1, word2):
        minimum_length = min(len(word1), len(word2))

        # Compare letters up to min length
        for index in range(minimum_length):
            char1 = word1[index]
            char2 = word2[index]

            if char1 != char2:
                # Check alien alphabet order
                if order_map[char1] > order_map[char2]:

                    return False

                return True

        # Handle cases where one word is a prefix of the other
        if len(word1) > len(word2):

            return False

        return True

    # Iterate through each pair of adjacent words
    for i in range(len(words) - 1):

        # Verify that the current word comes before the next
        if not compare_words(words[i], words[i + 1]):
            return False

    return True

Big(O) Analysis

Time Complexity
O(N * M)The primary operation involves iterating through pairs of adjacent words to compare them based on the alien alphabet. Let N be the number of words and M be the maximum length of a word. We iterate through N-1 pairs of words. For each pair, we compare the characters until we find a difference or reach the end of the shorter word, taking at most M comparisons. Therefore, the time complexity is O((N-1) * M), which simplifies to O(N * M).
Space Complexity
O(1)The algorithm iterates through the input words and compares characters within each word. It uses a constant amount of extra space to store the order of the alien alphabet, which can be represented as a fixed-size array or hash map, independent of the number of words or the length of the words. The comparison process involves a few index variables for iterating, but these take up constant space. Therefore, the auxiliary space complexity is O(1), indicating constant space usage.

Optimal Solution

Approach

We need to figure out if a list of words is sorted according to a strange alien alphabet. The key idea is to check pairs of adjacent words to see if they're in the correct order based on the given alien alphabet.

Here's how the algorithm would work step-by-step:

  1. First, create a quick lookup to know the alien order of each letter. Think of it like creating a secret code key.
  2. Then, go through the list of words, comparing each word to the one that comes after it.
  3. When comparing two words, look at their letters one by one, until you find a difference or one word runs out of letters.
  4. If the first difference you find is in the wrong order according to the alien alphabet, then the words are not in the correct order, and you can stop.
  5. If one word is a prefix of the other (like 'apple' and 'app'), and the shorter word comes later in the list, then the words are not sorted correctly.
  6. If you make it through comparing all the words without finding any mistakes, then the list is sorted correctly.

Code Implementation

def isAlienSorted(words, order):
    alien_order_map = {char: index for index, char in enumerate(order)}

    # Compare each adjacent pair of words
    for i in range(len(words) - 1):
        word1 = words[i]
        word2 = words[i+1]
        
        min_length = min(len(word1), len(word2))

        # Iterate through letters until a diff is found
        for j in range(min_length):
            if word1[j] != word2[j]:
                # Check if order is incorrect
                if alien_order_map[word1[j]] > alien_order_map[word2[j]]:
                    return False
                break
        else:
            # Check if word1 is a prefix of word2
            # If word1 is longer, then the order is wrong
            if len(word1) > len(word2):
                return False

    return True

Big(O) Analysis

Time Complexity
O(C * N)Let N be the number of words in the input list `words`, and let C be the maximum length of a word in `words`. The algorithm iterates through the list of words once, comparing adjacent pairs. The comparison of each pair of words involves iterating through the characters of the words until a difference is found or one word is exhausted. In the worst case, the algorithm compares all characters of the shorter word. Therefore, the time complexity for comparing two words is O(C). Since we compare N-1 pairs of words, the overall time complexity is O((N-1) * C), which simplifies to O(C * N).
Space Complexity
O(1)The algorithm creates a lookup table to store the alien alphabet order. Since the size of any alphabet is limited to a maximum of 26 characters, this lookup table consumes constant space, regardless of the number of words. The algorithm then iterates through the input words and compares them, requiring only a few variables to track the current indices during comparisons. Therefore, the auxiliary space used is constant, independent of the input word list size.

Edge Cases

CaseHow to Handle
words is null or emptyReturn true if words is null or empty, as an empty set of words is considered sorted.
order is null or emptyIf order is null or empty and words is not, words can only be sorted if it contains only one word; otherwise it is false.
words contains an empty stringAn empty string is considered lexicographically smaller than any non-empty string, so it should be handled correctly in the comparison.
order contains duplicate charactersThe order string should be considered invalid, and the program should likely return an error or false.
words are all identicalIdentical words should be considered sorted, so the comparison function should return true.
One word is a prefix of another (e.g., 'apple' and 'app')The shorter word should come before the longer word; if the shorter word is lexicographically *after* the longer word, return false.
Large input size (many words and/or long words)The solution should iterate efficiently through the words and characters without excessive memory usage, ideally using a dictionary (hashmap) for character lookups in O(1) time.
Words contain characters not in the order stringThese characters are considered invalid and should result in a false return or error condition, as the ordering cannot be determined.