Taro Logo

Verifying an Alien Dictionary

Easy
Amazon logo
Amazon
Topics:
ArraysStringsTwo Pointers

You are given a list of words that are sorted based on an alien language. You are also given a string order which describes the ordering of the letters in this alien language. The alien language only uses lowercase English letters, but the ordering is different.

Determine if the given list of words is sorted lexicographically according to the alien language's order. In other words, verify that the words appear in the correct order, considering the custom letter ordering.

For example:

  • words = ["hello","leetcode"], order = "hlabcdefgijkmnopqrstuvwxyz" Output: true Explanation: 'h' comes before 'l' in this language, so the sequence is sorted.

  • words = ["word","world","row"], order = "worldabcefghijkmnpqstuvxyz" Output: false Explanation: 'd' comes after 'l' in this language, hence the sequence is unsorted.

  • words = ["apple","app"], order = "abcdefghijklmnopqrstuvwxyz" Output: false Explanation: The first three characters "app" match, and the second string is shorter. According to lexicographical rules "apple" > "app", because 'l' > '' (empty string).

Consider these edge cases:

  • What if the input words list is empty?
  • What if the words contain duplicate entries?
  • How should you handle words of different lengths?

Can you provide an efficient algorithm to solve this problem?

Solution


Naive Solution

A brute force solution would involve comparing each pair of adjacent words in the given words array based on the provided alien order. We can iterate through the words, and for each pair, compare the characters one by one based on their index in the order string. If we find a case where the first word is lexicographically greater than the second, we return false. If we reach the end of the words without finding any such violations, we return true.

Code (Python)

def isAlienSorted_naive(words, order):
    for i in range(len(words) - 1):
        word1, word2 = words[i], words[i+1]
        min_len = min(len(word1), len(word2))
        
        for j in range(min_len):
            idx1, idx2 = order.index(word1[j]), order.index(word2[j])
            if idx1 > idx2:
                return False
            elif idx1 < idx2:
                break # words are in correct order, move to next pair
        else:
            # if we reached the end of the shorter word, check prefixes
            if len(word1) > len(word2):
                return False
    return True

Time and Space Complexity (Naive)

  • Time Complexity: O(N * M), where N is the number of words and M is the maximum length of a word. In the worst case, we might have to compare each character of every word.
  • Space Complexity: O(1). We are using only a constant amount of extra space.

Optimal Solution

To optimize the solution, we can create a mapping of each character to its index in the order string. This allows for O(1) lookup time when comparing characters. The rest of the logic remains the same as the naive solution.

Code (Python)

def isAlienSorted(words, order):
    order_map = {char: index for index, char in enumerate(order)}

    for i in range(len(words) - 1):
        word1, word2 = words[i], words[i+1]
        min_len = min(len(word1), len(word2))
        
        for j in range(min_len):
            idx1, idx2 = order_map[word1[j]], order_map[word2[j]]
            if idx1 > idx2:
                return False
            elif idx1 < idx2:
                break # words are in correct order, move to next pair
        else:
            # if we reached the end of the shorter word, check prefixes
            if len(word1) > len(word2):
                return False
    return True

Time and Space Complexity (Optimal)

  • Time Complexity: O(N * M), where N is the number of words and M is the maximum length of a word. Creating the order_map takes O(1) because order length is fixed.
  • Space Complexity: O(1). The order_map stores 26 characters, which is constant space.

Edge Cases

  • Empty input: The code handles empty input gracefully, as it iterates through the words array, which will be empty.
  • Words with different lengths: The code correctly handles words of different lengths by comparing the prefixes and considering the shorter word as a prefix of the longer word.
  • Identical words: If the words are identical, the inner loop completes without triggering the if idx1 > idx2 condition, and the outer loop continues to the next pair, which is the desired behavior.
  • Invalid Characters: Assumes only lowercase english letters. The prompt doesn't specifically talk about handling invalid characters, but the code would throw an exception if there are any invalid characters.