You're given a sorted list of words words
from an alien language, and you want to determine the order of the characters in that language. The words are sorted lexicographically according to the rules of the alien language. Assume that the alien language only uses lowercase English letters.
For example, if you're given the list ["wrt", "wrf", "er", "ett", "rftt"]
, the correct order of the characters would be "wertf"
. From the input, we can deduce the following:
w
comes before e
r
comes before e
e
comes before t
r
comes before f
Another example: ["z", "x"]
should result in "zx"
.
However, there are some edge cases to consider. What if the input is ["abc", "ab"]
? In this case, the input is invalid since "abc" is lexicographically greater than "ab", so you should return an empty string.
Your goal is to write a function that takes a list of strings words
and returns a string representing the order of characters in the alien language. If the order cannot be determined or the input is invalid, return an empty string. The function should be as efficient as possible.
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The alien dictionary problem asks us to figure out the order of letters in an alien language based on a list of words. A brute force approach involves trying out every possible order of letters and checking if that order is consistent with the given list of words.
Here's how the algorithm would work step-by-step:
import itertools
def alien_dictionary_brute_force(words):
unique_characters = set(''.join(words))
possible_orders = list(itertools.permutations(unique_characters))
for alphabet_order in possible_orders:
alphabet_map = {character: index for index, character in enumerate(alphabet_order)}
is_valid_order = True
# Compare adjacent words to determine validity
for i in range(len(words) - 1):
word1 = words[i]
word2 = words[i + 1]
min_length = min(len(word1), len(word2))
for j in range(min_length):
if word1[j] != word2[j]:
# Found a difference, check alphabet order
if alphabet_map[word1[j]] > alphabet_map[word2[j]]:
is_valid_order = False
break
else:
break
else:
# If word1 is longer than word2 and all preceding chars are same, invalid
if len(word1) > len(word2):
is_valid_order = False
if not is_valid_order:
break
# If a valid order is found, return letters in the order
if is_valid_order:
return ''.join(alphabet_order)
return ""
The alien dictionary problem is about finding the order of letters in an alien language, given a sorted list of words in that language. The optimal strategy involves building a graph representing letter dependencies and then using that graph to determine the order of letters.
Here's how the algorithm would work step-by-step:
def alien_order(words):
# Build the graph of letter dependencies.
adjacency_list = {character: [] for word in words for character in word}
in_degree = {character: 0 for character in adjacency_list}
for i in range(len(words) - 1):
word1 = words[i]
word2 = words[i + 1]
min_length = min(len(word1), len(word2))
if len(word1) > len(word2) and word1[:min_length] == word2:
return ""
for j in range(min_length):
if word1[j] != word2[j]:
if word2[j] not in adjacency_list[word1[j]]:
adjacency_list[word1[j]].append(word2[j])
in_degree[word2[j]] += 1
break
# Add nodes with in_degree of zero to the queue
queue = [character for character in in_degree if in_degree[character] == 0]
result = []
while queue:
character = queue.pop(0)
result.append(character)
# Iterate through neighbors
for neighbor in adjacency_list[character]:
in_degree[neighbor] -= 1
# If in-degree becomes 0, add to queue
if in_degree[neighbor] == 0:
queue.append(neighbor)
# Check for cycles in the graph.
if len(result) != len(in_degree):
return ""
return "".join(result)
Case | How to Handle |
---|---|
Empty input list of words | Return an empty string since no order can be determined. |
List with a single word | Return the unique characters in the single word, preserving order. |
Words with no discernible order (e.g., ['abc', 'bac']) | The algorithm should detect a cycle in the graph and return an empty string. |
Conflicting order (e.g., ['ab', 'ba']) leading to a cycle | The algorithm should detect a cycle in the graph and return an empty string. |
Words contain characters outside the lowercase English alphabet | Throw an exception or filter out invalid characters to prevent errors and ensure correct ordering is determined from only valid alphabet characters. |
Maximum length of the input list leading to high memory usage for adjacency list | Ensure that the adjacency list implementation scales efficiently to avoid exceeding memory limits by considering using appropriate data structures and optimize memory allocation. |
A character only appears at the end of the words and does not have dependency | Make sure topological sort still includes character without any dependencies. |
Inconsistent ordering: 'abc', 'ab' | This implies 'c' < '', which is invalid and constitutes a cycle, so return an empty string. |