Taro Logo

Unique Morse Code Words

Easy
Asked by:
Profile picture
Profile picture
Profile picture
37 views
Topics:
ArraysStrings

International Morse Code defines a standard encoding where each letter is mapped to a series of dots and dashes, as follows:

  • 'a' maps to ".-",
  • 'b' maps to "-...",
  • 'c' maps to "-.-.", and so on.

For convenience, the full table for the 26 letters of the English alphabet is given below:

[".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--.."]

Given an array of strings words where each word can be written as a concatenation of the Morse code of each letter.

  • For example, "cab" can be written as "-.-..--...", which is the concatenation of "-.-.", ".-", and "-...". We will call such a concatenation the transformation of a word.

Return the number of different transformations among all words we have.

Example 1:

Input: words = ["gin","zen","gig","msg"]
Output: 2
Explanation: The transformation of each word is:
"gin" -> "--...-."
"zen" -> "--...-."
"gig" -> "--...--."
"msg" -> "--...--."
There are 2 different transformations: "--...-." and "--...--.".

Example 2:

Input: words = ["a"]
Output: 1

Constraints:

  • 1 <= words.length <= 100
  • 1 <= words[i].length <= 12
  • words[i] consists of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the maximum length of each word in the input array, and what is the maximum number of words?
  2. Can the input array contain empty strings or null values?
  3. Should the output be case-sensitive; that is, should 'abc' and 'Abc' be considered the same word for the purposes of Morse code transformation?
  4. Are all input words guaranteed to contain only lowercase English letters?
  5. If the input array is empty, what should the return value be?

Brute Force Solution

Approach

The brute force strategy for this problem is to translate each word into its Morse code representation, then compare all the resulting Morse code sequences. We want to identify how many unique Morse code translations we end up with after processing all the words.

Here's how the algorithm would work step-by-step:

  1. First, for each word in the given list, find its Morse code translation by looking up each letter's Morse code equivalent.
  2. Then, put all the resulting Morse code translations into a collection.
  3. Finally, count how many unique Morse code sequences are in that collection, and that will be your answer.

Code Implementation

def unique_morse_representations(words):
    morse_code = [".-","-...","-.-.","-..",".","..-","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--.."]
    morse_translations = set()

    for word in words:
        morse_word = ""

        # Convert each word to its morse code equivalent
        for letter in word:
            morse_word += morse_code[ord(letter) - ord('a')]

        # Store the morse code representation
        morse_translations.add(morse_word)

    # Return the count of unique morse code representations
    return len(morse_translations)

Big(O) Analysis

Time Complexity
O(n*m)Let n be the number of words in the input array, and let m be the average length of a word. The outer loop iterates through each of the n words. Inside this loop, translating a single word to Morse code takes O(m) time, where m is the length of the word, because we iterate through each character of the word. The process of adding a Morse code translation to a set takes, on average, O(1) time, but in the worst-case it may take O(m) to compute the hash, giving a total cost of O(n*m) since the translation and set insertion are performed for each of the n words. Then, determining the number of unique Morse code sequences from a collection requires iterating through all n words, with each comparison taking up to m time. The total runtime is therefore O(n*m).
Space Complexity
O(N)The auxiliary space is dominated by the collection used to store the Morse code translations of each word. In the worst case, all N words in the input array will translate to unique Morse code sequences, requiring the collection to store N strings. The space used by this collection is therefore proportional to the number of input words. Thus, the space complexity is O(N).

Optimal Solution

Approach

The key to solving this problem efficiently is to transform each word into its Morse code representation and then count how many different Morse code sequences we've generated. We can achieve this by using a collection to keep track of all the unique transformations we encounter.

Here's how the algorithm would work step-by-step:

  1. Create a translation table that links each letter of the alphabet to its Morse code equivalent.
  2. For each word in the input list, transform it into its Morse code representation by looking up each letter in the translation table and combining the results.
  3. Add each generated Morse code representation to a collection that automatically only stores unique items. This will avoid double-counting identical transformations.
  4. Once you've processed all the words, the number of items in your unique collection is the answer.

Code Implementation

def uniqueMorseRepresentations(words):
    morse_code_table = [".-","-...","-.-.","-..",".","..-.","--.",
                        "....","..",".---","-.-",".-..","--","-.",
                        "---",".--.","--.-",".-.","...","-","..-",
                        "...-",".--","-..-","-.--","--.."]
    unique_transformations = set()

    for word in words:
        morse_transformation = ""
        # Build Morse code translation for each word
        for letter in word:
            morse_transformation += morse_code_table[ord(letter) - ord('a')]
        
        # Only store unique transformations.
        unique_transformations.add(morse_transformation)

    # Count the number of unique Morse code transformations
    return len(unique_transformations)

Big(O) Analysis

Time Complexity
O(N * L)Let N be the number of words in the input list and L be the maximum length of a word. The algorithm iterates through each of the N words. For each word, it translates each character into its Morse code representation. Since the length of each word is at most L, translating a single word takes O(L) time. Adding to a set takes, on average, O(1) time. Therefore, the overall time complexity is dominated by iterating through the words and translating them to Morse code, resulting in O(N * L).
Space Complexity
O(N)The dominant space complexity comes from storing the unique Morse code transformations in a collection (e.g., a set). In the worst-case scenario, all N words in the input array `words` transform into unique Morse code representations. Therefore, the collection could potentially store N distinct Morse code strings, leading to a space complexity of O(N), where N is the number of words in the input array. The translation table itself has a fixed size (26 characters) and doesn't scale with the input, so it contributes O(1) space. The temporary Morse code string for each word is also bounded by the length of the longest word, but the unique Morse code set is the primary contributor to space complexity.

Edge Cases

Null or empty input array
How to Handle:
Return 0, indicating no unique Morse code representations.
Input array with a single word
How to Handle:
Return 1, as a single word will always have a unique Morse code representation.
Input array with words that map to the same Morse code
How to Handle:
The set will only store unique Morse code transformations, avoiding duplicates.
Very long words that might cause string building issues in Morse conversion
How to Handle:
Ensure StringBuilder or equivalent is used to avoid quadratic time complexity for string concatenation.
Words containing characters outside the standard English alphabet (a-z)
How to Handle:
Handle non-alphabetic characters by either ignoring them or throwing an IllegalArgumentException.
Large input array size leading to memory constraints with large sets
How to Handle:
Consider memory usage of the set and optimize for space if necessary using alternative data structures.
Input contains mixed case words which will affect result
How to Handle:
Convert all words to lowercase before Morse code conversion to ensure correct matching.
Empty string as one of the words in the input array
How to Handle:
Treat empty string as a special case, and either ignore or map it to a predefined morse code (e.g., empty string).