Taro Logo

Valid Word Abbreviation

Easy
Meta logo
Meta
Topics:
StringsTwo Pointers

You are given a non-empty string word and an abbreviation abbr, return whether the string matches with the given abbreviation.

A string such as "word" contains only the following types of characters:

  • lowercase English letters

and abbreviation of it may be:

  • lowercase English letters
  • integers (representing the number of lowercase English letters bridged over in the original string)

Can you write a function that validates if a word can be abbreviated by the given abbreviation?

Example 1:

Input: word = "internationalization", abbr = "i12iz4n"
Output: true

Example 2:

Input: word = "apple", abbr = "a2e"
Output: false

Example 3:

Input: word = "a", abbr = "01"
Output: false

Explain your solution's time and space complexity. Make sure to handle edge cases such as leading zeros in the abbreviation.

Solution


Naive Approach

A straightforward approach is to iterate through both the word and the abbreviation, comparing characters and expanding the numbers in the abbreviation. This involves maintaining two pointers, one for the word and one for the abbreviation. When we encounter a digit in the abbreviation, we parse the full number, advance the word pointer by that number, and continue. If at any point the pointers go out of bounds or a character mismatch occurs (outside of number expansions), the function returns false.

Code (Naive)

def isValidWordAbbreviationNaive(word: str, abbr: str) -> bool:
    i, j = 0, 0
    while i < len(word) and j < len(abbr):
        if abbr[j].isdigit():
            if abbr[j] == '0': # Leading zero is invalid
                return False
            k = j
            while k < len(abbr) and abbr[k].isdigit():
                k += 1
            num = int(abbr[j:k])
            i += num
            j = k
        elif word[i] == abbr[j]:
            i += 1
            j += 1
        else:
            return False

    return i == len(word) and j == len(abbr)

Optimal Approach

This problem can be solved efficiently using a two-pointer approach. We iterate through the abbreviation string, and for each character, we check if it's a digit or a character. If it's a digit, we extract the complete number and increment the word index by that amount. If it's a character, we compare it with the current character in the word. If the characters don't match or the word index exceeds the word length, we return false. Finally, we check if the word index equals the word length.

Code (Optimal)

def isValidWordAbbreviation(word: str, abbr: str) -> bool:
    word_index = 0
    abbr_index = 0

    while abbr_index < len(abbr):
        if abbr[abbr_index].isdigit():
            if abbr[abbr_index] == '0':
                return False # No leading zeros

            num_start = abbr_index
            while abbr_index < len(abbr) and abbr[abbr_index].isdigit():
                abbr_index += 1

            num = int(abbr[num_start:abbr_index])
            word_index += num

        else:
            if word_index >= len(word) or word[word_index] != abbr[abbr_index]:
                return False
            word_index += 1
            abbr_index += 1

    return word_index == len(word)

Edge Cases:

  • Empty word and empty abbreviation: Should return true.
  • Empty word, non-empty abbreviation: Should return false if the abbreviation is not "0", otherwise true.
  • Non-empty word, empty abbreviation: Should return false.
  • Leading zeros in abbreviation numbers: Should return false.
  • Abbreviation number exceeds the word length.
  • Abbreviation only contains numbers.
  • Word only contains numbers (should still work).

Time and Space Complexity:

  • Time Complexity: O(N), where N is the length of the abbreviation. In the worst case, we iterate through the entire abbreviation string once.
  • Space Complexity: O(1). We use only a constant amount of extra space for variables, regardless of the input size.