You are given a non-empty string word
and an abbreviation abbr
, return whether the string matches with the given abbreviation.
A string such as "word" contains only the following types of characters:
and abbreviation of it may be:
Can you write a function that validates if a word can be abbreviated by the given abbreviation?
Example 1:
Input: word = "internationalization", abbr = "i12iz4n"
Output: true
Example 2:
Input: word = "apple", abbr = "a2e"
Output: false
Example 3:
Input: word = "a", abbr = "01"
Output: false
Explain your solution's time and space complexity. Make sure to handle edge cases such as leading zeros in the abbreviation.
When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach to checking if an abbreviation is valid involves systematically comparing the abbreviation to the full word. We expand the abbreviation and character-by-character compare it to the word. If they match, the abbreviation is deemed valid, otherwise its not.
Here's how the algorithm would work step-by-step:
def is_valid_word_abbreviation(word, abbreviation):
word_index = 0
abbreviation_index = 0
while abbreviation_index < len(abbreviation) and word_index < len(word):
if abbreviation[abbreviation_index].isdigit():
# Handle consecutive digits by accumulating the number
number_of_skips = 0
while abbreviation_index < len(abbreviation) and abbreviation[abbreviation_index].isdigit():
if abbreviation[abbreviation_index] == '0' and number_of_skips == 0:
return False
number_of_skips = number_of_skips * 10 + int(abbreviation[abbreviation_index])
abbreviation_index += 1
# Skip characters in the word based on the parsed number
word_index += number_of_skips
else:
# Compare characters if the abbreviation contains a character
if word[word_index] != abbreviation[abbreviation_index]:
return False
word_index += 1
abbreviation_index += 1
# Check if we've reached the end of both strings
return word_index == len(word) and abbreviation_index == len(abbreviation)
The problem requires checking if an abbreviation correctly matches a given word. The efficient strategy involves comparing the word and the abbreviation sequentially, consuming characters from both as we go. Numerical segments in the abbreviation indicate a skip in the word.
Here's how the algorithm would work step-by-step:
def is_valid_word_abbreviation(word, abbreviation):
word_index = 0
abbreviation_index = 0
while abbreviation_index < len(abbreviation) and word_index < len(word):
if abbreviation[abbreviation_index].isdigit():
if abbreviation[abbreviation_index] == '0':
return False
number_of_skips = 0
# Build the number from consecutive digits
while abbreviation_index < len(abbreviation) and abbreviation[abbreviation_index].isdigit():
number_of_skips = number_of_skips * 10 + int(abbreviation[abbreviation_index])
abbreviation_index += 1
# Advance word index by number. Necessary to 'skip' characters.
word_index += number_of_skips
else:
# Characters must match.
if word[word_index] != abbreviation[abbreviation_index]:
return False
word_index += 1
abbreviation_index += 1
# Both strings must be exhausted simultaneously.
return word_index == len(word) and abbreviation_index == len(abbreviation)
Case | How to Handle |
---|---|
Null or empty word | Return false immediately if the word is null or empty, since there's nothing to abbreviate. |
Null or empty abbreviation | Return false immediately if the abbreviation is null or empty, unless the word is also empty, then return true. |
Word is shorter than abbreviation | Return false since the abbreviation cannot match a shorter word. |
Abbreviation contains invalid characters (non-numeric/non-alphabetic) | Return false if invalid character in abbreviation or throw an exception for robustness. |
Abbreviation starts with zero followed by other digits (e.g., '02') | Return false since numbers with leading zeros are typically considered invalid. |
Abbreviation ends with a number | Carefully handle string indices to avoid out-of-bounds exception when processing the last numeric sequence. |
Abbreviation contains a number larger than the remaining word length | Return false, since that number of characters will go out of bounds. |
Very long strings to avoid potential integer overflow when parsing numbers in the abbreviation | Use long data type to store the parsed number if concerned about int overflow. |