Taro Logo

Validate IP Address

Medium
Asked by:
Profile picture
Profile picture
Profile picture
Profile picture
+6
39 views
Topics:
Strings

Given a string queryIP, return "IPv4" if IP is a valid IPv4 address, "IPv6" if IP is a valid IPv6 address or "Neither" if IP is not a correct IP of any type.

A valid IPv4 address is an IP in the form "x1.x2.x3.x4" where 0 <= xi <= 255 and xi cannot contain leading zeros. For example, "192.168.1.1" and "192.168.1.0" are valid IPv4 addresses while "192.168.01.1", "192.168.1.00", and "192.168@1.1" are invalid IPv4 addresses.

A valid IPv6 address is an IP in the form "x1:x2:x3:x4:x5:x6:x7:x8" where:

  • 1 <= xi.length <= 4
  • xi is a hexadecimal string which may contain digits, lowercase English letter ('a' to 'f') and upper-case English letters ('A' to 'F').
  • Leading zeros are allowed in xi.

For example, "2001:0db8:85a3:0000:0000:8a2e:0370:7334" and "2001:db8:85a3:0:0:8A2E:0370:7334" are valid IPv6 addresses, while "2001:0db8:85a3::8A2E:037j:7334" and "02001:0db8:85a3:0000:0000:8a2e:0370:7334" are invalid IPv6 addresses.

Example 1:

Input: queryIP = "172.16.254.1"
Output: "IPv4"
Explanation: This is a valid IPv4 address, return "IPv4".

Example 2:

Input: queryIP = "2001:0db8:85a3:0:0:8A2E:0370:7334"
Output: "IPv6"
Explanation: This is a valid IPv6 address, return "IPv6".

Example 3:

Input: queryIP = "256.256.256.256"
Output: "Neither"
Explanation: This is neither a IPv4 address nor a IPv6 address.

Constraints:

  • queryIP consists only of English letters, digits and the characters '.' and ':'.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Regarding the 'no leading zeros' rule for IPv4, just to confirm, is '0' considered a valid component by itself, whereas a component like '01' would be invalid?
  2. For the hexadecimal characters in an IPv6 address, should they be treated as case-insensitive? For example, is 'a' equivalent to 'A'?
  3. The definition for IPv6 mentions eight groups of hexadecimal digits. Does this validation need to account for abbreviated formats, such as the use of a double colon ('::') to represent consecutive groups of zeros?
  4. How should the validator handle strings with leading or trailing delimiters, like '.192.168.1.1', or consecutive delimiters, such as '192.168..1.1'?
  5. What is the expected output for an empty string or other malformed inputs that don't fit either pattern, such as a string containing both dots and colons?

Brute Force Solution

Approach

The brute force strategy involves testing the input string against two distinct sets of rules, one for each type of IP address. We first try to validate it as an IPv4 address by exhaustively checking every rule. If that fails, we then try to validate it as an IPv6 address, again by checking all of its specific rules.

Here's how the algorithm would work step-by-step:

  1. First, attempt to treat the string as an IPv4 address by trying to split it into pieces using the dot character.
  2. If you don't get exactly four pieces from the split, it cannot be an IPv4 address, so you move on to check for IPv6.
  3. If you do get four pieces, examine each one to make sure it follows all IPv4 rules: it must be a number between 0 and 255, and it cannot have extra zeros at the beginning unless the number itself is just zero.
  4. If all four pieces pass these checks, you've confirmed it's a valid IPv4 address.
  5. If the string failed the IPv4 check, you then attempt to treat it as an IPv6 address by splitting it into pieces using the colon character.
  6. If you don't get exactly eight pieces, it cannot be a valid IPv6 address.
  7. If you do get eight pieces, examine each one to make sure it follows all IPv6 rules: it must be one to four characters long, and every character must be a valid hexadecimal digit (0-9 or a-f).
  8. If all eight pieces pass these checks, you've confirmed it's a valid IPv6 address.
  9. If the string fails to satisfy the rules for both IPv4 and IPv6, then it is neither.

Code Implementation

class Solution:
    def validIPAddress(self, queryIP: str) -> str:
        ipv4_components = queryIP.split('.')

        # An IPv4 address must be composed of exactly four numerical components separated by dots.

        if len(ipv4_components) == 4:
            is_valid_ipv4_format = True
            for ipv4_component in ipv4_components:
                if not ipv4_component.isdigit():
                    is_valid_ipv4_format = False
                    break
                
                # Leading zeros are disallowed because they can be ambiguous, except for "0" itself.

                if len(ipv4_component) > 1 and ipv4_component[0] == '0':
                    is_valid_ipv4_format = False
                    break

                if not 0 <= int(ipv4_component) <= 255:
                    is_valid_ipv4_format = False
                    break
            
            if is_valid_ipv4_format:
                return "IPv4"

        ipv6_components = queryIP.split(':')

        # If IPv4 fails, check for IPv6, which has eight 1-4 digit hexadecimal components.

        if len(ipv6_components) == 8:
            is_valid_ipv6_format = True
            for ipv6_component in ipv6_components:
                if not (1 <= len(ipv6_component) <= 4):
                    is_valid_ipv6_format = False
                    break
                
                for hex_character in ipv6_component:
                    if hex_character not in "0123456789abcdefABCDEF":
                        is_valid_ipv6_format = False
                        break
                
                if not is_valid_ipv6_format:
                    break

            if is_valid_ipv6_format:
                return "IPv6"

        return "Neither"

Big(O) Analysis

Time Complexity
O(n) – Let n be the length of the input string. The dominant cost comes from splitting the string by a delimiter and then validating the resulting parts. Splitting the string requires a single pass over all n characters. Subsequently, validating each part involves checking every character within them, which, when summed up across all parts, is equivalent to another pass over the n characters. Since these operations are performed sequentially a constant number of times, the total operations are directly proportional to n, resulting in a linear time complexity of O(n).
Space Complexity
O(N) – The primary contributor to auxiliary space is the splitting operation described in the steps. When the input string of length N is split by either a dot or a colon, a new temporary list or array is created to store the resulting substrings or 'pieces'. The total size of all the characters stored in this new list is proportional to the length of the original input string. While a few variables are used for iteration, their space usage is constant and negligible, making the overall auxiliary space complexity O(N).

Optimal Solution

Approach

The approach is to first determine if the address looks like an IPv4 (with dots) or an IPv6 (with colons). Then, we break the address into its parts based on that separator and meticulously check if each part follows the strict rules for that specific address type.

Here's how the algorithm would work step-by-step:

  1. First, look for the separator character in the address. If it contains dots, we'll test it against IPv4 rules. If it contains colons, we'll test it against IPv6 rules. If it contains neither or both, it's invalid.
  2. To check for a valid IPv4 address, we first confirm that it's made of exactly four parts separated by dots.
  3. Then, we examine each of the four parts. Each part must be a number between 0 and 255, without any letters or symbols.
  4. We also make sure that numbers don't have extra zeros at the beginning. For example, '05' is not allowed, but '5' or '0' are fine. If all four parts pass these checks, the address is a valid IPv4.
  5. To check for a valid IPv6 address, we first confirm that it's made of exactly eight parts separated by colons.
  6. Next, we examine each of the eight parts. Each part must be between one and four characters long.
  7. Finally, every character in each of these parts must be a valid hexadecimal digit (0-9 or A-F, case doesn't matter). If all eight parts pass these checks, the address is a valid IPv6.
  8. If the address fails to meet all the rules for either type, we conclude that it is neither.

Code Implementation

class Solution:
    def validIPAddress(self, query_ip: str) -> str:
        # A '.' indicates a potential IPv4 address, so we attempt to validate it as such first.

        if '.' in query_ip:
            ipv4_parts = query_ip.split('.')
            if len(ipv4_parts) != 4:
                return "Neither"

            for part in ipv4_parts:
                # Each IPv4 segment must be a number from 0-255 and not have ambiguous leading zeros.

                if not part.isdigit():
                    return "Neither"
                
                if len(part) > 1 and part.startswith('0'):
                    return "Neither"
                
                if not 0 <= int(part) <= 255:
                    return "Neither"

            return "IPv4"

        # If it's not IPv4, a ':' indicates a potential IPv6 address.

        elif ':' in query_ip:
            ipv6_parts = query_ip.split(':')
            if len(ipv6_parts) != 8:
                return "Neither"
            
            valid_hex_characters = "0123456789abcdefABCDEF"
            for part in ipv6_parts:
                # Each IPv6 segment must be 1-4 valid hexadecimal characters.

                if not (1 <= len(part) <= 4):
                    return "Neither"
                
                for character in part:
                    if character not in valid_hex_characters:
                        return "Neither"

            return "IPv6"

        return "Neither"

Big(O) Analysis

Time Complexity
O(n) – The time complexity is determined by the length of the input string, let's call it n. The dominant cost comes from splitting the string into its numeric or hexadecimal parts based on the delimiter, which requires a single pass over the string. This splitting operation takes O(n) time. Following the split, the algorithm validates a constant number of parts, 4 for IPv4 or 8 for IPv6, each with a constant maximum length, making the validation itself an O(1) operation. Thus, the overall runtime is dictated by the initial O(n) split, simplifying the total complexity to O(n).
Space Complexity
O(N) – The dominant factor in space complexity comes from the step where the algorithm must 'break the address into its parts'. This action creates a new data structure, typically a list or an array, to hold the string components of the IP address after splitting by the separator. Let N be the length of the input IP string; the total size of all the created substring components is proportional to N. Other operations, such as iterating through these parts and checking their validity, use a constant amount of extra space, making the overall auxiliary space complexity linear with respect to the input string's length.

Edge Cases

CaseHow to Handle
String with leading, trailing, or consecutive delimiters (e.g., '.1.2.3.4', '1.2.3.4.', '1..2.3.4').Splitting the string by the delimiter will produce empty parts, which must be rejected by component validation checks.
An IPv4 component with a leading zero, such as '01', which is invalid.The solution must explicitly check if a component has a length greater than one and starts with '0'.
An incorrect number of components for either format (e.g., 3 for IPv4 or 9 for IPv6).The length of the array after splitting the string must be exactly 4 for IPv4 or 8 for IPv6.
An IPv4 component that is numerically outside the valid 0-255 range (e.g., '256' or '-1').After parsing a component as a number, its value must be checked to be within the inclusive [0, 255] range.
An IPv6 component with zero characters or more than four characters (e.g., 'fffff').Each component's length must be validated to be between 1 and 4 characters inclusive.
A component contains non-digit characters for IPv4 or non-hexadecimal characters for IPv6.The validation logic for each component must strictly enforce the allowed character sets.
Input string contains both '.' and ':' delimiters.A preliminary check can quickly identify and reject strings containing both types of separators.
The empty string or a string with just a delimiter as input.The solution should handle these gracefully by returning 'Neither' without causing errors.