Taro Logo

Validate IP Address

Medium
Nvidia logo
Nvidia
2 views
Topics:
Strings

Given a string queryIP, determine if it's a valid IPv4 or IPv6 address. Return "IPv4" if it's a valid IPv4 address, "IPv6" if it's a valid IPv6 address, or "Neither" if it's neither. Here are the requirements for each IP address type:

IPv4: An IPv4 address is in the form "x1.x2.x3.x4", where:

  1. 0 <= xi <= 255
  2. xi cannot contain leading zeros.

Examples:

  • "192.168.1.1" is valid.
  • "192.168.01.1" is invalid (leading zero).
  • "256.256.256.256" is invalid (value > 255).

IPv6: An IPv6 address is in the form "x1:x2:x3:x4:x5:x6:x7:x8", where:

  1. 1 <= xi.length <= 4
  2. xi is a hexadecimal string (digits, 'a'-'f', 'A'-'F').
  3. Leading zeros are allowed.

Examples:

  • "2001:0db8:85a3:0000:0000:8a2e:0370:7334" is valid.
  • "2001:db8:85a3:0:0:8A2E:0370:7334" is valid.
  • "2001:0db8:85a3::8A2E:037j:7334" is invalid (invalid character).
  • "02001:0db8:85a3:0000:0000:8a2e:0370:7334" is invalid (invalid format).

Could you implement a function to solve this problem?

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What constitutes a valid IP address format beyond the basic four numbers separated by dots? Are there any restrictions on leading zeros, number ranges (e.g., must each number be between 0 and 255), or whitespace?
  2. Should I only consider IPv4 addresses, or should I also handle IPv6 addresses? If both, how should I differentiate between them in the input string?
  3. What should I return if the input string is empty or null? Should I throw an exception, return a specific error code, or return a boolean value indicating validity (false)?
  4. Are there any specific allowed characters besides digits and dots? Should I expect and handle cases with leading/trailing whitespace or other unexpected characters?
  5. If the IP address is invalid, should I return a boolean (false) or a more descriptive error message (e.g., a string describing the validation failure)?

Brute Force Solution

Approach

The brute force way to validate an IP address is to check every possible format and value allowed. We essentially try everything until we either find a valid IP or determine it's impossible. It’s like trying every combination lock sequence until it opens.

Here's how the algorithm would work step-by-step:

  1. First, check if the input string contains the correct number of separators (dots) to be a valid IP address; it should have exactly three.
  2. Next, split the string by the dots. Now you have four parts that might represent numbers.
  3. For each part, examine it closely to make sure it contains only digits and nothing else.
  4. If it does contain something other than a digit, then this is not a valid IP address.
  5. Also make sure that none of these parts are empty.
  6. For each numerical part, check if it's within the valid range; it must be between 0 and 255.
  7. If any part is outside this range, the IP address is invalid.
  8. Finally, ensure that none of the parts have leading zeros unless the part is simply '0' itself. So '01' is invalid, but '0' is valid.
  9. If all of these checks pass, then we can confidently say the IP address is valid; otherwise, it is invalid.

Code Implementation

def validate_ip_address(ip_address):
    number_of_periods = ip_address.count('.')
    # Need exactly 3 periods for a valid IPv4 address
    if number_of_periods != 3:
        return False

    parts = ip_address.split('.')

    if len(parts) != 4:
        return False

    for part in parts:
        # Empty parts are invalid
        if not part:
            return False

        if not part.isdigit():
            return False

        # Convert the part to an integer for range checking
        try:
            numeric_part = int(part)
        except ValueError:
            return False

        # Check if the part is within the valid range (0-255)
        if numeric_part < 0 or numeric_part > 255:
            return False

        # Leading zeros are invalid, unless the part is just '0'
        if len(part) > 1 and part[0] == '0':
            return False

    return True

Big(O) Analysis

Time Complexity
O(n)The algorithm splits the input string of length n (where n is the length of the IP address string) into four parts based on the dots. After the split, it iterates through these four parts. Within the loop, checks are done on each part: digit validation, range validation, and leading zero check. These individual checks are proportional to the length of the individual number string, which is bounded by the original string length, but are limited to four parts (at most), so these are effectively constant time operations. Therefore, the time complexity is dominated by the initial split operation, making it O(n).
Space Complexity
O(1)The algorithm primarily uses a fixed number of variables to store the split IP address parts. The space required for these variables (typically four strings or integers depending on implementation details, but a fixed number) does not scale with the size of the input IP address string (N). Therefore, the auxiliary space complexity is constant.

Optimal Solution

Approach

To validate an IP address efficiently, we need to break down the address into its components and check if each part conforms to the rules. Instead of writing many nested if statements, it's better to make focused checks and return immediately if something is wrong.

Here's how the algorithm would work step-by-step:

  1. First, see if the input is formatted like an IPv4 address (four numbers separated by dots) or an IPv6 address (groups of characters separated by colons). If neither, it's invalid.
  2. If it's IPv4-like, split the string into four parts using the dots as dividers.
  3. For each part, check if it is a number between 0 and 255. If not, it's invalid. Also, check that there aren't leading zeros, unless the number is just zero itself.
  4. If it's IPv6-like, split the string into parts using colons.
  5. Check that there are at most 8 parts in the IPv6 address. If more, it's invalid.
  6. For each IPv6 part, check if it's made of valid hexadecimal characters (numbers 0-9 and letters a-f or A-F) and is not longer than four characters.
  7. Handle the special case of consecutive colons (::) which can represent a group of zeroed-out parts. Ensure there's only one instance of :: and adjust the count accordingly.
  8. If all parts pass the checks, and the overall structure matches either IPv4 or IPv6 standards, then the IP address is valid.

Code Implementation

def validate_ip_address(ip_address):    if '.' in ip_address:
        return validate_ipv4(ip_address)
    elif ':' in ip_address:
        return validate_ipv6(ip_address)
    else:
        return "Neither"

def validate_ipv4(ip_address):    address_segments = ip_address.split('.')
    if len(address_segments) != 4:
        return "Neither"

    for segment in address_segments:
        if not segment.isdigit():
            return "Neither"
        
        # Check if the segment has leading zeros and is not just zero
        if len(segment) > 1 and segment[0] == '0':
            return "Neither"
        
        try:
            numeric_segment = int(segment)
            if numeric_segment < 0 or numeric_segment > 255:
                return "Neither"
        except ValueError:
            return "Neither"
            
    return "IPv4"

def validate_ipv6(ip_address):    address_segments = ip_address.split(':')

    # Check for too many segments. The maximum number of segments is 8
    if len(address_segments) > 8:
        return "Neither"

    double_colon_found = False
    for segment in address_segments:
        if not segment:
            if double_colon_found:
                return "Neither" # Multiple :: are invalid
            double_colon_found = True
            continue
        
        if len(segment) > 4:
            return "Neither" # Max length of segment is 4

        # Check if all characters are valid hexadecimal
        try:
            int(segment, 16)
        except ValueError:
            return "Neither"

    # Check if the number of segments is less than 8, :: must be present
    if len(address_segments) < 8 and not double_colon_found:
        return "Neither"
        
    return "IPv6"

Big(O) Analysis

Time Complexity
O(1)The input string is first split into either IPv4 or IPv6 components. For IPv4, splitting the string and validating each of the four numerical components takes constant time. For IPv6, splitting and validating up to 8 hexadecimal groups also takes constant time, as the maximum number of groups and the maximum length of each group (4 characters) are fixed. Therefore, the overall time complexity remains constant regardless of the input IP address.
Space Complexity
O(N)The primary auxiliary space comes from splitting the input string into parts based on delimiters (dots or colons). In the worst-case IPv6 scenario, the input string may contain N characters, and splitting it could result in a list of strings each of length N in the worst case which will be discarded after computation. Therefore the space complexity is O(N), where N is the length of the input string.

Edge Cases

CaseHow to Handle
Null or empty string inputReturn false immediately as an empty string cannot be a valid IP address.
String with leading/trailing whitespaceTrim the input string to remove leading/trailing whitespace before validation.
String with multiple consecutive dotsReturn false, as consecutive dots indicate invalid segment separation.
Segments with leading zeros (e.g., '01', '00')Return false if any segment has leading zeros, unless it is a single '0'.
Segments outside the range [0, 255]Return false if any segment is less than 0 or greater than 255.
String with non-numeric charactersReturn false if the string contains any characters other than digits and dots.
String with more or fewer than 4 segmentsReturn false if the string has more or fewer than four segments separated by dots.
Integer overflow when parsing segmentsEnsure that integer parsing handles values that might cause overflow, potentially using a try-catch block or checking the string length of each segment before parsing.