Taro Logo

Simplify Path

Medium
Capital One logo
Capital One
1 view
Topics:
StringsStacks

You are given an absolute path for a Unix-style file system, which always begins with a slash '/'. Your task is to transform this absolute path into its simplified canonical path.

The rules of a Unix-style file system are as follows:

  • A single period '.' represents the current directory.
  • A double period '..' represents the previous/parent directory.
  • Multiple consecutive slashes such as '//' and '///' are treated as a single slash '/'.
  • Any sequence of periods that does not match the rules above should be treated as a valid directory or file name. For example, '...' and '....' are valid directory or file names.

The simplified canonical path should follow these rules:

  • The path must start with a single slash '/'.
  • Directories within the path must be separated by exactly one slash '/'.
  • The path must not end with a slash '/', unless it is the root directory.
  • The path must not have any single or double periods ('.' and '..') used to denote current or parent directories.

Return the simplified canonical path.

Example 1:

Input: path = "/home/"

Output: "/home"

Explanation:

The trailing slash should be removed.

Example 2:

Input: path = "/home//foo/"

Output: "/home/foo"

Explanation:

Multiple consecutive slashes are replaced by a single one.

Example 3:

Input: path = "/home/user/Documents/../Pictures"

Output: "/home/user/Pictures"

Explanation:

A double period ".." refers to the directory up a level (the parent directory).

Example 4:

Input: path = "/../"

Output: "/"

Explanation:

Going one level up from the root directory is not possible.

Example 5:

Input: path = "/.../a/../b/c/../d/./"

Output: "/.../b/d"

Explanation:

"..." is a valid name for a directory in this problem.

Constraints:

  • 1 <= path.length <= 3000
  • path consists of English letters, digits, period '.', slash '/' or '_'.
  • path is a valid absolute Unix path.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. Can the input path contain any characters other than alphanumeric characters, '/', '.', and potentially spaces? If so, how should I handle them?
  2. Are there any constraints on the length of the input path string?
  3. If the simplified path resolves to the root directory, should I return just "/" or something else?
  4. How should I handle consecutive slashes in the input path (e.g., "//home//foo")?
  5. Is the input path guaranteed to be a valid absolute path, or could it contain invalid sequences of '..' that navigate above the root directory?

Brute Force Solution

Approach

The brute force method for simplifying a file path involves exploring all possible interpretations of the given path. We systematically go through each part of the path, considering all options for handling special components like '.', '..', and multiple slashes. This approach explores every possible path transformation until a simplified version is found.

Here's how the algorithm would work step-by-step:

  1. First, split the given path string into individual parts separated by slashes.
  2. Then, go through these parts one by one.
  3. For each part, consider it as a normal directory name, a 'go to parent directory' command ('..'), a 'stay in the current directory' command ('.') or an empty part (meaning extra slashes).
  4. If it's a normal directory, add it to a list of directories representing our current path.
  5. If it's '..', go back one level in the list if possible, effectively removing the last directory added. If we are already at the root, do nothing.
  6. If it's '.' or an empty part, ignore it.
  7. After processing all parts, construct the simplified path by joining the directories in our list with slashes.
  8. Make sure the simplified path starts with a single slash.

Code Implementation

def simplify_path(path):    path_parts = path.split('/')
    stack = []

    for part in path_parts:
        # Ignore empty strings or current directory references
        if part == '' or part == '.':
            continue

        # Handle moving up a directory level
        if part == '..':
            if stack:
                stack.pop()

        # Add the directory to the stack
        else:
            stack.append(part)

    # Construct the simplified path
    simplified_path = '/' + '/'.join(stack)

    # Ensure the path is not just '/', which should represent the root
    if not simplified_path:
        return '/'

    return simplified_path

Big(O) Analysis

Time Complexity
O(n)The algorithm first splits the input path string of length n into parts, which takes O(n) time. Then, it iterates through each part of the path. The operations inside the loop, such as adding to a list or removing from a list, take O(1) time on average. Finally, constructing the simplified path from the list of directories also takes O(n) time since each directory is appended to a string. Therefore, the overall time complexity is O(n) + O(n) + O(n), which simplifies to O(n).
Space Complexity
O(N)The dominant space usage comes from the list of directories created to represent the current path. In the worst-case scenario, where the input path contains N directory names without any '..' components, the list will grow to store all N directory names. Therefore, the auxiliary space used is proportional to the number of directory names in the input path. Consequently, the space complexity is O(N), where N is the number of parts in the input path string after splitting by '/'.

Optimal Solution

Approach

The goal is to clean up a file path, like removing extra slashes and going back directories. We can use a simple way to keep track of only the important parts of the path and ignore the rest. This approach will give us the shortest, most correct path.

Here's how the algorithm would work step-by-step:

  1. First, split the given path into separate parts wherever there's a slash.
  2. Create an empty place to store the important directory names.
  3. Go through each part of the split path.
  4. If a part is just a dot (meaning 'stay in the current directory'), ignore it.
  5. If a part is two dots (meaning 'go up one directory'), check if you have any directories stored. If you do, remove the last one.
  6. If a part is a normal directory name, add it to the storage.
  7. After checking all parts, create the final cleaned path by joining the stored directory names with slashes. Always start with a slash at the beginning.

Code Implementation

def simplify_path(path): 
    path_components = path.split('/')
    stack_of_directories = []

    for component in path_components:
        if component == '' or component == '.':
            continue

        if component == '..':
            # Handle '..' by popping from the stack if it's not empty
            if stack_of_directories:
                stack_of_directories.pop()

        else:
            # Add valid directory names to the stack
            stack_of_directories.append(component)

    # Construct the simplified path
    simplified_path = '/' + '/'.join(stack_of_directories)

    return simplified_path

Big(O) Analysis

Time Complexity
O(n)The algorithm first splits the path string into a list of components based on the delimiter '/'. This operation takes O(n) time, where n is the length of the input path string. The algorithm then iterates through this list of components once. Inside the loop, operations like checking the component's value ('.', '..', or a directory name) and adding/removing elements to/from a stack or list take constant time, O(1). Since the loop iterates through all n components, the total time complexity of the loop is O(n). Finally, joining the elements from the stack with '/' takes O(m) time where m is the number of directories in the result, and in the worst case m will be proportional to n, so this operation can be considered O(n) as well. Therefore, the overall time complexity is dominated by the splitting and iteration steps, resulting in O(n).
Space Complexity
O(N)The solution uses a list to store the directory names of the simplified path. In the worst-case scenario, the input path contains N directory names that all need to be stored in this list. Therefore, the space required for the list grows linearly with the input size N, where N represents the number of components after splitting the initial path string. Consequently, the auxiliary space complexity is O(N).

Edge Cases

CaseHow to Handle
Null or empty input path stringReturn a single forward slash '/' representing the root directory.
Path containing only '.'Return a single forward slash '/' representing the root directory, as '.' means current directory.
Path containing only '..'Return a single forward slash '/' representing the root directory, as '..' from root stays at root.
Path with multiple consecutive slashes '//'Treat multiple consecutive slashes as a single slash, ensuring no empty directory names are pushed to stack.
Path ending with '/.'Ignore the '/.' at the end, effectively removing it during stack processing.
Path with mixed valid directory names, '.' and '..'Process each component sequentially, updating the stack appropriately for directory names, '.' and '..'.
Path with deeply nested directories and many '..' to move upThe stack ensures correct navigation to parent directories, even when many levels deep.
Path with trailing slashes after processingRemove trailing slashes from the final constructed string before returning to adhere to required format.