Taro Logo

Remove Sub-Folders from the Filesystem

Medium
3 views
a month ago

Given a list of folders folder, return the folders after removing all sub-folders in those folders. You may return the answer in any order. If a folder[i] is located within another folder[j], it is called a sub-folder of it. A sub-folder of folder[j] must start with folder[j], followed by a /. For example, /a/b is a sub-folder of /a, but /b is not a sub-folder of /a/b/c. The format of a path is one or more concatenated strings of the form: / followed by one or more lowercase English letters. For example, /leetcode and /leetcode/problems are valid paths while an empty string and / are not.

Example 1: Input: folder = [/a,/a/b,/c/d,/c/d/e,/c/f] Output: [/a,/c/d,/c/f]

Example 2: Input: folder = [/a,/a/b/c,/a/b/d] Output: [/a]

Example 3: Input: folder = [/a/b/c,/a/b/ca,/a/b/d] Output: [/a/b/c,/a/b/ca,/a/b/d]

Sample Answer
class Solution:
    def removeSubfolders(self, folder: List[str]) -> List[str]:
        folder.sort()
        result = [folder[0]]
        for i in range(1, len(folder)):
            if not folder[i].startswith(result[-1] + "/"):
                result.append(folder[i])
        return result

Explanation:

  1. Sort the folders: Sorting the folders lexicographically ensures that a parent folder always appears before its subfolders in the sorted list. This is crucial for the algorithm's efficiency.
  2. Initialize the result: Start with the first folder in the sorted list as the initial element in the result list. This folder is guaranteed to not be a subfolder of any previous folder since it's the first.
  3. Iterate through the remaining folders:
    • For each folder, check if it's a subfolder of the last folder added to the result list.
    • A folder is considered a subfolder if it starts with the last folder in result followed by a /. This condition ensures that it's a direct descendant.
    • If the current folder is not a subfolder, add it to the result list.
  4. Return the result: After iterating through all the folders, the result list contains the folders that are not subfolders of any other folder.

Example:

Consider the input folder = ["/a", "/a/b", "/c/d", "/c/d/e", "/c/f"].

  1. Sort: folder becomes ["/a", "/a/b", "/c/d", "/c/d/e", "/c/f"].
  2. Initialize: result = ["/a"].
  3. Iterate:
    • "/a/b" starts with "/a/", so it's a subfolder. Skip.
    • "/c/d" does not start with "/a/", so add it to result. result = ["/a", "/c/d"].
    • "/c/d/e" starts with "/c/d/", so it's a subfolder. Skip.
    • "/c/f" does not start with "/c/d/", so add it to result. result = ["/a", "/c/d", "/c/f"].
  4. Return: result is ["/a", "/c/d", "/c/f"].

Time and Space Complexity

  • Time Complexity: O(N log N + N * K), where N is the number of folders and K is the average length of the folder strings. The N log N term comes from sorting the folders, and the N * K term comes from iterating through the folders and checking if one is a subfolder of another (the startswith operation).
  • Space Complexity: O(N), where N is the number of folders. This is because, in the worst case, all folders might be distinct and added to the result list. The sorting operation might also take up O(N) space depending on the sorting algorithm used by the programming language.