Taro Logo

Subdomain Visit Count

Medium
Roblox logo
Roblox
0 views
Topics:
ArraysStrings

A website domain "discuss.leetcode.com" consists of various subdomains. At the top level, we have "com", at the next level, we have "leetcode.com" and at the lowest level, "discuss.leetcode.com". When we visit a domain like "discuss.leetcode.com", we will also visit the parent domains "leetcode.com" and "com" implicitly.

A count-paired domain is a domain that has one of the two formats "rep d1.d2.d3" or "rep d1.d2" where rep is the number of visits to the domain and d1.d2.d3 is the domain itself.

  • For example, "9001 discuss.leetcode.com" is a count-paired domain that indicates that discuss.leetcode.com was visited 9001 times.

Given an array of count-paired domains cpdomains, return an array of the count-paired domains of each subdomain in the input. You may return the answer in any order.

Example 1:

Input: cpdomains = ["9001 discuss.leetcode.com"]
Output: ["9001 leetcode.com","9001 discuss.leetcode.com","9001 com"]
Explanation: We only have one website domain: "discuss.leetcode.com".
As discussed above, the subdomain "leetcode.com" and "com" will also be visited. So they will all be visited 9001 times.

Example 2:

Input: cpdomains = ["900 google.mail.com", "50 yahoo.com", "1 intel.mail.com", "5 wiki.org"]
Output: ["901 mail.com","50 yahoo.com","900 google.mail.com","5 wiki.org","5 org","1 intel.mail.com","951 com"]
Explanation: We will visit "google.mail.com" 900 times, "yahoo.com" 50 times, "intel.mail.com" once and "wiki.org" 5 times.
For the subdomains, we will visit "mail.com" 900 + 1 = 901 times, "com" 900 + 50 + 1 = 951 times, and "org" 5 times.

Constraints:

  • 1 <= cpdomain.length <= 100
  • 1 <= cpdomain[i].length <= 100
  • cpdomain[i] follows either the "repi d1i.d2i.d3i" format or the "repi d1i.d2i" format.
  • repi is an integer in the range [1, 104].
  • d1i, d2i, and d3i consist of lowercase English letters.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the expected range for the count values in each 'count domain' string?
  2. Can the input array `cpdomains` be empty or contain null/empty strings?
  3. What is the expected format for the returned list of strings? Is the order of the domains important?
  4. If a subdomain appears multiple times in the input with different counts, should the counts be summed?
  5. Are the domain names guaranteed to be valid, or do I need to handle invalid characters or formats?

Edge Cases

CaseHow to Handle
Empty input array cpdomainsReturn an empty list, as there are no domains to process.
Null input array cpdomainsThrow an IllegalArgumentException or return an empty list to handle the invalid input.
Empty string in cpdomainsSkip the empty string and continue processing other valid entries to avoid errors.
String in cpdomains without a space separating count and domainReturn an empty list or throw an exception as the input is malformed and cannot be parsed.
String in cpdomains with multiple spacesSplit the string by spaces, taking only the first element as the count and the rest as the domain, handling potential extra spaces.
Non-integer count in cpdomainsThrow an exception or return an empty list or skip the record, logging an error, since the count must be a valid integer.
Domain with leading/trailing dotsTrim the domain string before processing subdomains to avoid invalid subdomain entries.
Integer overflow when summing countsUse a larger data type like `long` to store the counts to avoid potential integer overflow issues.