Given a list of accounts
where each element accounts[i]
is a list of strings. accounts[i][0]
is a name, and the rest of the elements are emails representing emails of the account.
We want to merge these accounts. Two accounts belong to the same person if they share a common email. Accounts with the same name may belong to different people.
After merging, return the accounts in the following format: the first element is the name, and the rest are emails in sorted order. The accounts themselves can be returned in any order.
Example 1:
Input: accounts = [["John","johnsmith@mail.com","john_newyork@mail.com"],["John","johnsmith@mail.com","john00@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]]
Output: [["John","john00@mail.com","john_newyork@mail.com","johnsmith@mail.com"],["Mary","mary@mail.com"],["John","johnnybravo@mail.com"]]
Example 2:
Input: accounts = [["Gabe","Gabe0@m.co","Gabe3@m.co","Gabe1@m.co"],["Kevin","Kevin3@m.co","Kevin5@m.co","Kevin0@m.co"],["Ethan","Ethan5@m.co","Ethan4@m.co","Ethan0@m.co"],["Hanzo","Hanzo3@m.co","Hanzo1@m.co","Hanzo0@m.co"],["Fern","Fern5@m.co","Fern1@m.co","Fern0@m.co"]]
Output: [["Ethan","Ethan0@m.co","Ethan4@m.co","Ethan5@m.co"],["Gabe","Gabe0@m.co","Gabe1@m.co","Gabe3@m.co"],["Hanzo","Hanzo0@m.co","Hanzo1@m.co","Hanzo3@m.co"],["Kevin","Kevin0@m.co","Kevin3@m.co","Kevin5@m.co"],["Fern","Fern0@m.co","Fern1@m.co","Fern5@m.co"]]
Constraints:
1 <= accounts.length <= 1000
2 <= accounts[i].length <= 10
1 <= accounts[i][j].length <= 30
accounts[i][0]
consists of English letters.accounts[i][j] (for j > 0)
is a valid email.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The goal is to combine email accounts that belong to the same person. The brute force way involves checking every possible pair of accounts and merging them if they share an email. We continue merging until no more accounts can be combined.
Here's how the algorithm would work step-by-step:
def accounts_merge_brute_force(accounts):
merged_accounts = accounts[:] # Start with a copy
number_of_accounts = len(accounts)
for first_index in range(number_of_accounts):
for second_index in range(number_of_accounts):
# Ensure we don't compare the same account to itself
if first_index == second_index:
continue
first_account = merged_accounts[first_index]
second_account = merged_accounts[second_index]
first_account_emails = set(first_account[1:])
second_account_emails = set(second_account[1:])
# Check if the accounts share any email addresses
if not first_account_emails.isdisjoint(second_account_emails):
# Merge the accounts
merged_name = first_account[0]
merged_emails = set(first_account[1:] + second_account[1:])
merged_account = [merged_name] + sorted(list(merged_emails))
merged_accounts[first_index] = merged_account
# Mark the second account as empty to avoid reprocessing
merged_accounts[second_index] = [""]
# Remove any empty accounts from the final result
final_accounts = [account for account in merged_accounts if account != [""]]
# Need to remove dupes after all merging is complete
final_result = []
for account in final_accounts:
if account not in final_result:
final_result.append(account)
return final_result
The optimal approach involves grouping emails belonging to the same person and then merging these groups. This is done by treating emails as connections in a network and identifying connected components to find all emails associated with each account. Finally, each group of emails is combined into a single account.
Here's how the algorithm would work step-by-step:
def accounts_merge(accounts):
email_to_name = {}
email_to_account_index = {}
graph = {}
# Build the graph and email mappings
for account_index, account in enumerate(accounts):
account_name = account[0]
for email_index in range(1, len(account)):
email = account[email_index]
email_to_name[email] = account_name
email_to_account_index[email] = account_index
if email not in graph:
graph[email] = []
if email_index > 1:
previous_email = account[email_index - 1]
graph[email].append(previous_email)
graph[previous_email].append(email)
merged_accounts = []
visited = set()
# Iterate through all emails
for email in email_to_name:
if email not in visited:
# Perform DFS to find connected emails
stack = [email]
visited.add(email)
connected_emails = [email]
while stack:
current_email = stack.pop()
for neighbor in graph.get(current_email, []):
if neighbor not in visited:
visited.add(neighbor)
stack.append(neighbor)
connected_emails.append(neighbor)
# Sort and add the merged account
connected_emails.sort()
account_name = email_to_name[email]
merged_accounts.append([account_name] + connected_emails)
return merged_accounts
Case | How to Handle |
---|---|
Null or empty accounts list | Return an empty list if the input accounts list is null or empty, as there are no accounts to merge. |
Single account with multiple email addresses | The algorithm should correctly group all emails belonging to that single account into a single merged account. |
Multiple accounts with the same name and different email addresses, and accounts with same email and different names | The algorithm relies on the email addresses for merging, so accounts with the same name but different emails should not be merged, accounts with same email should be merged regardless of name. |
Accounts with no email addresses | These accounts should be treated as isolated accounts and included in the result without merging. |
Maximum number of accounts and emails exceeding memory limits | Consider using data structures and algorithms that optimize memory usage, such as using a more efficient disjoint-set data structure or processing accounts in batches. |
Cyclic email relationships (e.g., A linked to B, B linked to C, and C linked to A) | The union-find data structure should correctly handle cycles and merge all connected components into a single set. |
Email addresses in different casing but considered the same (e.g., 'Test@example.com' vs 'test@example.com') | Normalize email addresses to lowercase before processing to ensure case-insensitive comparison and merging. |
Accounts with very large number of emails in each account | Algorithm should be efficient even with large lists in each account and should be tested to ensure no time out errors occur. |