Every valid email consists of a local name and a domain name, separated by the '@'
sign. Besides lowercase letters, the email may contain one or more '.'
or '+'
.
"alice@leetcode.com"
, "alice"
is the local name, and "leetcode.com"
is the domain name.If you add periods '.'
between some characters in the local name part of an email address, mail sent there will be forwarded to the same address without dots in the local name. Note that this rule does not apply to domain names.
"alice.z@leetcode.com"
and "alicez@leetcode.com"
forward to the same email address.If you add a plus '+'
in the local name, everything after the first plus sign will be ignored. This allows certain emails to be filtered. Note that this rule does not apply to domain names.
"m.y+name@email.com"
will be forwarded to "my@email.com"
.It is possible to use both of these rules at the same time.
Given an array of strings emails
where we send one email to each emails[i]
, return the number of different addresses that actually receive mails.
Example 1:
Input: emails = ["test.email+alex@leetcode.com","test.e.mail+bob.cathy@leetcode.com","testemail+david@lee.tcode.com"] Output: 2 Explanation: "testemail@leetcode.com" and "testemail@lee.tcode.com" actually receive mails.
Example 2:
Input: emails = ["a@leetcode.com","b@leetcode.com","c@leetcode.com"] Output: 3
Constraints:
1 <= emails.length <= 100
1 <= emails[i].length <= 100
emails[i]
consist of lowercase English letters, '+'
, '.'
and '@'
.emails[i]
contains exactly one '@'
character.'+'
character.".com"
suffix.".com"
suffix.When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:
The brute force approach for counting unique email addresses involves systematically processing each email and comparing it against all previously processed emails. The core idea is to simulate email sending and filter out duplicates based on the rules of the email system itself. It's like manually checking every email against every other email to see if they would end up at the same inbox.
Here's how the algorithm would work step-by-step:
def num_unique_emails_brute_force(emails):
unique_email_addresses = []
for email in emails:
# Split the email into local and domain parts
local_name, domain_name = email.split('@')
# Remove periods from the local name
local_name = local_name.replace('.', '')
# Split at the '+' sign and take the part before it
if '+' in local_name:
local_name = local_name.split('+')[0]
cleaned_email = local_name + '@' + domain_name
# Check if this cleaned email is already in the unique list
if cleaned_email not in unique_email_addresses:
# Only add if the cleaned email is truly unique
unique_email_addresses.append(cleaned_email)
return len(unique_email_addresses)
The goal is to count how many unique email addresses are valid after applying the special rules for local names (handling periods and plus signs). Instead of directly comparing every email, we standardize them first so emails that are actually the same are treated as one. This significantly reduces the number of comparisons needed.
Here's how the algorithm would work step-by-step:
def unique_email_addresses(emails):
unique_emails = set()
for email in emails:
local_name, domain_name = email.split('@')
# Remove all periods from the local name
local_name = local_name.replace('.', '')
# Handle the plus sign by truncating the local name
if '+' in local_name:
plus_index = local_name.find('+')
local_name = local_name[:plus_index]
# Standardize the email address
standardized_email = local_name + '@' + domain_name
# Track unique email addresses using a set
unique_emails.add(standardized_email)
return len(unique_emails)
Case | How to Handle |
---|---|
Empty email array | Return 0 if the input email array is empty. |
Null email array | Throw an IllegalArgumentException or return 0 if the input email array is null. |
Email with only local name | Handle emails without a domain by treating the local name as the full email (e.g. 'test' becomes 'test'). |
Email with only domain name | Reject emails without local names or treat them as invalid and exclude them from the count. |
Email with consecutive dots in local name | Collapse consecutive dots into a single dot or remove them altogether as per the rule. |
Email with plus sign at the end of local name | The plus sign and all characters following it are dropped, including the ending character. |
Extremely long email addresses causing potential memory issues | Limit the length of processed email strings to prevent excessive memory consumption and potential denial of service. |
Emails with invalid characters (e.g., control characters, Unicode) | Filter out or sanitize emails containing invalid characters to ensure correct processing. |