Taro Logo

Encode and Decode TinyURL

Medium
Google logo
Google
2 views
Topics:
StringsArraysDatabase Problems

TinyURL is a URL shortening service where you enter a URL such as https://leetcode.com/problems/design-tinyurl and it returns a short URL such as http://tinyurl.com/4e9iAk. Design a class to encode a URL and decode a tiny URL.

There is no restriction on how your encode/decode algorithm should work. You just need to ensure that a URL can be encoded to a tiny URL and the tiny URL can be decoded to the original URL.

Implement the Solution class:

  • Solution() Initializes the object of the system.
  • String encode(String longUrl) Returns a tiny URL for the given longUrl.
  • String decode(String shortUrl) Returns the original long URL for the given shortUrl. It is guaranteed that the given shortUrl was encoded by the same object.

For example, if the input URL is https://leetcode.com/problems/design-tinyurl, the encode function should return a shortened URL (e.g., http://tinyurl.com/4e9iAk), and the decode function, when given the shortened URL, should return the original URL https://leetcode.com/problems/design-tinyurl.

Solution


URL Shortener (TinyURL) Design

Problem Statement

Design a class to encode a URL and decode a tiny URL. There is no restriction on how your encode/decode algorithm should work. You just need to ensure that a URL can be encoded to a tiny URL and the tiny URL can be decoded to the original URL.

Implement the Solution class:

  • Solution() Initializes the object of the system.
  • String encode(String longUrl) Returns a tiny URL for the given longUrl.
  • String decode(String shortUrl) Returns the original long URL for the given shortUrl. It is guaranteed that the given shortUrl was encoded by the same object.

Naive Approach

A simple approach is to use a hash map to store the mapping between the long URL and the short URL. When encoding, we can generate a unique short URL (e.g., using a counter or random string) and store the mapping in the hash map. When decoding, we can simply look up the long URL in the hash map using the short URL.

Code (Python)

import random
import string

class Solution:
    def __init__(self):
        self.url_to_short = {}
        self.short_to_url = {}
        self.base = "http://tinyurl.com/"

    def encode(self, longUrl):
        if longUrl in self.url_to_short:
            return self.url_to_short[longUrl]

        short_url = self.base + self.generate_short_url()
        while short_url in self.short_to_url:
            short_url = self.base + self.generate_short_url()

        self.url_to_short[longUrl] = short_url
        self.short_to_url[short_url] = longUrl
        return short_url

    def decode(self, shortUrl):
        return self.short_to_url[shortUrl]

    def generate_short_url(self):
        return ''.join(random.choices(string.ascii_letters + string.digits, k=6))

Time Complexity

  • encode: O(1) on average, O(n) in the worst case (if short URL generation has many collisions).
  • decode: O(1)

Space Complexity

  • O(n), where n is the number of long URLs stored.

Optimal Approach

The optimal approach remains conceptually the same, using a hash map to store the mappings. The key is in the short URL generation to avoid collisions and ensure uniqueness. A good strategy is using a base-62 encoding scheme based on an auto-incrementing integer ID.

Code (Python)

class Solution:
    def __init__(self):
        self.url_to_short = {}
        self.short_to_url = {}
        self.counter = 0
        self.base = "http://tinyurl.com/"

    def encode(self, longUrl):
        if longUrl in self.url_to_short:
            return self.url_to_short[longUrl]

        self.counter += 1
        short_url = self.base + self.id_to_short_url(self.counter)

        self.url_to_short[longUrl] = short_url
        self.short_to_url[short_url] = longUrl
        return short_url

    def decode(self, shortUrl):
        return self.short_to_url[shortUrl]

    def id_to_short_url(self, n):
        characters = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
        short_url = ""
        while n > 0:
            short_url = characters[n % 62] + short_url
            n //= 62
        return short_url

Time Complexity

  • encode: O(1) on average (amortized, assuming integer increment is O(1)), O(L) in the worst case where L is the length of the generated short URL.
  • decode: O(1)

Space Complexity

  • O(n), where n is the number of long URLs stored.

Edge Cases

  • URL already exists: The code handles the case where the same long URL is encoded multiple times by returning the existing short URL.
  • Collision of short URLs: The counter-based method inherently avoids collisions.
  • Invalid short URL during decode: The problem statement guarantees that the shortUrl was encoded by the same object, so we don't need to handle invalid short URLs.