0

Sys Design Question: Improving Cost on a CDN Design

Profile picture
Senior Software Engineer at Upstart4 months ago

I was recently asked to design an autocomplete system. I suggested a solution where results by prefix are physically cached to a CDN. I was told that this solution isn't optimal because it represents an inefficient storage approach. I'm trying to understand an optimal solution.

Some alternatives I can think of:

  1. S3 - better storage cost, too slow
  2. Redis - about as fast, but more expensive
  3. Dynamo - about as fast, I don't know how it compares on cost

I think the main two cost components will be storage and read cost. I'm having trouble finding useful comparative data on these. Is it commonly understood/thought that Dynamo is cheaper than a CDN at scale, or is there some other design I'm not thinking of?

48
2

Discussion

(2 comments)
  • 1
    Profile picture
    Tech Lead @ Robinhood, Meta, Course Hero
    4 months ago

    Autocomplete is purely about speed (especially if you're doing live suggestions as they user types) - Wouldn't something like Redis be the fastest as it's in-memory? Physically caching to a CDN seems heavy, making it inefficient in terms of performance.

    Zooming out, a good system design interview (assuming the interviewer is high-quality) is about explaining the trade-offs and discussing (often conflicting) opinions. As a senior engineer, you will inevitably run into the scenario on the job where you propose a technical design and another senior or staff engineer disagrees. At that point, you need to make a decision:

    1. Defend your approach entirely
    2. Abandon your approach and take theirs
    3. Find a middle-ground and incorporate elements of their approach into yours (often the best)

    For a senior engineer in a system design interview, I would expect them to clearly take 1 of the 3 approaches, demonstrating conviction and strong reasoning.

    For me, I would state upfront that performance would be my priority (prioritize the end user) and I'm happy to sacrifice $$$ cost for that.

    I asked ChatGPT as well and the response was helpful (it mentioned a Trie, which is a nice call): https://chatgpt.com/share/1f99202f-cea8-41c7-9276-e615fe975563

    • 1
      Profile picture
      Senior Software Engineer [OP]
      Upstart
      4 months ago

      Wouldn't something like Redis be the fastest as it's in-memory?

      I think the distributed aspect of a CDN will still often win compared to a non-distributed Redis, but if we can have a distributed Redis acting as a CDN (so, not elasticache) then I would agree that would be faster. I'll keep this in mind as an alternate design next go-round, thanks!

      Edit: Also using ChatGPT, I found a couple edge cloud offerings for key-value stores including Cloudflare Workers with KV Store and Fastly Edge dictionaries:
      https://chatgpt.com/share/d426a3e9-9177-44d0-98f9-bb0801e3d0cb