Taro Logo

Sequentially Ordinal Rank Tracker

Hard
Asked by:
Profile picture
2 views
Topics:
ArraysStrings

A scenic location is represented by its name and attractiveness score, where name is a unique string among all locations and score is an integer. Locations can be ranked from the best to the worst. The higher the score, the better the location. If the scores of two locations are equal, then the location with the lexicographically smaller name is better.

You are building a system that tracks the ranking of locations with the system initially starting with no locations. It supports:

  • Adding scenic locations, one at a time.
  • Querying the ith best location of all locations already added, where i is the number of times the system has been queried (including the current query).
    • For example, when the system is queried for the 4th time, it returns the 4th best location of all locations already added.

Note that the test data are generated so that at any time, the number of queries does not exceed the number of locations added to the system.

Implement the SORTracker class:

  • SORTracker() Initializes the tracker system.
  • void add(string name, int score) Adds a scenic location with name and score to the system.
  • string get() Queries and returns the ith best location, where i is the number of times this method has been invoked (including this invocation).

Example 1:

Input
["SORTracker", "add", "add", "get", "add", "get", "add", "get", "add", "get", "add", "get", "get"]
[[], ["bradford", 2], ["branford", 3], [], ["alps", 2], [], ["orland", 2], [], ["orlando", 3], [], ["alpine", 2], [], []]
Output
[null, null, null, "branford", null, "alps", null, "bradford", null, "bradford", null, "bradford", "orland"]

Explanation
SORTracker tracker = new SORTracker(); // Initialize the tracker system.
tracker.add("bradford", 2); // Add location with name="bradford" and score=2 to the system.
tracker.add("branford", 3); // Add location with name="branford" and score=3 to the system.
tracker.get();              // The sorted locations, from best to worst, are: branford, bradford.
                            // Note that branford precedes bradford due to its higher score (3 > 2).
                            // This is the 1st time get() is called, so return the best location: "branford".
tracker.add("alps", 2);     // Add location with name="alps" and score=2 to the system.
tracker.get();              // Sorted locations: branford, alps, bradford.
                            // Note that alps precedes bradford even though they have the same score (2).
                            // This is because "alps" is lexicographically smaller than "bradford".
                            // Return the 2nd best location "alps", as it is the 2nd time get() is called.
tracker.add("orland", 2);   // Add location with name="orland" and score=2 to the system.
tracker.get();              // Sorted locations: branford, alps, bradford, orland.
                            // Return "bradford", as it is the 3rd time get() is called.
tracker.add("orlando", 3);  // Add location with name="orlando" and score=3 to the system.
tracker.get();              // Sorted locations: branford, orlando, alps, bradford, orland.
                            // Return "bradford".
tracker.add("alpine", 2);   // Add location with name="alpine" and score=2 to the system.
tracker.get();              // Sorted locations: branford, orlando, alpine, alps, bradford, orland.
                            // Return "bradford".
tracker.get();              // Sorted locations: branford, orlando, alpine, alps, bradford, orland.
                            // Return "orland".

Constraints:

  • name consists of lowercase English letters, and is unique among all locations.
  • 1 <= name.length <= 10
  • 1 <= score <= 105
  • At any time, the number of calls to get does not exceed the number of calls to add.
  • At most 4 * 104 calls in total will be made to add and get.

Solution


Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

  1. What is the range of possible integer values that can be inserted into the tracker?
  2. If the tracker is empty and I call get(rank), what should be returned, or should an exception be thrown?
  3. Are duplicate rank values allowed? If so, how should `get(rank)` behave if multiple numbers have the same rank?
  4. What is the expected frequency of `add()` and `get()` calls? This will influence data structure choice.
  5. Can I assume the input 'rank' to `get(rank)` is always within the valid range of currently added numbers? In other words, will 'rank' always be between 1 and the number of elements currently tracked inclusive?

Brute Force Solution

Approach

We want to keep track of numbers as they come in and quickly find the number that would be at a specific rank if they were all sorted. The most straightforward way is to simply re-sort all the numbers every time a new one is added and then pick the number at the desired rank.

Here's how the algorithm would work step-by-step:

  1. When a new number arrives, put it in a simple list along with all the previous numbers.
  2. To find the number at a specific rank, completely sort the entire list of numbers from smallest to largest.
  3. Once the list is sorted, the number at the rank requested is the answer. For example, if asked for the 3rd ranked number, select the third number in the now sorted list.

Code Implementation

class OrdinalRankTracker:

    def __init__(self):
        self.numbers = []

    def track(self, number):
        # Add the number to our list of numbers
        self.numbers.append(number)

    def get_rank(self, rank):
        # Create sorted copy to get ordinal rank
        sorted_numbers = sorted(self.numbers)

        # Ranks are 1-indexed, but lists are 0-indexed
        adjusted_index = rank - 1

        # Return number at the requested rank
        return sorted_numbers[adjusted_index]

Big(O) Analysis

Time Complexity
O(n log n)Adding a new number to the list is O(1). However, to find the element at a specific rank, the entire list of 'n' numbers is sorted. The dominant operation is sorting the list, which typically uses an efficient algorithm like merge sort or quicksort, leading to a time complexity of O(n log n). Selecting the element at the desired rank after sorting takes O(1) time. Therefore, the overall time complexity is determined by the sorting step, resulting in O(n log n).
Space Complexity
O(N)The algorithm stores all incoming numbers in a list. When finding the number at a specific rank, the entire list is sorted in place or copied for sorting. Therefore, the auxiliary space required is directly proportional to the number of incoming numbers, which is N. This implies that extra memory usage grows linearly with N. The space complexity is O(N).

Optimal Solution

Approach

This problem needs a way to track a changing list of scores and efficiently find the rank of the most recent score. The best method involves keeping the scores organized in a way that makes finding the rank really quick, rather than searching through everything each time. The strategy relies on an organized data structure to ensure all operations are performed as quickly as possible.

Here's how the algorithm would work step-by-step:

  1. Use a data structure that keeps the scores sorted automatically as they come in. Something like a sorted list or a binary search tree would work.
  2. When a new score is added, put it into the right spot in the sorted structure so that the list remains organized from highest to lowest.
  3. To find the rank of the last added score, determine its position within the sorted structure.
  4. The rank is simply one more than the number of scores that are higher than the last added score. So, if there are five scores greater, its rank is six.
  5. The key is to maintain the sorted order so finding the rank is fast. Using a simple list and re-sorting would be too slow for a lot of scores.

Code Implementation

import bisect

class SORTracker:

    def __init__(self):
        self.scores = []
        self.last_added_score = None

    def add(self, score):
        # Use bisect to insert the score in sorted order
        bisect.insort(self.scores, score)
        self.last_added_score = score

    def getRank(self):
        # Find the index of the last added score
        index = self.scores.index(self.last_added_score)

        # Rank is the index + 1.
        rank = index + 1
        return rank

# Your SORTracker object will be instantiated and called as such:
# sor_tracker = SORTracker()
# sor_tracker.add(score1)
# rank1 = sor_tracker.getRank()

Big(O) Analysis

Time Complexity
O(log n)The solution utilizes a sorted data structure (likely a binary search tree or a sorted list with binary search insertion) to maintain the scores. Inserting a new score into the sorted structure requires finding the correct position, which can be done in O(log n) time using binary search. Finding the rank of the last added score then simply involves determining its index in the sorted structure, which is O(1) since it is already known upon insertion. Therefore, the dominant operation is the insertion, resulting in an overall time complexity of O(log n).
Space Complexity
O(N)The primary auxiliary space usage comes from storing the scores in a sorted data structure, as suggested by the plain English explanation. This could be a sorted list or a binary search tree, both of which will need to hold all N scores that are added. Therefore, the space required grows linearly with the number of scores, N. No other significant data structures are used.

Edge Cases

CaseHow to Handle
Empty input streamInitialize the tracker and handle subsequent insertions/queries appropriately, likely returning an empty result or a default rank for empty tracker.
Stream with only one elementThe rank of that element is 1 or 0 based on the indexing convention, and future insertions should be compared to it.
Large input stream exceeding memory limitsUse an external sorting approach or approximate ranking techniques with bounded memory usage, or alternatively divide and conquer.
Input stream with all identical valuesAll elements will have the same rank, potentially requiring special handling for how identical ranks are assigned (e.g., dense vs. fractional ranking).
Input stream containing duplicate values with same ordinal rank.When duplicates are encountered for the same ordinal rank, the tracker must consistently handle the ranking, e.g., always assign lower/higher rank based on the original arrival order of the numbers.
Querying for a rank when the tracker is emptyReturn a predefined default rank (e.g., -1, 0) or throw an exception to indicate that no rank can be determined.
Values inserted are close to the integer limits (MAX_INT, MIN_INT).Check for potential integer overflows during calculations related to ranking or differences between values, and use long or appropriate data types as necessary.
Maintaining consistent ranks when equal elements are inserted and deleted.Ensure that the data structure supports efficient deletion and rank updates to correctly reflect the new ranking after deleting an item.