Memoize

Medium

Asked by:

17 views

Topics:

Dynamic Programming

Given a function fn, return a memoized version of that function.

A memoized function is a function that will never be called twice with the same inputs. Instead it will return a cached value.

You can assume there are 3 possible input functions: sum, fib, and factorial.

sum accepts two integers a and b and returns a + b. Assume that if a value has already been cached for the arguments (b, a) where a != b, it cannot be used for the arguments (a, b). For example, if the arguments are (3, 2) and (2, 3), two separate calls should be made.
fib accepts a single integer n and returns 1 if n <= 1 or fib(n - 1) + fib(n - 2) otherwise.
factorial accepts a single integer n and returns 1 if n <= 1 or factorial(n - 1) * n otherwise.

Example 1:

Input:
fnName = "sum"
actions = ["call","call","getCallCount","call","getCallCount"]
values = [[2,2],[2,2],[],[1,2],[]]
Output: [4,4,1,3,2]
Explanation:
const sum = (a, b) => a + b;
const memoizedSum = memoize(sum);
memoizedSum(2, 2); // "call" - returns 4. sum() was called as (2, 2) was not seen before.
memoizedSum(2, 2); // "call" - returns 4. However sum() was not called because the same inputs were seen before.
// "getCallCount" - total call count: 1
memoizedSum(1, 2); // "call" - returns 3. sum() was called as (1, 2) was not seen before.
// "getCallCount" - total call count: 2

Example 2:

Input:
fnName = "factorial"
actions = ["call","call","call","getCallCount","call","getCallCount"]
values = [[2],[3],[2],[],[3],[]]
Output: [2,6,2,2,6,2]
Explanation:
const factorial = (n) => (n <= 1) ? 1 : (n * factorial(n - 1));
const memoFactorial = memoize(factorial);
memoFactorial(2); // "call" - returns 2.
memoFactorial(3); // "call" - returns 6.
memoFactorial(2); // "call" - returns 2. However factorial was not called because 2 was seen before.
// "getCallCount" - total call count: 2
memoFactorial(3); // "call" - returns 6. However factorial was not called because 3 was seen before.
// "getCallCount" - total call count: 2

Example 3:

Input:
fnName = "fib"
actions = ["call","getCallCount"]
values = [[5],[]]
Output: [8,1]
Explanation:
fib(5) = 8 // "call"
// "getCallCount" - total call count: 1

Constraints:

0 <= a, b <= 10⁵
1 <= n <= 10
1 <= actions.length <= 10⁵
actions.length === values.length
actions[i] is one of "call" and "getCallCount"
fnName is one of "sum", "factorial" and "fib"

Try coding it in LeetCode

Solution

Clarifying Questions

When you get asked this question in a real-life environment, it will often be ambiguous (especially at FAANG). Make sure to ask these questions in that case:

What type of function are we expected to memoize? Are there any constraints on its input parameters or return type (e.g., primitives, objects, other functions)?
Are there any constraints on the number of arguments the function to be memoized can take?
What is the expected behavior if the function to be memoized throws an error? Should the memoized function also throw, or should it handle the error in some other way?
What is the scope or lifetime of the memoized values? Should the memoization cache be cleared at any point?
How should the cache key be generated for the function arguments? Are there any considerations for non-primitive argument types (e.g., objects, arrays) and how their equality should be determined?

Brute Force Solution

Approach

The brute force approach to memoization involves computing the value of a function for every possible input, without remembering previous results. It's like solving a puzzle repeatedly from scratch, even if you've already solved it before. This method explores every path to guarantee a solution, but it can be very inefficient.

Here's how the algorithm would work step-by-step:

Whenever you need to find the answer for a particular input to your function, calculate it directly.
Do not check if you have already calculated the answer before.
Repeat the entire calculation from the beginning every single time you need the answer, regardless of whether you've done it before or not.
Do this for every single input you encounter while running your program.
Return the answer you calculated directly.

Code Implementation

def fibonacci_brute_force(number):

    # Base case: if number is 0 or 1, return the number itself
    if number <= 1:
        return number

    # Recursively calculate Fibonacci without memoization
    fibonacci_minus_one = fibonacci_brute_force(number - 1)

    # Recursively calculate Fibonacci without memoization
    fibonacci_minus_two = fibonacci_brute_force(number - 2)

    # Calculate the Fibonacci number by summing the previous two
    return fibonacci_minus_one + fibonacci_minus_two

Big(O) Analysis

Time Complexity

O(exponential) – The provided solution describes a brute-force approach without memoization. This means that for any recursive function, each function call will recompute the result from scratch, potentially leading to overlapping subproblems being solved multiple times. The time complexity of this kind of solution depends heavily on the recursive structure of the problem. Typically, with overlapping subproblems and no memoization, recursive solutions have exponential time complexity, denoted as O(exponential) or often more specifically as O(b^n) where b is the branching factor and n is the depth of the recursion.

Space Complexity

O(1) – The plain English explanation states that no previous results are remembered or stored. The provided method calculates the answer directly each time it is needed without utilizing any data structures to store intermediate results or visited states. Therefore, the auxiliary space required is constant and independent of the input size N. No additional memory is allocated beyond the immediate calculations, leading to a space complexity of O(1).

Optimal Solution

Approach

The key to solving this problem efficiently is to avoid recomputing the same values over and over. We achieve this by storing previously computed results in a 'memory' or 'cache' so we can reuse them when needed.

Here's how the algorithm would work step-by-step:

First, create a 'memory' to store the results of function calls. Think of this 'memory' as a lookup table or notebook.
When the function is called with a specific input, check if the result for that input already exists in the 'memory'.
If the result is in the 'memory', simply return the stored result, avoiding redundant calculations.
If the result is NOT in the 'memory', perform the calculation as usual.
Before returning the result, store it in the 'memory' associated with the input it was calculated from. This ensures that future calls with the same input can quickly retrieve the result.
By storing and reusing results, we significantly reduce the total number of computations, leading to faster performance, especially for functions that are called repeatedly with the same inputs.

Code Implementation

def memoize(function):
    memory = {}

    def wrapper(*args):
        # Check if the result is already stored in memory.
        if args in memory:
            return memory[args]

        # If not in memory, compute the result.
        result = function(*args)

        # Store the result in memory before returning.
        memory[args] = result
        return result

    return wrapper

Big(O) Analysis

Time Complexity

O(1) – Memoization optimizes by storing previously computed results. When a function is called, we first check the 'memory' (cache). If the result for the input is already in the cache, it's retrieved in O(1) time. If the result is not in the cache, the function executes (but subsequent calls with the same input will be O(1)). Therefore, assuming sufficient cache hits, the amortized time complexity approaches O(1).

Space Complexity

O(N) – The algorithm uses a 'memory' (cache) to store the results of function calls. In the worst-case scenario, where each function call has a unique input, the 'memory' will store results for all N unique inputs encountered during execution. Thus, the auxiliary space required grows linearly with the number of unique inputs, N. This 'memory' is the primary driver of space complexity, where N represents the number of unique inputs to the function.

Edge Cases

Case	How to Handle
Null or undefined function input	Throw an IllegalArgumentException or return null to indicate invalid input and prevent errors
Function with no arguments	If arguments are dynamically constructed, the cache key generation must account for this to avoid collision with other calls
Function with arguments that are objects without a proper toString or serialization method	Implement a custom key generation that stringifies the object arguments or uses a hash function to create a unique key
Maximum recursion depth with recursive memoized functions	Implement a maximum cache size and eviction policy to prevent excessive memory usage and potential stack overflow errors
Function with arguments that mutate after the function call	Deep copy the arguments during key generation or avoid mutating arguments that are memoized
Integer overflow during cache key generation (e.g., hashing)	Use a sufficiently large integer type or a modulo operator to prevent overflow during hash calculations
Cache pollution with rarely used function calls	Implement a Least Recently Used (LRU) or Least Frequently Used (LFU) cache eviction strategy
Concurrency issues with multiple threads accessing the cache	Use thread-safe data structures (e.g., ConcurrentHashMap) and synchronization mechanisms to ensure data integrity

Null or undefined function input

How to Handle:

Throw an IllegalArgumentException or return null to indicate invalid input and prevent errors

Function with no arguments

How to Handle:

If arguments are dynamically constructed, the cache key generation must account for this to avoid collision with other calls

Function with arguments that are objects without a proper toString or serialization method

How to Handle:

Implement a custom key generation that stringifies the object arguments or uses a hash function to create a unique key

Maximum recursion depth with recursive memoized functions

How to Handle:

Implement a maximum cache size and eviction policy to prevent excessive memory usage and potential stack overflow errors

Function with arguments that mutate after the function call

How to Handle:

Deep copy the arguments during key generation or avoid mutating arguments that are memoized

Integer overflow during cache key generation (e.g., hashing)

How to Handle:

Use a sufficiently large integer type or a modulo operator to prevent overflow during hash calculations

Cache pollution with rarely used function calls

How to Handle:

Implement a Least Recently Used (LRU) or Least Frequently Used (LFU) cache eviction strategy

Concurrency issues with multiple threads accessing the cache

How to Handle:

Use thread-safe data structures (e.g., ConcurrentHashMap) and synchronization mechanisms to ensure data integrity