Memoization Techniques in Ruby

Memoization is an optimization technique for speeding up method calls by caching the results of expensive operations (such as database queries, API calls, etc) so you only do the work once during the execution flow. In Ruby, you typically cache the results using an instance variable and return it when next the method is invoked.

How to Memoize

Let's look at the simplest way to memoize in Ruby:

def users
  @users ||= User.all
end

The ||= operator above basically reads like this @users || @users = User.all (more details here if you're curious why). Since unassigned instance variables return nil by default in Ruby, @users = User.all will be executed the first time the method is invoked.

So we're making a database query to fetch all users and storing the value in @users. Subsequent calls to the users method would return the value stored in @users (basically acting like a cache). It's a common convention to name the instance variable after the method being memoized. You might also come across a different convention e.g @_users but they're all instance variables (and the naming is just a convention).

Memoizing code that spans multiple lines

Sometimes the code you want to memoize doesn't fit into one line. In this case, you memoize using the begin...end block:

def active_store_users
  @active_store_users ||= begin
    active_stores = Store.where(active: true)
    User.where(store_id: active_stores.select(:id))
  end
end

The begin...end is a code block that wraps the statements we're memoizing. The last line in the block is what gets returned and then assigned to @active_store_users.

A better approach?

Because of the way the ||= operator works, there's a gotcha to be aware of. Let's look at the memoization below:

def user
  @user ||= User.first
end

Rememeber that this is equivalent to @user || @user = User.first.

If User.first query returns a falsey value such as nil or false, then @user would be assigned that falsey value. Consequently, subsequent invocation of the method will trigger the query again since the first part of the || (or) statement is falsey 😥. So we need to distinguish between when @user is actually falsey and when @user is never assigned.

So, a better approach to memoization would be:

def user
  return @user if defined?(@user)
  @user = User.first
end

defined?(@user) would evaluate to true as long as @user has been previously assigned a value whether truthy or falsey and we now only invoke the query on the first execution. Exactly what we want! 😌

Memoizing with parameters

How do we memoize a method that accepts parameters?

Knowing how a hash data structure works, could it possibly help us here? Yes! We can instantiate a hash as an instance variable and store the results of invocation for each parameter(s) so that subsequent calls with the same parameter(s) can return the value stored in the hash instead.

Conveniently, it turns out that Ruby has a hash initializer that works perfectly for this use case. You can initialize a hash with Hash.new and pass a code block:

hash = Hash.new { |hash, key| hash[key] = calculate_value_for_the_key  }

Every time a key is accessed in the hash that hasn't been set before, Ruby executes the code block to calculate the value for that key and store it in the hash.

Now let's see how we can use a hash to memoize a method that accepts a parameter:

def store_users(store)
  @store_users ||= Hash.new do |users, store_id|
    users[store_id] = User.where(store_id: store.id)
  end
  @store_users[store.id]
end

So store_users method will fetch each store's users once and then memoize the result for subsequent invocations using the store's id value 😎.

How about multiple parameters?

If you are planning to memoize a method that accepts multiple parameters, you may want to question the value in that given the decreased legibility in the code. Nevertheless, if you have to, you can combine the parameters to form a unique key for the hash e.g

hash[a + b] = calculate_value_for_the_keys

Where a and b are the parameters for the method to memoize.

A gem maybe?

For the most part, you should be fine memoizing your methods in your codebase as needed. But, if you are quite concerned about sacrificing legibility for speed, you might want to consider using a gem that handles memoization for you with a nice, friendly API, thus giving you both legibility and speed.

Memoist is one such gem and is in fact an extension of what Rails used to have that was later removed in 2011. So, check it out and see if it fits your needs.

Of course, using a gem comes with the extra overhead of managing YAD (yet another dependency), so tread with caution as always 😅

Conclusion

We learned that memoization is a technique for speeding up methods that perform time-consuming operations by caching their results, thus avoiding redundant work. This is effective especially when such methods are being invoked multiple times.

Is it worth noting that memoization is often a short-lived, in-memory caching technique. In Ruby, instance variables are often used as the cache store, meaning that the cache is only active during the life-cycle of the object and hence instance variables. In Rails, this is typically one request-response cycle so you shouldn't have issues dealing with stale data.