Memoization is an optimization technique for speeding up method calls by caching the results of expensive operations (such as database queries, API calls, etc) so you only do the work once during the execution flow. In Ruby, you typically cache the results using an instance variable and return it when next the method is invoked.
How to Memoize
Let's look at the simplest way to memoize in Ruby:
def users @users ||= User.all end
The ||=
operator above basically reads like this @users || @users = User.all
(more details here if you're curious why). Since unassigned instance variables return nil
by default in Ruby, @users = User.all
will be executed the first time the method is invoked.
So we're making a database query to fetch all users and storing the value in @users
. Subsequent calls to the users
method would return the value stored in @users
(basically acting like a cache). It's a common convention to name the instance variable after the method being memoized. You might also come across a different convention e.g @_users
but they're all instance variables (and the naming is just a convention).
Memoizing code that spans multiple lines
Sometimes the code you want to memoize doesn't fit into one line. In this case, you memoize using the begin...end
block:
def active_store_users @active_store_users ||= begin active_stores = Store.where(active: true) User.where(store_id: active_stores.select(:id)) end end
The begin...end
is a code block that wraps the statements we're memoizing. The last line in the block is what gets returned and then assigned to @active_store_users
.
A better approach?
Because of the way the ||=
operator works, there's a gotcha to be aware of. Let's look at the memoization below:
def user @user ||= User.first end
Rememeber that this is equivalent to @user || @user = User.first
.
If User.first
query returns a falsey value such as nil
or false
, then @user
would be assigned that falsey value. Consequently, subsequent invocation of the method will trigger the query again since the first part of the ||
(or) statement is falsey 😥. So we need to distinguish between when @user
is actually falsey and when @user
is never assigned.
So, a better approach to memoization would be:
def user return @user if defined?(@user) @user = User.first end
defined?(@user)
would evaluate to true
as long as @user
has been previously assigned a value whether truthy or falsey and we now only invoke the query on the first execution. Exactly what we want! 😌
Memoizing with parameters
How do we memoize a method that accepts parameters?
Knowing how a hash data structure works, could it possibly help us here? Yes! We can instantiate a hash as an instance variable and store the results of invocation for each parameter(s) so that subsequent calls with the same parameter(s) can return the value stored in the hash instead.
Conveniently, it turns out that Ruby has a hash initializer that works perfectly for this use case. You can initialize a hash with Hash.new
and pass a code block:
hash = Hash.new { |hash, key| hash[key] = calculate_value_for_the_key }
Every time a key is accessed in the hash that hasn't been set before, Ruby executes the code block to calculate the value for that key and store it in the hash.
Now let's see how we can use a hash to memoize a method that accepts a parameter:
def store_users(store) @store_users ||= Hash.new do |users, store_id| users[store_id] = User.where(store_id: store.id) end @store_users[store.id] end
So store_users
method will fetch each store's users once and then memoize the result for subsequent invocations using the store's id
value 😎.
How about multiple parameters?
If you are planning to memoize a method that accepts multiple parameters, you may want to question the value in that given the decreased legibility in the code. Nevertheless, if you have to, you can combine the parameters to form a unique key for the hash e.g
hash[a + b] = calculate_value_for_the_keys
Where a
and b
are the parameters for the method to memoize.
A gem maybe?
For the most part, you should be fine memoizing your methods in your codebase as needed. But, if you are quite concerned about sacrificing legibility for speed, you might want to consider using a gem that handles memoization for you with a nice, friendly API, thus giving you both legibility and speed.
Memoist is one such gem and is in fact an extension of what Rails used to have that was later removed in 2011. So, check it out and see if it fits your needs.
Of course, using a gem comes with the extra overhead of managing YAD (yet another dependency), so tread with caution as always 😅
Conclusion
We learned that memoization is a technique for speeding up methods that perform time-consuming operations by caching their results, thus avoiding redundant work. This is effective especially when such methods are being invoked multiple times.
Is it worth noting that memoization is often a short-lived, in-memory caching technique. In Ruby, instance variables are often used as the cache store, meaning that the cache is only active during the life-cycle of the object and hence instance variables. In Rails, this is typically one request-response cycle so you shouldn't have issues dealing with stale data.
Comments
Related Posts
Finite state machines in Rails
What are finite state machines? When and how can you apply them to your (rails) application?...