A Simple Correction

In yesterday's post I said, in relation to "how does .present? work on ActiveRecord::Relation", I said that present? performs an existence check SELECT 1 AS one FROM ... LIMIT 1 because it calls exists? underneath. This is actually wrong - it loads the relation.

Jonathan Mast corrected me on Twitter. It turns out, I should have paid closer attention! Here is the actual implementation of blank? on ActiveRecord::Relation on Rails master:

# Returns true if relation is blank.
def blank?
  records.blank?
end

records with no @ in front of it? Usually ActiveRecord::Relation accesses @records directly in its internals. @records can be though of as "the plain Array which holds the loaded ActiveRecord objects", so what's records?

def records # :nodoc:
  load
  @records
end

D'oh! So blank? works exactly the way I said it DIDN'T work! It doesn't do an existence check (SELECT 1 ...), it loads the entire relation! Of course, this makes it dangerous for a different reason. Imagine a view like this:

- if @my_records.present?
  - @my_records.first(3).each do |record|

You could be loading the entire relation and then only using a small slice of it.

What about the other "existence predicates" which people seem to use interchangeably: exists?, empty?, any?, none? and blank?

Here's a table to summarize how this behavior works in Rails 5.1+:

method	SQL generated	memoized?	implementation	Runs query if `loaded?`
present?	SELECT "users".* FROM "users"	yes (`load`)	Object (!blank?)	no
blank?	SELECT "users".* FROM "users"	yes (`load`)	`load`; `blank?`	no
any?	SELECT,1 AS one FROM "users" LIMIT 1	no unless `loaded`	`!empty?`	no
empty?	SELECT,1 AS one FROM "users" LIMIT 1	no unless `loaded`	`exists?` if !`loaded?`	no
none?	SELECT,1 AS one FROM "users" LIMIT 1	no unless `loaded`	`empty?`	no
exists?	SELECT,1 AS one FROM "users" LIMIT 1	no	ActiveRecord::Calculations	yes

However, empty? had its implementation changed in Rails 5.1. In previous versions (Rails 5.0, Rails 4.2 and lower), empty?, any? work differently. none? also had its implementation changed in Rails 5.0! Here's the table for Rails 5.0, which changed the implementation of none? to match any? and empty?:

method	SQL generated	memoized?	implementation	Runs query if `loaded?`
present?	SELECT "users".* FROM "users"	yes (`load`)	Object (!blank?)	no
blank?	SELECT "users".* FROM "users"	yes (`load`)	`load`; `blank?`	no
any?	SELECT COUNT(*) FROM "users"	no unless `loaded`	`!empty?`	no
empty?	SELECT COUNT(*) FROM "users"	no unless `loaded`	count(:all) > 0	no
none?	SELECT COUNT(*) FROM "users"	no unless `loaded`	`empty?`	no
exists?	SELECT,1 AS one FROM "users" LIMIT 1	no	ActiveRecord::Calculations	yes

And here's the table Rails 4.2:

method	SQL generated	memoized?	implementation	Runs query if `loaded?`
present?	SELECT "users".* FROM "users"	yes	Object (!blank?)	no
blank?	SELECT "users".* FROM "users"	yes	to_a.blank?	no
any?	SELECT COUNT(*) FROM "users"	no unless `loaded`	`!empty?`	no
empty?	SELECT COUNT(*) FROM "users"	no unless `loaded`	count(:all) > 0	no
none?	SELECT "users".* FROM "users"	yes (`load` called)	Array	no
exists?	SELECT,1 AS one FROM "users" LIMIT 1	no	ActiveRecord::Calculations	yes

none? wasn't defined on ActiveRecord::Relation in Rails 4, so calling it on an ActiveRecord::Relation loads the records and calls none? on the resulting Array. The implementation of blank? also changed, though the effects are still the same.

If you'd like to test the above tables, I have a simple script you can drop into your Rails application and run with rails runner script.rb.

These six predicate methods, which are English-language synonyms all asking the same question, have completely different implementations and performance implications, and these consequences depend on which version of Rails you are using. It's all about ten times more complicated than I thought when I wrote the original article! So, let me distill all of the above into some concrete advice:

present? and blank? should not be used if the ActiveRecord::Relation will never be used in its entirety after you call present? or blank?. For example, @my_relation.present?; @my_relation.first(3).each.
any?, none? and empty? should probably be replaced with present? or blank? unless you will only take a section of the ActiveRecord::Relation using first or last. They will generate an extra existence SQL check if you're just going to use the entire relation if it exists. In essence, change @users.any?; @users.each... to @users.present?; @users.each... or @users.load.any?; @users.each..., but @users.any?; @users.first(3).each is fine.
exists? is a lot like count - it is never memoized, and always executes a SQL query. Most people probably do not actually want this behavior, and would be better off using present? or blank?

Also, note, from the complexity of the tables above, how ActiveRecord's definition of API stability may not extend to its generated SQL. From AR's perspective, SQL is an implementation detail, which means that performance could change significantly across minor versions as certain methods may generate different queries.

nateberkopec/correction.md

nicolas-brousse commented Feb 14, 2024

Uh oh!