Girders Blog
Notes on building internet applications

ActiveRecord#find using :include=>association no longer necessarily joins the association

Oct 30, 2008
Before Rails 2.1, adding an :include=>[:association] in your find method caused ActiveRecord to generate SQL using a join. Since 2.1, it MAY NOT execute as a join.

The join executes a large query and returned potentially duplicate records for a one-to-many association. After 2.1, the query is broken down and eager-loaded using an additional query per association, passing the set of id‘s to load, and avoiding the duplicate rows.

The new method eliminates duplicates, but can incur more database overhead. If you are loading a very large set of records (more than a “page”), you may need to “force” the join or use find_by_sql instead.

When you specify a “table.column” syntax within a

:conditions=>["child.name=?", name]

or

:order=>'child.name'

then ActiveRecord will build the older, full query with the join because you are referencing columns from another table to build. This will cause the duplicate rows to reappear.

Whenever you reference a column from another table in a condition or order clause, ALWAYS use the table name to prefix the column, even if it not ambiguous among the tables involved. Otherwise the query will not be executed as a join and you will receive an SQL error referencing the “missing” column.

You can “force” a join by adding a reference to the other tables in your :conditions or :options parameters, even if the test or sort is irrelevant.