Handling order with distinct: true in Rails and PostgreSQL
Problem Statement
When using ActiveRecord in Rails with PostgreSQL, you may encounter a common issue when trying to chain `.distinct` with `.order`:
This can raise a PostgreSQL error like:
PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
PostgreSQL is strict when using DISTINCT. If you use .distinct and specify an ORDER BY, PostgreSQL requires that every column in the ORDER BY must also be selected.
We want to retrieve unique records, based on some field(s), in a specific order — without breaking PostgreSQL’s rules.
Solution 1: Use select to include the order column
Add both the column(s) you want to DISTINCT by and the ones you want to ORDER BY in the select clause:
User.select('DISTINCT ON (users.email) users.*').order('users.email, users.created_at DESC')
Solution 2: Use Subquery to Get Distinct IDs, Then Fetch Full Records
Step 1: Select distinct IDs in subquery
distinct_ids = User.select(:email).distinct.pluck(:id)
Step 2: Use those IDs to load full records with ordering
User.where(id: distinct_ids).order(created_at: :desc)
Solution 3: Use DISTINCT ON in Raw SQL for Custom Use
If you need very specific behavior, you can drop to SQL:
User.find_by_sql <<~SQL
SELECT DISTINCT ON (email) *
FROM users
ORDER BY email, created_at DESC
SQL
You’ll get the latest user for each email, sorted by created_at.
Example Use Case
Say you want to get the most recent orders per customer:
Order.select('DISTINCT ON (customer_id) *').order('customer_id, created_at DESC')
Work with our skilled Ruby on Rails developers to accelerate your project and boost its performance.
Hire Ruby on Rails Developer