Skip to content

Instantly share code, notes, and snippets.

@zclfly
Forked from cpjolicoeur/gist:3590737
Created May 7, 2014 07:21
Show Gist options
  • Save zclfly/e8ee8bce53c2ea6204d9 to your computer and use it in GitHub Desktop.
Save zclfly/e8ee8bce53c2ea6204d9 to your computer and use it in GitHub Desktop.

I'm hunting for the best solution on how to handle keeping large sets of DB records "sorted" in a performant manner.

Problem Description

Most of us have work on projects at some point where we have needed to have ordered lists of objects. Whether it be a to-do list sorted by priority, or a list of documents that a user can sort in whatever order they want.

A traditional approach for this on a Rails project is to use something like the acts_as_list gem, or something similar. These systems typically add some sort of "postion" or "sort order" column to each record, which is then used when querying out the records in a traditional order by position SQL query.

This approach seems to work fine for smaller datasets, but can be hard to manage on large data sets with hundreds (or thousands) of records needing to be sorted. Changing the sort position of even a single object will require updating every single record in the database that is in the same sort group. This requires potentially thousands of writes to the database for every sort order update.

Possible Solutions

One typical alternative approach is to store a single, serialized column that contains the full sort order (for instance a serialized array column that contains the ordered list of record ids). Using this approach, we are now required to query the records out from the database using this specified order.

Ideally, it would be nice if the database just used the order of the in clause so a query of select * from foo where id in (67,23,1362,24) returned the records in that order, but I'm not aware of a database that does this.

MySQL has a built-in ORDER BY Field() clause where you could do something like ORDER BY FIELD(67,23,1362,24)

PostgreSQL doesn't seem to have any similar functions. There are two somewhat widely discussed solutions that I have come across.

  1. Using the PostgreSQL CASE clause like the following:
    SELECT * FROM foo ORDER BY
    CASE
      WHEN id=67 THEN 1
      WHEN id=23 THEN 2
      WHEN id=1362 THEN 3
      WHEN id=24 THEN 4
    END
  1. Using a custom JOIN table like the following:
    SELECT * FROM foo
    JOIN (VALUES (67,1), (23,2), (1362,3), (24,4)) as x(id, ordering) ON foo.id = x.id
    ORDER BY x.ordering

I'm currently using solution #2 as it was easier to implement as a module that I can include in any AR::Base class and have it work well with other scopes.

I'm wondering if there is an obvious solution I'm just missing, or any alternatives or alternate implementations of the db schema to begin with that might make this a non-issue.

What are the best ways to keep a large set of data sorted according to user specified order with PostgreSQL?

Sidenote

For this particular use case that I'm dealing with, we allow sorting more than one object at a time. The objects (images) are laid out in a grid on the UI and the user can select a single image to reorder via drag and drop, or they can select multiple images to be reordered via drag and drop at the same time. This makes using something like the acts_as_list#insert_at method less useful. I suppose we could call #insert_at or manually #update_attribute(:position, N) one at a time for each item that was moved in position, but I'd like to do it all in one fell swoop if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment