Created
November 12, 2008 03:19
-
-
Save metaskills/24073 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Debugging limiting association joins for SQL Server Adapter. | |
Giving the following limiting SQL that would be normaly be generatd for the | |
test_eager_load_has_many_through_has_many(AssociationsJoinModelTest) case in | |
ActiveRecord for SQL Server. In this case the ORDER BY is totally moot. | |
SELECT DISTINCT TOP 1 [authors].id | |
FROM [authors] | |
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id]) | |
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id]) | |
WHERE (name = 'David') ORDER BY comments.id | |
Note that this SQL is generated by #construct_finder_sql_for_association_limiting | |
method. We can take two routes here. First we can follow the PostgreSQL adapter | |
again and resort to adding a connection #distinct and #add_order_by_for_association_limiting! | |
method that would take the the entire SQL into a sub select and use aliasing. In IRB | |
it would looks something like this for #distinct | |
# Arg Values | |
columns = '[authors].id' | |
order_by = 'comments.id' | |
# Execution And Return | |
>> order_columns = order_by.split(',').collect { |s| s.split.first } | |
=> ["comments.id"] | |
>> order_columns.delete_if &:blank? | |
=> ["comments.id"] | |
>> order_columns = order_columns.zip((0...order_columns.size).to_a).map { |s,i| "#{s} AS alias_#{i}" } | |
=> ["comments.id AS alias_0"] | |
>> sql = "DISTINCT ON (#{columns}) #{columns}, " | |
=> "DISTINCT ON ([authors].id) [authors].id, " | |
>> sql << order_columns * ', ' | |
=> "DISTINCT ON ([authors].id) [authors].id, comments.id AS alias_0" | |
Now for PostgreSQLs #add_order_by_for_association_limiting! method. | |
# Arg Values | |
sql = %|SELECT DISTINCT TOP 1 [authors].id | |
FROM [authors] | |
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id]) | |
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id]) | |
WHERE (name = 'David')| | |
options = {:order => 'comments.id'} | |
# Execution And Return | |
>> order = options[:order].split(',').collect { |s| s.strip }.reject(&:blank?) | |
=> ["comments.id"] | |
>> order.map! { |s| 'DESC' if s =~ /\bdesc$/i } | |
=> [nil] | |
>> order.zip((0...order.size).to_a) | |
=> [[nil, 0]] | |
>> order = order.zip((0...order.size).to_a).map { |s,i| "id_list.alias_#{i} #{s}" }.join(', ') | |
=> "id_list.alias_0 " | |
Now minus the fact that there is no DISTINCT ON in SQL Server, the translated SQL | |
would look like this and is quite valid. Despite that two columns are being returned | |
for the select_all, rails will only look for the column that matches the primary_key. | |
So the alias_n will doubtfully be an issue. | |
# Before | |
SELECT DISTINCT TOP 1 [authors].id | |
FROM [authors] | |
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id]) | |
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id]) | |
WHERE (name = 'David') ORDER BY comments.id | |
# Adapter Implementation | |
def distinct(columns, order_by) | |
return "DISTINCT #{columns}" if order_by.blank? | |
order_columns = order_by.split(',').collect { |s| s.split.first } | |
order_columns.delete_if(&:blank?) | |
order_columns = order_columns.zip((0...order_columns.size).to_a).map{ |s,i| "#{s} AS alias_#{i}" }.join(', ') | |
"DISTINCT #{columns}, #{order_columns}" | |
end | |
def add_order_by_for_association_limiting!(sql, options) | |
return sql if options[:order].blank? | |
order = options[:order].split(',').collect { |s| s.strip }.reject(&:blank?) | |
order.map! { |s| 'DESC' if s =~ /\bdesc$/i } | |
order = order.zip((0...order.size).to_a).map { |s,i| "id_list.alias_#{i} #{s}" }.join(', ') | |
sql.replace "SELECT * FROM (#{sql}) AS id_list ORDER BY #{order}" | |
end | |
# After | |
SELECT * FROM ( | |
SELECT DISTINCT TOP 1 [authors].id, comments.id AS alias_0 | |
FROM [authors] | |
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id]) | |
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id]) | |
WHERE (name = 'David') | |
) AS id_list ORDER BY id_list.alias_0 | |
However there is one fatal flaw in this implementation. If the ordered by column | |
is not unique for the DISTINCT column in the catesian product, un-unique values | |
will be returned. For instance #test_select_limited_ids_list in HasAndBelongsToManyAssociationsTest | |
will fail with <["1", "2"]> expected but was <["1", "1", "1", "2"]>. There are two | |
possible recourses. | |
First the fact of the matter is that the order by is mostly moot in the limiting | |
SQL, as far as I can tell from the tests. So this implementation would work just | |
fine. | |
def add_order_by_for_association_limiting!(sql, options) | |
sql | |
end | |
However instead of this, we should make an attempt like the PostgreSQL adapter to | |
maintain ordering for distinct values. Based on that goal the following URL is an | |
excellent article on our implementation: | |
http://weblogs.sqlteam.com/jeffs/archive/2007/12/13/select-distinct-order-by-error.aspx | |
# Adapter Implementation | |
def add_order_by_for_association_limiting!(sql, options) | |
return sql if options[:order].blank? | |
columns = sql.match(/SELECT\s+DISTINCT(.*)FROM/)[1].strip | |
sql.sub!(/SELECT\s+DISTINCT/,'SELECT') | |
sql << "GROUP BY #{columns} ORDER BY #{order_to_min_set(options[:order])}" | |
end | |
def order_to_min_set(order) | |
dir = '' | |
orders = order.split(',').map(&:strip).reject(&:blank?) | |
mins = orders.map do |o| | |
if o =~ /\b(asc|desc)$/i | |
dir = $1 | |
o = o.sub($1,'').strip | |
end | |
"MIN(#{o})" | |
end | |
"#{mins.join(', ')} #{dir}".strip | |
end | |
# Working SQL | |
SELECT TOP 1 [authors].id | |
FROM [authors] | |
LEFT OUTER JOIN [posts] ON ([authors].[id] = [posts].[author_id]) | |
LEFT OUTER JOIN [comments] ON ([comments].[post_id] = [posts].[id]) | |
WHERE (name = 'David') | |
GROUP BY [authors].id | |
ORDER BY MIN(comments.id), MIN(comments.post_id) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment