-
-
Save odrotbohm/3497047 to your computer and use it in GitHub Desktop.
JPQL: select p from Person p | |
left outer join p.address address | |
order by address.city | |
SQL: select person0_ from Person person0_ | |
left outer join Address address1_ on person0_.id=address1_.person_id | |
order by address1_.city | |
JPQL: select p from Person p | |
left outer join p.address address | |
order by p.address.city | |
SQL: select person0_ from Person person0_ | |
left outer join Address address1_ on person0_.id=address1_.person_id | |
cross join Address address2_ where person0_.id=address2_.person_id | |
order by address2_.city | |
The core difference here is that qualifying the sorting criteria to p.address.city | |
instead of address.city results in an additional JOIN clause added which results in | |
persons not having an address being ruled out from the result set. Environment is | |
Hibernate 4.1.5.SP1, H2 1.3.168. | |
1. Should the two JPQL queries result in the SQL queries shown? In particular | |
shouldn't they actually generate the very same SQL? | |
2. Why does the second JPQL query add the additional JOIN which effectively rules | |
out persons without an address? | |
3. Is it a good idea that adding a sort criterion potentially adds an additional | |
join which influences the actual result set return (i.e. "the number of results | |
changes just because I added a sort criteria") |
An addition: Say we have the following statement:
select p, privateAddress.city, companyAddress.city from Person p
left outer join p.addresses privateAddress
left outer join p.addresses companyAddress
where
privateAddress.type='PRIVATE' and companyAddress.type='COMPANY'
Regardless whether this looks like the best possible modeling - what exactly should
ORDER BY addresses.city
now do when added, following your recommendation to implicitly / automatically re-use widening joins when having an order by expression?
Nothing at all, it's invalid as addesses
is not known at all. p.addresses
would have been. Does the query actually make sense? Why would you want to outer join the tables (retaining null values) if you actually have a where
clause that will definitely return false for these values?
Again, as soon as you manually add joins you do so for a reason and then it's up to you to define all related clauses correctly. But if you don't do (select p from Person p
), adding an order by
clause simply must not restrict the result set more than actually expressed in the query.
OK, I seem to have way too much time (I haven't), but well - it''s an interesting problem.
So, here's another addition.
- my last comment over-emphasizes the concept of re-using an existing outer join. If applying the rule of "non-aliased order by path expressions implicitly add an outer join to the resulting SQL
FROM
clause, it would be clear what the query does and the actual selection in terms of rows would not be affected. - Navigation expressions are all about navigating an object graph, since JPA tries to hide the relational stuff and give us a "purely" object oriented view on our data, right? So when applying the rule from point 1, you would effectively say: this
ORDER BY
is not an object graph expression any more, because in an object graph we can only walk along transitions - and not non-transitions, which is what a left outer join would construct. So ifORDER BY a.b
would result in an outer join, the path expression applied to the object graph would read [a | ].b, which feels utterly wrong to me in an object model.
Regarding your comment to the addresses-example:
Of course it should have read ORDER BY p.addresses
. The example does make sense as a classic "map rows to columns" example, relying on an effective query planner and executer in the DBMS, causing least possible communication overhead between DB and JVM. More realistic would be: Generate result rows of form
firstname, lastname, home phone, business phone
given that phone numbers are attributes of an entity with 1:n relation to Persons and a discriminator column type
But again, the example not really good to make any case for my point in opposite to yours, as mentioned in my last comment - so forget about it :)
I think what I'm trying to say is that this is an issue related the "M" in ORM. M stands for consistent mappings of idioms only available in / suitable for the O tier to the R tier. The path navigation is clearly an O concept, which we have to map to R. For the sake of consistency, it always imposes a side effect, namely an inner join.
My second SQL example was a bit wrong. Better would have been
Here you can see a real side effect of the path expression, without being anyhow explicit about joins. How does this side effect differ from the one imposed by order by, rethinking both examples with both possible join semantics? The consistency lies therein that a path navigation expression solely and always consistently affects the dynaset definition (
FROM
part) of your query, and not the projection (SELECT
), selection (WHERE
) orORDER BY
. So regardless where it is used, it should have the same meaning by having the same side effect on the dynaset. And no, it is not a side effect on theWHERE
clause, although it may look that way in the generated SQL. Semantically it only adds anINNER JOIN
to theFROM
clause, which Hibernate for whatever reason de-composes to a selection on a cartesian product. This is valid in relational algebra and the correct decomposition of the non-basicJOIN
operation - but it is totally idiotic to use it in a DMBS with extended relational algebra representation, meaning anything newer than Oracle 8i :)To be clear about, I was not talking about SQL at all, so of course I would not feel good about side effects on my dynaset imposed by
ORDER BY
in my SQL. Ironically SQL would not even allow me to generate side effects, because each and everything I want to use for ordering has to be explicitly represented by myFROM
clause, including the right and thoughtfully chosen join needed to order by a criterion in a related table.PS: The statement corrected above, although not intended, would have been correct for a "German-speaking" DBMS, since in Germany 'Bielefeld' equals null :)