-
-
Save codebje/58d1b12e7a2d0ed31b3a to your computer and use it in GitHub Desktop.
public <T> T streamQuery(String sql, Function<Stream<SqlRowSet>, ? extends T> streamer, Object... args) { | |
return jdbcTemplate.query(sql, resultSet -> { | |
final SqlRowSet rowSet = new ResultSetWrappingSqlRowSet(resultSet); | |
final boolean parallel = false; | |
// The ResultSet API has a slight impedance mismatch with Iterators, so this conditional | |
// simply returns an empty iterator if there are no results | |
if (!rowSet.next()) { | |
return streamer.apply(StreamSupport.stream(Spliterators.emptySpliterator(), parallel)); | |
} | |
Spliterator<SqlRowSet> spliterator = Spliterators.spliteratorUnknownSize(new Iterator<SqlRowSet>() { | |
private boolean first = true; | |
@Override | |
public boolean hasNext() { | |
return !rowSet.isLast(); | |
} | |
@Override | |
public SqlRowSet next() { | |
if (!first || !rowSet.next()) { | |
throw new NoSuchElementException(); | |
} | |
first = false; // iterators can be unwieldy sometimes | |
return rowSet; | |
} | |
}, Spliterator.IMMUTABLE); | |
return streamer.apply(StreamSupport.stream(spliterator, parallel)); | |
}, args); | |
} |
fun queryStream(sql: String, converter: (SqlRowSet) -> T, args: Array): Stream {
val rowSet = jdbcTemplate.queryForRowSet(sql, *args);
...
Meh, that's totally missing the point!
The idea is lazy consumption (on the fly row processing), but with JdbcTemplate::queryForRowSet
you're just aggregating results in memory! For that to workJdbcTemplate
would have to keep the db connection open after making that call and it just doesn't do that.
You can replace your whole snippet with jdbcTemplate.queryForList(...).stream()
to get the same result.
Fortunately, there is JdbcTemplate.queryForStream() method since Spring 5.3
Thus, this approach is not useful anymore
Meh, that's totally missing the point! The idea is lazy consumption (on the fly row processing), but with
JdbcTemplate::queryForRowSet
you're just aggregating results in memory! For that to workJdbcTemplate
would have to keep the db connection open after making that call and it just doesn't do that.
@sabirove It's been a while since I looked at this. This actually does work lazily. RowSet fetches on demand using a database cursor. That is the whole point of this approach. You have to indeed keep the connection open and process everything in a single connection for this to work because otherwise the database closes the cursor.
I used this to stream export tables that definitely would not fit in memory on the small vm I used to run this on. You need some additional Spring hackery to be able to write to the response directly of course (it normally would buffer everything in memory and then write to the response). Not that trivial with Spring Flux but I managed to do that. Basically, this allows you stream many gigabytes of data to a file with a simple curl command. It starts streaming data right away while it is still fetching data from the database and it writes at the speed it can fetch data from the database. Nice if you want your data in csv or ndjson format via a rest api, which was in fact my use case.
@wakedeer Nice that Spring added proper support for this. Makes this a bit easier.
Thanks for putting me on the right track with this. I am using this from Kotlin and came up with a Streaming version that works fine if you use it in a transaction. Also, you have a bug in your exit condition and you are skipping the last row. I fixed this by making hasNext do most of the work here (a good idea when implementing iterators) and by streaming the converted rows and not the rows themselves. This also gets rid of the problem you have with the first row.
This is Kotlin but you should be able to convert this back to Java easily. I use JdbcTemplate together with a TransactionTemplate to make sure the connection stays open while I stream the results.