Skip to content

Instantly share code, notes, and snippets.

@parj
Created April 27, 2023 20:28
Show Gist options
  • Save parj/34f31106d334090e66b88263a8de3bfc to your computer and use it in GitHub Desktop.
Save parj/34f31106d334090e66b88263a8de3bfc to your computer and use it in GitHub Desktop.
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import java.io.BufferedWriter;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.sql.*;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
public class DatabaseToJson {
private static final int PAGE_SIZE = 100000;
private static final int XMS_SIZE = 4 * 1024 * 1024; // 4 MB
private static final int XMX_SIZE = 16 * 1024 * 1024; // 16 MB
private static final Gson gson = new GsonBuilder().create();
public static void main(String[] args) throws SQLException, IOException {
// Set up database connection
String url = "jdbc:mysql://localhost:3306/mydatabase";
String user = "myuser";
String password = "mypassword";
Connection connection = DriverManager.getConnection(url, user, password);
// Set up file writer
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("output.json"), "UTF-8"));
// Set up executor service
ExecutorService executor = Executors.newFixedThreadPool(10);
// Set up pagination
int page = 0;
boolean hasNextPage = true;
while (hasNextPage) {
page++;
// Query database for page of data
PreparedStatement statement = connection.prepareStatement(
"SELECT * FROM mytable LIMIT ? OFFSET ?");
statement.setInt(1, PAGE_SIZE);
statement.setInt(2, (page - 1) * PAGE_SIZE);
ResultSet resultSet = statement.executeQuery();
// Write page of data to file in parallel
int rowCount = 0;
while (resultSet.next()) {
rowCount++;
executor.submit(() -> {
try {
String json = gson.toJson(resultSet.getObject(1)); // Assuming first column contains data
writer.write(json);
writer.newLine();
} catch (IOException | SQLException e) {
e.printStackTrace();
}
});
}
hasNextPage = rowCount == PAGE_SIZE;
// Close database resources
resultSet.close();
statement.close();
}
// Clean up resources
executor.shutdown();
writer.close();
connection.close();
}
}
/*
Here's how this code works:
It sets up a database connection to a MySQL database.
It sets up a file writer to write JSON data to disk.
It sets up an executor service to run parallel tasks to write data to the file.
It sets up pagination by querying the database for a page of data at a time.
For each row of data in the result set, it submits a task to the executor service to write the data in JSON format to the file.
It continues to query for the next page of data until there is no more data left.
It cleans up resources by shutting down the executor service, closing the file writer, and closing the database connection.
To meet your performance and scalability goals, this code:
Uses a fixed-size thread pool to limit the number of threads that are created and minimize overhead.
Writes data to disk in parallel using the executor service, which can handle large amounts of data efficiently.
Uses pagination to limit the amount of data that is queried from the database at one time.
Uses a buffered writer to minimize I/O overhead.
Uses GSON to convert data to JSON format efficiently.
*/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment