Skip to content

Instantly share code, notes, and snippets.

@ProjectCleverWeb
Last active January 17, 2025 10:28
Show Gist options
  • Save ProjectCleverWeb/d2362b082af1d7054ebfd464f202ec1b to your computer and use it in GitHub Desktop.
Save ProjectCleverWeb/d2362b082af1d7054ebfd464f202ec1b to your computer and use it in GitHub Desktop.

(My)SQL Batch Multi-Line Insert Data Export/Extracter for DataGrip

This is a better version of the multi-line insert script for DataGrip, than what is currently floating around.

Compatibility

This is likely compatible with most, if not all, other SQL dialects. However, I am only choosing to directly support & test MySQL.

If you want to make this compatible with other SQL dialects go ahead and fork this and add support yourself and mention it in the comments. I will NOT add support for other dialects. (It's a gist, not a repo that I want to continue to support forever)

Install (DataGrip)

  1. Go to the "Data Extracters" drop down. (in results bar)
  2. Select "Go to Scripts Directory"
  3. Create a new file called "SQL Batch Multi-Line Insert.sql.groovy" (The double extension is important for syntax highlighting)
  4. Paste & save the below script
  5. (optional) Adjust the configuration variables at the top to your liking

Features

  • Batch inserts (defaults to 1000 rows at a time)
  • Minify
  • Custom indentation
  • etc. (see the configuration variables)
/**
* MySQL Multi-Line Insert
* =======================
* See: https://gist.github.com/ProjectCleverWeb/d2362b082af1d7054ebfd464f202ec1b
*
* Notes:
* - I used a output buffer variable since in most languages variables are
* significantly faster than function calls and in this case the syntax for
* using a variable was both shorter and easier to read.
* - Table & column names are quoted to help improve compatibility, but in most
* cases it safe to remove them.
* - Having keywords set to "do not change" for the letter-case will assume
* lowercase keywords. (DataGrip determines this)
*
* @author Nicholas Summers
* @license MIT
*
* Available context bindings:
* COLUMNS List<DataColumn>
* ROWS Iterable<DataRow>
* OUT { append() }
* FORMATTER { format(row, col); formatValue(Object, col) }
* plus ALL_COLUMNS, TABLE, DIALECT
*
* where:
* DataRow { rowNumber(); first(); last(); data(): List<Object>; value(column): Object }
* DataColumn { columnNumber(), name() }
*/
// Basic Config
BATCH_SIZE = 1000
COLUMNS_ON_ONE_LINE = true
INDENT_ROWS = true
MINIFY = false
SHOW_CONFIG = true
// Imports
import com.intellij.openapi.project.Project
import com.intellij.psi.codeStyle.CodeStyleSettingsManager
// Custom Functions
def getIndentCharacter(Project project) {
def settings = CodeStyleSettingsManager.getSettings(project)
return settings.getIndentOptions().USE_TAB_CHARACTER ? "\t" : " ".repeat(settings.getIndentOptions().INDENT_SIZE)
}
// Config From IDE
INDENT = getIndentCharacter(PROJECT)
REAL_EOL = System.getProperty("line.separator") // End of line delimiter
KEYWORDS_LOWERCASE = com.intellij.database.util.DbSqlUtil.areKeywordsLowerCase(PROJECT)
LAST_INDEX = COLUMNS.size() - 1
// Advanced Config
DEFAULT_TABLE_NAME = "TABLE_NAME"
VALUE_QUOTE = "\'"
COLUMN_QUOTE = "`"
TABLE_QUOTE = "`"
EOL = REAL_EOL
OB = "" // Output Buffer Variable
RI = 0 // Row Iterator Variable
BI = 0 // Batch Iterator Variable
VALUE_DELIMITER = ","
COLUMN_DELIMITER = ","
STATEMENT_DELIMITER = ";"
COLUMN_INDENT = INDENT
ROW_INDENT = INDENT_ROWS ? INDENT : ""
KW_INSERT_INTO = KEYWORDS_LOWERCASE ? "insert into " : "INSERT INTO "
KW_VALUES = KEYWORDS_LOWERCASE ? "values" : "VALUES"
KW_NULL = KEYWORDS_LOWERCASE ? "null" : "NULL"
KW_TRUE = KEYWORDS_LOWERCASE ? "true" : "TRUE"
KW_FALSE = KEYWORDS_LOWERCASE ? "false" : "FALSE"
INLINE_COMMENT_PREFIX = "-- "
BLOCK_COMMENT_PREFIX = "/*"
BLOCK_COMMENT_SUFFIX = "*/"
// Main loop function
def mmli() {
ROWS.each { row ->
mmliPrintRow(COLUMNS, row)
}
OUT.append(
STATEMENT_DELIMITER + REAL_EOL + REAL_EOL +
INLINE_COMMENT_PREFIX + "Inserts Created: " + RI + REAL_EOL +
INLINE_COMMENT_PREFIX + "Batches Created: " + BI + REAL_EOL
)
}
// Generate table name, with parent if it is available
TABLE_NAME = ""
if (TABLE != null) {
TABLE_PARENT = TABLE.getParent().getName()
if (TABLE_PARENT != null && TABLE_PARENT != "") {
TABLE_NAME += TABLE_QUOTE + TABLE_PARENT + TABLE_QUOTE + "."
}
TABLE_NAME += TABLE_QUOTE + TABLE.getName() + TABLE_QUOTE
} else {
TABLE_NAME += TABLE_QUOTE + DEFAULT_TABLE_NAME + TABLE_QUOTE
}
// Show config in comments
if (SHOW_CONFIG) {
OB += BLOCK_COMMENT_PREFIX + EOL
OB += "Batch Size: " + BATCH_SIZE + EOL
OB += "Columns On One Line: " + (COLUMNS_ON_ONE_LINE ? KW_TRUE : KW_FALSE) + EOL
OB += "Indent Character: '" + INDENT + "'" + EOL
OB += "Indent Rows: " + (INDENT_ROWS ? KW_TRUE : KW_FALSE) + EOL
OB += "Minify: " + (MINIFY ? KW_TRUE : KW_FALSE) + EOL
OB += BLOCK_COMMENT_SUFFIX + EOL
OB += EOL
}
// Make extra whitespace empty strings to minify
if (MINIFY) {
EOL = ""
COLUMN_INDENT = ""
ROW_INDENT = ""
}
// Handles batching & printing the buffer.
def mmliPrintRow(columns, row) {
if (RI % BATCH_SIZE == 0) {
// Handle a beginning row
BI++ // Count batches
if (RI != 0) {
// Separate each new statement
OB += STATEMENT_DELIMITER + REAL_EOL + REAL_EOL
}
mmliHandleBeginStatement(columns)
} else {
// Handle a continuation row
OB += VALUE_DELIMITER + EOL
}
// Handle a row's column data
mmliHandleRow(columns, row)
// Print, clear buffer, & increment row count
OUT.append(OB)
OB = ''
RI++
}
// Print the beginning of an insert
def mmliHandleBeginStatement(columns) {
OB += KW_INSERT_INTO + TABLE_NAME + " (" + EOL + COLUMN_INDENT
// Loop through each column and print its name
columns.eachWithIndex { column, index ->
OB += COLUMN_QUOTE + column.name() + COLUMN_QUOTE
if (index != LAST_INDEX) {
OB += COLUMN_DELIMITER
if (!COLUMNS_ON_ONE_LINE) {
OB += EOL + COLUMN_INDENT
}
}
}
OB += EOL + ")" + EOL + KW_VALUES + EOL
}
// Print the values to insert
def mmliHandleRow(columns, row) {
OB += ROW_INDENT + "(" // open new row
// Loop through each column and print its value
columns.eachWithIndex { column, index ->
def stringValue = FORMATTER.format(row, column)
def rawValue = row.value(column)
// Handle MySQL backslashes
if (DIALECT.getDbms().isMysql()) {
stringValue = stringValue.replace("\\", "\\\\")
}
stringValue = stringValue.replace(VALUE_QUOTE, VALUE_QUOTE + VALUE_QUOTE)
// Determine value type so it can be printed correctly
if (rawValue instanceof Boolean) {
OB += rawValue ? KW_TRUE : KW_FALSE
} else if (rawValue == null) {
OB += KW_NULL
} else if (rawValue instanceof byte[]) {
OB += "0x" + rawValue.encodeHex().toString()
} else if (rawValue instanceof Number) {
OB += stringValue
} else {
OB += VALUE_QUOTE + stringValue + VALUE_QUOTE
}
// Delimit each value
if (index != LAST_INDEX) {
OB += VALUE_DELIMITER
}
}
OB += ")" // close new row
}
// Run
mmli()
@Sherpard
Copy link

Sherpard commented Nov 2, 2020

It has issues with boolean types, but other than that, it works wonderfully, Thank you!

@ProjectCleverWeb
Copy link
Author

@Sherpard - Can you describe what you mean when you say it has issues with boolean types?

As far as I am aware, MySQL only stores booleans as single-digit integers.

@Sherpard
Copy link

Sherpard commented Nov 4, 2020

@ProjectClever That's true, but the extractor takes a 'true' or 'false' with quotes, instead of a quoteless true/false (accepted) or a 0/1.

@ProjectCleverWeb
Copy link
Author

@Shepard - I cannot replicate the issue you are describing. Would you be able to generate a sample DDL I could test against?

Also, please make sure that the error in question isn't caused by a column that has a type of ENUM('TRUE','FALSE'). Enum's store their values as integers but always return strings when SELECTed.

@Sherpard
Copy link

Sherpard commented Nov 5, 2020

Sure, here you got
https://gist.github.com/Sherpard/114e762a9c57ae30638917c5eb583e39

data type being used is "bit", sorry for the confusion with boolean...

@ProjectCleverWeb
Copy link
Author

@Shepard - Thanks! That helped me find & fix the issue. Honestly, I wasn't even aware BIT was a column type. The script has been updated with that fix as well as a few other improvements.

@tamert
Copy link

tamert commented Oct 8, 2021

Youniverse mann.. thank you so much!

@Rastishka
Copy link

Bugreport:
Сhecking a string at line 159 "058879e309724477".toString().isNumber() returns TRUE, and VALUE_QUOTEs are not added.
When this value is inserted into DB it causes error "Data truncation: Illegal double '058879e309724477' value found during parsing"

@ProjectCleverWeb
Copy link
Author

@Rastishka This is a bit late because I didn't see your comment until very recently, but I did update it to fix your reported issue. I don't have a quick way of confirming the change worked, but the change is fairly simple so it should work fine.

@CodeBrauer
Copy link

Great script, just found a small little problem with manual UUID v4 generation:

SELECT UUID_TO_BIN(LOWER(CONCAT(
        LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'),
        LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'), '-',
        LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'), '-',
        '4',
        LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'), '-',
        HEX(FLOOR(RAND() * 4 + 8)),
        LPAD(HEX(FLOOR(RAND() * 0x0fff)), 3, '0'), '-',
        LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'),
        LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0'),
        LPAD(HEX(FLOOR(RAND() * 0xffff)), 4, '0')))) AS id,

Will result in binary data that will be treated as string like this:

insert into `TABLE_NAME` (`id`) values ('0x9FF77ED19A304867B860287A38469FF1');

So if you're using BINARY(16) for UUIDs in MySQL the import will fail due
[1406] Data truncation: Data too long for column 'id' at row 1

-- Correct
insert into `TABLE_NAME` (`id`) values (0x9FF77ED19A304867B860287A38469FF1);

@ProjectCleverWeb
Copy link
Author

@CodeBrauer Looks like it was a problem with binary in general. I have updated the script and this issue should be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment