Skip to content

Instantly share code, notes, and snippets.

@tpot
Created June 11, 2025 05:45
Show Gist options
  • Save tpot/3f6769e4bca3d55aed04c0632f50fd01 to your computer and use it in GitHub Desktop.
Save tpot/3f6769e4bca3d55aed04c0632f50fd01 to your computer and use it in GitHub Desktop.
DuckDB SQL to create CSV from a list of files

This SQL snippet takes a list of newline-separated filenames, for example from the output of find or fd, and creates a CSV file with the columns defined below. This is useful for creating a spreadsheet that can be imported into Excel or other tool for filtering or analysis.

  • path: original filename
  • dirname: the directory component of path, NULL if the file has no directory
  • basename: the filename component of path
  • extension: the extension component of the path, NULL for no extension

The dirname and basename columns are similar to the output of dirname(1) and basename(1) general commands.

COPY (
SELECT
path,
CASE
WHEN instr(path, '/') > 0 THEN
left(path, length(path) - length(reverse(split_part(reverse(path), '/', 1))) - 1)
END AS dirname,
reverse(split_part(reverse(path), '/', 1)) AS basename,
CASE
WHEN instr(basename, '.') > 0 THEN
reverse(split_part(reverse(basename), '.', 1))
END AS extension
FROM read_csv(
'files.txt',
header = FALSE,
column_names = ['path'],
delim = '\t'
)
) TO 'files.csv' (HEADER, DELIMITER ',');
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment