Last active
September 2, 2024 06:51
-
-
Save danielvartan/924817b7e4b69212beb217f339c37a3f to your computer and use it in GitHub Desktop.
Find orphan files in a Zotero database.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Install the packages below (in the `library` function) if you don't | |
# already have them. | |
# library(checkmate) | |
# library(magrittr) | |
# library(purrr) | |
# library(readr) | |
# library(stringr) | |
#' List all files linked to a reference in a Zotero library | |
#' | |
#' @description | |
#' | |
#' This function reads a CSV file exported from Zotero and extracts the | |
#' information about the files linked to the references in the library. | |
#' | |
#' @details | |
#' | |
#' To export your library from Zotero, go to the menu `File > Export Library...` | |
#' and choose the CSV format. | |
#' | |
#' @param lib_file A string with the path to the Zotero library exported as | |
#' a CSV file (important!). | |
#' @param basename A [`logical`][base::logical()] flag indicating if the | |
#' function should return the full path to the files or only the file names. | |
#' (default: `TRUE`). | |
#' | |
#' @return A [`character`][base::as.character] vector with the names of the | |
#' files linked to the references in the Zotero library. | |
#' | |
#' @noRd | |
list_linked_files <- function(lib_file = file.choose(), | |
basename = TRUE) { | |
checkmate::assert_file_exists(lib_file, access = "r") | |
checkmate::assert_flag(basename) | |
out <- | |
lib_file |> | |
readr::read_csv(col_types = readr::cols(.default = "c")) |> | |
magrittr::extract2("File Attachments") |> | |
stringr::str_split("; (?=[A-Z]:)") |> | |
unlist() |> | |
stringr::str_squish() |> | |
stringr::str_remove("[^A-Za-z0-9]$") |> | |
purrr::discard(is.na) | |
if (isTRUE(basename)) { | |
basename(out) | |
} else { | |
out | |
} | |
} | |
#' Find orphan files in a Zotero library | |
#' | |
#' @description | |
#' | |
#' This function compares the files in a folder with the files linked to the | |
#' references in a Zotero library and returns the names of the orphan files. | |
#' | |
#' @param lib_file A string with the path to the Zotero library exported as | |
#' a CSV file (important!). | |
#' @param file_folder A string with the path to the folder containing the files | |
#' linked to the references in the Zotero library. | |
#' | |
#' @return A [`character`][base::as.character] vector with the names of the | |
#' orphan files. | |
#' | |
#' @noRd | |
find_orphan_files <- function(lib_file = file.choose(), | |
file_folder = utils::choose.dir()) { | |
checkmate::assert_file_exists(lib_file, access = "r") | |
checkmate::assert_directory_exists(file_folder, access = "rw") | |
linked_files <- list_linked_files(lib_file, basename = TRUE) | |
real_files <- list.files(file_folder) |> basename() | |
real_files[!real_files %in% linked_files] | |
} |
Hi @wlperry ,
I added some documentation to the functions. See if that helps.
Super cool - thanks - this is the best!!!
Wow - thank you so much
bill
…On Thu, Jun 13, 2024 at 2:02 AM Daniel Vartanian ***@***.***> wrote:
***@***.**** commented on this gist.
------------------------------
Hi @wlperry <https://github.com/wlperry> ,
I added some documentation to the functions. See if that helps.
—
Reply to this email directly, view it on GitHub
<https://gist.github.com/danielvartan/924817b7e4b69212beb217f339c37a3f#gistcomment-5087413>
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEFHY4RW7WSKPSYSSIFAUKTZHEYY5BFKMF2HI4TJMJ2XIZLTSKBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJDHNFZXJJDOMFWWLK3UNBZGKYLEL52HS4DFVRZXKYTKMVRXIX3UPFYGLK2HNFZXIQ3PNVWWK3TUUZ2G64DJMNZZDAVEOR4XAZNEM5UXG5FFOZQWY5LFVEYTEOBQHEYTSNZXU52HE2LHM5SXFJTDOJSWC5DF>
.
You are receiving this email because you were mentioned.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>
.
--
~~~~~
Bill Perry
***@***.***
***@***.***
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi was trying this code on a mac but am not sure what file to point this function to... I am on a max and it is stored as a zotero.sqlite - wonder if you could help at all - [email protected] - thanks