Skip to content

Instantly share code, notes, and snippets.

@nanxstats
Created August 29, 2024 05:12
Show Gist options
  • Save nanxstats/18341468b6b8a5affb929a27a34ca395 to your computer and use it in GitHub Desktop.
Save nanxstats/18341468b6b8a5affb929a27a34ca395 to your computer and use it in GitHub Desktop.
Minimal example for analyzing R package changelogs with OpenAI API
desc <- tempfile()
curl::curl_download("https://cran.r-project.org/web/packages/tidyverse/DESCRIPTION", destfile = desc)
deps <- desc::desc_get_deps(desc)
pkgs <- deps[deps$type %in% c("Imports"), "package"]
urls <- paste0("https://cloud.r-project.org/web/packages/", pkgs, "/news/news.html")
html <- vector("list", length(urls))
for (i in seq_along(urls)) html[[i]] <- rawToChar(curl::curl_fetch_memory(urls[i])$content)
markdown <- vector("list", length(urls))
for (i in seq_along(urls)) markdown[[i]] <- paste0(pandoc::pandoc_convert(text = html[[i]], from = "html", to = "markdown"), collapse = "\n")
markdown <- markdown[-which(sapply(markdown, grepl, pattern = "# Not Found"))]
content <- paste(markdown[1:3], collapse = "\n\n")
prompt <- paste(
"Analyze the following Markdown changelog for multiple R packages.",
"For each package, determine if there are major changes between the",
"last version and the previous version. Consider major changes to include",
"any significant new features, breaking changes, or major bug fixes.",
"Ignore minor tweaks, small bug fixes, or documentation updates.",
"Return the results in the CSV tabular format with the following columns:",
"Package Name, Last Version, Previous Version, Major Changes Present",
"(Yes/No), Description of Major Changes (if any).",
"Here are the Markdown changelogs:\n\n",
content
)
result <- gptstudio::openai_create_chat_completion(
prompt = prompt,
model = "gpt-4o",
openai_api_key = Sys.getenv("OPENAI_API_KEY")
)
cat(result$choices[[1]]$message$content)
# Here is the summary of major changes in the given changelogs for the broom and conflicted packages serialized into a CSV format:
#
# ```csv
# Package Name,Last Version,Previous Version,Major Changes Present (Yes/No),Description of Major Changes (if any)
# broom,development version,1.0.5,Yes,"- Added support for `conf.level` in `augment.lm()`; Added support for columns `adj.r.squared` and `npar` in `glance()` method for objects from `mgcv::gam`; Deprecated tidiers for margins and sp packages."
# broom,1.0.5,1.0.4,No,""
# broom,1.0.4,1.0.3,No,""
# broom,1.0.3,1.0.2,No,""
# broom,1.0.2,1.0.1,No,""
# broom,1.0.1,1.0.0,No,""
# broom,1.0.0,0.8.0,Yes,"- First 'production' release; Major overhaul including notable error handling changes; introduced guidelines for backward compatibility."
# broom,0.8.0,0.7.12,No,""
# broom,0.7.12,0.7.11,No,""
# broom,0.7.11,0.7.10,No,""
# broom,0.7.10,0.7.9,No,""
# broom,0.7.9,0.7.8,No,""
# broom,0.7.8,0.7.7,No,""
# broom,0.7.7,0.7.6,No,""
# broom,0.7.6,0.7.5,No,""
# broom,0.7.5,0.7.4,No,""
# broom,0.7.4,0.7.3,Yes,"- Introduced tidier support for several new model objects and improved functionality of existing tidiers."
# broom,0.7.3,0.7.2,No,""
# broom,0.7.2,0.7.1,No,""
# broom,0.7.1,0.7.0,Yes,"- Introduced new tidiers; Improved existing tidiers with interval arguments; Bug fixes."
# broom,0.7.0,0.5.6,Yes,"- Major release with several new tidiers, soft-deprecations, and planned hard-deprecations. Changed reporting of degrees of freedom for `lm` objects; Moved away from supporting `summary.*()` objects."
# conflicted,1.2.0,1.1.0,Yes,"- New `conflicts_prefer()` to declare multiple preferences at once; Disambiguation message now provides clickable preferences."
# conflicted,1.1.0,1.0.4,Yes,"- New `conflicted_prefer_all()` and `conflicted_prefer_matching()` to prefer functions en masse; Improved conflict detection and resolution."
# conflicted,1.0.4,1.0.3,No,""
# conflicted,1.0.3,1.0.2,No,""
# conflicted,1.0.2,1.0.1,No,""
# conflicted,1.0.1,1.0.0,No,""
# conflicted,1.0.0,Initial Release,Yes,"- Initial release with `conflict_scout()` and `conflict_prefer()` functions to manage conflicts in R packages."
# cli,3.6.3,3.6.2,No,""
# cli,3.6.2,3.6.1,No,""
# cli,3.6.1,3.6.0,No,""
# cli,3.6.0,3.5.0,Yes,"- New `keypress()` function to read a single key press from a terminal; several enhancements and new hash functions."
# cli,3.5.0,3.4.1,Yes,"- New `pretty_print_code()` for syntax highlighting at the R console; new hash functions."
# cli,3.4.1,3.4.0,No,""
# cli,3.4.0,3.3.0,Yes,"- Experimental ANSI hyperlinks in RStudio and terminals; improved vector collapsing behavior."
# cli,3.3.0,3.2.0,Yes,"- Improved behavior of ANSI hyperlinks; detection of terminal capabilities updated."
# cli,3.2.0,3.1.1,Yes,"- Ensured CLI compatibility with ESS background themes and color handling."
# cli,3.1.1,3.1.0,No,""
# cli,3.1.0,2.5.0,Yes,"- Integrated with the new rlang 1.0 for handling errors; new hashing functions and other utilities."
# cli,2.5.0,2.4.0,Yes,"- Updated and added styles, improved CLI utilities including robust method handling."
# cli,2.4.0,2.3.1,No,""
# cli,2.3.1,2.3.0,No,""
# cli,2.3.0,2.2.0,Yes,"- Enhanced styling and strip functions, better handling of terminal colors."
# cli,2.2.0,2.1.0,No,""
# cli,2.1.0,2.0.2,No,""
# cli,2.0.2,2.0.1,No,""
# cli,2.0.1,2.0.0,No,""
# cli,2.0.0,1.1.0,Yes,"- Introduced a new set of functions for creating semantic CLI elements; bug fixes for better dynamic tty handling."
# cli,1.1.0,1.0.1,No,""
# cli,1.0.1,1.0.0,No,""
# cli,1.0.0,Initial Release,Yes,"- Initial CRAN release with basic CLI handling utilities."
# ```
#
# This table provides a clear summary of the major changes for each package between versions, focusing on significant new features, breaking changes, or major bug fixes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment