Skip to content

Instantly share code, notes, and snippets.

@JonathanWillitts
Last active November 5, 2024 15:46
Show Gist options
  • Save JonathanWillitts/13f37c3c169ff71d2488f683e7fbbf53 to your computer and use it in GitHub Desktop.
Save JonathanWillitts/13f37c3c169ff71d2488f683e7fbbf53 to your computer and use it in GitHub Desktop.
Updates EDC dev environment (by pulling all existing repos, cloning new (or missing) clinicedc repos, and installing repos that aren't editable as editable)

Setting up edc_source dev environment

This document describes using the update_edc_code.sh script for both:

  • the initial setup of the edc_source development environment
  • updating the edc_source development environment, to pull down any recent additions/changes

The latest version of the script can be viewed at: https://gist.github.com/JonathanWillitts/13f37c3c169ff71d2488f683e7fbbf53#file-update_edc_code-sh, or downloaded directly from: https://gist.githubusercontent.com/JonathanWillitts/13f37c3c169ff71d2488f683e7fbbf53/raw/update_edc_code.sh

First time setup of 'edc_source':

For first time setup of the edc_source development environment, complete the following:

Prerequisites:

Ensure GitHub CLI installed, if not install:

$ gh
Work seamlessly with GitHub from the command line.

USAGE
  gh <command> <subcommand> [flags]
...

Ensure authenticated with Github host, if not configure (to allow query of clinicedc organisation for new repos):

$ gh auth status
github.com
  ✓ Logged in to github.com as ...

Install/setup:

Setup directory structure, and download update_edc_code.sh script:

$ mkdir edc_source
$ cd edc_source
$ curl --remote-name https://gist.githubusercontent.com/JonathanWillitts/13f37c3c169ff71d2488f683e7fbbf53/raw/update_edc_code.sh

Create Conda environment:

$ conda create --yes --name edc-dev python=3.12

Activate environment, and install initial development prerequisites:

$ conda activate edc-dev
$ pip install \
  -r https://raw.githubusercontent.com/clinicedc/edc/develop/requirements.tests/tox.txt \
  -r https://raw.githubusercontent.com/clinicedc/edc/develop/requirements.tests/test_utils.txt \
  -r https://raw.githubusercontent.com/clinicedc/edc/develop/requirements.tests/third_party_dev.txt \
  -r https://raw.githubusercontent.com/clinicedc/edc/develop/requirements.tests/lint.txt

Clone and install the edc, and trial specific repos

$ bash ./update_edc_code.sh

See On running 'update_edc_code.sh' for more details on running the script.

To update 'edc_source':

The edc_source directory can be updated at any time by running the following:

$ conda activate edc-dev
$ cd edc_source
$ bash ./update_edc_code.sh

See On running 'update_edc_code.sh' for more details on running the script.

On running 'update_edc_code.sh':

The update_edc_code.sh script can be used for configuring an edc_source development environment for the first time, or for updating it.

The script will:

  1. Provide details of what it is about to do, and allow the user to continue/abort
  2. Perform a 'git pull' on all top-level directories it finds in the directory it is running in
  3. Clone any missing GitHub repos found in clinicedc, that were pushed after the cutoff date defined in the script (Jan 2021 by default)
  4. Clone any prerequisite GitHub repos from erikvw
  5. Clone any protocol specific trail repos that are defined, currently:
  6. Run pre-commit install on all configured repos
  7. Offer to install any repos that aren't currently installed as editable, as editable
    • Note: To avoid clobbering the system or Conda base Python environments with editable packages, this step requires a (non-base) Conda environment to be activated
#!/bin/bash
################################################################################
# Functions
###########
# Performs git pull on specified dir
# Usage: git_pull_dir <dir>
# e.g. git_pull_dir ./edc-utils
# Note: will skip <dir> if not a git repo
function git_pull_dir {
dir=$1
if [[ -d "${dir}/.git" ]]; then
echo "Pulling '${dir}' ...";
git -C "${dir}" pull;
else
echo "Skipping '${dir}' (not a git repo) ...";
fi
echo;
}
# Queries GitHub for repos contained in specified <entity_name>, that were last
# pushed to after the specified <date>.
# Usage: get_repos_pushed_after <entity_name> <date>
# e.g. get_repos_pushed_after clinicedc 2021-01z
# e.g. get_repos_pushed_after clinicedc 2020-01-13T02:40:21Z
function get_repos_pushed_after {
local gh_entity_name=$1
local date=$2
gh repo list "${gh_entity_name}" \
--json name,pushedAt \
--no-archived \
--limit 100 \
--jq "map(select(.pushedAt | . >= \"${date}\"))"
}
# Parses JSON passed on stdin, returning only (new line separated) repo name
# values.
function parse_repo_names {
python -c \
'
import json
import sys
gh_json = json.load(sys.stdin)
exclusions = [
".github",
"demo-repository",
"canned-views",
"clinicedc.github.io",
"edc-calendar",
"edc-call-manager",
"edc-icecap-a",
"edc-model-wrapper",
"edc-pregnancy-utils",
"edc-protocol-violation",
"edc-reference",
"edc-stata",
"edc-subject-model-wrappers",
"edc-tb",
]
repo_names = [repo["name"] for repo in gh_json if repo["name"] not in exclusions]
print(*repo_names, sep="\n")
'
}
# Function to clone repo from GitHub.
# Usage: clone_repo <entity_name> <repo_name>
# e.g. clone_repo clinicedc edc-utils
# Note: will silently skip repo if folder already exists.
function clone_repo {
local entity_name=$1
local repo_name=$2
if [[ -z "$1" || -z "$2" ]]; then
echo "Missing entity or repository name (args passed: 1='$1' 2='$2'). Exiting..."
exit 1
fi
if [[ ! -d "${repo_name}" ]]; then
local repo_url=https://github.com/${entity_name}/${repo_name}.git
echo "Cloning from '${repo_url}' ..."
git clone --branch develop "${repo_url}"
echo "----------------"
fi
}
################################################################################
# Script
###########
# Define settings
gh_entity_name="clinicedc"
cut_off_date="2021-01z"
# Export functions and paths to be run via xargs
export -f git_pull_dir
export -f clone_repo
export root_src_dir=${PWD}
echo -e "\n ____________________________________________________________\n" \
"| EDC code update script:\n" \
"| -----------------------\n" \
"| Warning!\n" \
"|\n" \
"| This script will:\n" \
"| - clone any missing GitHub repos found in '${gh_entity_name}' that were pushed after '${cut_off_date}'\n" \
"| - perform a 'git pull' on all top-level directories it finds in: '${PWD}'\n" \
"| - offer to install repos that aren't editable as editable, to: '${CONDA_DEFAULT_ENV}'\n" \
"|______________________________\n"
continue_prompt="Do you want to continue? Please answer [y]es to continue, or [N]o to abort: "
while true; do
read -p "${continue_prompt}" user_response
case $user_response in
[yY] | [yY][eE][sS])
echo -e "Continuing ...\n"
break
;;
[nN] | [nN][oO] | '')
echo -e "Exiting ${repo} ...";
exit 1
break;;
* )
;;
esac
done
echo "Searching '${gh_entity_name}' organisation for repos pushed after '${cut_off_date}' and cloning if missing ..."
get_repos_pushed_after "${gh_entity_name}" "${cut_off_date}" \
| parse_repo_names \
| tr '\n' '\0' \
| xargs -0 -n1 -I repo_name bash -c "clone_repo '${gh_entity_name}' 'repo_name'"
echo -e "Finished searching and cloning new '${gh_entity_name}' repos.\n"
echo "Searching for and updating existing git repos found in: ${PWD} ..."
find . -mindepth 1 -maxdepth 1 -type d -not -path '*/\.*' -print0 \
| sort -z \
| xargs -0 bash -c 'git_pull_dir $0
for arg do
git_pull_dir ${arg}
done'
echo -e "Finished updating existing git repos.\n"
echo "Cloning 'erikvw/' edc specific repos ..."
clone_repo "erikvw" "canned-views"
clone_repo "erikvw" "django-audit-fields"
clone_repo "erikvw" "django-crypto-fields"
clone_repo "erikvw" "django-revision"
echo -e "Finished cloning 'erikvw/' edc specific repos.\n"
echo "Cloning protocol specific repos ..."
clone_repo "ambition-trial" "ambition-edc"
clone_repo "ambition-trial" "ambition-form-validators"
clone_repo "effect-trial" "effect-edc"
clone_repo "effect-trial" "effect-form-validators"
clone_repo "inte-africa-trial" "inte-edc"
clone_repo "intecomm-trial" "intecomm-edc"
clone_repo "intecomm-trial" "intecomm-eligibility"
clone_repo "intecomm-trial" "intecomm-form-validators"
clone_repo "intecomm-trial" "intecomm-rando"
clone_repo "meta-trial" "meta-edc"
clone_repo "mocca-trial" "mocca-edc"
clone_repo "respond-africa" "respond-africa"
echo -e "Finished cloning protocol specific repos.\n"
echo "Running 'pre-commit install' on configured repos: ${PWD} ..."
find . -mindepth 1 -maxdepth 1 -type d -not -path '*/\.*' -print0 \
| sort -z \
| xargs -0 bash -c '
for arg do
target_dir=${root_src_dir}/${arg}
echo Processing: ${target_dir}
git_dir=${target_dir}/.git
if [ ! -d "${git_dir}" ]
then
echo ... skipping: ${git_dir} is not a git repo ...
else
cd "${target_dir}"
echo ...in $(pwd)
echo ...installing pre-commit in: ${target_dir}
pre-commit install
config_file=${target_dir}/.pre-commit-config.yaml
if [ ! -s "${config_file}" ]
then
echo ...skipping update for: ${target_dir} has no .pre-commit-config.yaml ...
else
echo ...updating pre-commit in: $(pwd)
pre-commit autoupdate
echo ...running pre-commit in: $(pwd)
pre-commit run --all-files
fi
cd "${root_src_dir}"
fi
echo
done'
echo -e "Finished running 'pre-commit install'.\n"
echo "Searching for packages installed as editable ..."
editable_packages=$(pip list -e | tail -n +3 | cut -f1 -d ' ')
echo -e "Found: ${editable_packages}\n"
echo "Searching for local dirs ..."
local_dirs=($(find . -mindepth 1 -maxdepth 1 -type d -not -path '*/\.*' | sort))
echo -e "Found:" "${local_dirs[@]}" "\n"
echo "Searching for local repos that aren't currently pip installed as editable..."
repos_to_ignore=(
"edc"
"adverse-event-app"
"ambition-edc"
"ambition-form-validators"
"canned_views"
"inte-edc"
"mocca-edc"
"respond-africa"
)
non_editable_repos=()
for local_dir in ${local_dirs[*]}; do
dir_name=${local_dir:2}
if [[ -d "${dir_name}/.git" ]] \
&& [[ "${dir_name}" != _* ]] \
&& [[ ${editable_packages} != *${dir_name}* ]] \
&& [[ ! " ${repos_to_ignore[*]} " =~ " ${dir_name} " ]] \
; then
non_editable_repos+=(${dir_name})
fi
done
echo "Found:"
printf '%s\n' "${non_editable_repos[@]}"
# Try to avoid clobbering system or conda base python environments with editable packages
echo -e "\nChecking environment ..."
echo "Python: $(which python)"
echo -e "Pip: $(which pip)\n"
if [[ -z "${CONDA_DEFAULT_ENV}" ]]; then
echo "Warning: Not in conda environment."
echo "Please select a dedicated conda environment (e.g. 'conda activate edc-dev') and try again."
echo "Exiting..."
exit 1
fi
if [[ "${CONDA_DEFAULT_ENV}" == base ]]; then
echo "Warning: Current conda environment is '${CONDA_DEFAULT_ENV}'."
echo "Please select a dedicated conda environment (e.g. 'conda activate edc-dev') and try again."
echo "Exiting..."
exit 1
fi
# Prompt to install each non editable repo as editable
install_all=false
install_prompt="Please answer [y]es, [N]o, or [a]ll to install all remaining repos: "
for repo in "${non_editable_repos[@]}"; do
while true; do
if [[ "${install_all}" == true ]]; then
user_response=all
else
echo -e "\nInstall '${repo}' repo as editable? "
read -p "${install_prompt}" user_response
# If response is [a]ll, set install_all flag true to skip future prompts
user_response=$(echo "${user_response}" | tr '[:upper:]' '[:lower:]')
if [[ "${user_response}" == "a" || "${user_response}" == "all" ]]; then
install_all=true
fi
fi
case $user_response in
[yY] | [yY][eE][sS] | [aA] | [aA][lL][lL])
echo "Installing '${repo}' as editable ..."
pip install -e "${repo}"
break
;;
[nN] | [nN][oO] | '')
echo -e "Skipping '${repo}' ...";
break;;
* )
;;
esac
done
done
echo "Finished."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment