Skip to content

Instantly share code, notes, and snippets.

@jim60105
Last active June 23, 2025 01:02
Show Gist options
  • Select an option

  • Save jim60105/41395b63750a33331ccc2ceef36ab06c to your computer and use it in GitHub Desktop.

Select an option

Save jim60105/41395b63750a33331ccc2ceef36ab06c to your computer and use it in GitHub Desktop.
Remove all '#' characters from commit messages in a Git repository history.
#!/bin/bash
# Copyright (C) 2025 陳鈞, licensed under GPL-3.0-or-later
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
# ==================================================================
#
# Automated Git Commit Message '#' Cleaner Script
# Purpose: Remove all '#' characters from commit messages in a Git repository history.
# This script is fully automated and requires no manual editing of commit messages.
#
# Usage:
# ./git-cleanup-number-signs.sh [options] [start-commit]
#
# Options:
# -h, --help Show this help message and exit
# -n, --dry-run Preview mode, show changes without modifying history
# -f, --force Force execution without confirmation prompt
#
# Arguments:
# start-commit The commit hash to start processing from (exclusive). If omitted, all commits are processed.
#
# Example:
# ./git-cleanup-number-signs.sh d61eef44d0e57bacf9417131a764cd5dd219f069
# ./git-cleanup-number-signs.sh --dry-run HEAD~10
# ./git-cleanup-number-signs.sh --force
#
# Features:
# - Scans commit history for messages containing '#'
# - Optionally previews changes before applying
# - Backs up the current branch before rewriting history
# - Uses git filter-branch to rewrite commit messages
# - Cleans up filter-branch backup refs and temporary files
# - Displays a summary and sample of modified commits
set -euo pipefail
# Color definitions for log output
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m'
# Logging functions for different message types
log_info() {
echo -e "${BLUE}ℹ️ $1${NC}"
}
log_success() {
echo -e "${GREEN}✅ $1${NC}"
}
log_warning() {
echo -e "${YELLOW}⚠️ $1${NC}"
}
log_error() {
echo -e "${RED}❌ $1${NC}"
}
# Print usage instructions
usage() {
cat << EOF
Usage: $0 [options] [start-commit]
Options:
-h, --help Show this help message
-n, --dry-run Preview changes only, do not modify history
-f, --force Force execution without confirmation
Arguments:
start-commit The commit hash to start processing from (exclusive)
If not specified, all commits will be processed
Examples:
$0 d61eef44d0e57bacf9417131a764cd5dd219f069
$0 --dry-run HEAD~10
$0 --force
EOF
}
# Default values for script options
START_COMMIT=""
DRY_RUN=false
FORCE=false
# Parse command-line arguments
while [[ $# -gt 0 ]]; do
case $1 in
-h|--help)
usage
exit 0
;;
-n|--dry-run)
DRY_RUN=true
shift
;;
-f|--force)
FORCE=true
shift
;;
-* )
log_error "Unknown option: $1"
usage
exit 1
;;
* )
if [[ -z "$START_COMMIT" ]]; then
START_COMMIT="$1"
else
log_error "Too many arguments"
usage
exit 1
fi
shift
;;
esac
done
# Ensure script is run inside a Git repository
if ! git rev-parse --git-dir > /dev/null 2>&1; then
log_error "Not inside a Git repository"
exit 1
fi
# Ensure working directory is clean before rewriting history
if ! git diff-index --quiet HEAD --; then
log_error "Working directory has uncommitted changes. Please commit or stash first."
exit 1
fi
# Validate the start commit hash if provided
if [[ -n "$START_COMMIT" ]] && ! git rev-parse --verify "$START_COMMIT" > /dev/null 2>&1; then
log_error "Invalid commit hash: $START_COMMIT"
exit 1
fi
# Get the current branch name
CURRENT_BRANCH=$(git branch --show-current)
if [[ -z "$CURRENT_BRANCH" ]]; then
log_error "Unable to determine current branch"
exit 1
fi
# Scan for commits that need modification (contain '#')
log_info "Scanning for commits to modify..."
COMMITS_TO_MODIFY=()
COMMITS_INFO=() # Store commit details for reporting
if [[ -n "$START_COMMIT" ]]; then
# Process commits after the specified start commit
while IFS= read -r commit; do
full_msg=$(git log --format=%B -n 1 "$commit")
if [[ "$full_msg" == *"#"* ]]; then
COMMITS_TO_MODIFY+=("$commit")
author=$(git log --format=%an -n 1 "$commit")
date=$(git log --format=%aI -n 1 "$commit")
msg=$(git log --format=%B -n 1 "$commit")
COMMITS_INFO+=("$commit|$author|$date|$msg")
fi
done < <(git rev-list --reverse "${START_COMMIT}..HEAD")
else
# Process all commits in the repository
while IFS= read -r commit; do
full_msg=$(git log --format=%B -n 1 "$commit")
if [[ "$full_msg" == *"#"* ]]; then
COMMITS_TO_MODIFY+=("$commit")
author=$(git log --format=%an -n 1 "$commit")
date=$(git log --format=%aI -n 1 "$commit")
msg=$(git log --format=%B -n 1 "$commit")
COMMITS_INFO+=("$commit|$author|$date|$msg")
fi
done < <(git rev-list --reverse HEAD)
fi
# Exit if no commits require modification
if [[ ${#COMMITS_TO_MODIFY[@]} -eq 0 ]]; then
log_success "No commit messages containing '#' found."
exit 0
fi
# List commits that will be modified
log_info "Found ${#COMMITS_TO_MODIFY[@]} commit(s) to modify:"
for commit in "${COMMITS_TO_MODIFY[@]}"; do
title=$(git log --format=%s -n 1 "$commit")
full_msg=$(git log --format=%B -n 1 "$commit")
body=$(git log --format=%b -n 1 "$commit")
if [[ "$title" == *"#"* ]]; then
echo " ${commit:0:8}: $title [# in subject]"
elif [[ "$body" == *"#"* ]]; then
echo " ${commit:0:8}: $title [# in body]"
fi
done
# Preview mode: show before/after commit messages without modifying history
if [[ "$DRY_RUN" == true ]]; then
log_info "Preview of commit message changes:"
for commit in "${COMMITS_TO_MODIFY[@]}"; do
echo "--- Commit ${commit:0:8} ---"
echo "Before:"
git log --format=%B -n 1 "$commit" | sed 's/^/ /'
echo "After:"
git log --format=%B -n 1 "$commit" | sed 's/#//g' | sed 's/^/ /'
echo
done
exit 0
fi
# Confirm before proceeding unless --force is specified
if [[ "$FORCE" != true ]]; then
log_warning "This operation will rewrite Git history!"
read -p "Are you sure you want to continue? (y/N): " -n 1 -r
echo
if [[ ! $REPLY =~ ^[Yy]$ ]]; then
log_info "Operation cancelled."
exit 0
fi
fi
# Create a backup branch before rewriting history
BACKUP_BRANCH="backup-${CURRENT_BRANCH}-$(date +%Y%m%d-%H%M%S)"
git branch "$BACKUP_BRANCH"
log_info "Backup branch created: $BACKUP_BRANCH"
# Prepare to rewrite commit messages using git filter-branch
log_info "Starting automated commit message rewrite..."
# Create a temporary script for the message filter
TEMP_SCRIPT=$(mktemp)
cat > "$TEMP_SCRIPT" << 'EOF'
#!/bin/bash
# Remove all '#' characters from commit message
sed 's/#//g'
EOF
chmod +x "$TEMP_SCRIPT"
# Run git filter-branch to rewrite commit messages
if [[ -n "$START_COMMIT" ]]; then
# Only process commits after the specified start commit
log_info "Processing range: ${START_COMMIT}..HEAD"
git filter-branch -f --msg-filter "$TEMP_SCRIPT" "${START_COMMIT}..HEAD"
else
# Process all commits in the repository
log_info "Processing all commits"
git filter-branch -f --msg-filter "$TEMP_SCRIPT" -- --all
fi
# Remove the temporary message filter script
rm -f "$TEMP_SCRIPT"
# Clean up filter-branch backup refs and run garbage collection
if [[ -d .git/refs/original ]]; then
log_info "Cleaning up filter-branch backup refs..."
git for-each-ref --format="%(refname)" refs/original/ | xargs -n 1 git update-ref -d
git reflog expire --expire=now --all
git gc --prune=now --quiet
fi
log_success "All '#' characters have been removed from commit messages."
log_info "Backup branch: $BACKUP_BRANCH"
log_warning "If you are satisfied with the result, you may delete the backup branch: git branch -D $BACKUP_BRANCH"
log_warning "You must force-push to remote: git push --force-with-lease"
# Display summary statistics
log_info "Modification summary:"
log_success "Successfully processed ${#COMMITS_TO_MODIFY[@]} commit(s)."
# Show up to 3 examples of modified commit messages
log_info "Sample of modified commits (showing up to 3):"
count=0
for info in "${COMMITS_INFO[@]}"; do
if [[ $count -ge 3 ]]; then
break
fi
IFS='|' read -r orig_hash orig_author orig_date orig_msg <<< "$info"
# After filter-branch, '#' has been removed from the message
new_msg=$(echo "$orig_msg" | sed 's/#//g')
# Attempt to find the new commit in rewritten history by author, date, and message
new_commit=$(git log --all --format="%H" --author="$orig_author" --since="$orig_date" --until="$orig_date" --grep="$(echo "$new_msg" | head -n1 | sed 's/[][.*^$(){}?+|/]/\\&/g')" -n 1 2>/dev/null || echo "")
if [[ -n "$new_commit" ]]; then
echo " Modified commit ${new_commit:0:8}:"
git log --format=%B -n 1 "$new_commit" | head -3 | sed 's/^/ /'
echo
else
echo " Could not find corresponding new commit (author/date/message may not be unique)"
fi
((count++))
done
@jim60105
Copy link
Author

One of my forked projects accidentally backlinked the upstream project's issues and PRs in the commit messages. I originally intended to refer to another internal backlog numbering system of my own. This script will remove all # symbols from historical commits. After processing is complete, force push; GitHub should remove the backlinks after a while (this behavior is based on community experiments rather than official documentation and may stop working in the future).

@jim60105
Copy link
Author

Note: I have only run it on a single-line git log branch. I haven't tried it, but merged commits are likely to fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment